summaryrefslogtreecommitdiff
path: root/net/ipv6
AgeCommit message (Collapse)Author
2010-12-10inet6: Remove redundant unlikely()Tobias Klauser
IS_ERR() already implies unlikely(), so it can be omitted here. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10xfrm: Traffic Flow Confidentiality for IPv6 ESPMartin Willi
Add TFC padding to all packets smaller than the boundary configured on the xfrm state. If the boundary is larger than the PMTU, limit padding to the PMTU. Signed-off-by: Martin Willi <martin@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10net/ipv6/udp.c: fix typo in flush_stack()Jiri Pirko
skb1 should be passed as parameter to sk_rcvqueues_full() here. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10ipv6: Fix 'release_it' logic in tcp_v6_get_peer()David S. Miller
We accidently set it to "true" for the case where we are using a route bound peer. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-09net: optimize INET input path furtherEric Dumazet
Followup of commit b178bb3dfc30 (net: reorder struct sock fields) Optimize INET input path a bit further, by : 1) moving sk_refcnt close to sk_lock. This reduces number of dirtied cache lines by one on 64bit arches (and 64 bytes cache line size). 2) moving inet_daddr & inet_rcv_saddr at the beginning of sk (same cache line than hash / family / bound_dev_if / nulls_node) This reduces number of accessed cache lines in lookups by one, and dont increase size of inet and timewait socks. inet and tw sockets now share same place-holder for these fields. Before patch : offsetof(struct sock, sk_refcnt) = 0x10 offsetof(struct sock, sk_lock) = 0x40 offsetof(struct sock, sk_receive_queue) = 0x60 offsetof(struct inet_sock, inet_daddr) = 0x270 offsetof(struct inet_sock, inet_rcv_saddr) = 0x274 After patch : offsetof(struct sock, sk_refcnt) = 0x44 offsetof(struct sock, sk_lock) = 0x48 offsetof(struct sock, sk_receive_queue) = 0x68 offsetof(struct inet_sock, inet_daddr) = 0x0 offsetof(struct inet_sock, inet_rcv_saddr) = 0x4 compute_score() (udp or tcp) now use a single cache line per ignored item, instead of two. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-09net: Abstract away all dst_entry metrics accesses.David S. Miller
Use helper functions to hide all direct accesses, especially writes, to dst_entry metrics values. This will allow us to: 1) More easily change how the metrics are stored. 2) Implement COW for metrics. In particular this will help us put metrics into the inetpeer cache if that is what we end up doing. We can make the _metrics member a pointer instead of an array, initially have it point at the read-only metrics in the FIB, and then on the first set grab an inetpeer entry and point the _metrics member there. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2010-12-08Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/ath/ath9k/ar9003_eeprom.c net/llc/af_llc.c
2010-12-02ipv6: use ND_REACHABLE_TIME and ND_RETRANS_TIMER instead of magic numberShan Wei
ND_REACHABLE_TIME and ND_RETRANS_TIMER have defined since v2.6.12-rc2, but never been used. So use them instead of magic number. This patch also changes original code style to read comfortably . Thank YOSHIFUJI Hideaki for pointing it out. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02tcp: Implement ipv6 ->get_peer() and ->tw_get_peer().David S. Miller
Now ipv6 timewait recycling is fully implemented and enabled. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02tcp: Add timewait recycling bits to ipv6 connect code.David S. Miller
This will also improve handling of ipv6 tcp socket request backlog when syncookies are not enabled. When backlog becomes very deep, last quarter of backlog is limited to validated destinations. Previously only ipv4 implemented this logic, but now ipv6 does too. Now we are only one step away from enabling timewait recycling for ipv6, and that step is simply filling in the implementation of tcp_v6_get_peer() and tcp_v6_tw_get_peer(). Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02ipv6: Create inet6_csk_route_req().David S. Miller
Brother of ipv4's inet_csk_route_req(). Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01timewait_sock: Create and use getpeer op.David S. Miller
The only thing AF-specific about remembering the timestamp for a time-wait TCP socket is getting the peer. Abstract that behind a new timewait_sock_ops vector. Support for real IPV6 sockets is not filled in yet, but curiously this makes timewait recycling start to work for v4-mapped ipv6 sockets. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01net/ipv6/sit.c: return unhandled skb to tunnel4_rcvDavid McCullough
I found a problem using an IPv6 over IPv4 tunnel. When CONFIG_IPV6_SIT was enabled, the packets would be rejected as net/ipv6/sit.c was catching all IPPROTO_IPV6 packets and returning an ICMP port unreachable error. I think this patch fixes the problem cleanly. I believe the code in net/ipv4/tunnel4.c:tunnel4_rcv takes care of it properly if none of the handlers claim the skb. Signed-off-by: David McCullough <david_mccullough@mcafee.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01Make the ip6_tunnel reflect the true mtu.Anders Franzen
The ip6_tunnel always assumes it consumes 40 bytes (ip6 hdr) of the mtu of the underlaying device. So for a normal ethernet bearer, the mtu of the ip6_tunnel is 1460. However, when creating a tunnel the encap limit option is enabled by default, and it consumes 8 bytes more, so the true mtu shall be 1452. I dont really know if this breaks some statement in some RFC, so this is a request for comments. Signed-off-by: Anders Franzen <anders.franzen@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30inet: Turn ->remember_stamp into ->get_peer in connection AF ops.David S. Miller
Then we can make a completely generic tcp_remember_stamp() that uses ->get_peer() as a helper, minimizing the AF specific code and minimizing the eventual code duplication when we implement the ipv6 side of TW recycling. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30ipv6: Add infrastructure to bind inet_peer objects to routes.David S. Miller
They are only allowed on cached ipv6 routes. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28ipv6: Prepare the tree for un-inlined jhash.Jozsef Kadlecsik
jhash is widely used in the kernel and because the functions are inlined, the cost in size is significant. Also, the new jhash functions are slightly larger than the previous ones so better un-inline. As a preparation step, the calls to the internal macros are replaced with the plain jhash function calls. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28ipv6: kill two unused macro definitionShan Wei
1. IPV6_TLV_TEL_DST_SIZE This has not been using for several years since created. 2. RT6_INFO_LEN commit 33120b30 kill all RT6_INFO_LEN's references, but only this definition remained. commit 33120b30cc3b8665204d4fcde7288638b0dd04d5 Author: Alexey Dobriyan <adobriyan@sw.ru> Date: Tue Nov 6 05:27:11 2007 -0800 [IPV6]: Convert /proc/net/ipv6_route to seq_file interface Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-27rtnl: make link af-specific updates atomicThomas Graf
As David pointed out correctly, updates to af-specific attributes are currently not atomic. If multiple changes are requested and one of them fails, previous updates may have been applied already leaving the link behind in a undefined state. This patch splits the function parse_link_af() into two functions validate_link_af() and set_link_at(). validate_link_af() is placed to validate_linkmsg() check for errors as early as possible before any changes to the link have been made. set_link_af() is called to commit the changes later. This method is not fail proof, while it is currently sufficient to make set_link_af() inerrable and thus 100% atomic, the validation function method will not be able to detect all error scenarios in the future, there will likely always be errors depending on states which are f.e. not protected by rtnl_mutex and thus may change between validation and setting. Also, instead of silently ignoring unknown address families and config blocks for address families which did not register a set function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are returned to avoid comitting 4 out of 5 update requests without notifying the user. Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24ipv6: mcast: RCU conversionEric Dumazet
ipv6_sk_mc_lock rwlock becomes a spinlock. readers (inet6_mc_check()) now takes rcu_read_lock() instead of read lock. Writers dont need to disable BH anymore. struct ipv6_mc_socklist objects are reclaimed after one RCU grace period. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-22Net: ipv6: netfiliter: Makefile: Remove deprecated kbuild goal definitionsTracey Dent
Changed Makefile to use <modules>-y instead of <modules>-objs because -objs is deprecated and not mentioned in Documentation/kbuild/makefiles.txt. Signed-off-by: Tracey Dent <tdent48227@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-22ipv6: fix missing in6_ifa_put in addrconfJohn Fastabend
Fix ref count bug introduced by commit 2de795707294972f6c34bae9de713e502c431296 Author: Lorenzo Colitti <lorenzo@google.com> Date: Wed Oct 27 18:16:49 2010 +0000 ipv6: addrconf: don't remove address state on ifdown if the address is being kept Fix logic so that addrconf_ifdown() decrements the inet6_ifaddr refcnt correctly with in6_ifa_put(). Reported-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-19Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c net/core/net-sysfs.c net/ipv6/addrconf.c
2010-11-18ipv6: Expose reachable and retrans timer values as msecsThomas Graf
Expose reachable and retrans timer values in msecs instead of jiffies. Both timer values are already exposed as msecs in the neighbour table netlink interface. The creation timestamp format with increased precision is kept but cleaned up. Signed-off-by: Thomas Graf <tgraf@infradead.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18ipv6: Expose IFLA_PROTINFO timer values in msecs instead of jiffiesThomas Graf
IFLA_PROTINFO exposes timer related per device settings in jiffies. Change it to expose these values in msecs like the sysctl interface does. I did not find any users of IFLA_PROTINFO which rely on any of these values and even if there are, they are likely already broken because there is no way for them to reliably convert such a value to another time format. Signed-off-by: Thomas Graf <tgraf@infradead.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17net: use the macros defined for the members of flowiChangli Gao
Use the macros defined for the members of flowi to clean the code up. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17ipv6: AF_INET6 link address familyThomas Graf
IPv6 already exposes some address family data via netlink in the IFLA_PROTINFO attribute if RTM_GETLINK request is sent with the address family set to AF_INET6. We take over this format and reuse all the code. Signed-off-by: Thomas Graf <tgraf@infradead.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16udp: use atomic_inc_not_zero_hintEric Dumazet
UDP sockets refcount is usually 2, unless an incoming frame is going to be queued in receive or backlog queue. Using atomic_inc_not_zero_hint() permits to reduce latency, because processor issues less memory transactions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16ipv6: fix missing in6_ifa_put in addrconfJohn Fastabend
Fix ref count bug introduced by commit 2de795707294972f6c34bae9de713e502c431296 Author: Lorenzo Colitti <lorenzo@google.com> Date: Wed Oct 27 18:16:49 2010 +0000 ipv6: addrconf: don't remove address state on ifdown if the address is being kept Fix logic so that addrconf_ifdown() decrements the inet6_ifaddr refcnt correctly with in6_ifa_put(). Reported-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15net/ipv6/mcast.c: Remove unnecessary semicolonsJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12ipv6: Warn users if maximum number of routes is reached.Ben Greear
This gives users at least some clue as to what the problem might be and how to go about fixing it. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12ipv6: addrconf: don't remove address state on ifdown if the address is being ↵Lorenzo Colitti
kept Currently, addrconf_ifdown does not delete statically configured IPv6 addresses when the interface is brought down. The intent is that when the interface comes back up the address will be usable again. However, this doesn't actually work, because the system stops listening on the corresponding solicited-node multicast address, so the address cannot respond to neighbor solicitations and thus receive traffic. Also, the code notifies the rest of the system that the address is being deleted (e.g, RTM_DELADDR), even though it is not. Fix it so that none of this state is updated if the address is being kept on the interface. Tested: Added a statically configured IPv6 address to an interface, started ping, brought link down, brought link up again. When link came up ping kept on going and "ip -6 maddr" showed that the host was still subscribed to there Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6David S. Miller
2010-11-12netfilter: ipv6: fix overlap check for fragmentsShan Wei
The type of FRAG6_CB(prev)->offset is int, skb->len is *unsigned* int, and offset is int. Without this patch, type conversion occurred to this expression, when (FRAG6_CB(prev)->offset + prev->len) is less than offset. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-08ipv6: fix overlap check for fragmentsShan Wei
The type of FRAG6_CB(prev)->offset is int, skb->len is *unsigned* int, and offset is int. Without this patch, type conversion occurred to this expression, when (FRAG6_CB(prev)->offset + prev->len) is less than offset. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-03netfilter: ip6_tables: fix information leak to userspaceJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-03net dst: fix percpu_counter list corruption and poison overwrittenXiaotian Feng
There're some percpu_counter list corruption and poison overwritten warnings in recent kernel, which is resulted by fc66f95c. commit fc66f95c switches to use percpu_counter, in ip6_route_net_init, kernel init the percpu_counter for dst entries, but, the percpu_counter is never destroyed in ip6_route_net_exit. So if the related data is freed by kernel, the freed percpu_counter is still on the list, then if we insert/remove other percpu_counter, list corruption resulted. Also, if the insert/remove option modifies the ->prev,->next pointer of the freed value, the poison overwritten is resulted then. With the following patch, the percpu_counter list corruption and poison overwritten warnings disappeared. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-30ipv6/udp: report SndbufErrors and RcvbufErrorsEric Dumazet
commit a18135eb9389 (Add UDP_MIB_{SND,RCV}BUFERRORS handling.) forgot to make the necessary changes in net/ipv6/proc.c to report additional counters in /proc/net/snmp6 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27tunnels: Fix tunnels change rcu protectionPavel Emelyanov
After making rcu protection for tunnels (ipip, gre, sit and ip6) a bug was introduced into the SIOCCHGTUNNEL code. The tunnel is first unlinked, then addresses change, then it is linked back probably into another bucket. But while changing the parms, the hash table is unlocked to readers and they can lookup the improper tunnel. Respective commits are b7285b79 (ipip: get rid of ipip_lock), 1507850b (gre: get rid of ipgre_lock), 3a43be3c (sit: get rid of ipip6_lock) and 94767632 (ip6tnl: get rid of ip6_tnl_lock). The quick fix is to wait for quiescent state to pass after unlinking, but if it is inappropriate I can invent something better, just let me know. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27net: add __rcu annotations to protocolEric Dumazet
Add __rcu annotations to : struct net_protocol *inet_protos struct net_protocol *inet6_protos And use appropriate casts to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27ipv6: fix refcnt problem related to POSTDAD stateUrsula Braun
After running this bonding setup script modprobe bonding miimon=100 mode=0 max_bonds=1 ifconfig bond0 10.1.1.1/16 ifenslave bond0 eth1 ifenslave bond0 eth3 on s390 with qeth-driven slaves, modprobe -r fails with this message unregister_netdevice: waiting for bond0 to become free. Usage count = 1 due to twice detection of duplicate address. Problem is caused by a missing decrease of ifp->refcnt in addrconf_dad_failure. An extra call of in6_ifa_put(ifp) solves it. Problem has been introduced with commit f2344a131bccdbfc5338e17fa71a807dee7944fa. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26IPv6: Temp addresses are immediately deleted.Glenn Wurster
There is a bug in the interaction between ipv6_create_tempaddr and addrconf_verify. Because ipv6_create_tempaddr uses the cstamp and tstamp from the public address in creating a private address, if we have not received a router advertisement in a while, tstamp + temp_valid_lft might be < now. If this happens, the new address is created inside ipv6_create_tempaddr, then the loop within addrconf_verify starts again and the address is immediately deleted. We are left with no temporary addresses on the interface, and no more will be created until the public IP address is updated. To avoid this, set the expiry time to be the minimum of the time left on the public address or the config option PLUS the current age of the public interface. Signed-off-by: Glenn Wurster <gwurster@scs.carleton.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26IPv6: Create temporary address if none exists.Glenn Wurster
If privacy extentions are enabled, but no current temporary address exists, then create one when we get a router advertisement. Signed-off-by: Glenn Wurster <gwurster@scs.carleton.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26netfilter: Add missing CONFIG_SYSCTL checks in ipv6's nf_conntrack_reasm.cDavid S. Miller
Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25net: add __rcu annotation to sk_filterEric Dumazet
Add __rcu annotation to : (struct sock)->sk_filter And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25netfilter: fix module dependency issues with IPv6 defragmentation, ip6tables ↵KOVACS Krisztian
and xt_TPROXY One of the previous tproxy related patches split IPv6 defragmentation and connection tracking, but did not correctly add Kconfig stanzas to handle the new dependencies correctly. This patch fixes that by making the config options mirror the setup we have for IPv4: a distinct config option for defragmentation that is automatically selected by both connection tracking and xt_TPROXY/xt_socket. The patch also changes the #ifdefs enclosing IPv6 specific code in xt_socket and xt_TPROXY: we only compile these in case we have ip6tables support enabled. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25tunnels: add _rcu annotationsEric Dumazet
(struct ip6_tnl)->next is rcu protected : (struct ip_tunnel)->next is rcu protected : (struct xfrm6_tunnel)->next is rcu protected : add __rcu annotation and proper rcu primitives. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-24tproxy: Add missing CAP_NET_ADMIN check to ipv6 sideBalazs Scheidler
IP_TRANSPARENT requires root (more precisely CAP_NET_ADMIN privielges) for IPV6. However as I see right now this check was missed from the IPv6 implementation. Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-24ip6_tunnel dont update the mtu on the route.Anders Franzen
The ip6_tunnel device did not unset the flag, IFF_XMIT_DST_RELEASE. This will make the dev layer to release the dst before calling the tunnel. The tunnel will not update any mtu/pmtu info, since it does not have a dst on the skb. Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6