summaryrefslogtreecommitdiff
path: root/net/ipv4
AgeCommit message (Collapse)Author
2010-11-18bonding: IGMP handling cleanupEric Dumazet
Instead of iterating in_dev->mc_list from bonding driver, its better to call a helper function provided by igmp.c Details of implementation (locking) are private to igmp code. ip_mc_rejoin_group(struct ip_mc_list *im) becomes ip_mc_rejoin_groups(struct in_device *in_dev); Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17net: ipv4: tcp_probe: cleanup snprintf() useVasiliy Kulikov
snprintf() returns number of bytes that were copied if there is no overflow. This code uses return value as number of copied bytes. Theoretically format string '%lu.%09lu %pI4:%u %pI4:%u %d %#x %#x %u %u %u %u\n' may be expanded up to 163 bytes. In reality tv.tv_sec is just few bytes instead of 20, 2 ports are just 5 bytes each instead of 10, length is 5 bytes instead of 10. The rest is an unstrusted input. Theoretically if tv_sec is big then copy_to_user() would overflow tbuf. tbuf was increased to fit in 163 bytes. snprintf() is used to follow return value semantic. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17net: use the macros defined for the members of flowiChangli Gao
Use the macros defined for the members of flowi to clean the code up. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17ipv4: AF_INET link address familyThomas Graf
Implements the AF_INET link address family exposing the per device configuration settings via netlink using the attribute IFLA_INET_CONF. The format of IFLA_INET_CONF differs depending on the direction the attribute is sent. The attribute sent by the kernel consists of a u32 array, basically a 1:1 copy of in_device->cnf.data[]. The attribute expected by the kernel must consist of a sequence of nested u32 attributes, each representing a change request, e.g. [IFLA_INET_CONF] = { [IPV4_DEVCONF_FORWARDING] = 1, [IPV4_DEVCONF_NOXFRM] = 0, } libnl userspace API documentation and example available from: http://www.infradead.org/~tgr/libnl/doc-git/group__link__inet.html Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17network: tcp_connect should return certain errors up the stackEric Paris
The current tcp_connect code completely ignores errors from sending an skb. This makes sense in many situations (like -ENOBUFFS) but I want to be able to immediately fail connections if they are denied by the SELinux netfilter hook. Netfilter does not normally return ECONNREFUSED when it drops a packet so we respect that error code as a final and fatal error that can not be recovered. Based-on-patch-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16xfrm: update flowi saddr in icmp_send if unsetUlrich Weber
otherwise xfrm_lookup will fail to find correct policy Signed-off-by: Ulrich Weber <uweber@astaro.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16udp: use atomic_inc_not_zero_hintEric Dumazet
UDP sockets refcount is usually 2, unless an incoming frame is going to be queued in receive or backlog queue. Using atomic_inc_not_zero_hint() permits to reduce latency, because processor issues less memory transactions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15xfrm: use gre key as flow upper protocol infoTimo Teräs
The GRE Key field is intended to be used for identifying an individual traffic flow within a tunnel. It is useful to be able to have XFRM policy selector matches to have different policies for different GRE tunnels. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15ipv4: Fix build with multicast disabled.David S. Miller
net/ipv4/igmp.c: In function 'ip_mc_inc_group': net/ipv4/igmp.c:1228: error: implicit declaration of function 'for_each_pmc_rtnl' net/ipv4/igmp.c:1228: error: expected ';' before '{' token net/ipv4/igmp.c: In function 'ip_mc_unmap': net/ipv4/igmp.c:1333: error: expected ';' before 'igmp_group_dropped' ... Move for_each_pmc_rcu and for_each_pmc_rtnl macro definitions outside of multicast ifdef protection. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-14Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-11-12tcp: Don't change unlocked socket state in tcp_v4_err().David S. Miller
Alexey Kuznetsov noticed a regression introduced by commit f1ecd5d9e7366609d640ff4040304ea197fbc618 ("Revert Backoff [v3]: Revert RTO on ICMP destination unreachable") The RTO and timer modification code added to tcp_v4_err() doesn't check sock_owned_by_user(), which if true means we don't have exclusive access to the socket and therefore cannot modify it's critical state. Just skip this new code block if sock_owned_by_user() is true and eliminate the now superfluous sock_owned_by_user() code block contained within. Reported-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net> CC: Damian Lukowski <damian@tvk.rwth-aachen.de> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2010-11-12igmp: RCU conversion of in_dev->mc_listEric Dumazet
in_dev->mc_list is protected by one rwlock (in_dev->mc_list_lock). This can easily be converted to a RCU protection. Writers hold RTNL, so mc_list_lock is removed, not replaced by a spinlock. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Cypher Wu <cypher.w@gmail.com> Cc: Américo Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-11ipv4: Make rt->fl.iif tests lest obscure.David S. Miller
When we test rt->fl.iif against zero, we're seeing if it's an output or an input route. Make that explicit with some helper functions. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-11net: get rid of rtable->idevEric Dumazet
It seems idev field in struct rtable has no special purpose, but adding extra atomic ops. We hold refcounts on the device itself (using percpu data, so pretty cheap in current kernel). infiniband case is solved using dst.dev instead of idev->dev Removal of this field means routing without route cache is now using shared data, percpu data, and only potential contention is a pair of atomic ops on struct neighbour per forwarded packet. About 5% speedup on routing test. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-10tcp: Increase TCP_MAXSEG socket option minimum.David S. Miller
As noted by Steve Chen, since commit f5fff5dc8a7a3f395b0525c02ba92c95d42b7390 ("tcp: advertise MSS requested by user") we can end up with a situation where tcp_select_initial_window() does a divide by a zero (or even negative) mss value. The problem is that sometimes we effectively subtract TCPOLEN_TSTAMP_ALIGNED and/or TCPOLEN_MD5SIG_ALIGNED from the mss. Fix this by increasing the minimum from 8 to 64. Reported-by: Steve Chen <schen@mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-10net: avoid limits overflowEric Dumazet
Robin Holt tried to boot a 16TB machine and found some limits were reached : sysctl_tcp_mem[2], sysctl_udp_mem[2] We can switch infrastructure to use long "instead" of "int", now atomic_long_t primitives are available for free. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Robin Holt <holt@sgi.com> Reviewed-by: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-09net/ipv4/tcp.c: Update WARN usesJoe Perches
Coalesce long formats. Align arguments. Remove KERN_<level>. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-09inet: fix ip_mc_drop_socket()Eric Dumazet
commit 8723e1b4ad9be4444 (inet: RCU changes in inetdev_by_index()) forgot one call site in ip_mc_drop_socket() We should not decrease idev refcount after inetdev_by_index() call, since refcount is not increased anymore. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Reported-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-04inet_diag: Make sure we actually run the same bytecode we audited.Nelson Elhage
We were using nlmsg_find_attr() to look up the bytecode by attribute when auditing, but then just using the first attribute when actually running bytecode. So, if we received a message with two attribute elements, where only the second had type INET_DIAG_REQ_BYTECODE, we would validate and run different bytecode strings. Fix this by consistently using nlmsg_find_attr everywhere. Signed-off-by: Nelson Elhage <nelhage@ksplice.com> Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-04fib: fib_result_assign() should not change fib refcountsEric Dumazet
After commit ebc0ffae5 (RCU conversion of fib_lookup()), fib_result_assign() should not change fib refcounts anymore. Thanks to Michael who did the bisection and bug report. Reported-by: Michael Ellerman <michael@ellerman.id.au> Tested-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-03Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2010-11-03ipv4: netfilter: ip_tables: fix information leak to userlandVasiliy Kulikov
Structure ipt_getinfo is copied to userland with the field "name" that has the last elements unitialized. It leads to leaking of contents of kernel stack memory. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-03ipv4: netfilter: arp_tables: fix information leak to userlandVasiliy Kulikov
Structure arpt_getinfo is copied to userland with the field "name" that has the last elements unitialized. It leads to leaking of contents of kernel stack memory. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-30ip_gre: fix fallback tunnel setupEric Dumazet
Before making the fallback tunnel visible to lookups, we should make sure it is completely setup, once ipgre_tunnel_init() had been called and tstats per_cpu pointer allocated. move rcu_assign_pointer(ign->tunnels_wc[0], tunnel); from ipgre_fb_tunnel_init() to ipgre_init_net() Based on a patch from Pavel Emelyanov Reported-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-29netfilter: nf_nat: fix compiler warning with CONFIG_NF_CT_NETLINK=nPatrick McHardy
net/ipv4/netfilter/nf_nat_core.c:52: warning: 'nf_nat_proto_find_get' defined but not used net/ipv4/netfilter/nf_nat_core.c:66: warning: 'nf_nat_proto_put' defined but not used Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-28fib: Fix fib zone and its hash leak on namespace stopPavel Emelyanov
When we stop a namespace we flush the table and free one, but the added fn_zone-s (and their hashes if grown) are leaked. Need to free. Tries releases all its stuff in the flushing code. Shame on us - this bug exists since the very first make-fib-per-net patches in 2.6.27 :( Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27tunnels: Fix tunnels change rcu protectionPavel Emelyanov
After making rcu protection for tunnels (ipip, gre, sit and ip6) a bug was introduced into the SIOCCHGTUNNEL code. The tunnel is first unlinked, then addresses change, then it is linked back probably into another bucket. But while changing the parms, the hash table is unlocked to readers and they can lookup the improper tunnel. Respective commits are b7285b79 (ipip: get rid of ipip_lock), 1507850b (gre: get rid of ipgre_lock), 3a43be3c (sit: get rid of ipip6_lock) and 94767632 (ip6tnl: get rid of ip6_tnl_lock). The quick fix is to wait for quiescent state to pass after unlinking, but if it is inappropriate I can invent something better, just let me know. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27inetpeer: __rcu annotationsEric Dumazet
Adds __rcu annotations to inetpeer (struct inet_peer)->avl_left (struct inet_peer)->avl_right This is a tedious cleanup, but removes one smp_wmb() from link_to_pool() since we now use more self documenting rcu_assign_pointer(). Note the use of RCU_INIT_POINTER() instead of rcu_assign_pointer() in all cases we dont need a memory barrier. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27tunnels: add __rcu annotationsEric Dumazet
Add __rcu annotations to : (struct ip_tunnel)->prl (struct ip_tunnel_prl_entry)->next (struct xfrm_tunnel)->next struct xfrm_tunnel *tunnel4_handlers struct xfrm_tunnel *tunnel64_handlers And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27net: add __rcu annotations to protocolEric Dumazet
Add __rcu annotations to : struct net_protocol *inet_protos struct net_protocol *inet6_protos And use appropriate casts to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27ipv4: add __rcu annotations to routes.cEric Dumazet
Add __rcu annotations to : (struct dst_entry)->rt_next (struct rt_hash_bucket)->chain And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26fib_hash: fix rcu sparse and logical errorsEric Dumazet
While fixing CONFIG_SPARSE_RCU_POINTER errors, I had to fix accesses to fz->fz_hash for real. - &fz->fz_hash[fn_hash(f->fn_key, fz)] + rcu_dereference(fz->fz_hash) + fn_hash(f->fn_key, fz) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25ipv4: add __rcu annotations to ip_ra_chainEric Dumazet
Add __rcu annotations to : (struct ip_ra_chain)->next struct ip_ra_chain *ip_ra_chain; And use appropriate rcu primitives. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25net: add __rcu annotation to sk_filterEric Dumazet
Add __rcu annotation to : (struct sock)->sk_filter And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25tunnels: add _rcu annotationsEric Dumazet
(struct ip6_tnl)->next is rcu protected : (struct ip_tunnel)->next is rcu protected : (struct xfrm6_tunnel)->next is rcu protected : add __rcu annotation and proper rcu primitives. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-24Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Update broken web addresses in arch directory. Update broken web addresses in the kernel. Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget Revert "Fix typo: configuation => configuration" partially ida: document IDA_BITMAP_LONGS calculation ext2: fix a typo on comment in ext2/inode.c drivers/scsi: Remove unnecessary casts of private_data drivers/s390: Remove unnecessary casts of private_data net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data drivers/infiniband: Remove unnecessary casts of private_data drivers/gpu/drm: Remove unnecessary casts of private_data kernel/pm_qos_params.c: Remove unnecessary casts of private_data fs/ecryptfs: Remove unnecessary casts of private_data fs/seq_file.c: Remove unnecessary casts of private_data arm: uengine.c: remove C99 comments arm: scoop.c: remove C99 comments Fix typo configue => configure in comments Fix typo: configuation => configuration Fix typo interrest[ing|ed] => interest[ing|ed] Fix various typos of valid in comments ... Fix up trivial conflicts in: drivers/char/ipmi/ipmi_si_intf.c drivers/usb/gadget/rndis.c net/irda/irnet/irnet_ppp.c
2010-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits) bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL. vlan: Calling vlan_hwaccel_do_receive() is always valid. tproxy: use the interface primary IP address as a default value for --on-ip tproxy: added IPv6 support to the socket match cxgb3: function namespace cleanup tproxy: added IPv6 support to the TPROXY target tproxy: added IPv6 socket lookup function to nf_tproxy_core be2net: Changes to use only priority codes allowed by f/w tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled tproxy: added tproxy sockopt interface in the IPV6 layer tproxy: added udp6_lib_lookup function tproxy: added const specifiers to udp lookup functions tproxy: split off ipv6 defragmentation to a separate module l2tp: small cleanup nf_nat: restrict ICMP translation for embedded header can: mcp251x: fix generation of error frames can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set can-raw: add msg_flags to distinguish local traffic 9p: client code cleanup rds: make local functions/variables static ... Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and drivers/net/wireless/ath/ath9k/debug.c as per David
2010-10-22Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bklLinus Torvalds
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek
2010-10-21Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (52 commits) sched: fix RCU lockdep splat from task_group() rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value sched: suppress RCU lockdep splat in task_fork_fair net: suppress RCU lockdep false positive in sock_update_classid rcu: move check from rcu_dereference_bh to rcu_read_lock_bh_held rcu: Add advice to PROVE_RCU_REPEATEDLY kernel config parameter rcu: Add tracing data to support queueing models rcu: fix sparse errors in rcutorture.c rcu: only one evaluation of arg in rcu_dereference_check() unless sparse kernel: Remove undead ifdef CONFIG_DEBUG_LOCK_ALLOC rcu: fix _oddness handling of verbose stall warnings rcu: performance fixes to TINY_PREEMPT_RCU callback checking rcu: upgrade stallwarn.txt documentation for CPU-bound RT processes vhost: add __rcu annotations rcu: add comment stating that list_empty() applies to RCU-protected lists rcu: apply TINY_PREEMPT_RCU read-side speedup to TREE_PREEMPT_RCU rcu: combine duplicate code, courtesy of CONFIG_PREEMPT_RCU rcu: Upgrade srcu_read_lock() docbook about SRCU grace periods rcu: document ways of stalling updates in low-memory situations rcu: repair code-duplication FIXMEs ...
2010-10-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2010-10-21Merge branch 'for-patrick' of ↵Patrick McHardy
git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6
2010-10-21nf_nat: restrict ICMP translation for embedded headerJulian Anastasov
Skip ICMP translation of embedded protocol header if NAT bits are not set. Needed for IPVS to see the original embedded addresses because for IPVS traffic the IPS_SRC_NAT_BIT and IPS_DST_NAT_BIT bits are not set. It happens when IPVS performs DNAT for client packets after using nf_conntrack_alter_reply to expect replies from real server. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21tproxy: fix hash locking issue when using port redirection in ↵Balazs Scheidler
__inet_inherit_port() When __inet_inherit_port() is called on a tproxy connection the wrong locks are held for the inet_bind_bucket it is added to. __inet_inherit_port() made an implicit assumption that the listener's port number (and thus its bind bucket). Unfortunately, if you're using the TPROXY target to redirect skbs to a transparent proxy that assumption is not true anymore and things break. This patch adds code to __inet_inherit_port() so that it can handle this case by looking up or creating a new bind bucket for the child socket and updates callers of __inet_inherit_port() to gracefully handle __inet_inherit_port() failing. Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>. See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21fib: introduce fib_alias_accessed() helperEric Dumazet
Perf tools session at NFWS 2010 pointed out a false sharing on struct fib_alias that can be avoided pretty easily, if we set FA_S_ACCESSED bit only if needed (ie : not already set) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21secmark: export secctx, drop secmark in procfsEric Paris
The current secmark code exports a secmark= field which just indicates if there is special labeling on a packet or not. We drop this field as it isn't particularly useful and instead export a new field secctx= which is the actual human readable text label. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-20net: avoid RCU for NOCACHE dstEric Dumazet
There is no point using RCU for dst we allocate for a very short time (used once). Change dst_release() to take DST_NOCACHE into account, but also change skb_dst_set_noref() to force a refcount increment for such dst. This is a _huge_ gain, because we dont waste memory to store xx thousand of dsts. Instead of queueing them to RCU, we can free them instantly. CPU caches can stay hot, re-using same memory blocks to hold temporary dsts. Note : remove unneeded smp_mb__before_atomic_dec(); in dst_release(), since atomic_dec_return() implies a full memory barrier. Stress test, 160.000.000 udp frames sent, IP route cache disabled (DDOS). Before: real 0m38.091s user 0m13.189s sys 7m53.018s After: real 0m29.946s user 0m12.157s sys 7m40.605s For reference, if IP route cache was enabled : real 0m32.030s user 0m10.521s sys 8m15.243s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-19inet: RCU changes in inetdev_by_index()Eric Dumazet
Convert inetdev_by_index() to not increment in_dev refcount. Callers hold RCU or RTNL, and should not decrement in_dev refcount. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-19net: avoid a dev refcount in ip_mc_find_dev()Eric Dumazet
We hold RTNL in ip_mc_find_dev(), no need to touch device refcount. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-18IPv4: route.c: Change checks against 0xffffffff to ipv4_is_lbcast()Andy Walls
Change a few checks against the hardcoded broadcast address, 0xffffffff, to ipv4_is_lbcast(). Remove some existing checks using ipv4_is_lbcast() that are now obviously superfluous. Signed-off-by: Andy Walls <awalls@md.metrocast.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-18netfilter: fix kconfig unmet dependency warningRandy Dunlap
Fix netfilter kconfig unmet dependencies warning & spell out "compatible" while there. warning: (IP_NF_TARGET_TTL && NET && INET && NETFILTER && IP_NF_IPTABLES && NETFILTER_ADVANCED || IP6_NF_TARGET_HL && NET && INET && IPV6 && NETFILTER && IP6_NF_IPTABLES && NETFILTER_ADVANCED) selects NETFILTER_XT_TARGET_HL which has unmet direct dependencies ((IP_NF_MANGLE || IP6_NF_MANGLE) && NETFILTER_ADVANCED) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Patrick McHardy <kaber@trash.net>