summaryrefslogtreecommitdiff
path: root/net/xfrm
AgeCommit message (Collapse)Author
2014-02-21xfrm: Cleanup error handling of xfrm_state_cloneSteffen Klassert
The error pointer passed to xfrm_state_clone() is unchecked, so remove it and indicate an error by returning a null pointer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-20xfrm: Clone states properly on migrationSteffen Klassert
We loose a lot of information of the original state if we clone it with xfrm_state_clone(). In particular, there is no crypto algorithm attached if the original state uses an aead algorithm. This patch add the missing information to the clone state. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-20xfrm: Take xfrm_state_lock in xfrm_migrate_state_findSteffen Klassert
A comment on xfrm_migrate_state_find() says that xfrm_state_lock is held. This is apparently not the case, but we need it to traverse through the state lists. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-20xfrm: Fix NULL pointer dereference on sub policy usageSteffen Klassert
xfrm_state_sort() takes the unsorted states from the src array and stores them into the dst array. We try to get the namespace from the dst array which is empty at this time, so take the namespace from the src array instead. Fixes: 283bc9f35bbbc ("xfrm: Namespacify xfrm state/policy locks") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-19xfrm: Remove caching of xfrm_policy_sk_bundlesSteffen Klassert
We currently cache socket policy bundles at xfrm_policy_sk_bundles. These cached bundles are never used. Instead we create and cache a new one whenever xfrm_lookup() is called on a socket policy. Most protocols cache the used routes to the socket, so let's remove the unused caching of socket policy bundles in xfrm. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-17ipsec: add support of limited SA dumpNicolas Dichtel
The goal of this patch is to allow userland to dump only a part of SA by specifying a filter during the dump. The kernel is in charge to filter SA, this avoids to generate useless netlink traffic (it save also some cpu cycles). This is particularly useful when there is a big number of SA set on the system. Note that I removed the union in struct xfrm_state_walk to fix a problem on arm. struct netlink_callback->args is defined as a array of 6 long and the first long is used in xfrm code to flag the cb as initialized. Hence, we must have: sizeof(struct xfrm_state_walk) <= sizeof(long) * 5. With the union, it was false on arm (sizeof(struct xfrm_state_walk) was sizeof(long) * 7), due to the padding. In fact, whatever the arch is, this union seems useless, there will be always padding after it. Removing it will not increase the size of this struct (and reduce it on arm). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-13xfrm: avoid creating temporary SA when there are no listenersHoria Geanta
In the case when KMs have no listeners, km_query() will fail and temporary SAs are garbage collected immediately after their allocation. This causes strain on memory allocation, leading even to OOM since temporary SA alloc/free cycle is performed for every packet and garbage collection does not keep up the pace. The sane thing to do is to make sure we have audience before temporary SA allocation. Signed-off-by: Horia Geanta <horia.geanta@freescale.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-12flowcache: Make flow cache name space awareFan Du
Inserting a entry into flowcache, or flushing flowcache should be based on per net scope. The reason to do so is flushing operation from fat netns crammed with flow entries will also making the slim netns with only a few flow cache entries go away in original implementation. Since flowcache is tightly coupled with IPsec, so it would be easier to put flow cache global parameters into xfrm namespace part. And one last thing needs to do is bumping flow cache genid, and flush flow cache should also be made in per net style. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-12xfrm: Don't prohibit AH from using ESN featureFan Du
Clear checking when user try to use ESN through netlink keymgr for AH. As only ESP and AH support ESN feature according to RFC. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) BPF debugger and asm tool by Daniel Borkmann. 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann. 3) Correct reciprocal_divide and update users, from Hannes Frederic Sowa and Daniel Borkmann. 4) Currently we only have a "set" operation for the hw timestamp socket ioctl, add a "get" operation to match. From Ben Hutchings. 5) Add better trace events for debugging driver datapath problems, also from Ben Hutchings. 6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we have a small send and a previous packet is already in the qdisc or device queue, defer until TX completion or we get more data. 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko. 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel Borkmann. 9) Share IP header compression code between Bluetooth and IEEE802154 layers, from Jukka Rissanen. 10) Fix ipv6 router reachability probing, from Jiri Benc. 11) Allow packets to be captured on macvtap devices, from Vlad Yasevich. 12) Support tunneling in GRO layer, from Jerry Chu. 13) Allow bonding to be configured fully using netlink, from Scott Feldman. 14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can already get the TCI. From Atzm Watanabe. 15) New "Heavy Hitter" qdisc, from Terry Lam. 16) Significantly improve the IPSEC support in pktgen, from Fan Du. 17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom Herbert. 18) Add Proportional Integral Enhanced packet scheduler, from Vijay Subramanian. 19) Allow openvswitch to mmap'd netlink, from Thomas Graf. 20) Key TCP metrics blobs also by source address, not just destination address. From Christoph Paasch. 21) Support 10G in generic phylib. From Andy Fleming. 22) Try to short-circuit GRO flow compares using device provided RX hash, if provided. From Tom Herbert. The wireless and netfilter folks have been busy little bees too. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits) net/cxgb4: Fix referencing freed adapter ipv6: reallocate addrconf router for ipv6 address when lo device up fib_frontend: fix possible NULL pointer dereference rtnetlink: remove IFLA_BOND_SLAVE definition rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info qlcnic: update version to 5.3.55 qlcnic: Enhance logic to calculate msix vectors. qlcnic: Refactor interrupt coalescing code for all adapters. qlcnic: Update poll controller code path qlcnic: Interrupt code cleanup qlcnic: Enhance Tx timeout debugging. qlcnic: Use bool for rx_mac_learn. bonding: fix u64 division rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC sfc: Use the correct maximum TX DMA ring size for SFC9100 Add Shradha Shah as the sfc driver maintainer. net/vxlan: Share RX skb de-marking and checksum checks with ovs tulip: cleanup by using ARRAY_SIZE() ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called net/cxgb4: Don't retrieve stats during recovery ...
2014-01-23Merge git://git.infradead.org/users/eparis/auditLinus Torvalds
Pull audit update from Eric Paris: "Again we stayed pretty well contained inside the audit system. Venturing out was fixing a couple of function prototypes which were inconsistent (didn't hurt anything, but we used the same value as an int, uint, u32, and I think even a long in a couple of places). We also made a couple of minor changes to when a couple of LSMs called the audit system. We hoped to add aarch64 audit support this go round, but it wasn't ready. I'm disappearing on vacation on Thursday. I should have internet access, but it'll be spotty. If anything goes wrong please be sure to cc rgb@redhat.com. He'll make fixing things his top priority" * git://git.infradead.org/users/eparis/audit: (50 commits) audit: whitespace fix in kernel-parameters.txt audit: fix location of __net_initdata for audit_net_ops audit: remove pr_info for every network namespace audit: Modify a set of system calls in audit class definitions audit: Convert int limit uses to u32 audit: Use more current logging style audit: Use hex_byte_pack_upper audit: correct a type mismatch in audit_syscall_exit() audit: reorder AUDIT_TTY_SET arguments audit: rework AUDIT_TTY_SET to only grab spin_lock once audit: remove needless switch in AUDIT_SET audit: use define's for audit version audit: documentation of audit= kernel parameter audit: wait_for_auditd rework for readability audit: update MAINTAINERS audit: log task info on feature change audit: fix incorrect set of audit_sock audit: print error message when fail to create audit socket audit: fix dangling keywords in audit_log_set_loginuid() output audit: log on errors from filter user rules ...
2014-01-14net: replace macros net_random and net_srandom with direct calls to prandomAruna-Hewapathirane
This patch removes the net_random and net_srandom macros and replaces them with direct calls to the prandom ones. As new commits only seem to use prandom_u32 there is no use to keep them around. This change makes it easier to grep for users of prandom_u32. Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Conflicts: net/xfrm/xfrm_policy.c Steffen Klassert says: ==================== This pull request has a merge conflict between commits be7928d20bab ("net: xfrm: xfrm_policy: fix inline not at beginning of declaration") and da7c224b1baa ("net: xfrm: xfrm_policy: silence compiler warning") from the net-next tree and commit 2f3ea9a95c58 ("xfrm: checkpatch erros with inline keyword position") from the ipsec-next tree. The version from net-next can be used, like it is done in linux-next. 1) Checkpatch cleanups, from Weilong Chen. 2) Fix lockdep complaints when pktgen is used with IPsec, from Fan Du. 3) Update pktgen to allow any combination of IPsec transport/tunnel mode and AH/ESP/IPcomp type, from Fan Du. 4) Make pktgen_dst_metrics static, Fengguang Wu. 5) Compile fix for pktgen when CONFIG_XFRM is not set, from Fan Du. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13audit: convert all sessionid declaration to unsigned intEric Paris
Right now the sessionid value in the kernel is a combination of u32, int, and unsigned int. Just use unsigned int throughout. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-07net: xfrm: xfrm_policy: silence compiler warningYing Xue
Fix below compiler warning: net/xfrm/xfrm_policy.c:1644:12: warning: ‘xfrm_dst_alloc_copy’ defined but not used [-Wunused-function] Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07net: xfrm: xfrm_policy: fix inline not at beginning of declarationDaniel Borkmann
Fix three warnings related to: net/xfrm/xfrm_policy.c:1644:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration] net/xfrm/xfrm_policy.c:1656:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration] net/xfrm/xfrm_policy.c:1668:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration] Just removing the inline keyword is sufficient as the compiler will decide on its own about inlining or not. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-03{pktgen, xfrm} Introduce xfrm_state_lookup_byspi for pktgenFan Du
Introduce xfrm_state_lookup_byspi to find user specified by custom from "pgset spi xxx". Using this scheme, any flow regardless its saddr/daddr could be transform by SA specified with configurable spi. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-03{pktgen, xfrm} Correct xfrm_state_lock usage in xfrm_stateonly_findFan Du
Acquiring xfrm_state_lock in process context is expected to turn BH off, as this lock is also used in BH context, namely xfrm state timer handler. Otherwise it surprises LOCKDEP with below messages. [ 81.422781] pktgen: Packet Generator for packet performance testing. Version: 2.74 [ 81.725194] [ 81.725211] ========================================================= [ 81.725212] [ INFO: possible irq lock inversion dependency detected ] [ 81.725215] 3.13.0-rc2+ #92 Not tainted [ 81.725216] --------------------------------------------------------- [ 81.725218] kpktgend_0/2780 just changed the state of lock: [ 81.725220] (xfrm_state_lock){+.+...}, at: [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725231] but this lock was taken by another, SOFTIRQ-safe lock in the past: [ 81.725232] (&(&x->lock)->rlock){+.-...} [ 81.725232] [ 81.725232] and interrupts could create inverse lock ordering between them. [ 81.725232] [ 81.725235] [ 81.725235] other info that might help us debug this: [ 81.725237] Possible interrupt unsafe locking scenario: [ 81.725237] [ 81.725238] CPU0 CPU1 [ 81.725240] ---- ---- [ 81.725241] lock(xfrm_state_lock); [ 81.725243] local_irq_disable(); [ 81.725244] lock(&(&x->lock)->rlock); [ 81.725246] lock(xfrm_state_lock); [ 81.725248] <Interrupt> [ 81.725249] lock(&(&x->lock)->rlock); [ 81.725251] [ 81.725251] *** DEADLOCK *** [ 81.725251] [ 81.725254] no locks held by kpktgend_0/2780. [ 81.725255] [ 81.725255] the shortest dependencies between 2nd lock and 1st lock: [ 81.725269] -> (&(&x->lock)->rlock){+.-...} ops: 8 { [ 81.725274] HARDIRQ-ON-W at: [ 81.725276] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70 [ 81.725282] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725284] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725289] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725292] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725300] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725303] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725305] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725308] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725313] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725316] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725329] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725333] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725338] IN-SOFTIRQ-W at: [ 81.725340] [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70 [ 81.725342] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725344] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725347] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725349] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725352] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725355] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725358] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725360] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725363] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725365] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725368] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725370] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725373] INITIAL USE at: [ 81.725375] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70 [ 81.725385] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725388] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725390] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725394] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725398] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725401] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725404] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725407] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725409] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725412] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725415] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725417] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725420] } [ 81.725421] ... key at: [<ffffffff8295b9c8>] __key.46349+0x0/0x8 [ 81.725445] ... acquired at: [ 81.725446] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725449] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725452] [<ffffffff816dc057>] __xfrm_state_delete+0x37/0x140 [ 81.725454] [<ffffffff816dc18c>] xfrm_state_delete+0x2c/0x50 [ 81.725456] [<ffffffff816dc277>] xfrm_state_flush+0xc7/0x1b0 [ 81.725458] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725465] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725468] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725471] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725476] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725479] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725482] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725484] [ 81.725486] -> (xfrm_state_lock){+.+...} ops: 11 { [ 81.725490] HARDIRQ-ON-W at: [ 81.725493] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70 [ 81.725504] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725507] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70 [ 81.725510] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0 [ 81.725513] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725516] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725519] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725522] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725525] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725527] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725530] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725533] SOFTIRQ-ON-W at: [ 81.725534] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725537] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725539] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725541] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725544] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725547] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725550] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725555] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725565] INITIAL USE at: [ 81.725567] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70 [ 81.725569] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725572] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70 [ 81.725574] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0 [ 81.725576] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725580] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725583] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725586] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725589] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725594] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725597] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725599] } [ 81.725600] ... key at: [<ffffffff81cadef8>] xfrm_state_lock+0x18/0x50 [ 81.725606] ... acquired at: [ 81.725607] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150 [ 81.725609] [<ffffffff81099e96>] mark_lock+0x196/0x2f0 [ 81.725611] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725614] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725616] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725627] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725629] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725632] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725635] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725637] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725640] [ 81.725641] [ 81.725641] stack backtrace: [ 81.725645] CPU: 0 PID: 2780 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #92 [ 81.725647] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 [ 81.725649] ffffffff82537b80 ffff880018199988 ffffffff8176af37 0000000000000007 [ 81.725652] ffff8800181999f0 ffff8800181999d8 ffffffff81099358 ffffffff82537b80 [ 81.725655] ffffffff81a32def ffff8800181999f4 0000000000000000 ffff880002cbeaa8 [ 81.725659] Call Trace: [ 81.725664] [<ffffffff8176af37>] dump_stack+0x46/0x58 [ 81.725667] [<ffffffff81099358>] print_irq_inversion_bug.part.42+0x1e8/0x1f0 [ 81.725670] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150 [ 81.725672] [<ffffffff81099e96>] mark_lock+0x196/0x2f0 [ 81.725675] [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150 [ 81.725685] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725691] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90 [ 81.725694] [<ffffffff81089b38>] ? sched_clock_cpu+0xa8/0x120 [ 81.725697] [<ffffffff8109a31a>] ? __lock_acquire+0x32a/0x1d70 [ 81.725699] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725702] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725704] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725707] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90 [ 81.725710] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725712] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725715] [<ffffffff810971ec>] ? lock_release_holdtime.part.26+0x1c/0x1a0 [ 81.725717] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725721] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725724] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725727] [<ffffffffa008ba71>] ? pktgen_thread_worker+0xb11/0x1880 [pktgen] [ 81.725729] [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10 [ 81.725733] [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40 [ 81.725745] [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0 [ 81.725751] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 81.725753] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 81.725757] [<ffffffffa008af60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen] [ 81.725759] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725762] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 [ 81.725765] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725768] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch erros with inline keyword positionWeilong Chen
Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: fix checkpatch errorWeilong Chen
Fix that "else should follow close brace '}'". Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch erros with space prohibitedWeilong Chen
Fix checkpatch error "space prohibited xxx". Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch errors with foo * barWeilong Chen
This patch clean up some checkpatch errors like this: ERROR: "foo * bar" should be "foo *bar" ERROR: "(foo*)" should be "(foo *)" Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch errors with spaceWeilong Chen
This patch cleanup some space errors. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-16xfrm: export verify_userspi_info for pkfey and netlink interfaceFan Du
In order to check against valid IPcomp spi range, export verify_userspi_info for both pfkey and netlink interface. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-16xfrm: check user specified spi for IPCompFan Du
IPComp connection between two hosts is broken if given spi bigger than 0xffff. OUTSPI=0x87 INSPI=0x11112 ip xfrm policy update dst 192.168.1.101 src 192.168.1.109 dir out action allow \ tmpl dst 192.168.1.101 src 192.168.1.109 proto comp spi $OUTSPI ip xfrm policy update src 192.168.1.101 dst 192.168.1.109 dir in action allow \ tmpl src 192.168.1.101 dst 192.168.1.109 proto comp spi $INSPI ip xfrm state add src 192.168.1.101 dst 192.168.1.109 proto comp spi $INSPI \ comp deflate ip xfrm state add dst 192.168.1.101 src 192.168.1.109 proto comp spi $OUTSPI \ comp deflate tcpdump can capture outbound ping packet, but inbound packet is dropped with XfrmOutNoStates errors. It looks like spi value used for IPComp is expected to be 16bits wide only. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-06xfrm: Remove ancient sleeping when the SA is in acquire stateSteffen Klassert
We now queue packets to the policy if the states are not yet resolved, this replaces the ancient sleeping code. Also the sleeping can cause indefinite task hangs if the needed state does not get resolved. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-06xfrm: Namespacify xfrm state/policy locksFan Du
By semantics, xfrm layer is fully name space aware, so will the locks, e.g. xfrm_state/pocliy_lock. Ensure exclusive access into state/policy link list for different name space with one global lock is not right in terms of semantics aspect at first place, as they are indeed mutually independent with each other, but also more seriously causes scalability problem. One practical scenario is on a Open Network Stack, more than hundreds of lxc tenants acts as routers within one host, a global xfrm_state/policy_lock becomes the bottleneck. But onces those locks are decoupled in a per-namespace fashion, locks contend is just with in specific name space scope, without causing additional SPD/SAD access delay for other name space. Also this patch improve scalability while as without changing original xfrm behavior. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-06xfrm: Using the right namespace to migrate key infoFan Du
because the home agent could surely be run on a different net namespace other than init_net. The original behavior could lead into inconsistent of key info. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-06xfrm: Try to honor policy index if it's supplied by userFan Du
xfrm code always searches for unused policy index for newly created policy regardless whether or not user space policy index hint supplied. This patch enables such feature so that using "ip xfrm ... index=xxx" can be used by user to set specific policy index. Currently this beahvior is broken, so this patch make it happen as expected. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-11-07net: move pskb_put() to core codeMathias Krause
This function has usage beside IPsec so move it to the core skbuff code. While doing so, give it some documentation and change its return type to 'unsigned char *' to be in line with skb_put(). Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/emulex/benet/be.h drivers/net/netconsole.c net/bridge/br_private.h Three mostly trivial conflicts. The net/bridge/br_private.h conflict was a function signature (argument addition) change overlapping with the extern removals from Joe Perches. In drivers/net/netconsole.c we had one change adjusting a printk message whilst another changed "printk(KERN_INFO" into "pr_info(". Lastly, the emulex change was a new inline function addition overlapping with Joe Perches's extern removals. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Conflicts: net/xfrm/xfrm_policy.c Minor merge conflict in xfrm_policy.c, consisting of overlapping changes which were trivial to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/usb/qmi_wwan.c include/net/dst.h Trivial merge conflicts, both were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21xfrm: Don't queue retransmitted packets if the original is still on the hostSteffen Klassert
It does not make sense to queue retransmitted packets if the original packet is still in some queue of this host. So add a check to xdst_queue_output() and drop the packet if the original packet is not yet sent. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Eric Dumazet <edumazet@google.com>
2013-10-21xfrm: use vmalloc_node() for percpu scratchesEric Dumazet
scratches are per cpu, we can use vmalloc_node() for proper NUMA affinity. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-19net: misc: Remove extern from function prototypesJoe Perches
There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-18xfrm: prevent ipcomp scratch buffer race conditionMichal Kubecek
In ipcomp_compress(), sortirq is enabled too early, allowing the per-cpu scratch buffer to be rewritten by ipcomp_decompress() (called on the same CPU in softirq context) between populating the buffer and copying the compressed data to the skb. v2: as pointed out by Steffen Klassert, if we also move the local_bh_disable() before reading the per-cpu pointers, we can get rid of get_cpu()/put_cpu(). v3: removed ipcomp_decompress part (as explained by Herbert Xu, it cannot be called from process context), get rid of cpu variable (thanks to Eric Dumazet) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-08xfrm: check for a vaild skb in xfrm_policy_queue_processSteffen Klassert
We might dreference a NULL pointer if the hold_queue is empty, so add a check to avoid this. Bug was introduced with git commit a0073fe18 ("xfrm: Add a state resolution packet queue") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-08xfrm: Add refcount handling to queued policiesSteffen Klassert
We need to ensure that policies can't go away as long as the hold timer is armed, so take a refcont when we arm the timer and drop one if we delete it. Bug was introduced with git commit a0073fe18 ("xfrm: Add a state resolution packet queue") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-01xfrm: Simplify SA looking up when using wildcard sourceFan Du
__xfrm4/6_state_addr_check is a four steps check, all we need to do is checking whether the destination address match when looking SA using wildcard source address. Passing saddr from flow is worst option, as the checking needs to reach the fourth step while actually only one time checking will do the work. So, simplify this process by only checking destination address when using wildcard source address for looking up SAs. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-01xfrm: Force SA to be lookup again if SA in acquire stateFan Du
If SA is in the process of acquiring, which indicates this SA is more promising and precise than the fall back option, i.e. using wild card source address for searching less suitable SA. So, here bail out, and try again. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-30Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Conflicts: include/net/xfrm.h Simple conflict between Joe Perches "extern" removal for function declarations in header files and the changes in Steffen's tree. Steffen Klassert says: ==================== Two patches that are left from the last development cycle. Manual merging of include/net/xfrm.h is needed. The conflict can be solved as it is currently done in linux-next. 1) We announce the creation of temporary acquire state via an asyc event, so the deletion should be annunced too. From Nicolas Dichtel. 2) The VTI tunnels do not real tunning, they just provide a routable IPsec tunnel interface. So introduce and use xfrm_tunnel_notifier instead of xfrm_tunnel for xfrm tunnel mode callback. From Fan Du. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-25xfrm: Fix aevent generation for each received packetThomas Egerer
If asynchronous events are enabled for a particular netlink socket, the notify function is called by the advance function. The notify function creates and dispatches a km_event if a replay timeout occurred, or at least replay_maxdiff packets have been received since the last asynchronous event has been sent. The function is supposed to return if neither of the two events were detected for a state, or replay_maxdiff is equal to zero. Replay_maxdiff is initialized in xfrm_state_construct to the value of the xfrm.sysctl_aevent_rseqth (2 by default), and updated if for a state if the netlink attribute XFRMA_REPLAY_THRESH is set. If, however, replay_maxdiff is set to zero, then all of the three notify implementations perform a break from the switch statement instead of checking whether a timeout occurred, and -- if not -- return. As a result an asynchronous event is generated for every replay update of a state that has a zero replay_maxdiff value. This patch modifies the notify functions such that they immediately return if replay_maxdiff has the value zero, unless a timeout occurred. Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-17xfrm: Guard IPsec anti replay window against replay bitmapFan Du
For legacy IPsec anti replay mechanism: bitmap in struct xfrm_replay_state could only provide a 32 bits window size limit in current design, thus user level parameter sadb_sa_replay should honor this limit, otherwise misleading outputs("replay=244") by setkey -D will be: 192.168.25.2 192.168.22.2 esp mode=transport spi=147561170(0x08cb9ad2) reqid=0(0x00000000) E: aes-cbc 9a8d7468 7655cf0b 719d27be b0ddaac2 A: hmac-sha1 2d2115c2 ebf7c126 1c54f186 3b139b58 264a7331 seq=0x00000000 replay=244 flags=0x00000000 state=mature created: Sep 17 14:00:00 2013 current: Sep 17 14:00:22 2013 diff: 22(s) hard: 30(s) soft: 26(s) last: Sep 17 14:00:00 2013 hard: 0(s) soft: 0(s) current: 1408(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 22 hard: 0 soft: 0 sadb_seq=1 pid=4854 refcnt=0 192.168.22.2 192.168.25.2 esp mode=transport spi=255302123(0x0f3799eb) reqid=0(0x00000000) E: aes-cbc 6485d990 f61a6bd5 e5660252 608ad282 A: hmac-sha1 0cca811a eb4fa893 c47ae56c 98f6e413 87379a88 seq=0x00000000 replay=244 flags=0x00000000 state=mature created: Sep 17 14:00:00 2013 current: Sep 17 14:00:22 2013 diff: 22(s) hard: 30(s) soft: 26(s) last: Sep 17 14:00:00 2013 hard: 0(s) soft: 0(s) current: 1408(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 22 hard: 0 soft: 0 sadb_seq=0 pid=4854 refcnt=0 And also, optimizing xfrm_replay_check window checking by setting the desirable x->props.replay_window with only doing the comparison once for all when xfrm_state is first born. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-16xfrm: Fix replay size checking on async eventsSteffen Klassert
We pass the wrong netlink attribute to xfrm_replay_verify_len(). It should be XFRMA_REPLAY_ESN_VAL and not XFRMA_REPLAY_VAL as we currently doing. This causes memory corruptions if the replay esn attribute has incorrect length. Fix this by passing the right attribute to xfrm_replay_verify_len(). Reported-by: Michael Rossberg <michael.rossberg@tu-ilmenau.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c net/bridge/br_multicast.c net/ipv6/sit.c The conflicts were minor: 1) sit.c changes overlap with change to ip_tunnel_xmit() signature. 2) br_multicast.c had an overlap between computing max_delay using msecs_to_jiffies and turning MLDV2_MRC() into an inline function with a name using lowercase instead of uppercase letters. 3) stmmac had two overlapping changes, one which conditionally allocated and hooked up a dma_cfg based upon the presence of the pbl OF property, and another one handling store-and-forward DMA made. The latter of which should not go into the new of_find_property() basic block. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-28xfrm: Fix potential null pointer dereference in xdst_queue_outputSteffen Klassert
The net_device might be not set on the skb when we try refcounting. This leads to a null pointer dereference in xdst_queue_output(). It turned out that the refcount to the net_device is not needed after all. The dst_entry has a refcount to the net_device before we queue the skb, so it can't go away. Therefore we can remove the refcount on queueing to fix the null pointer dereference. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-08-26xfrm: announce deleation of temporary SANicolas Dichtel
Creation of temporary SA are announced by netlink, but there is no notification for the deletion. This patch fix this asymmetric situation. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-08-19xfrm: remove irrelevant comment in xfrm_input().Rami Rosen
This patch removes a comment in xfrm_input() which became irrelevant due to commit 2774c13, "xfrm: Handle blackhole route creation via afinfo". That commit removed returning -EREMOTE in the xfrm_lookup() method when the packet should be discarded and also removed the correspoinding -EREMOTE handlers. This was replaced by calling the make_blackhole() method. Therefore the comment about -EREMOTE is not relevant anymore. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-08-19xfrm: choose protocol family by skb protocolHannes Frederic Sowa
We need to choose the protocol family by skb->protocol. Otherwise we call the wrong xfrm{4,6}_local_error handler in case an ipv6 sockets is used in ipv4 mode, in which case we should call down to xfrm4_local_error (ip6 sockets are a superset of ip4 ones). We are called before before ip_output functions, so skb->protocol is not reset. Cc: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>