summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2012-05-01netem: fix possible skb leakEric Dumazet
skb_checksum_help(skb) can return an error, we must free skb in this case. qdisc_drop(skb, sch) can also be feeded with a NULL skb (if skb_unshare() failed), so lets use this generic helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30tcp: fix infinite cwnd in tcp_complete_cwr()Yuchung Cheng
When the cwnd reduction is done, ssthresh may be infinite if TCP enters CWR via ECN or F-RTO. If cwnd is not undone, i.e., undo_marker is set, tcp_complete_cwr() falsely set cwnd to the infinite ssthresh value. The correct operation is to keep cwnd intact because it has been updated in ECN or F-RTO. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30Merge branch 'master' of git://1984.lsi.us.es/netDavid S. Miller
2012-04-30netfilter: xt_CT: fix wrong checking in the timeout assignment pathPablo Neira Ayuso
The current checking always succeeded. We have to check the first character of the string to check that it's empty, thus, skipping the timeout path. This fixes the use of the CT target without the timeout option. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-30ipvs: kernel oops - do_ip_vs_get_ctlHans Schillstrom
Change order of init so netns init is ready when register ioctl and netlink. Ver2 Whitespace fixes and __init added. Reported-by: "Ryan O'Hara" <rohara@redhat.com> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-30ipvs: take care of return value from protocol init_netnsHans Schillstrom
ip_vs_create_timeout_table() can return NULL All functions protocol init_netns is affected of this patch. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-30ipvs: null check of net->ipvs in lblc(r) shedulersHans Schillstrom
Avoid crash when registering shedulers after the IPVS core initialization for netns fails. Do this by checking for present core (net->ipvs). Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-28drop_monitor: Make updating data->skb smp safeNeil Horman
Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in its smp protections. Specifically, its possible to replace data->skb while its being written. This patch corrects that by making data->skb an rcu protected variable. That will prevent it from being overwritten while a tracepoint is modifying it. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> CC: David Miller <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-28drop_monitor: fix sleeping in invalid context warningNeil Horman
Eric Dumazet pointed out this warning in the drop_monitor protocol to me: [ 38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85 [ 38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch [ 38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71 [ 38.352582] Call Trace: [ 38.352592] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0 [ 38.352599] [<ffffffff81063f2a>] __might_sleep+0xca/0xf0 [ 38.352606] [<ffffffff81655b16>] mutex_lock+0x26/0x50 [ 38.352610] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0 [ 38.352616] [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90 [ 38.352621] [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170 [ 38.352625] [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40 [ 38.352630] [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0 [ 38.352636] [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0 [ 38.352640] [<ffffffff8154a600>] ? genl_rcv+0x30/0x30 [ 38.352645] [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0 [ 38.352649] [<ffffffff8154a5f0>] genl_rcv+0x20/0x30 [ 38.352653] [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0 [ 38.352658] [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310 [ 38.352663] [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130 [ 38.352668] [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0 [ 38.352673] [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0 [ 38.352677] [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390 [ 38.352682] [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210 [ 38.352687] [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0 [ 38.352693] [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0 [ 38.352699] [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10 [ 38.352703] [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140 [ 38.352708] [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80 [ 38.352713] [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b It stems from holding a spinlock (trace_state_lock) while attempting to register or unregister tracepoint hooks, making in_atomic() true in this context, leading to the warning when the tracepoint calls might_sleep() while its taking a mutex. Since we only use the trace_state_lock to prevent trace protocol state races, as well as hardware stat list updates on an rcu write side, we can just convert the spinlock to a mutex to avoid this problem. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> CC: David Miller <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-27tcp: clean up use of jiffies in tcp_rcv_rtt_measure()Neal Cardwell
Clean up a reference to jiffies in tcp_rcv_rtt_measure() that should instead reference tcp_time_stamp. Since the result of the subtraction is passed into a function taking u32, this should not change any behavior (and indeed the generated assembly does not change on x86_64). However, it seems worth cleaning this up for consistency and clarity (and perhaps to avoid bugs if this is copied and pasted somewhere else). Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-266lowpan: add missing spin_lock_init()alex.bluesman.smirnov@gmail.com
Add missing spin_lock_init() for frames list lock. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-266lowpan: clean up fragments list if module unloadedalex.bluesman.smirnov@gmail.com
Clean all the pending fragments and relative timers if 6lowpan link is going to be deleted. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-266lowpan: fix segmentation fault caused by mlme requestalex.bluesman.smirnov@gmail.com
Add nescesary mlme callbacks to satisfy "iz list" request from user space. Due to 6lowpan device doesn't have its own phy, mlme implemented as a pipe to a real phy to which 6lowpan is attached. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-26ipvs: reset ipvs pointer in netnsJulian Anastasov
Make sure net->ipvs is reset on netns cleanup or failed initialization. It is needed for IPVS applications to know that IPVS core is not loaded in netns. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-26ipvs: add check in ftp for initialized coreJulian Anastasov
Avoid crash when registering ip_vs_ftp after the IPVS core initialization for netns fails. Do this by checking for present core (net->ipvs). Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-25udp_diag: implement idiag_get_info for udp/udplite to get queue informationShan Wei
When we use netlink to monitor queue information for udp socket, idiag_rqueue and idiag_wqueue of inet_diag_msg are returned with 0. Keep consistent with netstat, just return back allocated rmem/wmem size. Signed-off-by: Shan Wei <davidshan@tencent.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-25Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-04-25ipvs: fix crash in ip_vs_control_net_cleanup on unloadJulian Anastasov
commit 14e405461e664b777e2a5636e10b2ebf36a686ec (2.6.39) ("Add __ip_vs_control_{init,cleanup}_sysctl()") introduced regression due to wrong __net_init for __ip_vs_control_cleanup_sysctl. This leads to crash when the ip_vs module is unloaded. Fix it by changing __net_init to __net_exit for the function that is already renamed to ip_vs_control_net_cleanup_sysctl. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-25ipvs: Verify that IP_VS protocol has been registeredSasha Levin
The registration of a protocol might fail, there were no checks and all registrations were assumed to be correct. This lead to NULL ptr dereferences when apps tried registering. For example: [ 1293.226051] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 1293.227038] IP: [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] PGD 391de067 PUD 6c20b067 PMD 0 [ 1293.227038] Oops: 0000 [#1] PREEMPT SMP [ 1293.227038] CPU 1 [ 1293.227038] Pid: 19609, comm: trinity Tainted: G W 3.4.0-rc1-next-20120405-sasha-dirty #57 [ 1293.227038] RIP: 0010:[<ffffffff822aacb0>] [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] RSP: 0018:ffff880038c1dd18 EFLAGS: 00010286 [ 1293.227038] RAX: ffffffffffffffc0 RBX: 0000000000001500 RCX: 0000000000010000 [ 1293.227038] RDX: 0000000000000000 RSI: ffff88003a2d5888 RDI: 0000000000000282 [ 1293.227038] RBP: ffff880038c1dd48 R08: 0000000000000000 R09: 0000000000000000 [ 1293.227038] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a2d5668 [ 1293.227038] R13: ffff88003a2d5988 R14: ffff8800696a8ff8 R15: 0000000000000000 [ 1293.227038] FS: 00007f01930d9700(0000) GS:ffff88007ce00000(0000) knlGS:0000000000000000 [ 1293.227038] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1293.227038] CR2: 0000000000000018 CR3: 0000000065dfc000 CR4: 00000000000406e0 [ 1293.227038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1293.227038] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1293.227038] Process trinity (pid: 19609, threadinfo ffff880038c1c000, task ffff88002dc73000) [ 1293.227038] Stack: [ 1293.227038] ffff880038c1dd48 00000000fffffff4 ffff8800696aada0 ffff8800694f5580 [ 1293.227038] ffffffff8369f1e0 0000000000001500 ffff880038c1dd98 ffffffff822a716b [ 1293.227038] 0000000000000000 ffff8800696a8ff8 0000000000000015 ffff8800694f5580 [ 1293.227038] Call Trace: [ 1293.227038] [<ffffffff822a716b>] ip_vs_app_inc_new+0xdb/0x180 [ 1293.227038] [<ffffffff822a7258>] register_ip_vs_app_inc+0x48/0x70 [ 1293.227038] [<ffffffff822b2fea>] __ip_vs_ftp_init+0xba/0x140 [ 1293.227038] [<ffffffff821c9060>] ops_init+0x80/0x90 [ 1293.227038] [<ffffffff821c90cb>] setup_net+0x5b/0xe0 [ 1293.227038] [<ffffffff821c9416>] copy_net_ns+0x76/0x100 [ 1293.227038] [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190 [ 1293.227038] [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80 [ 1293.227038] [<ffffffff810afd1f>] sys_unshare+0xff/0x290 [ 1293.227038] [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 1293.227038] [<ffffffff82665539>] system_call_fastpath+0x16/0x1b [ 1293.227038] Code: 89 c7 e8 34 91 3b 00 89 de 66 c1 ee 04 31 de 83 e6 0f 48 83 c6 22 48 c1 e6 04 4a 8b 14 26 49 8d 34 34 48 8d 42 c0 48 39 d6 74 13 <66> 39 58 58 74 22 48 8b 48 40 48 8d 41 c0 48 39 ce 75 ed 49 8d [ 1293.227038] RIP [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] RSP <ffff880038c1dd18> [ 1293.227038] CR2: 0000000000000018 [ 1293.379284] ---[ end trace 364ab40c7011a009 ]--- [ 1293.381182] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-24mac80211: call ieee80211_mgd_stop() on interface stopEliad Peller
ieee80211_mgd_teardown() is called on netdev removal, which occurs after the vif was already removed from the low-level driver, resulting in the following warning: [ 4809.014734] ------------[ cut here ]------------ [ 4809.019861] WARNING: at net/mac80211/driver-ops.h:12 ieee80211_bss_info_change_notify+0x200/0x2c8 [mac80211]() [ 4809.030388] wlan0: Failed check-sdata-in-driver check, flags: 0x4 [ 4809.036862] Modules linked in: wlcore_sdio(-) wl12xx wlcore mac80211 cfg80211 [last unloaded: cfg80211] [ 4809.046849] [<c001bd4c>] (unwind_backtrace+0x0/0x12c) [ 4809.055937] [<c047cf1c>] (dump_stack+0x20/0x24) [ 4809.065385] [<c003e334>] (warn_slowpath_common+0x5c/0x74) [ 4809.075589] [<c003e408>] (warn_slowpath_fmt+0x40/0x48) [ 4809.088291] [<bf033630>] (ieee80211_bss_info_change_notify+0x200/0x2c8 [mac80211]) [ 4809.102844] [<bf067f84>] (ieee80211_destroy_auth_data+0x80/0xa4 [mac80211]) [ 4809.116276] [<bf068004>] (ieee80211_mgd_teardown+0x5c/0x74 [mac80211]) [ 4809.129331] [<bf043f18>] (ieee80211_teardown_sdata+0xb0/0xd8 [mac80211]) [ 4809.141595] [<c03b5e58>] (rollback_registered_many+0x228/0x2f0) [ 4809.153056] [<c03b5f48>] (unregister_netdevice_many+0x28/0x50) [ 4809.165696] [<bf041ea8>] (ieee80211_remove_interfaces+0xb4/0xdc [mac80211]) [ 4809.179151] [<bf032174>] (ieee80211_unregister_hw+0x50/0xf0 [mac80211]) [ 4809.191043] [<bf0bebb4>] (wlcore_remove+0x5c/0x7c [wlcore]) [ 4809.201491] [<c02c6918>] (platform_drv_remove+0x24/0x28) [ 4809.212029] [<c02c4d50>] (__device_release_driver+0x8c/0xcc) [ 4809.222738] [<c02c4e84>] (device_release_driver+0x30/0x3c) [ 4809.233099] [<c02c4258>] (bus_remove_device+0x10c/0x128) [ 4809.242620] [<c02c26f8>] (device_del+0x11c/0x17c) [ 4809.252150] [<c02c6de0>] (platform_device_del+0x28/0x68) [ 4809.263051] [<bf0df49c>] (wl1271_remove+0x3c/0x50 [wlcore_sdio]) [ 4809.273590] [<c03806b0>] (sdio_bus_remove+0x48/0xf8) [ 4809.283754] [<c02c4d50>] (__device_release_driver+0x8c/0xcc) [ 4809.293729] [<c02c4e2c>] (driver_detach+0x9c/0xc4) [ 4809.303163] [<c02c3d7c>] (bus_remove_driver+0xc4/0xf4) [ 4809.312973] [<c02c5a98>] (driver_unregister+0x70/0x7c) [ 4809.323220] [<c03809c4>] (sdio_unregister_driver+0x24/0x2c) [ 4809.334213] [<bf0df458>] (wl1271_exit+0x14/0x1c [wlcore_sdio]) [ 4809.344930] [<c009b1a4>] (sys_delete_module+0x228/0x2a8) [ 4809.354734] ---[ end trace 515290ccf5feb522 ]--- Rename ieee80211_mgd_teardown() to ieee80211_mgd_stop(), and call it on ieee80211_do_stop(). Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-24set fake_rtable's dst to NULL to avoid kernel OopsPeter Huang (Peng)
bridge: set fake_rtable's dst to NULL to avoid kernel Oops when bridge is deleted before tap/vif device's delete, kernel may encounter an oops because of NULL reference to fake_rtable's dst. Set fake_rtable's dst to NULL before sending packets out can solve this problem. v4 reformat, change br_drop_fake_rtable(skb) to {} v3 enrich commit header v2 introducing new flag DST_FAKE_RTABLE to dst_entry struct. [ Use "do { } while (0)" for nop br_drop_fake_rtable() implementation -DaveM ] Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Peter Huang <peter.huangpeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-22tcp: fix TCP_MAXSEG for established IPv6 passive socketsNeal Cardwell
Commit f5fff5d forgot to fix TCP_MAXSEG behavior IPv6 sockets, so IPv6 TCP server sockets that used TCP_MAXSEG would find that the advmss of child sockets would be incorrect. This commit mirrors the advmss logic from tcp_v4_syn_recv_sock in tcp_v6_syn_recv_sock. Eventually this logic should probably be shared between IPv4 and IPv6, but this at least fixes this issue. Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-21drop_monitor: allow more events per secondEric Dumazet
It seems there is a logic error in trace_drop_common(), since we store only 64 drops, even if they are from same location. This fix is a one liner, but we probably need more work to avoid useless atomic dec/inc Now I can watch 1 Mpps drops through dropwatch... Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-19net ax25: Reorder ax25_exit to remove races.Eric W. Biederman
While reviewing the sysctl code in ax25 I spotted races in ax25_exit where it is possible to receive notifications and packets after already freeing up some of the data structures needed to process those notifications and updates. Call unregister_netdevice_notifier early so that the rest of the cleanup code does not need to deal with network devices. This takes advantage of my recent enhancement to unregister_netdevice_notifier to send unregister notifications of all network devices that are current registered. Move the unregistration for packet types, socket types and protocol types before we cleanup any of the ax25 data structures to remove the possibilities of other races. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-18tcp: fix retransmit of partially acked framesEric Dumazet
Alexander Beregalov reported skb_over_panic errors and provided stack trace. I occurs commit a21d45726aca (tcp: avoid order-1 allocations on wifi and tx path) added a regression, when a retransmit is done after a partial ACK. tcp_retransmit_skb() tries to aggregate several frames if the first one has enough available room to hold the following ones payload. This is controlled by /proc/sys/net/ipv4/tcp_retrans_collapse tunable (default : enabled) Problem is we must make sure _pskb_trim_head() doesnt fool skb_availroom() when pulling some bytes from skb (this pull is done when receiver ACK part of the frame). Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Cc: Marc MERLIN <marc@merlins.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-18Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless From John: Another batch of fixes intended for 3.4... First up, we have a minor signedness fix for libertas from Amitkumar Karwar. Next, Arend gives us a brcm80211 fix for correctly enabling Tx FIFOs on channels 12 and 13. Bing Zhao gives us some register address corrections for mwifiex. Felix give us a trio of fixes -- one for ath9k to wake the hardware properly from full sleep, one for mac80211 to properly handle packets in cooked monitor mode, and one for ensuring that the proper HT mode selection is honored. Hauke gives us a bcma fix for handling the lack of an sprom. Jonathon Bither gives us an ath5k fix for a missing THIS_MODULE build issue, and another ath5k fix for an io mapping leak. Lukasz Kucharczyk fixes a bitwise check in cfg80211, and Sujith gives us an ath9k fix for assigning sequence numbers for fragmented frames. Finally, we have a MAINTAINERS change from Wey-Yi Guy -- congrats to Johannes Berg for taking the lead on iwlwifi. :-) Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-18Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-04-18netns: do not leak net_generic data on failed initJulian Anastasov
ops_init should free the net_generic data on init failure and __register_pernet_operations should not call ops_free when NET_NS is not enabled. Signed-off-by: Julian Anastasov <ja@ssi.bg> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-17tcp: fix tcp_grow_window() for large incoming framesEric Dumazet
tcp_grow_window() has to grow rcv_ssthresh up to window_clamp, allowing sender to increase its window. tcp_grow_window() still assumes a tcp frame is under MSS, but its no longer true with LRO/GRO. This patch fixes one of the performance issue we noticed with GRO on. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-17mac80211: fix logic error in ibss channel type checkFelix Fietkau
The broken check leads to rate control attempting to use HT40 while the driver is configured for HT20. This leads to interesting hardware issues. HT40 can only be used if the channel type is either HT40- or HT40+ and if the channel type of the cell matches the local type. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Cc: stable@vger.kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-17mac80211: fix truncated packets in cooked monitor rxFelix Fietkau
Cooked monitor rx was recently changed to use ieee80211_add_rx_radiotap_header instead of generating only limited radiotap information. ieee80211_add_rx_radiotap_header assumes that FCS info is still present if the hardware supports receiving it, however when cooked monitor rx packets are processed, FCS info has already been stripped. Fix this by adding an extra flag indicating FCS presence. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-16net_sched: gred: Fix oops in gred_dump() in WRED modeDavid Ward
A parameter set exists for WRED mode, called wred_set, to hold the same values for qavg and qidlestart across all VQs. The WRED mode values had been previously held in the VQ for the default DP. After these values were moved to wred_set, the VQ for the default DP was no longer created automatically (so that it could be omitted on purpose, to have packets in the default DP enqueued directly to the device without using RED). However, gred_dump() was overlooked during that change; in WRED mode it still reads qavg/qidlestart from the VQ for the default DP, which might not even exist. As a result, this command sequence will cause an oops: tc qdisc add dev $DEV handle $HANDLE parent $PARENT gred setup \ DPs 3 default 2 grio tc qdisc change dev $DEV handle $HANDLE gred DP 0 prio 8 $RED_OPTIONS tc qdisc change dev $DEV handle $HANDLE gred DP 1 prio 8 $RED_OPTIONS This fixes gred_dump() in WRED mode to use the values held in wred_set. Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13cfg80211: fix interface combinations check.Lukasz Kucharczyk
Signed-off-by: Lukasz Kucharczyk <lukasz.kucharczyk@tieto.com> Cc: stable@vger.kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-13ipv6: fix problem with expired dst cacheGao feng
If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet. this dst cache will not check expire because it has no RTF_EXPIRES flag. So this dst cache will always be used until the dst gc run. Change the struct dst_entry,add a union contains new pointer from and expires. When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use. we can use this field to point to where the dst cache copy from. The dst.from is only used in IPV6. rt6_check_expired check if rt6_info.dst.from is expired. ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF and RTF_DEFAULT.then hold the ort. ip6_dst_destroy release the ort. Add some functions to operate the RTF_EXPIRES flag and expires(from) together. and change the code to use these new adding functions. Changes from v5: modify ip6_route_add and ndisc_router_discovery to use new adding functions. Only set dst.from when the ort has flag RTF_ADDRCONF and RTF_DEFAULT.then hold the ort. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13caif: Fix memory leakage in the chnl_net.c.Tomasz Gregorek
Added kfree_skb() calls in the chnk_net.c file on the error paths. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13l2tp: don't overwrite source address in l2tp_ip_bind()James Chapman
Applications using L2TP/IP sockets want to be able to bind() an L2TP/IP socket to set the local tunnel id while leaving the auto-assigned source address alone. So if no source address is supplied, don't overwrite the address already stored in the socket. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13l2tp: fix refcount leak in l2tp_ip socketsJames Chapman
The l2tp_ip socket close handler does not update the module refcount correctly which prevents module unload after the first bind() call on an L2TPv3 IP encapulation socket. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13net/key/af_key.c: add missing kfree_skbJulia Lawall
At the point of this error-handling code, alloc_skb has succeded, so free the resulting skb by jumping to the err label. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13phonet: Sort out initiailziation and cleanup code.Eric W. Biederman
Recently an oops was reported in phonet if there was a failure during network namespace creation. [ 163.733755] ------------[ cut here ]------------ [ 163.734501] kernel BUG at include/net/netns/generic.h:45! [ 163.734501] invalid opcode: 0000 [#1] PREEMPT SMP [ 163.734501] CPU 2 [ 163.734501] Pid: 19145, comm: trinity Tainted: G W 3.4.0-rc1-next-20120405-sasha-dirty #57 [ 163.734501] RIP: 0010:[<ffffffff824d6062>] [<ffffffff824d6062>] phonet_pernet+0x182/0x1a0 [ 163.734501] RSP: 0018:ffff8800674d5ca8 EFLAGS: 00010246 [ 163.734501] RAX: 000000003fffffff RBX: 0000000000000000 RCX: ffff8800678c88d8 [ 163.734501] RDX: 00000000003f4000 RSI: ffff8800678c8910 RDI: 0000000000000282 [ 163.734501] RBP: ffff8800674d5cc8 R08: 0000000000000000 R09: 0000000000000000 [ 163.734501] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880068bec920 [ 163.734501] R13: ffffffff836b90c0 R14: 0000000000000000 R15: 0000000000000000 [ 163.734501] FS: 00007f055e8de700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 [ 163.734501] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 163.734501] CR2: 00007f055e6bb518 CR3: 0000000070c16000 CR4: 00000000000406e0 [ 163.734501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 163.734501] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 163.734501] Process trinity (pid: 19145, threadinfo ffff8800674d4000, task ffff8800678c8000) [ 163.734501] Stack: [ 163.734501] ffffffff824d5f00 ffffffff810e2ec1 ffff880067ae0000 00000000ffffffd4 [ 163.734501] ffff8800674d5cf8 ffffffff824d667a ffff880067ae0000 00000000ffffffd4 [ 163.734501] ffffffff836b90c0 0000000000000000 ffff8800674d5d18 ffffffff824d707d [ 163.734501] Call Trace: [ 163.734501] [<ffffffff824d5f00>] ? phonet_pernet+0x20/0x1a0 [ 163.734501] [<ffffffff810e2ec1>] ? get_parent_ip+0x11/0x50 [ 163.734501] [<ffffffff824d667a>] phonet_device_destroy+0x1a/0x100 [ 163.734501] [<ffffffff824d707d>] phonet_device_notify+0x3d/0x50 [ 163.734501] [<ffffffff810dd96e>] notifier_call_chain+0xee/0x130 [ 163.734501] [<ffffffff810dd9d1>] raw_notifier_call_chain+0x11/0x20 [ 163.734501] [<ffffffff821cce12>] call_netdevice_notifiers+0x52/0x60 [ 163.734501] [<ffffffff821cd235>] rollback_registered_many+0x185/0x270 [ 163.734501] [<ffffffff821cd334>] unregister_netdevice_many+0x14/0x60 [ 163.734501] [<ffffffff823123e3>] ipip_exit_net+0x1b3/0x1d0 [ 163.734501] [<ffffffff82312230>] ? ipip_rcv+0x420/0x420 [ 163.734501] [<ffffffff821c8515>] ops_exit_list+0x35/0x70 [ 163.734501] [<ffffffff821c911b>] setup_net+0xab/0xe0 [ 163.734501] [<ffffffff821c9416>] copy_net_ns+0x76/0x100 [ 163.734501] [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190 [ 163.734501] [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80 [ 163.734501] [<ffffffff810afd1f>] sys_unshare+0xff/0x290 [ 163.734501] [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 163.734501] [<ffffffff82665539>] system_call_fastpath+0x16/0x1b [ 163.734501] Code: e0 c3 fe 66 0f 1f 44 00 00 48 c7 c2 40 60 4d 82 be 01 00 00 00 48 c7 c7 80 d1 23 83 e8 48 2a c4 fe e8 73 06 c8 fe 48 85 db 75 0e <0f> 0b 0f 1f 40 00 eb fe 66 0f 1f 44 00 00 48 83 c4 10 48 89 d8 [ 163.734501] RIP [<ffffffff824d6062>] phonet_pernet+0x182/0x1a0 [ 163.734501] RSP <ffff8800674d5ca8> [ 163.861289] ---[ end trace fb5615826c548066 ]--- After investigation it turns out there were two issues. 1) Phonet was not implementing network devices but was using register_pernet_device instead of register_pernet_subsys. This was allowing there to be cases when phonenet was not initialized and the phonet net_generic was not set for a network namespace when network device events were being reported on the netdevice_notifier for a network namespace leading to the oops above. 2) phonet_exit_net was implementing a confusing and special case of handling all network devices from going away that it was hard to see was correct, and would only occur when the phonet module was removed. Now that unregister_netdevice_notifier has been modified to synthesize unregistration events for the network devices that are extant when called this confusing special case in phonet_exit_net is no longer needed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13net: In unregister_netdevice_notifier unregister the netdevices.Eric W. Biederman
We already synthesize events in register_netdevice_notifier and synthesizing events in unregister_netdevice_notifier allows to us remove the need for special case cleanup code. This change should be safe as it adds no new cases for existing callers of unregiser_netdevice_notifier to handle. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix bluetooth userland regression reported by Keith Packard, from Gustavo Padovan. 2) Revert ath9k PS idle change, from Sujith Manoharan. 3) Correct default TCP memory limits (again), from Eric Dumazet. 4) Fix tcp_rcv_rtt_update() accidental use of unscaled RTT, from Neal Cardwell. 5) We made a facility for layers like wireless to say how much tailroom they need in the SKB for link layer stuff such as wireless encryption etc., but TCP works hard to fill every SKB out to the end defeating this specification. This leads to every TCP packet getting reallocated by the wireless code in order to have the right amount of tailroom available. Fix TCP to only fill SKBs out to the real amount of data area it asked for during the allocation, this way it won't eat into the slack added for the device's tailroom needs. Reported by Marc Merlin and fixed by Eric Dumazet. 6) Leaks, endian bugs, and new device IDs in bluetooth from Santosh Nayak, João Paulo Rechi Vita, Cho, Yu-Chen, Andrei Emeltchenko, AceLan Kao, and Andrei Emeltchenko. 7) OOPS on tty_close fix in bluetooth's hci_ldisc from Johan Hovold. 8) netfilter erroneously scales TCP window twice, fix from Changli Gao. 9) Memleak fix in wext-core from Julia Lawall. 10) Consistently handle invalid TCP packets in ipv4 vs. ipv6 conntrack, from Jozsef Kadlecsik. 11) Validate IP header length properly in netfilter conntrack's ipv4_get_l4proto(). * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (39 commits) NFC: Fix the LLCP Tx fragmentation loop rtlwifi: Add missing DMA buffer unmapping for PCI drivers rtlwifi: Preallocate USB read buffers and eliminate kalloc in read routine tcp: avoid order-1 allocations on wifi and tx path net: allow pskb_expand_head() to get maximum tailroom bridge: Do not send queries on multicast group leaves MAINTAINERS: Mark NATSEMI driver as orphan'd. tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sample tcp: restore correct limit Revert "ath9k: fix going to full-sleep on PS idle" rt2x00: Fix rfkill_polling register function. bcma: fix build error on MIPS; implicit pcibios_enable_device netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net netfilter: nf_ct_ipv4: packets with wrong ihl are invalid netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistently net/wireless/wext-core.c: add missing kfree rtlwifi: Fix oops on rate-control failure mac80211: Convert WARN_ON to WARN_ON_ONCE rtlwifi: rtl8192de: Fix firmware initialization nl80211: ensure interface is up in various APIs ...
2012-04-12Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-04-11NFC: Fix the LLCP Tx fragmentation loopSamuel Ortiz
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-11tcp: avoid order-1 allocations on wifi and tx pathEric Dumazet
Marc Merlin reported many order-1 allocations failures in TX path on its wireless setup, that dont make any sense with MTU=1500 network, and non SG capable hardware. After investigation, it turns out TCP uses sk_stream_alloc_skb() and used as a convention skb_tailroom(skb) to know how many bytes of data payload could be put in this skb (for non SG capable devices) Note : these skb used kmalloc-4096 (MTU=1500 + MAX_HEADER + sizeof(struct skb_shared_info) being above 2048) Later, mac80211 layer need to add some bytes at the tail of skb (IEEE80211_ENCRYPT_TAILROOM = 18 bytes) and since no more tailroom is available has to call pskb_expand_head() and request order-1 allocations. This patch changes sk_stream_alloc_skb() so that only sk->sk_prot->max_header bytes of headroom are reserved, and use a new skb field, avail_size to hold the data payload limit. This way, order-0 allocations done by TCP stack can leave more than 2 KB of tailroom and no more allocation is performed in mac80211 layer (or any layer needing some tailroom) avail_size is unioned with mark/dropcount, since mark will be set later in IP stack for output packets. Therefore, skb size is unchanged. Reported-by: Marc MERLIN <marc@merlins.org> Tested-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-11net: allow pskb_expand_head() to get maximum tailroomEric Dumazet
Marc Merlin reported many order-1 allocations failures in TX path on its wireless setup, that dont make any sense with MTU=1500 network, and non SG capable hardware. Turns out part of the problem comes from pskb_expand_head() not using ksize() to get exact head size given by kmalloc(). Doing the same thing than __alloc_skb() allows more tailroom in skb and can prevent future reallocations. As a bonus, struct skb_shared_info becomes cache line aligned. Reported-by: Marc MERLIN <marc@merlins.org> Tested-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-11bridge: Do not send queries on multicast group leavesHerbert Xu
As it stands the bridge IGMP snooping system will respond to group leave messages with queries for remaining membership. This is both unnecessary and undesirable. First of all any multicast routers present should be doing this rather than us. What's more the queries that we send may end up upsetting other multicast snooping swithces in the system that are buggy. In fact, we can simply remove the code that send these queries because the existing membership expiry mechanism doesn't rely on them anyway. So this patch simply removes all code associated with group queries in response to group leave messages. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10Merge tag 'dmaengine-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine Pull dmaengine fixes from Dan Williams: 1/ regression fix for Xen as it now trips over a broken assumption about the dma address size on 32-bit builds 2/ new quirk for netdma to ignore dma channels that cannot meet netdma alignment requirements 3/ fixes for two long standing issues in ioatdma (ring size overflow) and iop-adma (potential stack corruption) * tag 'dmaengine-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine: netdma: adding alignment check for NETDMA ops ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata ioat: ring size variables need to be 32bit to avoid overflow iop-adma: Corrected array overflow in RAID6 Xscale(R) test. ioat: fix size of 'completion' for Xen
2012-04-10tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sampleNeal Cardwell
Fix a code path in tcp_rcv_rtt_update() that was comparing scaled and unscaled RTT samples. The intent in the code was to only use the 'm' measurement if it was a new minimum. However, since 'm' had not yet been shifted left 3 bits but 'new_sample' had, this comparison would nearly always succeed, leading us to erroneously set our receive-side RTT estimate to the 'm' sample when that sample could be nearly 8x too high to use. The overall effect is to often cause the receive-side RTT estimate to be significantly too large (up to 40% too large for brief periods in my tests). Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10tcp: restore correct limitEric Dumazet
Commit c43b874d5d714f (tcp: properly initialize tcp memory limits) tried to fix a regression added in commits 4acb4190 & 3dc43e3, but still get it wrong. Result is machines with low amount of memory have too small tcp_rmem[2] value and slow tcp receives : Per socket limit being 1/1024 of memory instead of 1/128 in old kernels, so rcv window is capped to small values. Fix this to match comment and previous behavior. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10Merge branch 'master' of git://1984.lsi.us.es/netDavid S. Miller