Age | Commit message (Collapse) | Author |
|
In the (unlikely) event fixup_permanent_addr() returns a failure,
addrconf_permanent_addr() calls ipv6_del_addr() without the
mandatory call to in6_ifa_hold(), leading to a refcount error,
spotted by syzkaller :
WARNING: CPU: 1 PID: 3142 at lib/refcount.c:227 refcount_dec+0x4c/0x50
lib/refcount.c:227
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 3142 Comm: ip Not tainted 4.14.0-rc4-next-20171009+ #33
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x41c kernel/panic.c:181
__warn+0x1c4/0x1e0 kernel/panic.c:544
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_dec+0x4c/0x50 lib/refcount.c:227
RSP: 0018:ffff8801ca49e680 EFLAGS: 00010286
RAX: 000000000000002c RBX: ffff8801d07cfcdc RCX: 0000000000000000
RDX: 000000000000002c RSI: 1ffff10039493c90 RDI: ffffed0039493cc4
RBP: ffff8801ca49e688 R08: ffff8801ca49dd70 R09: 0000000000000000
R10: ffff8801ca49df58 R11: 0000000000000000 R12: 1ffff10039493cd9
R13: ffff8801ca49e6e8 R14: ffff8801ca49e7e8 R15: ffff8801d07cfcdc
__in6_ifa_put include/net/addrconf.h:369 [inline]
ipv6_del_addr+0x42b/0xb60 net/ipv6/addrconf.c:1208
addrconf_permanent_addr net/ipv6/addrconf.c:3327 [inline]
addrconf_notify+0x1c66/0x2190 net/ipv6/addrconf.c:3393
notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
__raw_notifier_call_chain kernel/notifier.c:394 [inline]
raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
call_netdevice_notifiers_info+0x32/0x60 net/core/dev.c:1697
call_netdevice_notifiers net/core/dev.c:1715 [inline]
__dev_notify_flags+0x15d/0x430 net/core/dev.c:6843
dev_change_flags+0xf5/0x140 net/core/dev.c:6879
do_setlink+0xa1b/0x38e0 net/core/rtnetlink.c:2113
rtnl_newlink+0xf0d/0x1a40 net/core/rtnetlink.c:2661
rtnetlink_rcv_msg+0x733/0x1090 net/core/rtnetlink.c:4301
netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2408
rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4313
netlink_unicast_kernel net/netlink/af_netlink.c:1273 [inline]
netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1299
netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1862
sock_sendmsg_nosec net/socket.c:633 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:643
___sys_sendmsg+0x75b/0x8a0 net/socket.c:2049
__sys_sendmsg+0xe5/0x210 net/socket.c:2083
SYSC_sendmsg net/socket.c:2094 [inline]
SyS_sendmsg+0x2d/0x50 net/socket.c:2090
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7fa9174d3320
RSP: 002b:00007ffe302ae9e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffe302b2ae0 RCX: 00007fa9174d3320
RDX: 0000000000000000 RSI: 00007ffe302aea20 RDI: 0000000000000016
RBP: 0000000000000082 R08: 0000000000000000 R09: 000000000000000f
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe302b32a0
R13: 0000000000000000 R14: 00007ffe302b2ab8 R15: 00007ffe302b32b8
Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ahern <dsahern@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Syzkaller found several variants of the lockup below by setting negative
values with the TUNSETSNDBUF ioctl. This patch adds a sanity check
to both the tun and tap versions of this ioctl.
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [repro:2389]
Modules linked in:
irq event stamp: 329692056
hardirqs last enabled at (329692055): [<ffffffff824b8381>] _raw_spin_unlock_irqrestore+0x31/0x75
hardirqs last disabled at (329692056): [<ffffffff824b9e58>] apic_timer_interrupt+0x98/0xb0
softirqs last enabled at (35659740): [<ffffffff824bc958>] __do_softirq+0x328/0x48c
softirqs last disabled at (35659731): [<ffffffff811c796c>] irq_exit+0xbc/0xd0
CPU: 0 PID: 2389 Comm: repro Not tainted 4.14.0-rc7 #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880009452140 task.stack: ffff880006a20000
RIP: 0010:_raw_spin_lock_irqsave+0x11/0x80
RSP: 0018:ffff880006a27c50 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
RAX: ffff880009ac68d0 RBX: ffff880006a27ce0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff880006a27ce0 RDI: ffff880009ac6900
RBP: ffff880006a27c60 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 000000000063ff00 R12: ffff880009ac6900
R13: ffff880006a27cf8 R14: 0000000000000001 R15: ffff880006a27cf8
FS: 00007f4be4838700(0000) GS:ffff88000cc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020101000 CR3: 0000000009616000 CR4: 00000000000006f0
Call Trace:
prepare_to_wait+0x26/0xc0
sock_alloc_send_pskb+0x14e/0x270
? remove_wait_queue+0x60/0x60
tun_get_user+0x2cc/0x19d0
? __tun_get+0x60/0x1b0
tun_chr_write_iter+0x57/0x86
__vfs_write+0x156/0x1e0
vfs_write+0xf7/0x230
SyS_write+0x57/0xd0
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7f4be4356df9
RSP: 002b:00007ffc18101c08 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4be4356df9
RDX: 0000000000000046 RSI: 0000000020101000 RDI: 0000000000000005
RBP: 00007ffc18101c40 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000293 R12: 0000559c75f64780
R13: 00007ffc18101d30 R14: 0000000000000000 R15: 0000000000000000
Fixes: 33dccbb050bb ("tun: Limit amount of queued packets per device")
Fixes: 20d29d7a916a ("net: macvtap driver")
Signed-off-by: Craig Gallek <kraig@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It fixes a problem for the last chunk where 'chunk_size' is smaller than
MLXSW_I2C_BLK_MAX and data is copied to the wrong offset, overriding
previous data.
Fixes: 6882b0aee180 ("mlxsw: Introduce support for I2C bus")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2017-11-01
1) Fix a memleak when a packet matches a policy
without a matching state.
2) Reset the socket cached dst_entry when inserting
a socket policy, otherwise the policy might be
ignored. From Jonathan Basseri.
3) Fix GSO for a IPsec, GRE tunnel combination.
We reset the encapsulation field at the skb
too erly, as a result GRE does not segment
GSO packets. Fix this by resetting the the
encapsulation field right before the
transformation where the inner headers get
invalid.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ASIC has the ability to generate events whenever a sensor indicates
the temperature goes above or below its high or low thresholds,
respectively.
In new firmware versions the firmware enforces a minimum of 5
degrees Celsius difference between both thresholds. Make the driver
conform to this requirement.
Note that this is required even when the events are disabled, as in
certain systems interrupts are generated via GPIO based on these
thresholds.
Fixes: 85926f877040 ("mlxsw: reg: Add definition of temperature management registers")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Provide a mailing list for maintenance of the module instead.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For the time being I will be available in my private mail. Update both the
MAINTAINERS file and the individual modules MODULE_AUTHOR directive with
the new address.
Signed-off-by: Yotam Gigi <yotam.gi@gmail.com>
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function of_parse_phandle() returns a NULL pointer if it cannot
resolve a phandle property to a device_node pointer. In function
hns_nic_dev_probe(), its return value is passed to PTR_ERR to extract
the error code. However, in this case, the extracted error code will
always be zero, which is unexpected.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function netdev_priv() returns the private data of the device. The
memory to store the private data is allocated in alloc_netdev() and is
released in netdev_free(). Calling kfree() on the return value of
netdev_priv() after netdev_free() results in a double free bug.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that SK_REDIRECT is no longer a valid return code. Remove it
from the UAPI completely. Then do a namespace remapping internal
to sockmap so SK_REDIRECT is no longer externally visible.
Patchs primary change is to do a namechange from SK_REDIRECT to
__SK_REDIRECT
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The fix 5987feb38aa5 ("net: phy: marvell: logical vs bitwise OR typo")
uncovered another bug in the Marvell PHY driver, which broke the
Marvell OpenRD platform. It relies on the bootloader configuring the
RGMII delays and does not specify a phy-mode in its device tree. The
PHY driver should only configure RGMII delays if the phy mode
indicates it is using RGMII. Without anything in device tree, the
mv643xx Ethernet driver defaults to GMII.
Fixes: 5987feb38aa5 ("net: phy: marvell: logical vs bitwise OR typo")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.14
The most important here is the security vulnerabitility fix for
ath10k.
ath10k
* fix security vulnerability with missing PN check on certain hardware
* revert ath10k napi fix as it caused regressions on QCA6174
wcn36xx
* remove unnecessary rcu_read_unlock() from error path
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ath.git fixes for 4.14. Major changes:
ath10k
* fix security vulnerability with missing PN check on certain hardware
* revert ath10k napi fix as it caused regressions on QCA6174
wcn36xx
* remove unnecessary rcu_read_unlock() from error path
|
|
We reset the encapsulation field of the skb too early
in xfrm_output. As a result, the GRE GSO handler does
not segment the packets. This leads to a performance
drop down. We fix this by resetting the encapsulation
field right before we do the transformation, when
the inner headers become invalid.
Fixes: f1bd7d659ef0 ("xfrm: Add encapsulation header offsets while SKB is not encrypted")
Reported-by: Vicente De Luca <vdeluca@zendesk.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Fixes: 31c2611b66e0 ("selftests: Introduce a new test case to tc testsuite")
Fixes: 76b903ee198d ("selftests: Introduce tc testsuite")
Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit 7aa0045dadb6 ("net_sched: introduce a workqueue for RCU callbacks of tc filter")
I defer tcf_chain_flush() to a workqueue, this causes a use-after-free
because qdisc is already destroyed after we queue this work.
The tcf_block_put_deferred() is no longer necessary after we get RTNL
for each tc filter destroy work, no others could jump in at this point.
Same for tcf_chain_hold(), we are fully serialized now.
This also reduces one indirection therefore makes the code more
readable. Note this brings back a rcu_barrier(), however comparing
to the code prior to commit 7aa0045dadb6 we still reduced one
rcu_barrier(). For net-next, we can consider to refcnt tcf block to
avoid it.
Fixes: 7aa0045dadb6 ("net_sched: introduce a workqueue for RCU callbacks of tc filter")
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use l2tp_tunnel_get() in pppol2tp_connect() to ensure the tunnel isn't
going to disappear while processing the rest of the function.
Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Thorsten reported on <fa6e3ee2-91b5-a54b-afe3-87f30aac7a48@leemhuis.info> that
commit c9353bf483d3 made ath10k unstable with QCA6174 on his Dell XPS13 (9360)
with an error message:
ath10k_pci 0000:3a:00.0: failed to extract amsdu: -11
It only seemed to happen with certain APs, not all, but when it happened the
only way to get ath10k working was to switch the wifi off and on with a hotkey.
As this commit made things even worse (a warning vs breaking the whole
connection) let's revert the commit for now and while the issue is being fixed.
Link: http://lists.infradead.org/pipermail/ath10k/2017-October/010227.html
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
|
|
Rx data frames notified through HTT_T2H_MSG_TYPE_RX_IND and
HTT_T2H_MSG_TYPE_RX_FRAG_IND expect PN/TSC check to be done
on host (mac80211) rather than firmware. Rebuild cipher header
in every received data frames (that are notified through those
HTT interfaces) from the rx_hdr_status tlv available in the
rx descriptor of the first msdu. Skip setting RX_FLAG_IV_STRIPPED
flag for the packets which requires mac80211 PN/TSC check support
and set appropriate RX_FLAG for stripped crypto tail. Hw QCA988X,
QCA9887, QCA99X0, QCA9984, QCA9888 and QCA4019 currently need the
rebuilding of cipher header to perform PN/TSC check for replay
attack.
Please note that removing crypto tail for CCMP-256, GCMP and GCMP-256 ciphers
in raw mode needs to be fixed. Since Rx with these ciphers in raw
mode does not work in the current form even without this patch and
removing crypto tail for these chipers needs clean up, raw mode related
issues in CCMP-256, GCMP and GCMP-256 can be addressed in follow up
patches.
Tested-by: Manikanta Pubbisetty <mpubbise@qti.qualcomm.com>
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
|
|
|
|
Pull networking fixes from David Miller:
1) Fix route leak in xfrm_bundle_create().
2) In mac80211, validate user rate mask before configuring it. From
Johannes Berg.
3) Properly enforce memory limits in fair queueing code, from Toke
Hoiland-Jorgensen.
4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet.
5) Fix TSO header allocation and management in mvpp2 driver, from Yan
Markman.
6) Don't take socket lock in BH handler in strparser code, from Tom
Herbert.
7) Don't show sockets from other namespaces in AF_UNIX code, from
Andrei Vagin.
8) Fix double free in error path of tap_open(), from Girish Moodalbail.
9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker
and Alexander Duyck.
10) Fix DCB mode programming in stmmac driver, from Jose Abreu.
11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin
Long.
12) Properly align SKB head before building SKB in tuntap, from Jason
Wang.
13) Avoid matching qdiscs with a zero handle during lookups, from Cong
Wang.
14) Fix various endianness bugs in sctp, from Xin Long.
15) Fix tc filter callback races and add selftests which trigger the
problem, from Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
selftests: Introduce a new test case to tc testsuite
selftests: Introduce a new script to generate tc batch file
net_sched: fix call_rcu() race on act_sample module removal
net_sched: add rtnl assertion to tcf_exts_destroy()
net_sched: use tcf_queue_work() in tcindex filter
net_sched: use tcf_queue_work() in rsvp filter
net_sched: use tcf_queue_work() in route filter
net_sched: use tcf_queue_work() in u32 filter
net_sched: use tcf_queue_work() in matchall filter
net_sched: use tcf_queue_work() in fw filter
net_sched: use tcf_queue_work() in flower filter
net_sched: use tcf_queue_work() in flow filter
net_sched: use tcf_queue_work() in cgroup filter
net_sched: use tcf_queue_work() in bpf filter
net_sched: use tcf_queue_work() in basic filter
net_sched: introduce a workqueue for RCU callbacks of tc filter
sctp: fix some type cast warnings introduced since very beginning
sctp: fix a type cast warnings that causes a_rwnd gets the wrong value
sctp: fix some type cast warnings introduced by transport rhashtable
sctp: fix some type cast warnings introduced by stream reconf
...
|
|
Cong Wang says:
====================
net_sched: fix races with RCU callbacks
Recently, the RCU callbacks used in TC filters and TC actions keep
drawing my attention, they introduce at least 4 race condition bugs:
1. A simple one fixed by Daniel:
commit c78e1746d3ad7d548bdf3fe491898cc453911a49
Author: Daniel Borkmann <daniel@iogearbox.net>
Date: Wed May 20 17:13:33 2015 +0200
net: sched: fix call_rcu() race on classifier module unloads
2. A very nasty one fixed by me:
commit 1697c4bb5245649a23f06a144cc38c06715e1b65
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon Sep 11 16:33:32 2017 -0700
net_sched: carefully handle tcf_block_put()
3. Two more bugs found by Chris:
https://patchwork.ozlabs.org/patch/826696/
https://patchwork.ozlabs.org/patch/826695/
Usually RCU callbacks are simple, however for TC filters and actions,
they are complex because at least TC actions could be destroyed
together with the TC filter in one callback. And RCU callbacks are
invoked in BH context, without locking they are parallel too. All of
these contribute to the cause of these nasty bugs.
Alternatively, we could also:
a) Introduce a spinlock to serialize these RCU callbacks. But as I
said in commit 1697c4bb5245 ("net_sched: carefully handle
tcf_block_put()"), it is very hard to do because of tcf_chain_dump().
Potentially we need to do a lot of work to make it possible (if not
impossible).
b) Just get rid of these RCU callbacks, because they are not
necessary at all, callers of these call_rcu() are all on slow paths
and holding RTNL lock, so blocking is allowed in their contexts.
However, David and Eric dislike adding synchronize_rcu() here.
As suggested by Paul, we could defer the work to a workqueue and
gain the permission of holding RTNL again without any performance
impact, however, in tcf_block_put() we could have a deadlock when
flushing workqueue while hodling RTNL lock, the trick here is to
defer the work itself in workqueue and make it queued after all
other works so that we keep the same ordering to avoid any
use-after-free. Please see the first patch for details.
Patch 1 introduces the infrastructure, patch 2~12 move each
tc filter to the new tc filter workqueue, patch 13 adds
an assertion to catch potential bugs like this, patch 14
closes another rcu callback race, patch 15 and patch 16 add
new test cases.
====================
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In this patchset, we fixed a tc bug. This patch adds the test case
that reproduces the bug. To run this test case, user should specify
an existing NIC device:
# sudo ./tdc.py -d enp4s0f0
This test case belongs to category "flower". If user doesn't specify
a NIC device, the test cases belong to "flower" will not be run.
In this test case, we create 1M filters and all filters share the same
action. When destroying all filters, kernel should not panic. It takes
about 18s to run it.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
# ./tdc_batch.py -h
usage: tdc_batch.py [-h] [-n NUMBER] [-o] [-s] [-p] device file
TC batch file generator
positional arguments:
device device name
file batch file name
optional arguments:
-h, --help show this help message and exit
-n NUMBER, --number NUMBER
how many lines in batch file
-o, --skip_sw skip_sw (offload), by default skip_hw
-s, --share_action all filters share the same action
-p, --prio all filters have different prio
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar to commit c78e1746d3ad
("net: sched: fix call_rcu() race on classifier module unloads"),
we need to wait for flying RCU callback tcf_sample_cleanup_rcu().
Cc: Yotam Gigi <yotamg@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After previous patches, it is now safe to claim that
tcf_exts_destroy() is always called with RTNL lock.
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch introduces a dedicated workqueue for tc filters
so that each tc filter's RCU callback could defer their
action destroy work to this workqueue. The helper
tcf_queue_work() is introduced for them to use.
Because we hold RTNL lock when calling tcf_block_put(), we
can not simply flush works inside it, therefore we have to
defer it again to this workqueue and make sure all flying RCU
callbacks have already queued their work before this one, in
other words, to ensure this is the last one to execute to
prevent any use-after-free.
On the other hand, this makes tcf_block_put() ugly and
harder to understand. Since David and Eric strongly dislike
adding synchronize_rcu(), this is probably the only
solution that could make everyone happy.
Please also see the code comments below.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Xin Long says:
====================
sctp: a bunch of fixes for some sparse warnings
As Eric noticed, when running 'make C=2 M=net/sctp/', a plenty of
warnings or errors checked by sparse appear. They are all problems
about Endian and type cast.
Most of them are just warnings by which no issues could be caused
while some might be bugs.
This patchset fixes them with four patches basically according to
how they are introduced.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These warnings were found by running 'make C=2 M=net/sctp/'.
They are there since very beginning.
Note after this patch, there still one warning left in
sctp_outq_flush():
sctp_chunk_fail(chunk, SCTP_ERROR_INV_STRM)
Since it has been moved to sctp_stream_outq_migrate on net-next,
to avoid the extra job when merging net-next to net, I will post
the fix for it after the merging is done.
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These warnings were found by running 'make C=2 M=net/sctp/'.
Commit d4d6fb5787a6 ("sctp: Try not to change a_rwnd when faking a
SACK from SHUTDOWN.") expected to use the peers old rwnd and add
our flight size to the a_rwnd. But with the wrong Endian, it may
not work as well as expected.
So fix it by converting to the right value.
Fixes: d4d6fb5787a6 ("sctp: Try not to change a_rwnd when faking a SACK from SHUTDOWN.")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These warnings were found by running 'make C=2 M=net/sctp/'.
They are introduced by not aware of Endian for the port when
coding transport rhashtable patches.
Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport rhashtable")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These warnings were found by running 'make C=2 M=net/sctp/'.
They are introduced by not aware of Endian when coding stream
reconf patches.
Since commit c0d8bab6ae51 ("sctp: add get and set sockopt for
reconf_enable") enabled stream reconf feature for users, the
Fixes tag below would use it.
Fixes: c0d8bab6ae51 ("sctp: add get and set sockopt for reconf_enable")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Davide found the following script triggers a NULL pointer
dereference:
ip l a name eth0 type dummy
tc q a dev eth0 parent :1 handle 1: htb
This is because for a freshly created netdevice noop_qdisc
is attached and when passing 'parent :1', kernel actually
tries to match the major handle which is 0 and noop_qdisc
has handle 0 so is matched by mistake. Commit 69012ae425d7
tries to fix a similar bug but still misses this case.
Handle 0 is not a valid one, should be just skipped. In
fact, kernel uses it as TC_H_UNSPEC.
Fixes: 69012ae425d7 ("net: sched: fix handling of singleton qdiscs with qdisc_hash")
Fixes: 59cc1f61f09c ("net: sched:convert qdisc linked list to hashtable")
Reported-by: Davide Caratti <dcaratti@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now when migrating sock to another one in sctp_sock_migrate(), it only
resets owner sk for the data in receive queues, not the chunks on out
queues.
It would cause that data chunks length on the sock is not consistent
with sk sk_wmem_alloc. When closing the sock or freeing these chunks,
the old sk would never be freed, and the new sock may crash due to
the overflow sk_wmem_alloc.
syzbot found this issue with this series:
r0 = socket$inet_sctp()
sendto$inet(r0)
listen(r0)
accept4(r0)
close(r0)
Although listen() should have returned error when one TCP-style socket
is in connecting (I may fix this one in another patch), it could also
be reproduced by peeling off an assoc.
This issue is there since very beginning.
This patch is to reset owner sk for the chunks on out queues so that
sk sk_wmem_alloc has correct value after accept one sock or peeloff
an assoc to one sock.
Note that when resetting owner sk for chunks on outqueue, it has to
sctp_clear_owner_w/skb_orphan chunks before changing assoc->base.sk
first and then sctp_set_owner_w them after changing assoc->base.sk,
due to that sctp_wfree and it's callees are using assoc->base.sk.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
John Fastabend says:
====================
net: sockmap fixes
Last two fixes (as far as I know) for sockmap code this round.
First, we are using the qdisc cb structure when making the data end
calculation. This is really just wrong so, store it with the other
metadata in the correct tcp_skb_cb sturct to avoid breaking things.
Next, with recent work to attach multiple programs to a cgroup a
specific enumeration of return codes was agreed upon. However,
I wrote the sk_skb program types before seeing this work and used
a different convention. Patch 2 in the series aligns the return
codes to avoid breaking with this infrastructure and also aligns
with other programming conventions to avoid being the odd duck out
forcing programs to remember SK_SKB programs are different. Pusing
to net because its a user visible change. With this SK_SKB program
return codes are the same as other cgroup program types.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Recent additions to support multiple programs in cgroups impose
a strict requirement, "all yes is yes, any no is no". To enforce
this the infrastructure requires the 'no' return code, SK_DROP in
this case, to be 0.
To apply these rules to SK_SKB program types the sk_actions return
codes need to be adjusted.
This fix adds SK_PASS and makes 'SK_DROP = 0'. Finally, remove
SK_ABORTED to remove any chance that the API may allow aborted
program flows to be passed up the stack. This would be incorrect
behavior and allow programs to break existing policies.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SK_SKB program types use bpf_compute_data to store the end of the
packet data. However, bpf_compute_data assumes the cb is stored in the
qdisc layer format. But, for SK_SKB this is the wrong layer of the
stack for this type.
It happens to work (sort of!) because in most cases nothing happens
to be overwritten today. This is very fragile and error prone.
Fortunately, we have another hole in tcp_skb_cb we can use so lets
put the data_end value there.
Note, SK_SKB program types do not use data_meta, they are failed by
sk_skb_is_valid_access().
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- fix O= building on dash
- remove unused dependency in Makefile
- fix default of a choice in Kconfig
- fix typos and documentation style
- fix command options unrecognized by sparse
* tag 'kbuild-fixes-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: clang: fix build failures with sparse check
kbuild doc: a bundle of fixes on makefiles.txt
Makefile: kselftest: fix grammar typo
kbuild: Fix optimization level choice default
kbuild: drop unused symverfile in Makefile.modpost
kbuild: revert $(realpath ...) to $(shell cd ... && /bin/pwd)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- fix gtco tablet driver, tightening parsing of HID descriptors
- add ACPI ID added to Elan driver to be able to handle touchpads found
in Lenovo Ideapad 320/520
- fix the Symaptics RMI4 driver to adjust handling of buttons
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - limit the range of what GPIOs are buttons
Input: gtco - fix potential out-of-bound access
Input: elan_i2c - add ELAN0611 to the ACPI table
|