Age | Commit message (Collapse) | Author |
|
Anatoly has been fuzzing with kBdysch harness and reported a hang in one
of the outcomes:
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (85) call bpf_get_socket_cookie#46
1: R0_w=invP(id=0) R10=fp0
1: (57) r0 &= 808464432
2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
2: (14) w0 -= 810299440
3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
3: (c4) w0 s>>= 1
4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
4: (76) if w0 s>= 0x30303030 goto pc+216
221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
221: (95) exit
processed 6 insns (limit 1000000) [...]
Taking a closer look, the program was xlated as follows:
# ./bpftool p d x i 12
0: (85) call bpf_get_socket_cookie#7800896
1: (bf) r6 = r0
2: (57) r6 &= 808464432
3: (14) w6 -= 810299440
4: (c4) w6 s>>= 1
5: (76) if w6 s>= 0x30303030 goto pc+216
6: (05) goto pc-1
7: (05) goto pc-1
8: (05) goto pc-1
[...]
220: (05) goto pc-1
221: (05) goto pc-1
222: (95) exit
Meaning, the visible effect is very similar to f54c7898ed1c ("bpf: Fix
precision tracking for unbounded scalars"), that is, the fall-through
branch in the instruction 5 is considered to be never taken given the
conclusion from the min/max bounds tracking in w6, and therefore the
dead-code sanitation rewrites it as goto pc-1. However, real-life input
disagrees with verification analysis since a soft-lockup was observed.
The bug sits in the analysis of the ARSH. The definition is that we shift
the target register value right by K bits through shifting in copies of
its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the
register into 32 bit mode, same happens after simulating the operation.
However, for the case of simulating the actual ARSH, we don't take the
mode into account and act as if it's always 64 bit, but location of sign
bit is different:
dst_reg->smin_value >>= umin_val;
dst_reg->smax_value >>= umin_val;
dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val);
Consider an unknown R0 where bpf_get_socket_cookie() (or others) would
for example return 0xffff. With the above ARSH simulation, we'd see the
following results:
[...]
1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0
1: (85) call bpf_get_socket_cookie#46
2: R0_w=invP(id=0) R10=fp0
2: (57) r0 &= 808464432
-> R0_runtime = 0x3030
3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
3: (14) w0 -= 810299440
-> R0_runtime = 0xcfb40000
4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
(0xffffffff)
4: (c4) w0 s>>= 1
-> R0_runtime = 0xe7da0000
5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
(0x67c00000) (0x7ffbfff8)
[...]
In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011
0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that
is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly
retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into
0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000'
and means here that the simulation didn't retain the sign bit. With above
logic, the updates happen on the 64 bit min/max bounds and given we coerced
the register, the sign bits of the bounds are cleared as well, meaning, we
need to force the simulation into s32 space for 32 bit alu mode.
Verification after the fix below. We're first analyzing the fall-through branch
on 32 bit signed >= test eventually leading to rejection of the program in this
specific case:
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (b7) r2 = 808464432
1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0
1: (85) call bpf_get_socket_cookie#46
2: R0_w=invP(id=0) R10=fp0
2: (bf) r6 = r0
3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0
3: (57) r6 &= 808464432
4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
4: (14) w6 -= 810299440
5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
5: (c4) w6 s>>= 1
6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0
(0x67c00000) (0xfffbfff8)
6: (76) if w6 s>= 0x30303030 goto pc+216
7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0
7: (30) r0 = *(u8 *)skb[808464432]
BPF_LD_[ABS|IND] uses reserved fields
processed 8 insns (limit 1000000) [...]
Fixes: 9cbe1f5a32dc ("bpf/verifier: improve register value range tracking with ARSH")
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115204733.16648-1-daniel@iogearbox.net
|
|
For plain text output, it incorrectly prints the pointer value
"void *data". The "void *data" is actually pointing to memory that
contains a bpf-map's value. The intention is to print the content of
the bpf-map's value instead of printing the pointer pointing to the
bpf-map's value.
In this case, a member of the bpf-map's value is a pointer type.
Thus, it should print the "*(void **)data".
Fixes: 22c349e8db89 ("tools: bpftool: fix format strings and arguments for jsonw_printf()")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Link: https://lore.kernel.org/bpf/20200110231644.3484151-1-kafai@fb.com
|
|
It's possible to leak time wait and request sockets via the following
BPF pseudo code:
sk = bpf_skc_lookup_tcp(...)
if (sk)
bpf_sk_release(sk)
If sk->sk_state is TCP_NEW_SYN_RECV or TCP_TIME_WAIT the refcount taken
by bpf_skc_lookup_tcp is not undone by bpf_sk_release. This is because
sk_flags is re-used for other data in both kinds of sockets. The check
!sock_flag(sk, SOCK_RCU_FREE)
therefore returns a bogus result. Check that sk_flags is valid by calling
sk_fullsock. Skip checking SOCK_RCU_FREE if we already know that sk is
not a full socket.
Fixes: edbf8c01de5a ("bpf: add skc_lookup_tcp helper")
Fixes: f7355a6c0497 ("bpf: Check sk_fullsock() before returning from bpf_sk_lookup()")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200110132336.26099-1-lmb@cloudflare.com
|
|
Right now in tcp_bpf_recvmsg, sock read data first from sk_receive_queue
if not empty than psock->ingress_msg otherwise. If a FIN packet arrives
and there's also some data in psock->ingress_msg, the data in
psock->ingress_msg will be purged. It is always happen when request to a
HTTP1.0 server like python SimpleHTTPServer since the server send FIN
packet after data is sent out.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Reported-by: Arika Chen <eaglesora@gmail.com>
Suggested-by: Arika Chen <eaglesora@gmail.com>
Signed-off-by: Lingpeng Chen <forrest0579@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200109014833.18951-1-forrest0579@gmail.com
|
|
When using fixed link we don't need the MDIO bus support.
Reported-by: Heiko Stuebner <heiko@sntech.de>
Reported-by: kernelci.org bot <bot@kernelci.org>
Fixes: d3e014ec7d5e ("net: stmmac: platform: Fix MDIO init for platforms without PHY")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Acked-by: Sriram Dash <Sriram.dash@samsung.com>
Tested-by: Patrice Chotard <patrice.chotard@st.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail> # Lamobo R1 (fixed-link + MDIO sub node for roboswitch).
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Eric Dumazet says:
====================
vlan: rtnetlink newlink fixes
First patch fixes a potential memory leak found by syzbot
Second patch makes vlan_changelink() aware of errors
and report them to user.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both vlan_dev_change_flags() and vlan_dev_set_egress_priority()
can return an error. vlan_changelink() should not ignore them.
Fixes: 07b5b17e157b ("[VLAN]: Use rtnl_link API")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are few cases where the ndo_uninit() handler might be not
called if an error happens while device is initialized.
Since vlan_newlink() calls vlan_changelink() before
trying to register the netdevice, we need to make sure
vlan_dev_uninit() has been called at least once,
or we might leak allocated memory.
BUG: memory leak
unreferenced object 0xffff888122a206c0 (size 32):
comm "syz-executor511", pid 7124, jiffies 4294950399 (age 32.240s)
hex dump (first 32 bytes):
00 00 00 00 00 00 61 73 00 00 00 00 00 00 00 00 ......as........
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<000000000eb3bb85>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<000000000eb3bb85>] slab_post_alloc_hook mm/slab.h:586 [inline]
[<000000000eb3bb85>] slab_alloc mm/slab.c:3320 [inline]
[<000000000eb3bb85>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3549
[<000000007b99f620>] kmalloc include/linux/slab.h:556 [inline]
[<000000007b99f620>] vlan_dev_set_egress_priority+0xcc/0x150 net/8021q/vlan_dev.c:194
[<000000007b0cb745>] vlan_changelink+0xd6/0x140 net/8021q/vlan_netlink.c:126
[<0000000065aba83a>] vlan_newlink+0x135/0x200 net/8021q/vlan_netlink.c:181
[<00000000fb5dd7a2>] __rtnl_newlink+0x89a/0xb80 net/core/rtnetlink.c:3305
[<00000000ae4273a1>] rtnl_newlink+0x4e/0x80 net/core/rtnetlink.c:3363
[<00000000decab39f>] rtnetlink_rcv_msg+0x178/0x4b0 net/core/rtnetlink.c:5424
[<00000000accba4ee>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
[<00000000319fe20f>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442
[<00000000d51938dc>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
[<00000000d51938dc>] netlink_unicast+0x223/0x310 net/netlink/af_netlink.c:1328
[<00000000e539ac79>] netlink_sendmsg+0x2c0/0x570 net/netlink/af_netlink.c:1917
[<000000006250c27e>] sock_sendmsg_nosec net/socket.c:639 [inline]
[<000000006250c27e>] sock_sendmsg+0x54/0x70 net/socket.c:659
[<00000000e2a156d1>] ____sys_sendmsg+0x2d0/0x300 net/socket.c:2330
[<000000008c87466e>] ___sys_sendmsg+0x8a/0xd0 net/socket.c:2384
[<00000000110e3054>] __sys_sendmsg+0x80/0xf0 net/socket.c:2417
[<00000000d71077c8>] __do_sys_sendmsg net/socket.c:2426 [inline]
[<00000000d71077c8>] __se_sys_sendmsg net/socket.c:2424 [inline]
[<00000000d71077c8>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2424
Fixe: 07b5b17e157b ("[VLAN]: Use rtnl_link API")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2020-01-07
The following pull-request contains BPF updates for your *net* tree.
We've added 2 non-merge commits during the last 1 day(s) which contain
a total of 2 files changed, 16 insertions(+), 4 deletions(-).
The main changes are:
1) Fix a use-after-free in cgroup BPF due to auto-detachment, from Roman Gushchin.
2) Fix skb out-of-bounds access in ld_abs/ind instruction, from Daniel Borkmann.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add one notifier for udev changes net device name.
Fixes: b6601323ef9e ("net: stmmac: debugfs entry name is not be changed when udev rename")
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-01-06
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
For -stable v5.3
('net/mlx5: Move devlink registration before interfaces load')
For -stable v5.4
('net/mlx5e: Fix hairpin RSS table size')
('net/mlx5: DR, Init lists that are used in rule's member')
('net/mlx5e: Always print health reporter message to dmesg')
('net/mlx5: DR, No need for atomic refcount for internal SW steering resources')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Whenever adding new member of rule object we attach it to 2 lists,
These 2 lists should be initialized first.
Fixes: 41d07074154c ("net/mlx5: DR, Expose steering rule functionality")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Set hairpin table size to the corret size, based on the groups that
would be created in it. Groups are laid out on the table such that a
group occupies a range of entries in the table. This implies that the
group ranges should have correspondence to the table they are laid upon.
The patch cited below made group 1's size to grow hence causing
overflow of group range laid on the table.
Fixes: a795d8db2a6d ("net/mlx5e: Support RSS for IP-in-IP and IPv6 tunneled packets")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
No need for an atomic refcounter for the STE and hashtables.
These are internal SW steering resources and they are always
under domain mutex.
This also fixes the following refcount error:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 9 PID: 3527 at lib/refcount.c:25 refcount_warn_saturate+0x81/0xe0
Call Trace:
dr_table_init_nic+0x10d/0x110 [mlx5_core]
mlx5dr_table_create+0xb4/0x230 [mlx5_core]
mlx5_cmd_dr_create_flow_table+0x39/0x120 [mlx5_core]
__mlx5_create_flow_table+0x221/0x5f0 [mlx5_core]
esw_create_offloads_fdb_tables+0x180/0x5a0 [mlx5_core]
...
Fixes: 26d688e33f88 ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
This reverts commit 7dee607ed0e04500459db53001d8e02f8831f084.
During cleanup path, FTE's parent node group is removed which is
referenced by the FTE while freeing the FTE.
Hence FTE's lockless read lookup optimization done in cited commit is
not possible at the moment.
Hence, revert the commit.
This avoid below KAZAN call trace.
[ 110.390896] BUG: KASAN: use-after-free in find_root.isra.14+0x56/0x60
[mlx5_core]
[ 110.391048] Read of size 4 at addr ffff888c19e6d220 by task
swapper/12/0
[ 110.391219] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 5.5.0-rc1+
[ 110.391222] Hardware name: HP ProLiant DL380p Gen8, BIOS P70
08/02/2014
[ 110.391225] Call Trace:
[ 110.391229] <IRQ>
[ 110.391246] dump_stack+0x95/0xd5
[ 110.391307] ? find_root.isra.14+0x56/0x60 [mlx5_core]
[ 110.391320] print_address_description.constprop.5+0x20/0x320
[ 110.391379] ? find_root.isra.14+0x56/0x60 [mlx5_core]
[ 110.391435] ? find_root.isra.14+0x56/0x60 [mlx5_core]
[ 110.391441] __kasan_report+0x149/0x18c
[ 110.391499] ? find_root.isra.14+0x56/0x60 [mlx5_core]
[ 110.391504] kasan_report+0x12/0x20
[ 110.391511] __asan_report_load4_noabort+0x14/0x20
[ 110.391567] find_root.isra.14+0x56/0x60 [mlx5_core]
[ 110.391625] del_sw_fte_rcu+0x4a/0x100 [mlx5_core]
[ 110.391633] rcu_core+0x404/0x1950
[ 110.391640] ? rcu_accelerate_cbs_unlocked+0x100/0x100
[ 110.391649] ? run_rebalance_domains+0x201/0x280
[ 110.391654] rcu_core_si+0xe/0x10
[ 110.391661] __do_softirq+0x181/0x66c
[ 110.391670] irq_exit+0x12c/0x150
[ 110.391675] smp_apic_timer_interrupt+0xf0/0x370
[ 110.391681] apic_timer_interrupt+0xf/0x20
[ 110.391684] </IRQ>
[ 110.391695] RIP: 0010:cpuidle_enter_state+0xfa/0xba0
[ 110.391703] Code: 3d c3 9b b5 50 e8 56 75 6e fe 48 89 45 c8 0f 1f 44
00 00 31 ff e8 a6 94 6e fe 45 84 ff 0f 85 f6 02 00 00 fb 66 0f 1f 44 00
00 <45> 85 f6 0f 88 db 06 00 00 4d 63 fe 4b 8d 04 7f 49 8d 04 87 49 8d
[ 110.391706] RSP: 0018:ffff888c23a6fce8 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffff13
[ 110.391712] RAX: dffffc0000000000 RBX: ffffe8ffff7002f8 RCX:
000000000000001f
[ 110.391715] RDX: 1ffff11184ee6cb5 RSI: 0000000040277d83 RDI:
ffff888c277365a8
[ 110.391718] RBP: ffff888c23a6fd40 R08: 0000000000000002 R09:
0000000000035280
[ 110.391721] R10: ffff888c23a6fc80 R11: ffffed11847485d0 R12:
ffffffffb1017740
[ 110.391723] R13: 0000000000000003 R14: 0000000000000003 R15:
0000000000000000
[ 110.391732] ? cpuidle_enter_state+0xea/0xba0
[ 110.391738] cpuidle_enter+0x4f/0xa0
[ 110.391747] call_cpuidle+0x6d/0xc0
[ 110.391752] do_idle+0x360/0x430
[ 110.391758] ? arch_cpu_idle_exit+0x40/0x40
[ 110.391765] ? complete+0x67/0x80
[ 110.391771] cpu_startup_entry+0x1d/0x20
[ 110.391779] start_secondary+0x2f3/0x3c0
[ 110.391784] ? set_cpu_sibling_map+0x2500/0x2500
[ 110.391795] secondary_startup_64+0xa4/0xb0
[ 110.391841] Allocated by task 290:
[ 110.391917] save_stack+0x21/0x90
[ 110.391921] __kasan_kmalloc.constprop.8+0xa7/0xd0
[ 110.391925] kasan_kmalloc+0x9/0x10
[ 110.391929] kmem_cache_alloc_trace+0xf6/0x270
[ 110.391987] create_root_ns.isra.36+0x58/0x260 [mlx5_core]
[ 110.392044] mlx5_init_fs+0x5fd/0x1ee0 [mlx5_core]
[ 110.392092] mlx5_load_one+0xc7a/0x3860 [mlx5_core]
[ 110.392139] init_one+0x6ff/0xf90 [mlx5_core]
[ 110.392145] local_pci_probe+0xde/0x190
[ 110.392150] work_for_cpu_fn+0x56/0xa0
[ 110.392153] process_one_work+0x678/0x1140
[ 110.392157] worker_thread+0x573/0xba0
[ 110.392162] kthread+0x341/0x400
[ 110.392166] ret_from_fork+0x1f/0x40
[ 110.392218] Freed by task 2742:
[ 110.392288] save_stack+0x21/0x90
[ 110.392292] __kasan_slab_free+0x137/0x190
[ 110.392296] kasan_slab_free+0xe/0x10
[ 110.392299] kfree+0x94/0x250
[ 110.392357] tree_put_node+0x257/0x360 [mlx5_core]
[ 110.392413] tree_remove_node+0x63/0xb0 [mlx5_core]
[ 110.392469] clean_tree+0x199/0x240 [mlx5_core]
[ 110.392525] mlx5_cleanup_fs+0x76/0x580 [mlx5_core]
[ 110.392572] mlx5_unload+0x22/0xc0 [mlx5_core]
[ 110.392619] mlx5_unload_one+0x99/0x260 [mlx5_core]
[ 110.392666] remove_one+0x61/0x160 [mlx5_core]
[ 110.392671] pci_device_remove+0x10b/0x2c0
[ 110.392677] device_release_driver_internal+0x1e4/0x490
[ 110.392681] device_driver_detach+0x36/0x40
[ 110.392685] unbind_store+0x147/0x200
[ 110.392688] drv_attr_store+0x6f/0xb0
[ 110.392693] sysfs_kf_write+0x127/0x1d0
[ 110.392697] kernfs_fop_write+0x296/0x420
[ 110.392702] __vfs_write+0x66/0x110
[ 110.392707] vfs_write+0x1a0/0x500
[ 110.392711] ksys_write+0x164/0x250
[ 110.392715] __x64_sys_write+0x73/0xb0
[ 110.392720] do_syscall_64+0x9f/0x3a0
[ 110.392725] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 7dee607ed0e0 ("net/mlx5: Support lockless FTE read lookups")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Register devlink before interfaces are added.
This will allow interfaces to use devlink while initalizing. For example,
call mlx5_is_roce_enabled.
Fixes: aba25279c100 ("net/mlx5e: Add TX reporter support")
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In case a reporter exists, error message is logged only to the devlink
tracer. The devlink tracer is a visibility utility only, which user can
choose not to monitor.
After cited patch, 3rd party monitoring tools that tracks these error
message will no longer find them in dmesg, causing a regression.
With this patch, error messages are also logged into the dmesg.
Fixes: c50de4af1d63 ("net/mlx5e: Generalize tx reporter's functionality")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Following scenario easily break driver logic and crash the kernel:
1. Add rule with mirred actions to same device.
2. Delete this rule.
In described scenario rule is not added to database and on deletion
driver access invalid entry.
Example:
$ tc filter add dev ens1f0_0 ingress protocol ip prio 1 \
flower skip_sw \
action mirred egress mirror dev ens1f0_1 pipe \
action mirred egress redirect dev ens1f0_1
$ tc filter del dev ens1f0_0 ingress protocol ip prio 1
Dmesg output:
[ 376.634396] mlx5_core 0000:82:00.0: mlx5_cmd_check:756:(pid 3439): DESTROY_FLOW_GROUP(0x934) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x563e2f)
[ 376.654983] mlx5_core 0000:82:00.0: del_hw_flow_group:567:(pid 3439): flow steering can't destroy fg 89 of ft 3145728
[ 376.673433] kasan: CONFIG_KASAN_INLINE enabled
[ 376.683769] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 376.695229] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
[ 376.705069] CPU: 7 PID: 3439 Comm: tc Not tainted 5.4.0-rc5+ #76
[ 376.714959] Hardware name: Supermicro SYS-2028TP-DECTR/X10DRT-PT, BIOS 2.0a 08/12/2016
[ 376.726371] RIP: 0010:mlx5_del_flow_rules+0x105/0x960 [mlx5_core]
[ 376.735817] Code: 01 00 00 00 48 83 eb 08 e8 28 d9 ff ff 4c 39 e3 75 d8 4c 8d bd c0 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 84 04 00 00 48 8d 7d 28 8b 9 d
[ 376.761261] RSP: 0018:ffff888847c56db8 EFLAGS: 00010202
[ 376.770054] RAX: dffffc0000000000 RBX: ffff8888582a6da0 RCX: ffff888847c56d60
[ 376.780743] RDX: 0000000000000058 RSI: 0000000000000008 RDI: 0000000000000282
[ 376.791328] RBP: 0000000000000000 R08: fffffbfff0c60ea6 R09: fffffbfff0c60ea6
[ 376.802050] R10: fffffbfff0c60ea5 R11: ffffffff8630752f R12: ffff8888582a6da0
[ 376.812798] R13: dffffc0000000000 R14: ffff8888582a6da0 R15: 00000000000002c0
[ 376.823445] FS: 00007f675f9a8840(0000) GS:ffff88886d200000(0000) knlGS:0000000000000000
[ 376.834971] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 376.844179] CR2: 00000000007d9640 CR3: 00000007d3f26003 CR4: 00000000001606e0
[ 376.854843] Call Trace:
[ 376.868542] __mlx5_eswitch_del_rule+0x49/0x300 [mlx5_core]
[ 376.877735] mlx5e_tc_del_fdb_flow+0x6ec/0x9e0 [mlx5_core]
[ 376.921549] mlx5e_flow_put+0x2b/0x50 [mlx5_core]
[ 376.929813] mlx5e_delete_flower+0x5b6/0xbd0 [mlx5_core]
[ 376.973030] tc_setup_cb_reoffload+0x29/0xc0
[ 376.980619] fl_reoffload+0x50a/0x770 [cls_flower]
[ 377.015087] tcf_block_playback_offloads+0xbd/0x250
[ 377.033400] tcf_block_setup+0x1b2/0xc60
[ 377.057247] tcf_block_offload_cmd+0x195/0x240
[ 377.098826] tcf_block_offload_unbind+0xe7/0x180
[ 377.107056] __tcf_block_put+0xe5/0x400
[ 377.114528] ingress_destroy+0x3d/0x60 [sch_ingress]
[ 377.122894] qdisc_destroy+0xf1/0x5a0
[ 377.129993] qdisc_graft+0xa3d/0xe50
[ 377.151227] tc_get_qdisc+0x48e/0xa20
[ 377.165167] rtnetlink_rcv_msg+0x35d/0x8d0
[ 377.199528] netlink_rcv_skb+0x11e/0x340
[ 377.219638] netlink_unicast+0x408/0x5b0
[ 377.239913] netlink_sendmsg+0x71b/0xb30
[ 377.267505] sock_sendmsg+0xb1/0xf0
[ 377.273801] ___sys_sendmsg+0x635/0x900
[ 377.312784] __sys_sendmsg+0xd3/0x170
[ 377.338693] do_syscall_64+0x95/0x460
[ 377.344833] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 377.352321] RIP: 0033:0x7f675e58e090
To avoid this, for every mirred action check if output device was
already processed. If so - drop rule with EOPNOTSUPP error.
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Anatoly has been fuzzing with kBdysch harness and reported a KASAN
slab oob in one of the outcomes:
[...]
[ 77.359642] BUG: KASAN: slab-out-of-bounds in bpf_skb_load_helper_8_no_cache+0x71/0x130
[ 77.360463] Read of size 4 at addr ffff8880679bac68 by task bpf/406
[ 77.361119]
[ 77.361289] CPU: 2 PID: 406 Comm: bpf Not tainted 5.5.0-rc2-xfstests-00157-g2187f215eba #1
[ 77.362134] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 77.362984] Call Trace:
[ 77.363249] dump_stack+0x97/0xe0
[ 77.363603] print_address_description.constprop.0+0x1d/0x220
[ 77.364251] ? bpf_skb_load_helper_8_no_cache+0x71/0x130
[ 77.365030] ? bpf_skb_load_helper_8_no_cache+0x71/0x130
[ 77.365860] __kasan_report.cold+0x37/0x7b
[ 77.366365] ? bpf_skb_load_helper_8_no_cache+0x71/0x130
[ 77.366940] kasan_report+0xe/0x20
[ 77.367295] bpf_skb_load_helper_8_no_cache+0x71/0x130
[ 77.367821] ? bpf_skb_load_helper_8+0xf0/0xf0
[ 77.368278] ? mark_lock+0xa3/0x9b0
[ 77.368641] ? kvm_sched_clock_read+0x14/0x30
[ 77.369096] ? sched_clock+0x5/0x10
[ 77.369460] ? sched_clock_cpu+0x18/0x110
[ 77.369876] ? bpf_skb_load_helper_8+0xf0/0xf0
[ 77.370330] ___bpf_prog_run+0x16c0/0x28f0
[ 77.370755] __bpf_prog_run32+0x83/0xc0
[ 77.371153] ? __bpf_prog_run64+0xc0/0xc0
[ 77.371568] ? match_held_lock+0x1b/0x230
[ 77.371984] ? rcu_read_lock_held+0xa1/0xb0
[ 77.372416] ? rcu_is_watching+0x34/0x50
[ 77.372826] sk_filter_trim_cap+0x17c/0x4d0
[ 77.373259] ? sock_kzfree_s+0x40/0x40
[ 77.373648] ? __get_filter+0x150/0x150
[ 77.374059] ? skb_copy_datagram_from_iter+0x80/0x280
[ 77.374581] ? do_raw_spin_unlock+0xa5/0x140
[ 77.375025] unix_dgram_sendmsg+0x33a/0xa70
[ 77.375459] ? do_raw_spin_lock+0x1d0/0x1d0
[ 77.375893] ? unix_peer_get+0xa0/0xa0
[ 77.376287] ? __fget_light+0xa4/0xf0
[ 77.376670] __sys_sendto+0x265/0x280
[ 77.377056] ? __ia32_sys_getpeername+0x50/0x50
[ 77.377523] ? lock_downgrade+0x350/0x350
[ 77.377940] ? __sys_setsockopt+0x2a6/0x2c0
[ 77.378374] ? sock_read_iter+0x240/0x240
[ 77.378789] ? __sys_socketpair+0x22a/0x300
[ 77.379221] ? __ia32_sys_socket+0x50/0x50
[ 77.379649] ? mark_held_locks+0x1d/0x90
[ 77.380059] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 77.380536] __x64_sys_sendto+0x74/0x90
[ 77.380938] do_syscall_64+0x68/0x2a0
[ 77.381324] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 77.381878] RIP: 0033:0x44c070
[...]
After further debugging, turns out while in case of other helper functions
we disallow passing modified ctx, the special case of ld/abs/ind instruction
which has similar semantics (except r6 being the ctx argument) is missing
such check. Modified ctx is impossible here as bpf_skb_load_helper_8_no_cache()
and others are expecting skb fields in original position, hence, add
check_ctx_reg() to reject any modified ctx. Issue was first introduced back
in f1174f77b50c ("bpf/verifier: rework value tracking").
Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200106215157.3553-1-daniel@iogearbox.net
|
|
Igor Russkikh says:
====================
Aquantia/Marvell atlantic bugfixes 2020/01
Here is a set of recently discovered bugfixes,
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Function entries were duplicated accidentally, removing the dups.
Fixes: ea4b4d7fc106 ("net: atlantic: loopback tests via private flags")
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Initial loopback configuration should be called earlier, before
starting traffic on HW blocks. Otherwise depending on race conditions
it could be kept disabled.
Fixes: ea4b4d7fc106 ("net: atlantic: loopback tests via private flags")
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Last code/checkpatch cleanup did a copy paste error where code from
firmware 3 API logic was moved to firmware 1 logic.
This resulted in FW1.x users would never see the link state as active.
Fixes: 7b0c342f1f67 ("net: atlantic: code style cleanup")
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before commit 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
cgroup bpf structures were released with
corresponding cgroup structures. It guaranteed the hierarchical order
of destruction: children were always first. It preserved attached
programs from being released before their propagated copies.
But with cgroup auto-detachment there are no such guarantees anymore:
cgroup bpf is released as soon as the cgroup is offline and there are
no live associated sockets. It means that an attached program can be
detached and released, while its propagated copy is still living
in the cgroup subtree. This will obviously lead to an use-after-free
bug.
To reproduce the issue the following script can be used:
#!/bin/bash
CGROOT=/sys/fs/cgroup
mkdir -p ${CGROOT}/A ${CGROOT}/B ${CGROOT}/A/C
sleep 1
./test_cgrp2_attach ${CGROOT}/A egress &
A_PID=$!
./test_cgrp2_attach ${CGROOT}/B egress &
B_PID=$!
echo $$ > ${CGROOT}/A/C/cgroup.procs
iperf -s &
S_PID=$!
iperf -c localhost -t 100 &
C_PID=$!
sleep 1
echo $$ > ${CGROOT}/B/cgroup.procs
echo ${S_PID} > ${CGROOT}/B/cgroup.procs
echo ${C_PID} > ${CGROOT}/B/cgroup.procs
sleep 1
rmdir ${CGROOT}/A/C
rmdir ${CGROOT}/A
sleep 1
kill -9 ${S_PID} ${C_PID} ${A_PID} ${B_PID}
On the unpatched kernel the following stacktrace can be obtained:
[ 33.619799] BUG: unable to handle page fault for address: ffffbdb4801ab002
[ 33.620677] #PF: supervisor read access in kernel mode
[ 33.621293] #PF: error_code(0x0000) - not-present page
[ 33.622754] Oops: 0000 [#1] SMP NOPTI
[ 33.623202] CPU: 0 PID: 601 Comm: iperf Not tainted 5.5.0-rc2+ #23
[ 33.625545] RIP: 0010:__cgroup_bpf_run_filter_skb+0x29f/0x3d0
[ 33.635809] Call Trace:
[ 33.636118] ? __cgroup_bpf_run_filter_skb+0x2bf/0x3d0
[ 33.636728] ? __switch_to_asm+0x40/0x70
[ 33.637196] ip_finish_output+0x68/0xa0
[ 33.637654] ip_output+0x76/0xf0
[ 33.638046] ? __ip_finish_output+0x1c0/0x1c0
[ 33.638576] __ip_queue_xmit+0x157/0x410
[ 33.639049] __tcp_transmit_skb+0x535/0xaf0
[ 33.639557] tcp_write_xmit+0x378/0x1190
[ 33.640049] ? _copy_from_iter_full+0x8d/0x260
[ 33.640592] tcp_sendmsg_locked+0x2a2/0xdc0
[ 33.641098] ? sock_has_perm+0x10/0xa0
[ 33.641574] tcp_sendmsg+0x28/0x40
[ 33.641985] sock_sendmsg+0x57/0x60
[ 33.642411] sock_write_iter+0x97/0x100
[ 33.642876] new_sync_write+0x1b6/0x1d0
[ 33.643339] vfs_write+0xb6/0x1a0
[ 33.643752] ksys_write+0xa7/0xe0
[ 33.644156] do_syscall_64+0x5b/0x1b0
[ 33.644605] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix this by grabbing a reference to the bpf structure of each ancestor
on the initialization of the cgroup bpf structure, and dropping the
reference at the end of releasing the cgroup bpf structure.
This will restore the hierarchical order of cgroup bpf releasing,
without adding any operations on hot paths.
Thanks to Josef Bacik for the debugging and the initial analysis of
the problem.
Fixes: 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Fix calling multiple tee_client_close_context in case of shm allocation
fails.
Fixes: 246880958ac9 (“firmware: broadcom: add OP-TEE based BNXT f/w manager”)
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The 6390 family uses an extended register to set the port connected to
the CPU. The lower 5 bits indicate the port, the upper three bits are
the priority of the frames as they pass through the switch, what
egress queue they should use, etc. Since frames being set to the CPU
are typically management frames, BPDU, IGMP, ARP, etc set the priority
to 7, the reset default, and the highest.
Fixes: 33641994a676 ("net: dsa: mv88e6xxx: Monitor and Management tables")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix up inconsistent usage of upper and lowercase letters in "Samsung"
name.
"SAMSUNG" is not an abbreviation but a regular trademarked name.
Therefore it should be written with lowercase letters starting with
capital letter.
Although advertisement materials usually use uppercase "SAMSUNG", the
lowercase version is used in all legal aspects (e.g. on Wikipedia and in
privacy/legal statements on
https://www.samsung.com/semiconductor/privacy-global/).
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since net_device.mem_start is unsigned long, it should not be cast to
int right before casting to pointer. This fixes warning (compile
testing on alpha architecture):
drivers/net/wan/sdla.c: In function ‘sdla_transmit’:
drivers/net/wan/sdla.c:711:13: warning:
cast to pointer from integer of different size [-Wint-to-pointer-cast]
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to fix a memleak caused by no place to free cmd->obj.chunk
for the unprocessed SCTP_CMD_REPLY. This issue occurs when failing to
process a cmd while there're still SCTP_CMD_REPLY cmds on the cmd seq
with an allocated chunk in cmd->obj.chunk.
So fix it by freeing cmd->obj.chunk for each SCTP_CMD_REPLY cmd left on
the cmd seq when any cmd returns error. While at it, also remove 'nomem'
label.
Reported-by: syzbot+107c4aff5f392bf1517f@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot found the following crash on:
=====================================================
BUG: KMSAN: uninit-value in __nlmsg_parse include/net/netlink.h:661 [inline]
BUG: KMSAN: uninit-value in nlmsg_parse_deprecated
include/net/netlink.h:706 [inline]
BUG: KMSAN: uninit-value in __tipc_nl_compat_dumpit+0x553/0x11e0
net/tipc/netlink_compat.c:215
CPU: 0 PID: 12425 Comm: syz-executor062 Not tainted 5.5.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x220 lib/dump_stack.c:118
kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
__msan_warning+0x57/0xa0 mm/kmsan/kmsan_instr.c:245
__nlmsg_parse include/net/netlink.h:661 [inline]
nlmsg_parse_deprecated include/net/netlink.h:706 [inline]
__tipc_nl_compat_dumpit+0x553/0x11e0 net/tipc/netlink_compat.c:215
tipc_nl_compat_dumpit+0x761/0x910 net/tipc/netlink_compat.c:308
tipc_nl_compat_handle net/tipc/netlink_compat.c:1252 [inline]
tipc_nl_compat_recv+0x12e9/0x2870 net/tipc/netlink_compat.c:1311
genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:717 [inline]
genl_rcv_msg+0x1dd0/0x23a0 net/netlink/genetlink.c:734
netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477
genl_rcv+0x63/0x80 net/netlink/genetlink.c:745
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0xfa0/0x1100 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x11f0/0x1480 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:639 [inline]
sock_sendmsg net/socket.c:659 [inline]
____sys_sendmsg+0x1362/0x13f0 net/socket.c:2330
___sys_sendmsg net/socket.c:2384 [inline]
__sys_sendmsg+0x4f0/0x5e0 net/socket.c:2417
__do_sys_sendmsg net/socket.c:2426 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:295
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x444179
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 1b d8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffd2d6409c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 0000000000444179
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
RBP: 00000000006ce018 R08: 0000000000000000 R09: 00000000004002e0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401e20
R13: 0000000000401eb0 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline]
kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132
kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:86
slab_alloc_node mm/slub.c:2774 [inline]
__kmalloc_node_track_caller+0xe47/0x11f0 mm/slub.c:4382
__kmalloc_reserve net/core/skbuff.c:141 [inline]
__alloc_skb+0x309/0xa50 net/core/skbuff.c:209
alloc_skb include/linux/skbuff.h:1049 [inline]
nlmsg_new include/net/netlink.h:888 [inline]
tipc_nl_compat_dumpit+0x6e4/0x910 net/tipc/netlink_compat.c:301
tipc_nl_compat_handle net/tipc/netlink_compat.c:1252 [inline]
tipc_nl_compat_recv+0x12e9/0x2870 net/tipc/netlink_compat.c:1311
genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:717 [inline]
genl_rcv_msg+0x1dd0/0x23a0 net/netlink/genetlink.c:734
netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477
genl_rcv+0x63/0x80 net/netlink/genetlink.c:745
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0xfa0/0x1100 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x11f0/0x1480 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:639 [inline]
sock_sendmsg net/socket.c:659 [inline]
____sys_sendmsg+0x1362/0x13f0 net/socket.c:2330
___sys_sendmsg net/socket.c:2384 [inline]
__sys_sendmsg+0x4f0/0x5e0 net/socket.c:2417
__do_sys_sendmsg net/socket.c:2426 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:295
entry_SYSCALL_64_after_hwframe+0x44/0xa9
=====================================================
The complaint above occurred because the memory region pointed by attrbuf
variable was not initialized. To eliminate this warning, we use kcalloc()
rather than kmalloc_array() to allocate memory for attrbuf.
Reported-by: syzbot+b1fd2bf2c89d8407e15f@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The only clk init function in this driver that register a clk is
fu540_c000_clk_init(), and thus we need to unregister the clk when this
driver is removed on that platform. Other init functions, for example
macb_clk_init(), don't register clks and therefore we shouldn't
unregister the clks when this driver is removed. Convert this
registration path to devm so it gets auto-unregistered when this driver
is removed and drop the clk_unregister() calls in driver remove (and
error paths) so that we don't erroneously remove a clk from the system
that isn't registered by this driver.
Otherwise we get strange crashes with a use-after-free when the
devm_clk_get() call in macb_clk_init() calls clk_put() on a clk pointer
that has become invalid because it is freed in clk_unregister().
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Yash Shah <yash.shah@sifive.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: c218ad559020 ("macb: Add support for SiFive FU540-C000")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The emails to ks.giri@samsung.com and vipul.pandya@samsung.com bounce
with 550 error code:
host mailin.samsung.com[203.254.224.12] said: 550
5.1.1 Recipient address rejected: User unknown (in reply to RCPT TO
command)"
Drop Girish K S and Vipul Pandya from sxgbe maintainers entry.
Cc: Byungho An <bh74.an@samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The len used for skb_put_padto is wrong, it need to add len of hdr.
In qrtr_node_enqueue, local variable size_t len is assign with
skb->len, then skb_push(skb, sizeof(*hdr)) will add skb->len with
sizeof(*hdr), so local variable size_t len is not same with skb->len
after skb_push(skb, sizeof(*hdr)).
Then the purpose of skb_put_padto(skb, ALIGN(len, 4)) is to add add
pad to the end of the skb's data if skb->len is not aligned to 4, but
unfortunately it use len instead of skb->len, at this line, skb->len
is 32 bytes(sizeof(*hdr)) more than len, for example, len is 3 bytes,
then skb->len is 35 bytes(3 + 32), and ALIGN(len, 4) is 4 bytes, so
__skb_put_padto will do nothing after check size(35) < len(4), the
correct value should be 36(sizeof(*hdr) + ALIGN(len, 4) = 32 + 4),
then __skb_put_padto will pass check size(35) < len(36) and add 1 byte
to the end of skb's data, then logic is correct.
function of skb_push:
void *skb_push(struct sk_buff *skb, unsigned int len)
{
skb->data -= len;
skb->len += len;
if (unlikely(skb->data < skb->head))
skb_under_panic(skb, len, __builtin_return_address(0));
return skb->data;
}
function of skb_put_padto
static inline int skb_put_padto(struct sk_buff *skb, unsigned int len)
{
return __skb_put_padto(skb, len, true);
}
function of __skb_put_padto
static inline int __skb_put_padto(struct sk_buff *skb, unsigned int len,
bool free_on_error)
{
unsigned int size = skb->len;
if (unlikely(size < len)) {
len -= size;
if (__skb_pad(skb, len, free_on_error))
return -ENOMEM;
__skb_put(skb, len);
}
return 0;
}
Signed-off-by: Carl Huang <cjhuang@codeaurora.org>
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Atomic operations that span cache lines are super-expensive on x86
(not just to the current processor, but also to other processes as all
memory operations are blocked until the operation completes). Upcoming
x86 processors have a switch to cause such operations to generate a #AC
trap. It is expected that some real time systems will enable this mode
in BIOS.
In preparation for this, it is necessary to fix code that may execute
atomic instructions with operands that cross cachelines because the #AC
trap will crash the kernel.
Since "pwol_mask" is local and never exposed to concurrency, there is
no need to set bits in pwol_mask using atomic operations.
Directly operate on the byte which contains the bit instead of using
__set_bit() to avoid any big endian concern due to type cast to
unsigned long in __set_bit().
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Current code use dma_wmb() to ensure Rx/Tx descriptors are visible
to device before writing to doorbell.
However, these dma_wmb() are wrong and unnecessary. Therefore,
they should be removed.
iowrite32be() called from gve_rx_write_doorbell()/gve_tx_put_doorbell()
should guaratee that all previous writes to WB/UC memory is visible to
device before the write done by iowrite32be().
E.g. On ARM64, iowrite32be() calls __iowmb() which expands to dma_wmb()
and only then calls __raw_writel().
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The kernel test robot reports a boot failure with qemu in 5.5-rc,
referencing commit 2203cbf2c8b5 ("net: sfp: move fwnode parsing into
sfp-bus layer"). This is caused by phylink_create() being passed a
NULL fwnode, causing fwnode_property_get_reference_args() to return
-EINVAL.
Don't attempt to attach to a SFP bus if we have no fwnode, which
avoids this issue.
Reported-by: kernel test robot <rong.a.chen@intel.com>
Fixes: 2203cbf2c8b5 ("net: sfp: move fwnode parsing into sfp-bus layer")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DaveM's git tree have been moved into a named subdir 'netdev' to deal with
allowing Jakub Kicinski to help co-maintain the trees.
Link: https://www.kernel.org/doc/html/latest/networking/netdev-FAQ.html
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The loopback feature is only supported on a few drivers like broadcom,
mellanox, etc. The default veth driver has not supported it yet. To avoid
returning failed and making the runner feel confused, let's just skip
the test on drivers that not support loopback.
Fixes: ad11340994d5 ("selftests: Add loopback test")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2020-01-02
this is a pull request of 9 patches for net/master.
The first 5 patches target all the tcan4x5x driver. The first 3 patches
of them are by Dan Murphy and Sean Nyekjaer and improve the device
initialization (power on, reset and get device out of standby before
register access). The next patch is by Dan Murphy and disables the INH
pin device-state if the GPIO is unavailable. The last patch for the
tcan4x5x driver is by Gustavo A. R. Silva and fixes an inconsistent
PTR_ERR check in the tcan4x5x_parse_config() function.
The next patch is by Oliver Hartkopp and targets the generic CAN device
infrastructure. It ensures that an initialized headroom in outgoing CAN
sk_buffs (e.g. if injected by AF_PACKET).
The last 2 patches are by Johan Hovold and fix the kvaser_usb and gs_usb
drivers by always using the current alternate setting not blindly the
first one.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to dump the FECs registers the clocks have to be ticking,
otherwise a data abort occurs. Add calls to runtime PM so they are
enabled and later disabled.
Fixes: e8fcfcd5684a ("net: fec: optimize the clock management to save power")
Reported-by: Chris Healy <Chris.Healy@zii.aero>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before ip_tunnel_ecn_encap() and udp_tunnel_xmit_skb() we should filter
tos value by RT_TOS() instead of using config tos directly.
vxlan_get_route() would filter the tos to fl4.flowi4_tos but we didn't
return it back, as geneve_get_v4_rt() did. So we have to use RT_TOS()
directly in function ip_tunnel_ecn_encap().
Fixes: 206aaafcd279 ("VXLAN: Use IP Tunnels tunnel ENC encap API")
Fixes: 1400615d64cf ("vxlan: allow setting ipv6 traffic class")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The variables 'window_interval' is u64 and do_div()
truncates it to 32 bits, which means it can test
non-zero and be truncated to zero for division.
The unit of window_interval is nanoseconds,
so its lower 32-bit is relatively easy to exceed.
Fix this issue by using div64_u64() instead.
Fixes: 7298de9cd725 ("sch_cake: Add ingress mode")
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Cc: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: cake@lists.bufferbloat.net
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It can take on the values of '0', '1', and '2' and thus
is not a boolean.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we receive a D-SACK, where the sequence number satisfies:
undo_marker <= start_seq < end_seq <= prior_snd_una
we consider this is a valid D-SACK and tcp_is_sackblock_valid()
returns true, then this D-SACK is discarded as "old stuff",
but the variable first_sack_index is not marked as negative
in tcp_sacktag_write_queue().
If this D-SACK also carries a SACK that needs to be processed
(for example, the previous SACK segment was lost), this SACK
will be treated as a D-SACK in the following processing of
tcp_sacktag_write_queue(), which will eventually lead to
incorrect updates of undo_retrans and reordering.
Fixes: fd6dad616d4f ("[TCP]: Earlier SACK block verification & simplify access to them")
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
mv88e6xxx_port_set_cmode() relies on cmode stored in struct
mv88e6xxx_port to skip cmode update when the requested value matches the
cached value. It turns out that mv88e6xxx_port_hidden_write() might
change the port cmode setting as a side effect, so we can't rely on the
cached value to determine that cmode update in not necessary.
Force cmode update in mv88e6341_port_set_cmode(), to make
serdes configuration work again. Other mv88e6xxx_port_set_cmode()
callers keep the current behaviour.
This fixes serdes configuration of the 6141 switch on SolidRun Clearfog
GT-8K.
Fixes: 7a3007d22e8 ("net: dsa: mv88e6xxx: fully support SERDES on Topaz family")
Reported-by: Denis Odintsov <d.odintsov@traviangames.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
to irq mode
Under load, the RX side of the mscan driver can get stuck while TX still
works. Restarting the interface locks up the system. This behaviour
could be reproduced reliably on a MPC5121e based system.
The patch fixes the return value of the NAPI polling function (should be
the number of processed packets, not constant 1) and the condition under
which IRQs are enabled again after polling is finished.
With this patch, no more lockups were observed over a test period of ten
days.
Fixes: afa17a500a36 ("net/can: add driver for mscan family & mpc52xx_mscan")
Signed-off-by: Florian Faber <faber@faberman.de>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Make sure to always use the descriptors of the current alternate setting
to avoid future issues when accessing fields that may differ between
settings.
Signed-off-by: Johan Hovold <johan@kernel.org>
Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Make sure to use the current alternate setting when verifying the
interface descriptors to avoid binding to an invalid interface.
Failing to do so could cause the driver to misbehave or trigger a WARN()
in usb_submit_urb() that kernels with panic_on_warn set would choke on.
Fixes: aec5fb2268b7 ("can: kvaser_usb: Add support for Kvaser USB hydra family")
Cc: stable <stable@vger.kernel.org> # 4.19
Cc: Jimmy Assarsson <extja@kvaser.com>
Cc: Christer Beskow <chbe@kvaser.com>
Cc: Nicklas Johansson <extnj@kvaser.com>
Cc: Martin Henriksson <mh@kvaser.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
CAN sk_buffs
KMSAN sysbot detected a read access to an untinitialized value in the
headroom of an outgoing CAN related sk_buff. When using CAN sockets this
area is filled appropriately - but when using a packet socket this
initialization is missing.
The problematic read access occurs in the CAN receive path which can
only be triggered when the sk_buff is sent through a (virtual) CAN
interface. So we check in the sending path whether we need to perform
the missing initializations.
Fixes: d3b58c47d330d ("can: replace timestamp as unique skb attribute")
Reported-by: syzbot+b02ff0707a97e4e79ebb@syzkaller.appspotmail.com
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org> # >= v4.1
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|