Age | Commit message (Collapse) | Author |
|
DST_NOCOUNT in dst_entry::flags tracks whether the entry counts
toward route cache size (net->ipv6.sysctl.ip6_rt_max_size).
If the flag is NOT set, dst_ops::pcpuc_entries counter is incremented
in dist_init() and decremented in dst_destroy().
This flag is tied to allocation/deallocation of dst_entry and
should not be copied from another dst/route. Otherwise it can happen
that dst_ops::pcpuc_entries counter grows until no new routes can
be allocated because the counter reached ip6_rt_max_size due to
DST_NOCOUNT not set and thus no counter decrements on gc-ed routes.
Fixes: 3b6761d18bc1 ("net/ipv6: Move dst flags to booleans in fib entries")
Cc: David Ahern <dsahern@gmail.com>
Acked-by: Wei Wang <weiwan@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 072222b488bc55cce92ff246bdc10115fd57d3ab.
I just read that this causes regressions.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current code assumes kcm users know they need to look for the
strparser offset within their bpf program, which is not documented
anywhere and examples laying around do not do.
The actual recv function does handle the offset well, so we can create a
temporary clone of the skb and pull that one up as required for parsing.
The pull itself has a cost if we are pulling beyond the head data,
measured to 2-3% latency in a noisy VM with a local client stressing
that path. The clone's impact seemed too small to measure.
This bug can be exhibited easily by implementing a "trivial" kcm parser
taking the first bytes as size, and on the client sending at least two
such packets in a single write().
Note that bpf sockmap has the same problem, both for parse and for recv,
so it would pulling twice or a real pull within the strparser logic if
anyone cares about that.
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In kTLS MSG_PEEK behavior is currently failing, strace example:
[pid 2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
[pid 2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
[pid 2430] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 2430] listen(4, 10) = 0
[pid 2430] getsockname(4, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
[pid 2430] connect(3, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 2430] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
[pid 2430] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
[pid 2430] accept(4, {sa_family=AF_INET, sin_port=htons(49636), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
[pid 2430] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
[pid 2430] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
[pid 2430] close(4) = 0
[pid 2430] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
[pid 2430] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
[pid 2430] recvfrom(5, "test_read_peektest_read_peektest"..., 64, MSG_PEEK, NULL, NULL) = 64
As can be seen from strace, there are two TLS records sent,
i) 'test_read_peek' and ii) '_mult_recs\0' where we end up
peeking 'test_read_peektest_read_peektest'. This is clearly
wrong, and what happens is that given peek cannot call into
tls_sw_advance_skb() to unpause strparser and proceed with
the next skb, we end up looping over the current one, copying
the 'test_read_peek' over and over into the user provided
buffer.
Here, we can only peek into the currently held skb (current,
full TLS record) as otherwise we would end up having to hold
all the original skb(s) (depending on the peek depth) in a
separate queue when unpausing strparser to process next
records, minimally intrusive is to return only up to the
current record's size (which likely was what c46234ebb4d1
("tls: RX path for ktls") originally intended as well). Thus,
after patch we properly peek the first record:
[pid 2046] wait4(2075, <unfinished ...>
[pid 2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
[pid 2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
[pid 2075] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 2075] listen(4, 10) = 0
[pid 2075] getsockname(4, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
[pid 2075] connect(3, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 2075] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
[pid 2075] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
[pid 2075] accept(4, {sa_family=AF_INET, sin_port=htons(45732), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
[pid 2075] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
[pid 2075] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
[pid 2075] close(4) = 0
[pid 2075] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
[pid 2075] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
[pid 2075] recvfrom(5, "test_read_peek", 64, MSG_PEEK, NULL, NULL) = 14
Fixes: c46234ebb4d1 ("tls: RX path for ktls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the unlikely case ip6_xmit() has to call skb_realloc_headroom(),
we need to call skb_set_owner_w() before consuming original skb,
otherwise we risk a use-after-free.
Bring IPv6 in line with what we do in IPv4 to fix this.
Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2018-09-16
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix end boundary calculation in BTF for the type section, from Martin.
2) Fix and revert subtraction of pointers that was accidentally allowed
for unprivileged programs, from Alexei.
3) Fix bpf_msg_pull_data() helper by using __GFP_COMP in order to avoid
a warning in linearizing sg pages into a single one for large allocs,
from Tushar.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the UDPv6 early demux rx code path lacks some mandatory
checks, already implemented into the normal RX code path - namely
the checksum conversion and no_check6_rx check.
Similar to the previous commit, we move the common processing to
an UDPv6 specific helper and call it from both edemux code path
and normal code path. In respect to the UDPv4, we need to add an
explicit check for non zero csum according to no_check6_rx value.
Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Xin Long <lucien.xin@gmail.com>
Fixes: c9f2c1ae123a ("udp6: fix socket leak on early demux")
Fixes: 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 2abb7cdc0dc8 ("udp: Add support for doing checksum
unnecessary conversion") left out the early demux path for
connected sockets. As a result IP_CMSG_CHECKSUM gives wrong
values for such socket when GRO is not enabled/available.
This change addresses the issue by moving the csum conversion to a
common helper and using such helper in both the default and the
early demux rx path.
Fixes: 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Matteo reported the following splat, testing the datapath of TC 'sample':
BUG: KASAN: null-ptr-deref in tcf_sample_act+0xc4/0x310
Read of size 8 at addr 0000000000000000 by task nc/433
CPU: 0 PID: 433 Comm: nc Not tainted 4.19.0-rc3-kvm #17
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
Call Trace:
kasan_report.cold.6+0x6c/0x2fa
tcf_sample_act+0xc4/0x310
? dev_hard_start_xmit+0x117/0x180
tcf_action_exec+0xa3/0x160
tcf_classify+0xdd/0x1d0
htb_enqueue+0x18e/0x6b0
? deref_stack_reg+0x7a/0xb0
? htb_delete+0x4b0/0x4b0
? unwind_next_frame+0x819/0x8f0
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
__dev_queue_xmit+0x722/0xca0
? unwind_get_return_address_ptr+0x50/0x50
? netdev_pick_tx+0xe0/0xe0
? save_stack+0x8c/0xb0
? kasan_kmalloc+0xbe/0xd0
? __kmalloc_track_caller+0xe4/0x1c0
? __kmalloc_reserve.isra.45+0x24/0x70
? __alloc_skb+0xdd/0x2e0
? sk_stream_alloc_skb+0x91/0x3b0
? tcp_sendmsg_locked+0x71b/0x15a0
? tcp_sendmsg+0x22/0x40
? __sys_sendto+0x1b0/0x250
? __x64_sys_sendto+0x6f/0x80
? do_syscall_64+0x5d/0x150
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
? __sys_sendto+0x1b0/0x250
? __x64_sys_sendto+0x6f/0x80
? do_syscall_64+0x5d/0x150
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
ip_finish_output2+0x495/0x590
? ip_copy_metadata+0x2e0/0x2e0
? skb_gso_validate_network_len+0x6f/0x110
? ip_finish_output+0x174/0x280
__tcp_transmit_skb+0xb17/0x12b0
? __tcp_select_window+0x380/0x380
tcp_write_xmit+0x913/0x1de0
? __sk_mem_schedule+0x50/0x80
tcp_sendmsg_locked+0x49d/0x15a0
? tcp_rcv_established+0x8da/0xa30
? tcp_set_state+0x220/0x220
? clear_user+0x1f/0x50
? iov_iter_zero+0x1ae/0x590
? __fget_light+0xa0/0xe0
tcp_sendmsg+0x22/0x40
__sys_sendto+0x1b0/0x250
? __ia32_sys_getpeername+0x40/0x40
? _copy_to_user+0x58/0x70
? poll_select_copy_remaining+0x176/0x200
? __pollwait+0x1c0/0x1c0
? ktime_get_ts64+0x11f/0x140
? kern_select+0x108/0x150
? core_sys_select+0x360/0x360
? vfs_read+0x127/0x150
? kernel_write+0x90/0x90
__x64_sys_sendto+0x6f/0x80
do_syscall_64+0x5d/0x150
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fefef2b129d
Code: ff ff ff ff eb b6 0f 1f 80 00 00 00 00 48 8d 05 51 37 0c 00 41 89 ca 8b 00 85 c0 75 20 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 6b f3 c3 66 0f 1f 84 00 00 00 00 00 41 56 41
RSP: 002b:00007fff2f5350c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 000056118d60c120 RCX: 00007fefef2b129d
RDX: 0000000000002000 RSI: 000056118d629320 RDI: 0000000000000003
RBP: 000056118d530370 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000002000
R13: 000056118d5c2a10 R14: 000056118d5c2a10 R15: 000056118d5303b8
tcf_sample_act() tried to update its per-cpu stats, but tcf_sample_init()
forgot to allocate them, because tcf_idr_create() was called with a wrong
value of 'cpustats'. Setting it to true proved to fix the reported crash.
Reported-by: Matteo Croce <mcroce@redhat.com>
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action")
Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As reported by Reobert O'Callahan, since Viro's commit to kill
dev_ifsioc() we attempt to copy too much data in compat mode,
which may lead to EFAULT when the 32-bit version of struct ifreq
sits at/near the end of a page boundary, and the next page isn't
mapped.
Fix this by passing the approprate compat/non-compat size to copy
and using that, as before the dev_ifsioc() removal. This works
because only the embedded "struct ifmap" has different size, and
this is only used in SIOCGIFMAP/SIOCSIFMAP which has a different
handler. All other parts of the union are naturally compatible.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=199469.
Fixes: bf4405737f9f ("kill dev_ifsioc()")
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When splitting a GSO segment that consists of encapsulated packets, the
skb->mac_len of the segments can end up being set wrong, causing packet
drops in particular when using act_mirred and ifb interfaces in
combination with a qdisc that splits GSO packets.
This happens because at the time skb_segment() is called, network_header
will point to the inner header, throwing off the calculation in
skb_reset_mac_len(). The network_header is subsequently adjust by the
outer IP gso_segment handlers, but they don't set the mac_len.
Fix this by adding skb_reset_mac_len() calls to both the IPv4 and IPv6
gso_segment handlers, after they modify the network_header.
Many thanks to Eric Dumazet for his help in identifying the cause of
the bug.
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Johan Hedberg says:
====================
pull request: bluetooth 2018-09-13
A few Bluetooth fixes for the 4.19-rc series:
- Fixed rw_semaphore leak in hci_ldisc
- Fixed local Out-of-Band pairing data handling
Let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This contains key material in crypto_send_aes_gcm_128 and
crypto_recv_aes_gcm_128.
Introduce union tls_crypto_context, and replace the two identical
unions directly embedded in struct tls_context with it. We can then
use this union to clean up the memory in the new tls_ctx_free()
function.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There's no need to copy the key to an on-stack buffer before calling
crypto_aead_setkey().
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Update 'confirmed' timestamp when ARP packet is received. It shouldn't
affect locktime logic and anyway entry can be confirmed by any higher-layer
protocol. Thus it makes sense to confirm it when ARP packet is received.
Fixes: 77d7123342dc ("neighbour: update neigh timestamps iff update is effective")
Signed-off-by: Vasily Khoruzhick <vasilykh@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This fix addresses https://bugzilla.kernel.org/show_bug.cgi?id=201071
Commit 5025f7f7d506 wrongly relied on __dev_change_flags to notify users of
dev flag changes in the case when dev->rtnl_link_state = RTNL_LINK_INITIALIZED.
Fix it by indicating flag changes explicitly to __dev_notify_flags.
Fixes: 5025f7f7d506 ("rtnetlink: add rtnl_link_state check in rtnl_configure_link")
Reported-By: Liam mcbirnie <liam.mcbirnie@boeing.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we delete a chain of filters, we need to notify
user-space we are deleting each filters in this chain
too.
Fixes: 32a4f5ecd738 ("net: sched: introduce chain object to uapi")
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In inet6_rtm_getroute, since Commit 93531c674315 ("net/ipv6: separate
handling of FIB entries from dst based routes"), it has used rt->from
to dump route info instead of rt.
However for some route like cache, some of its information like flags
or gateway is not the same as that of the 'from' one. It caused 'ip
route get' to dump the wrong route information.
In Jianlin's testing, the output information even lost the expiration
time for a pmtu route cache due to the wrong fib6_flags.
So change to use rt6_info members for dst addr, src addr, flags and
gateway when it tries to dump a route entry without fibmatch set.
v1->v2:
- not use rt6i_prefsrc.
- also fix the gw dump issue.
Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Helper bpg_msg_pull_data() can allocate multiple pages while
linearizing multiple scatterlist elements into one shared page.
However, if the shared page has size > PAGE_SIZE, using
copy_page_to_iter() causes below warning.
e.g.
[ 6367.019832] WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825
page_copy_sane.part.8+0x0/0x8
To avoid above warning, use __GFP_COMP while allocating multiple
contiguous pages.
Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
When __tipc_dump_start() fails with running out of memory,
we have no reason to continue, especially we should avoid
calling tipc_dump_done().
Fixes: 8f5c5fcf3533 ("tipc: call start and done ops directly in __tipc_nl_compat_dumpit()")
Reported-and-tested-by: syzbot+3f8324abccfbf8c74a9f@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a rds sock is bound, it is inserted into the bind_hash_table
which is protected by RCU. But when releasing rds sock, after it
is removed from this hash table, it is freed immediately without
respecting RCU grace period. This could cause some use-after-free
as reported by syzbot.
Mark the rds sock with SOCK_RCU_FREE before inserting it into the
bind_hash_table, so that it would be always freed after a RCU grace
period.
The other problem is in rds_find_bound(), the rds sock could be
freed in between rhashtable_lookup_fast() and rds_sock_addref(),
so we need to extend RCU read lock protection in rds_find_bound()
to close this race condition.
Reported-and-tested-by: syzbot+8967084bcac563795dc6@syzkaller.appspotmail.com
Reported-by: syzbot+93a5839deb355537440f@syzkaller.appspotmail.com
Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: rds-devel@oss.oracle.com
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oarcle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When processing icmp unreachable message for erspan tunnel, tunnel id
should be erspan_net_id instead of ipgre_net_id.
Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If erspan tunnel hasn't been established, we'd better send icmp port
unreachable message after receive erspan packets.
Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert pr_info to net_info_ratelimited to limit the total number of
synflood warnings.
Commit 946cedccbd73 ("tcp: Change possible SYN flooding messages")
rate limits synflood warnings to one per listener.
Workloads that open many listener sockets can still see a high rate of
log messages. Syzkaller is one frequent example.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for you net tree:
1) Remove duplicated include at the end of UDP conntrack, from Yue Haibing.
2) Restore conntrack dependency on xt_cluster, from Martin Willi.
3) Fix splat with GSO skbs from the checksum target, from Florian Westphal.
4) Rework ct timeout support, the template strategy to attach custom timeouts
is not correct since it will not work in conjunction with conntrack zones
and we have a possible free after use when removing the rule due to missing
refcounting. To fix these problems, do not use conntrack template at all
and set custom timeout on the already valid conntrack object. This
fix comes with a preparation patch to simplify timeout adjustment by
initializating the first position of the timeout array for all of the
existing trackers. Patchset from Florian Westphal.
5) Fix missing dependency on from IPv4 chain NAT type, from Florian.
6) Release chain reference counter from the flush path, from Taehee Yoo.
7) After flushing an iptables ruleset, conntrack hooks are unregistered
and entries are left stale to be cleaned up by the timeout garbage
collector. No TCP tracking is done on established flows by this time.
If ruleset is reloaded, then hooks are registered again and TCP
tracking is restored, which considers packets to be invalid. Clear
window tracking to exercise TCP flow pickup from the middle given that
history is lost for us. Again from Florian.
8) Fix crash from netlink interface with CONFIG_NF_CONNTRACK_TIMEOUT=y
and CONFIG_NF_CT_NETLINK_TIMEOUT=n.
9) Broken CT target due to returning incorrect type from
ctnl_timeout_find_get().
10) Solve conntrack clash on NF_REPEAT verdicts too, from Michal Vaner.
11) Missing conversion of hashlimit sysctl interface to new API, from
Cong Wang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case local OOB data was generated and other device initiated pairing
claiming that it has got OOB data, following crash occurred:
[ 222.847853] general protection fault: 0000 [#1] SMP PTI
[ 222.848025] CPU: 1 PID: 42 Comm: kworker/u5:0 Tainted: G C 4.18.0-custom #4
[ 222.848158] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 222.848307] Workqueue: hci0 hci_rx_work [bluetooth]
[ 222.848416] RIP: 0010:compute_ecdh_secret+0x5a/0x270 [bluetooth]
[ 222.848540] Code: 0c af f5 48 8b 3d 46 de f0 f6 ba 40 00 00 00 be c0 00 60 00 e8 b7 7b c5 f5 48 85 c0 0f 84 ea 01 00 00 48 89 c3 e8 16 0c af f5 <49> 8b 47 38 be c0 00 60 00 8b 78 f8 48 83 c7 48 e8 51 84 c5 f5 48
[ 222.848914] RSP: 0018:ffffb1664087fbc0 EFLAGS: 00010293
[ 222.849021] RAX: ffff8a5750d7dc00 RBX: ffff8a5671096780 RCX: ffffffffc08bc32a
[ 222.849111] RDX: 0000000000000000 RSI: 00000000006000c0 RDI: ffff8a5752003800
[ 222.849192] RBP: ffffb1664087fc60 R08: ffff8a57525280a0 R09: ffff8a5752003800
[ 222.849269] R10: ffffb1664087fc70 R11: 0000000000000093 R12: ffff8a5674396e00
[ 222.849350] R13: ffff8a574c2e79aa R14: ffff8a574c2e796a R15: 020e0e100d010101
[ 222.849429] FS: 0000000000000000(0000) GS:ffff8a5752500000(0000) knlGS:0000000000000000
[ 222.849518] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 222.849586] CR2: 000055856016a038 CR3: 0000000110d2c005 CR4: 00000000000606e0
[ 222.849671] Call Trace:
[ 222.849745] ? sc_send_public_key+0x110/0x2a0 [bluetooth]
[ 222.849825] ? sc_send_public_key+0x115/0x2a0 [bluetooth]
[ 222.849925] smp_recv_cb+0x959/0x2490 [bluetooth]
[ 222.850023] ? _cond_resched+0x19/0x40
[ 222.850105] ? mutex_lock+0x12/0x40
[ 222.850202] l2cap_recv_frame+0x109d/0x3420 [bluetooth]
[ 222.850315] ? l2cap_recv_frame+0x109d/0x3420 [bluetooth]
[ 222.850426] ? __switch_to_asm+0x34/0x70
[ 222.850515] ? __switch_to_asm+0x40/0x70
[ 222.850625] ? __switch_to_asm+0x34/0x70
[ 222.850724] ? __switch_to_asm+0x40/0x70
[ 222.850786] ? __switch_to_asm+0x34/0x70
[ 222.850846] ? __switch_to_asm+0x40/0x70
[ 222.852581] ? __switch_to_asm+0x34/0x70
[ 222.854976] ? __switch_to_asm+0x40/0x70
[ 222.857475] ? __switch_to_asm+0x40/0x70
[ 222.859775] ? __switch_to_asm+0x34/0x70
[ 222.861218] ? __switch_to_asm+0x40/0x70
[ 222.862327] ? __switch_to_asm+0x34/0x70
[ 222.863758] l2cap_recv_acldata+0x266/0x3c0 [bluetooth]
[ 222.865122] hci_rx_work+0x1c9/0x430 [bluetooth]
[ 222.867144] process_one_work+0x210/0x4c0
[ 222.868248] worker_thread+0x41/0x4d0
[ 222.869420] kthread+0x141/0x160
[ 222.870694] ? process_one_work+0x4c0/0x4c0
[ 222.871668] ? kthread_create_worker_on_cpu+0x90/0x90
[ 222.872896] ret_from_fork+0x35/0x40
[ 222.874132] Modules linked in: algif_hash algif_skcipher af_alg rfcomm bnep btusb btrtl btbcm btintel snd_intel8x0 cmac intel_rapl_perf vboxvideo(C) snd_ac97_codec bluetooth ac97_bus joydev ttm snd_pcm ecdh_generic drm_kms_helper snd_timer snd input_leds drm serio_raw fb_sys_fops soundcore syscopyarea sysfillrect sysimgblt mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ahci psmouse libahci i2c_piix4 video e1000 pata_acpi
[ 222.883153] fbcon_switch: detected unhandled fb_set_par error, error code -16
[ 222.886774] fbcon_switch: detected unhandled fb_set_par error, error code -16
[ 222.890503] ---[ end trace 6504aa7a777b5316 ]---
[ 222.890541] RIP: 0010:compute_ecdh_secret+0x5a/0x270 [bluetooth]
[ 222.890551] Code: 0c af f5 48 8b 3d 46 de f0 f6 ba 40 00 00 00 be c0 00 60 00 e8 b7 7b c5 f5 48 85 c0 0f 84 ea 01 00 00 48 89 c3 e8 16 0c af f5 <49> 8b 47 38 be c0 00 60 00 8b 78 f8 48 83 c7 48 e8 51 84 c5 f5 48
[ 222.890555] RSP: 0018:ffffb1664087fbc0 EFLAGS: 00010293
[ 222.890561] RAX: ffff8a5750d7dc00 RBX: ffff8a5671096780 RCX: ffffffffc08bc32a
[ 222.890565] RDX: 0000000000000000 RSI: 00000000006000c0 RDI: ffff8a5752003800
[ 222.890571] RBP: ffffb1664087fc60 R08: ffff8a57525280a0 R09: ffff8a5752003800
[ 222.890576] R10: ffffb1664087fc70 R11: 0000000000000093 R12: ffff8a5674396e00
[ 222.890581] R13: ffff8a574c2e79aa R14: ffff8a574c2e796a R15: 020e0e100d010101
[ 222.890586] FS: 0000000000000000(0000) GS:ffff8a5752500000(0000) knlGS:0000000000000000
[ 222.890591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 222.890594] CR2: 000055856016a038 CR3: 0000000110d2c005 CR4: 00000000000606e0
This commit fixes a bug where invalid pointer to crypto tfm was used for
SMP SC ECDH calculation when OOB was in use. Solution is to use same
crypto tfm than when generating OOB material on generate_oob() function.
This bug was introduced in commit c0153b0b901a ("Bluetooth: let the crypto
subsystem generate the ecc privkey"). Bug was found by fuzzing kernel SMP
implementation using Synopsys Defensics.
Signed-off-by: Matias Karhumaa <matias.karhumaa@gmail.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
A remote device may claim that it has received our OOB data, even
though we never geneated it. Add a new flag to track whether we
actually have OOB data, and ignore the remote peer's flag if haven't
generated OOB data.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
After switching to the new procfs API, it is supposed to
retrieve the private pointer from PDE_DATA(file_inode(s->file)),
s->private is no longer referred.
Fixes: 1cd671827290 ("netfilter/x_tables: switch to proc_create_seq_private")
Reported-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
NF_REPEAT places the packet at the beginning of the iptables chain
instead of accepting or rejecting it right away. The packet however will
reach the end of the chain and continue to the end of iptables
eventually, so it needs the same handling as NF_ACCEPT and NF_DROP.
Fixes: 368982cd7d1b ("netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks")
Signed-off-by: Michal 'vorner' Vaner <michal.vaner@avast.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Compiler did not catch incorrect typing in the rcu hook assignment.
% nfct add timeout test-tcp inet tcp established 100 close 10 close_wait 10
% iptables -I OUTPUT -t raw -p tcp -j CT --timeout test-tcp
dmesg - xt_CT: Timeout policy `test-tcp' can only be used by L3 protocol number 25000
The CT target bails out with incorrect layer 3 protocol number.
Fixes: 6c1fd7dc489d ("netfilter: cttimeout: decouple timeout policy from nfnetlink_cttimeout object")
Reported-by: Harsha Sharma <harshasharmaiitr@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Now that cttimeout support for nft_ct is in place, these should depend
on CONFIG_NF_CONNTRACK_TIMEOUT otherwise we can crash when dumping the
policy if this option is not enabled.
[ 71.600121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[...]
[ 71.600141] CPU: 3 PID: 7612 Comm: nft Not tainted 4.18.0+ #246
[...]
[ 71.600188] Call Trace:
[ 71.600201] ? nft_ct_timeout_obj_dump+0xc6/0xf0 [nft_ct]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Doug Smythies says:
Sometimes it is desirable to temporarily disable, or clear,
the iptables rule set on a computer being controlled via a
secure shell session (SSH). While unwise on an internet facing
computer, I also do it often on non-internet accessible computers
while testing. Recently, this has become problematic, with the
SSH session being dropped upon re-load of the rule set.
The problem is that when all rules are deleted, conntrack hooks get
unregistered.
In case the rules are re-added later, its possible that tcp window
has moved far enough so that all packets are considered invalid (out of
window) until entry expires (which can take forever, default
established timeout is 5 days).
Fix this by clearing maxwin of existing tcp connections on register.
v2: don't touch entries on hook removal.
v3: remove obsolete expiry check.
Reported-by: Doug Smythies <dsmythies@telus.net>
Fixes: 4d3a57f23dec59 ("netfilter: conntrack: do not enable connection tracking unless needed")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
A kernel crash occurrs when defragmented packet is fragmented
in ip_do_fragment().
In defragment routine, skb_orphan() is called and
skb->ip_defrag_offset is set. but skb->sk and
skb->ip_defrag_offset are same union member. so that
frag->sk is not NULL.
Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
defragmented packet is fragmented.
test commands:
%iptables -t nat -I POSTROUTING -j MASQUERADE
%hping3 192.168.4.2 -s 1000 -p 2000 -d 60000
splat looks like:
[ 261.069429] kernel BUG at net/ipv4/ip_output.c:636!
[ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
[ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
[ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
[ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
[ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
[ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
[ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
[ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
[ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
[ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
[ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
[ 261.198158] Call Trace:
[ 261.199018] ? dst_output+0x180/0x180
[ 261.205011] ? save_trace+0x300/0x300
[ 261.209018] ? ip_copy_metadata+0xb00/0xb00
[ 261.213034] ? sched_clock_local+0xd4/0x140
[ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack]
[ 261.223014] ? rt_cpu_seq_stop+0x10/0x10
[ 261.227014] ? find_held_lock+0x39/0x1c0
[ 261.233008] ip_finish_output+0x51d/0xb50
[ 261.237006] ? ip_fragment.constprop.56+0x220/0x220
[ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
[ 261.250152] ? rcu_is_watching+0x77/0x120
[ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
[ 261.261033] ? nf_hook_slow+0xb1/0x160
[ 261.265007] ip_output+0x1c7/0x710
[ 261.269005] ? ip_mc_output+0x13f0/0x13f0
[ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0
[ 261.278152] ? ip_fragment.constprop.56+0x220/0x220
[ 261.282996] ? nf_hook_slow+0xb1/0x160
[ 261.287007] raw_sendmsg+0x21f9/0x4420
[ 261.291008] ? dst_output+0x180/0x180
[ 261.297003] ? sched_clock_cpu+0x126/0x170
[ 261.301003] ? find_held_lock+0x39/0x1c0
[ 261.306155] ? stop_critical_timings+0x420/0x420
[ 261.311004] ? check_flags.part.36+0x450/0x450
[ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40
[ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40
[ 261.326142] ? cyc2ns_read_end+0x10/0x10
[ 261.330139] ? raw_bind+0x280/0x280
[ 261.334138] ? sched_clock_cpu+0x126/0x170
[ 261.338995] ? check_flags.part.36+0x450/0x450
[ 261.342991] ? __lock_acquire+0x4500/0x4500
[ 261.348994] ? inet_sendmsg+0x11c/0x500
[ 261.352989] ? dst_output+0x180/0x180
[ 261.357012] inet_sendmsg+0x11c/0x500
[ ... ]
v2:
- clear skb->sk at reassembly routine.(Eric Dumarzet)
Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tls_sw_sendmsg() allocates plaintext and encrypted SG entries using
function sk_alloc_sg(). In case the number of SG entries hit
MAX_SKB_FRAGS, sk_alloc_sg() returns -ENOSPC and sets the variable for
current SG index to '0'. This leads to calling of function
tls_push_record() with 'sg_encrypted_num_elem = 0' and later causes
kernel crash. To fix this, set the number of SG elements to the number
of elements in plaintext/encrypted SG arrays in case sk_alloc_sg()
returns -ENOSPC.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to the documentation in msg_zerocopy.rst, the SO_ZEROCOPY
flag was introduced because send(2) ignores unknown message flags and
any legacy application which was accidentally passing the equivalent of
MSG_ZEROCOPY earlier should not see any new behaviour.
Before commit f214f915e7db ("tcp: enable MSG_ZEROCOPY"), a send(2) call
which passed the equivalent of MSG_ZEROCOPY without setting SO_ZEROCOPY
would succeed. However, after that commit, it fails with -ENOBUFS. So
it appears that the SO_ZEROCOPY flag fails to fulfill its intended
purpose. Fix it.
Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When nla_put*() fails after nla_nest_start(), we need
to call nla_nest_cancel() to cancel the message, otherwise
we end up calling nla_nest_end() like a success.
Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key")
Cc: Davide Caratti <dcaratti@redhat.com>
Cc: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__tipc_nl_compat_dumpit() uses a netlink_callback on stack,
so the only way to align it with other ->dumpit() call path
is calling tipc_dump_start() and tipc_dump_done() directly
inside it. Otherwise ->dumpit() would always get NULL from
cb->args[].
But tipc_dump_start() uses sock_net(cb->skb->sk) to retrieve
net pointer, the cb->skb here doesn't set skb->sk, the net pointer
is saved in msg->net instead, so introduce a helper function
__tipc_dump_start() to pass in msg->net.
Ying pointed out cb->args[0...3] are already used by other
callbacks on this call path, so we can't use cb->args[0] any
more, use cb->args[4] instead.
Fixes: 9a07efa9aea2 ("tipc: switch to rhashtable iterator")
Reported-and-tested-by: syzbot+e93a2c41f91b8e2c7d9b@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes a compile warning.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When sending an skb, afiucv_hs_send() bails out on various error
conditions. But currently the caller has no way of telling whether the
skb was freed or not - resulting in potentially either
a) leaked skbs from iucv_send_ctrl(), or
b) double-free's from iucv_sock_sendmsg().
As dev_queue_xmit() will always consume the skb (even on error), be
consistent and also free the skb from all other error paths. This way
callers no longer need to care about managing the skb.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Inbound packets may have any combination of flag bits set in their iucv
header. If we don't know how to handle a specific combination, drop the
skb instead of leaking it.
To clarify what error is returned in this case, replace the hard-coded
0 with the corresponding macro.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If users try to install act_tunnel_key 'set' rules with duplicate values
of 'index', the tunnel metadata are allocated, but never released. Then,
kmemleak complains as follows:
# tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
# echo clear > /sys/kernel/debug/kmemleak
# tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
Error: TC IDR already exists.
We have an error talking to the kernel
# echo scan > /sys/kernel/debug/kmemleak
# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8800574e6c80 (size 256):
comm "tc", pid 5617, jiffies 4298118009 (age 57.990s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 1c e8 b0 ff ff ff ff ................
81 24 c2 ad ff ff ff ff 00 00 00 00 00 00 00 00 .$..............
backtrace:
[<00000000b7afbf4e>] tunnel_key_init+0x8a5/0x1800 [act_tunnel_key]
[<000000007d98fccd>] tcf_action_init_1+0x698/0xac0
[<0000000099b8f7cc>] tcf_action_init+0x15c/0x590
[<00000000dc60eebe>] tc_ctl_action+0x336/0x5c2
[<000000002f5a2f7d>] rtnetlink_rcv_msg+0x357/0x8e0
[<000000000bfe7575>] netlink_rcv_skb+0x124/0x350
[<00000000edab656f>] netlink_unicast+0x40f/0x5d0
[<00000000b322cdcb>] netlink_sendmsg+0x6e8/0xba0
[<0000000063d9d490>] sock_sendmsg+0xb3/0xf0
[<00000000f0d3315a>] ___sys_sendmsg+0x654/0x960
[<00000000c06cbd42>] __sys_sendmsg+0xd3/0x170
[<00000000ce72e4b0>] do_syscall_64+0xa5/0x470
[<000000005caa2d97>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<00000000fac1b476>] 0xffffffffffffffff
This problem theoretically happens also in case users attempt to setup a
geneve rule having wrong configuration data, or when the kernel fails to
allocate 'params_new'. Ensure that tunnel_key_init() releases the tunnel
metadata also in the above conditions.
Addresses-Coverity-ID: 1373974 ("Resource leak")
Fixes: d0f6dd8a914f4 ("net/sched: Introduce act_tunnel_key")
Fixes: 0ed5269f9e41f ("net/sched: add tunnel option support to act_tunnel_key")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before we unlock the sock in tipc_release(), we have to
detach sk->sk_socket from sk, otherwise a parallel
tipc_sk_fill_sock_diag() could stil read it after we
free this socket.
Fixes: c30b70deb5f4 ("tipc: implement socket diagnostics for AF_TIPC")
Reported-and-tested-by: syzbot+48804b87c16588ad491d@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) Must perform TXQ teardown before unregistering interfaces in
mac80211, from Toke Høiland-Jørgensen.
2) Don't allow creating mac80211_hwsim with less than one channel, from
Johannes Berg.
3) Division by zero in cfg80211, fix from Johannes Berg.
4) Fix endian issue in tipc, from Haiqing Bai.
5) BPF sockmap use-after-free fixes from Daniel Borkmann.
6) Spectre-v1 in mac80211_hwsim, from Jinbum Park.
7) Missing rhashtable_walk_exit() in tipc, from Cong Wang.
8) Revert kvzalloc() conversion of AF_PACKET, it breaks mmap() when
kvzalloc() tries to use kmalloc() pages. From Eric Dumazet.
9) Fix deadlock in hv_netvsc, from Dexuan Cui.
10) Do not restart timewait timer on RST, from Florian Westphal.
11) Fix double lwstate refcount grab in ipv6, from Alexey Kodanev.
12) Unsolicit report count handling is off-by-one, fix from Hangbin Liu.
13) Sleep-in-atomic in cadence driver, from Jia-Ju Bai.
14) Respect ttl-inherit in ip6 tunnel driver, from Hangbin Liu.
15) Use-after-free in act_ife, fix from Cong Wang.
16) Missing hold to meta module in act_ife, from Vlad Buslov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
net: phy: sfp: Handle unimplemented hwmon limits and alarms
net: sched: action_ife: take reference to meta module
act_ife: fix a potential use-after-free
net/mlx5: Fix SQ offset in QPs with small RQ
tipc: correct spelling errors for tipc_topsrv_queue_evt() comments
tipc: correct spelling errors for struct tipc_bc_base's comment
bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.
bnxt_en: Clean up unused functions.
bnxt_en: Fix firmware signaled resource change logic in open.
sctp: not traverse asoc trans list if non-ipv6 trans exists for ipv6_flowlabel
sctp: fix invalid reference to the index variable of the iterator
net/ibm/emac: wrong emac_calc_base call was used by typo
net: sched: null actions array pointer before releasing action
vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition
r8169: add support for NCube 8168 network card
ip6_tunnel: respect ttl inherit for ip6tnl
mac80211: shorten the IBSS debug messages
mac80211: don't Tx a deauth frame if the AP forbade Tx
mac80211: Fix station bandwidth setting after channel switch
mac80211: fix a race between restart and CSA flows
...
|
|
Recent refactoring of add_metainfo() caused use_all_metadata() to add
metainfo to ife action metalist without taking reference to module. This
causes warning in module_put called from ife action cleanup function.
Implement add_metainfo_and_get_ops() function that returns with reference
to module taken if metainfo was added successfully, and call it from
use_all_metadata(), instead of calling __add_metainfo() directly.
Example warning:
[ 646.344393] WARNING: CPU: 1 PID: 2278 at kernel/module.c:1139 module_put+0x1cb/0x230
[ 646.352437] Modules linked in: act_meta_skbtcindex act_meta_mark act_meta_skbprio act_ife ife veth nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc mlx5_ib ib_uverbs ib_core intel_rapl sb_edac x86_pkg_temp_thermal mlx5_core coretemp kvm_intel kvm nfsd igb irqbypass crct10dif_pclmul devlink crc32_pclmul mei_me joydev ses crc32c_intel enclosure auth_rpcgss i2c_algo_bit ioatdma ptp mei pps_core ghash_clmulni_intel iTCO_wdt iTCO_vendor_support pcspkr dca ipmi_ssif lpc_ich target_core_mod i2c_i801 ipmi_si ipmi_devintf pcc_cpufreq wmi ipmi_msghandler nfs_acl lockd acpi_pad acpi_power_meter grace sunrpc mpt3sas raid_class scsi_transport_sas
[ 646.425631] CPU: 1 PID: 2278 Comm: tc Not tainted 4.19.0-rc1+ #799
[ 646.432187] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 646.440595] RIP: 0010:module_put+0x1cb/0x230
[ 646.445238] Code: f3 66 94 02 e8 26 ff fa ff 85 c0 74 11 0f b6 1d 51 30 94 02 80 fb 01 77 60 83 e3 01 74 13 65 ff 0d 3a 83 db 73 e9 2b ff ff ff <0f> 0b e9 00 ff ff ff e8 59 01 fb ff 85 c0 75 e4 48 c7 c2 20 62 6b
[ 646.464997] RSP: 0018:ffff880354d37068 EFLAGS: 00010286
[ 646.470599] RAX: 0000000000000000 RBX: ffffffffc0a52518 RCX: ffffffff8c2668db
[ 646.478118] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffffffffc0a52518
[ 646.485641] RBP: ffffffffc0a52180 R08: fffffbfff814a4a4 R09: fffffbfff814a4a3
[ 646.493164] R10: ffffffffc0a5251b R11: fffffbfff814a4a4 R12: 1ffff1006a9a6e0d
[ 646.500687] R13: 00000000ffffffff R14: ffff880362bab890 R15: dead000000000100
[ 646.508213] FS: 00007f4164c99800(0000) GS:ffff88036fe40000(0000) knlGS:0000000000000000
[ 646.516961] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 646.523080] CR2: 00007f41638b8420 CR3: 0000000351df0004 CR4: 00000000001606e0
[ 646.530595] Call Trace:
[ 646.533408] ? find_symbol_in_section+0x260/0x260
[ 646.538509] tcf_ife_cleanup+0x11b/0x200 [act_ife]
[ 646.543695] tcf_action_cleanup+0x29/0xa0
[ 646.548078] __tcf_action_put+0x5a/0xb0
[ 646.552289] ? nla_put+0x65/0xe0
[ 646.555889] __tcf_idr_release+0x48/0x60
[ 646.560187] tcf_generic_walker+0x448/0x6b0
[ 646.564764] ? tcf_action_dump_1+0x450/0x450
[ 646.569411] ? __lock_is_held+0x84/0x110
[ 646.573720] ? tcf_ife_walker+0x10c/0x20f [act_ife]
[ 646.578982] tca_action_gd+0x972/0xc40
[ 646.583129] ? tca_get_fill.constprop.17+0x250/0x250
[ 646.588471] ? mark_lock+0xcf/0x980
[ 646.592324] ? check_chain_key+0x140/0x1f0
[ 646.596832] ? debug_show_all_locks+0x240/0x240
[ 646.601839] ? memset+0x1f/0x40
[ 646.605350] ? nla_parse+0xca/0x1a0
[ 646.609217] tc_ctl_action+0x215/0x230
[ 646.613339] ? tcf_action_add+0x220/0x220
[ 646.617748] rtnetlink_rcv_msg+0x56a/0x6d0
[ 646.622227] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.626466] netlink_rcv_skb+0x18d/0x200
[ 646.630752] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.634959] ? netlink_ack+0x500/0x500
[ 646.639106] netlink_unicast+0x2d0/0x370
[ 646.643409] ? netlink_attachskb+0x340/0x340
[ 646.648050] ? _copy_from_iter_full+0xe9/0x3e0
[ 646.652870] ? import_iovec+0x11e/0x1c0
[ 646.657083] netlink_sendmsg+0x3b9/0x6a0
[ 646.661388] ? netlink_unicast+0x370/0x370
[ 646.665877] ? netlink_unicast+0x370/0x370
[ 646.670351] sock_sendmsg+0x6b/0x80
[ 646.674212] ___sys_sendmsg+0x4a1/0x520
[ 646.678443] ? copy_msghdr_from_user+0x210/0x210
[ 646.683463] ? lock_downgrade+0x320/0x320
[ 646.687849] ? debug_show_all_locks+0x240/0x240
[ 646.692760] ? do_raw_spin_unlock+0xa2/0x130
[ 646.697418] ? _raw_spin_unlock+0x24/0x30
[ 646.701798] ? __handle_mm_fault+0x1819/0x1c10
[ 646.706619] ? __pmd_alloc+0x320/0x320
[ 646.710738] ? debug_show_all_locks+0x240/0x240
[ 646.715649] ? restore_nameidata+0x7b/0xa0
[ 646.720117] ? check_chain_key+0x140/0x1f0
[ 646.724590] ? check_chain_key+0x140/0x1f0
[ 646.729070] ? __fget_light+0xbc/0xd0
[ 646.733121] ? __sys_sendmsg+0xd7/0x150
[ 646.737329] __sys_sendmsg+0xd7/0x150
[ 646.741359] ? __ia32_sys_shutdown+0x30/0x30
[ 646.746003] ? up_read+0x53/0x90
[ 646.749601] ? __do_page_fault+0x484/0x780
[ 646.754105] ? do_syscall_64+0x1e/0x2c0
[ 646.758320] do_syscall_64+0x72/0x2c0
[ 646.762353] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 646.767776] RIP: 0033:0x7f4163872150
[ 646.771713] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[ 646.791474] RSP: 002b:00007ffdef7d6b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 646.799721] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 00007f4163872150
[ 646.807240] RDX: 0000000000000000 RSI: 00007ffdef7d6bd0 RDI: 0000000000000003
[ 646.814760] RBP: 000000005b8b9482 R08: 0000000000000001 R09: 0000000000000000
[ 646.822286] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdef7dad20
[ 646.829807] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000679bc0
[ 646.837360] irq event stamp: 6083
[ 646.841043] hardirqs last enabled at (6081): [<ffffffff8c220a7d>] __call_rcu+0x17d/0x500
[ 646.849882] hardirqs last disabled at (6083): [<ffffffff8c004f06>] trace_hardirqs_off_thunk+0x1a/0x1c
[ 646.859775] softirqs last enabled at (5968): [<ffffffff8d4004a1>] __do_softirq+0x4a1/0x6ee
[ 646.868784] softirqs last disabled at (6082): [<ffffffffc0a78759>] tcf_ife_cleanup+0x39/0x200 [act_ife]
[ 646.878845] ---[ end trace b1b8c12ffe51e657 ]---
Fixes: 5ffe57da29b3 ("act_ife: fix a potential deadlock")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Immediately after module_put(), user could delete this
module, so e->ops could be already freed before we call
e->ops->release().
Fix this by moving module_put() after ops->release().
Fixes: ef6980b6becb ("introduce IFE action")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Here are quite a large number of fixes, notably:
* various A-MSDU building fixes (currently only affects mt76)
* syzkaller & spectre fixes in hwsim
* TXQ vs. teardown fix that was causing crashes
* embed WMM info in reg rule, bad code here had been causing crashes
* one compilation issue with fix from Arnd (rfkill-gpio includes)
* fixes for a race and bad data during/after channel switch
* nl80211: a validation fix, attribute type & unit fixes
along with other small fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tipc_conn_queue_evt -> tipc_topsrv_queue_evt
tipc_send_work -> tipc_conn_send_work
tipc_send_to_sock -> tipc_conn_send_to_sock
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Trivial fix for two spelling mistakes.
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When users set params.spp_address and get a trans, ipv6_flowlabel flag
should be applied into this trans. But even if this one is not an ipv6
trans, it should not go to apply it into all other transes of the asoc
but simply ignore it.
Fixes: 0b0dce7a36fb ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|