Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Allow DRBG testing through user-space af_alg
- Add tcrypt speed testing support for keyed hashes
- Add type-safe init/exit hooks for ahash
Algorithms:
- Mark arc4 as obsolete and pending for future removal
- Mark anubis, khazad, sead and tea as obsolete
- Improve boot-time xor benchmark
- Add OSCCA SM2 asymmetric cipher algorithm and use it for integrity
Drivers:
- Fixes and enhancement for XTS in caam
- Add support for XIP8001B hwrng in xiphera-trng
- Add RNG and hash support in sun8i-ce/sun8i-ss
- Allow imx-rngc to be used by kernel entropy pool
- Use crypto engine in omap-sham
- Add support for Ingenic X1830 with ingenic"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (205 commits)
X.509: Fix modular build of public_key_sm2
crypto: xor - Remove unused variable count in do_xor_speed
X.509: fix error return value on the failed path
crypto: bcm - Verify GCM/CCM key length in setkey
crypto: qat - drop input parameter from adf_enable_aer()
crypto: qat - fix function parameters descriptions
crypto: atmel-tdes - use semicolons rather than commas to separate statements
crypto: drivers - use semicolons rather than commas to separate statements
hwrng: mxc-rnga - use semicolons rather than commas to separate statements
hwrng: iproc-rng200 - use semicolons rather than commas to separate statements
hwrng: stm32 - use semicolons rather than commas to separate statements
crypto: xor - use ktime for template benchmarking
crypto: xor - defer load time benchmark to a later time
crypto: hisilicon/zip - fix the uninitalized 'curr_qm_qp_num'
crypto: hisilicon/zip - fix the return value when device is busy
crypto: hisilicon/zip - fix zero length input in GZIP decompress
crypto: hisilicon/zip - fix the uncleared debug registers
lib/mpi: Fix unused variable warnings
crypto: x86/poly1305 - Remove assignments with no effect
hwrng: npcm - modify readl to readb
...
|
|
Extend the user-space RNG interface:
1. Add entropy input via ALG_SET_DRBG_ENTROPY setsockopt option;
2. Add additional data input via sendmsg syscall.
This allows DRBG to be tested with test vectors, for example for the
purpose of CAVP testing, which otherwise isn't possible.
To prevent erroneous use of entropy input, it is hidden under
CRYPTO_USER_API_RNG_CAVP config option and requires CAP_SYS_ADMIN to
succeed.
Signed-off-by: Elena Petrova <lenaptr@google.com>
Acked-by: Stephan Müller <smueller@chronox.de>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The iwd daemon uses libell which sets up the skcipher operation with
two separate control messages. As the first control message is sent
without MSG_MORE, it is interpreted as an empty request.
While libell should be fixed to use MSG_MORE where appropriate, this
patch works around the bug in the kernel so that existing binaries
continue to work.
We will print a warning however.
A separate issue is that the new kernel code no longer allows the
control message to be sent twice within the same request. This
restriction is obviously incompatible with what iwd was doing (first
setting an IV and then sending the real control message). This
patch changes the kernel so that this is explicitly allowed.
Reported-by: Caleb Jorden <caljorden@hotmail.com>
Fixes: f3c802a1f300 ("crypto: algif_aead - Only wake up when...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Pull networking updates from David Miller:
1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan.
2) Support UDP segmentation in code TSO code, from Eric Dumazet.
3) Allow flashing different flash images in cxgb4 driver, from Vishal
Kulkarni.
4) Add drop frames counter and flow status to tc flower offloading,
from Po Liu.
5) Support n-tuple filters in cxgb4, from Vishal Kulkarni.
6) Various new indirect call avoidance, from Eric Dumazet and Brian
Vazquez.
7) Fix BPF verifier failures on 32-bit pointer arithmetic, from
Yonghong Song.
8) Support querying and setting hardware address of a port function via
devlink, use this in mlx5, from Parav Pandit.
9) Support hw ipsec offload on bonding slaves, from Jarod Wilson.
10) Switch qca8k driver over to phylink, from Jonathan McDowell.
11) In bpftool, show list of processes holding BPF FD references to
maps, programs, links, and btf objects. From Andrii Nakryiko.
12) Several conversions over to generic power management, from Vaibhav
Gupta.
13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry
Yakunin.
14) Various https url conversions, from Alexander A. Klimov.
15) Timestamping and PHC support for mscc PHY driver, from Antoine
Tenart.
16) Support bpf iterating over tcp and udp sockets, from Yonghong Song.
17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov.
18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan.
19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several
drivers. From Luc Van Oostenryck.
20) XDP support for xen-netfront, from Denis Kirjanov.
21) Support receive buffer autotuning in MPTCP, from Florian Westphal.
22) Support EF100 chip in sfc driver, from Edward Cree.
23) Add XDP support to mvpp2 driver, from Matteo Croce.
24) Support MPTCP in sock_diag, from Paolo Abeni.
25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic
infrastructure, from Jakub Kicinski.
26) Several pci_ --> dma_ API conversions, from Christophe JAILLET.
27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel.
28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki.
29) Refactor a lot of networking socket option handling code in order to
avoid set_fs() calls, from Christoph Hellwig.
30) Add rfc4884 support to icmp code, from Willem de Bruijn.
31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei.
32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin.
33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin.
34) Support TCP syncookies in MPTCP, from Flowian Westphal.
35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano
Brivio.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits)
net: thunderx: initialize VF's mailbox mutex before first usage
usb: hso: remove bogus check for EINPROGRESS
usb: hso: no complaint about kmalloc failure
hso: fix bailout in error case of probe
ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
selftests/net: relax cpu affinity requirement in msg_zerocopy test
mptcp: be careful on subflow creation
selftests: rtnetlink: make kci_test_encap() return sub-test result
selftests: rtnetlink: correct the final return value for the test
net: dsa: sja1105: use detected device id instead of DT one on mismatch
tipc: set ub->ifindex for local ipv6 address
ipv6: add ipv6_dev_find()
net: openvswitch: silence suspicious RCU usage warning
Revert "vxlan: fix tos value before xmit"
ptp: only allow phase values lower than 1 period
farsync: switch from 'pci_' to 'dma_' API
wan: wanxl: switch from 'pci_' to 'dma_' API
hv_netvsc: do not use VF device if link is down
dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
net: macb: Properly handle phylink on at91sam9x
...
|
|
Rework the remaining setsockopt code to pass a sockptr_t instead of a
plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
outside of architecture specific code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154]
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Just check for a NULL method instead of wiring up
sock_no_{get,set}sockopt.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some user-space programs rely on crypto requests that have no
control metadata. This broke when a check was added to require
the presence of control metadata with the ctx->init flag.
This patch fixes the regression by setting ctx->init as long as
one sendmsg(2) has been made, with or without a control message.
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: f3c802a1f300 ("crypto: algif_aead - Only wake up when...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
AEAD does not support partial requests so we must not wake up
while ctx->more is set. In order to distinguish between the
case of no data sent yet and a zero-length request, a new init
flag has been added to ctx.
SKCIPHER has also been modified to ensure that at least a block
of data is available if there is more data to come.
Fixes: 2d97591ef43d ("crypto: af_alg - consolidation of...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The locking in af_alg_release_parent is broken as the BH socket
lock can only be taken if there is a code-path to handle the case
where the lock is owned by process-context. Instead of adding
such handling, we can fix this by changing the ref counts to
atomic_t.
This patch also modifies the main refcnt to include both normal
and nokey sockets. This way we don't have to fudge the nokey
ref count when a socket changes from nokey to normal.
Credits go to Mauricio Faria de Oliveira who diagnosed this bug
and sent a patch for it:
https://lore.kernel.org/linux-crypto/20200605161657.535043-1-mfo@canonical.com/
Reported-by: Brian Moyles <bmoyles@netflix.com>
Reported-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Fixes: 37f96694cf73 ("crypto: af_alg - Use bh_lock_sock in...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When working with bool values the true and false definitions should be used
instead of 1 and 0.
Hopefully I fixed my mailer and apologize for that.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
'PTR_ERR(p) == -E*' is a stronger condition than IS_ERR(p).
Hence, IS_ERR(p) is unneeded.
The semantic patch that generates this commit is as follows:
// <smpl>
@@
expression ptr;
constant error_code;
@@
-IS_ERR(ptr) && (PTR_ERR(ptr) == - error_code)
+PTR_ERR(ptr) == - error_code
// </smpl>
Link: http://lkml.kernel.org/r/20200106045833.1725-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Stephen Boyd <sboyd@kernel.org> [drivers/clk/clk.c]
Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> [GPIO]
Acked-by: Wolfram Sang <wsa@the-dreams.de> [drivers/i2c]
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [acpi/scan.c]
Acked-by: Rob Herring <robh@kernel.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
As af_alg_release_parent may be called from BH context (most notably
due to an async request that only completes after socket closure,
or as reported here because of an RCU-delayed sk_destruct call), we
must use bh_lock_sock instead of lock_sock.
Reported-by: syzbot+c2f1558d49e25cc36e5e@syzkaller.appspotmail.com
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: c840ac6af3f8 ("crypto: af_alg - Disallow bind/setkey/...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
when libkcapi test is executed using HW accelerator, cipher operation
return -74.Since af_alg_async_cb->ki_complete treat err as unsigned int,
libkcapi receive 429467222 even though it expect -ve value.
Hence its required to cast resultlen to int so that proper
error is returned to libkcapi.
AEAD one shot non-aligned test 2(libkcapi test)
./../bin/kcapi -x 10 -c "gcm(aes)" -i 7815d4b06ae50c9c56e87bd7
-k ea38ac0c9b9998c80e28fb496a2b88d9 -a
"853f98a750098bec1aa7497e979e78098155c877879556bb51ddeb6374cbaefc"
-t "c4ce58985b7203094be1d134c1b8ab0b" -q
"b03692f86d1b8b39baf2abb255197c98"
Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
"API:
- Add helper for simple skcipher modes.
- Add helper to register multiple templates.
- Set CRYPTO_TFM_NEED_KEY when setkey fails.
- Require neither or both of export/import in shash.
- AEAD decryption test vectors are now generated from encryption
ones.
- New option CONFIG_CRYPTO_MANAGER_EXTRA_TESTS that includes random
fuzzing.
Algorithms:
- Conversions to skcipher and helper for many templates.
- Add more test vectors for nhpoly1305 and adiantum.
Drivers:
- Add crypto4xx prng support.
- Add xcbc/cmac/ecb support in caam.
- Add AES support for Exynos5433 in s5p.
- Remove sha384/sha512 from artpec7 as hardware cannot do partial
hash"
[ There is a merge of the Freescale SoC tree in order to pull in changes
required by patches to the caam/qi2 driver. ]
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (174 commits)
crypto: s5p - add AES support for Exynos5433
dt-bindings: crypto: document Exynos5433 SlimSSS
crypto: crypto4xx - add missing of_node_put after of_device_is_available
crypto: cavium/zip - fix collision with generic cra_driver_name
crypto: af_alg - use struct_size() in sock_kfree_s()
crypto: caam - remove redundant likely/unlikely annotation
crypto: s5p - update iv after AES-CBC op end
crypto: x86/poly1305 - Clear key material from stack in SSE2 variant
crypto: caam - generate hash keys in-place
crypto: caam - fix DMA mapping xcbc key twice
crypto: caam - fix hash context DMA unmap size
hwrng: bcm2835 - fix probe as platform device
crypto: s5p-sss - Use AES_BLOCK_SIZE define instead of number
crypto: stm32 - drop pointless static qualifier in stm32_hash_remove()
crypto: chelsio - Fixed Traffic Stall
crypto: marvell - Remove set but not used variable 'ivsize'
crypto: ccp - Update driver messages to remove some confusion
crypto: adiantum - add 1536 and 4096-byte test vectors
crypto: nhpoly1305 - add a test vector with len % 16 != 0
crypto: arm/aes-ce - update IV after partial final CTR block
...
|
|
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.
So, change the following form:
sizeof(*sgl) + sizeof(sgl->sg[0]) * (MAX_SGL_ENTS + 1)
to :
struct_size(sgl, sg, MAX_SGL_ENTS + 1)
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
KASAN has found use-after-free in sockfs_setattr.
The existed commit 6d8c50dcb029 ("socket: close race condition between sock_close()
and sockfs_setattr()") is to fix this simillar issue, but it seems to ignore
that crypto module forgets to set the sk to NULL after af_alg_release.
KASAN report details as below:
BUG: KASAN: use-after-free in sockfs_setattr+0x120/0x150
Write of size 4 at addr ffff88837b956128 by task syz-executor0/4186
CPU: 2 PID: 4186 Comm: syz-executor0 Not tainted xxx + #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.10.2-1ubuntu1 04/01/2014
Call Trace:
dump_stack+0xca/0x13e
print_address_description+0x79/0x330
? vprintk_func+0x5e/0xf0
kasan_report+0x18a/0x2e0
? sockfs_setattr+0x120/0x150
sockfs_setattr+0x120/0x150
? sock_register+0x2d0/0x2d0
notify_change+0x90c/0xd40
? chown_common+0x2ef/0x510
chown_common+0x2ef/0x510
? chmod_common+0x3b0/0x3b0
? __lock_is_held+0xbc/0x160
? __sb_start_write+0x13d/0x2b0
? __mnt_want_write+0x19a/0x250
do_fchownat+0x15c/0x190
? __ia32_sys_chmod+0x80/0x80
? trace_hardirqs_on_thunk+0x1a/0x1c
__x64_sys_fchownat+0xbf/0x160
? lockdep_hardirqs_on+0x39a/0x5e0
do_syscall_64+0xc8/0x580
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x462589
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89
f7 48 89 d6 48 89
ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3
48 c7 c1 bc ff ff
ff f7 d8 64 89 01 48
RSP: 002b:00007fb4b2c83c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000104
RAX: ffffffffffffffda RBX: 000000000072bfa0 RCX: 0000000000462589
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000007
RBP: 0000000000000005 R08: 0000000000001000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb4b2c846bc
R13: 00000000004bc733 R14: 00000000006f5138 R15: 00000000ffffffff
Allocated by task 4185:
kasan_kmalloc+0xa0/0xd0
__kmalloc+0x14a/0x350
sk_prot_alloc+0xf6/0x290
sk_alloc+0x3d/0xc00
af_alg_accept+0x9e/0x670
hash_accept+0x4a3/0x650
__sys_accept4+0x306/0x5c0
__x64_sys_accept4+0x98/0x100
do_syscall_64+0xc8/0x580
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 4184:
__kasan_slab_free+0x12e/0x180
kfree+0xeb/0x2f0
__sk_destruct+0x4e6/0x6a0
sk_destruct+0x48/0x70
__sk_free+0xa9/0x270
sk_free+0x2a/0x30
af_alg_release+0x5c/0x70
__sock_release+0xd3/0x280
sock_close+0x1a/0x20
__fput+0x27f/0x7f0
task_work_run+0x136/0x1b0
exit_to_usermode_loop+0x1a7/0x1d0
do_syscall_64+0x461/0x580
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Syzkaller reproducer:
r0 = perf_event_open(&(0x7f0000000000)={0x0, 0x70, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext}, 0x0, 0x0,
0xffffffffffffffff, 0x0)
r1 = socket$alg(0x26, 0x5, 0x0)
getrusage(0x0, 0x0)
bind(r1, &(0x7f00000001c0)=@alg={0x26, 'hash\x00', 0x0, 0x0,
'sha256-ssse3\x00'}, 0x80)
r2 = accept(r1, 0x0, 0x0)
r3 = accept4$unix(r2, 0x0, 0x0, 0x0)
r4 = dup3(r3, r0, 0x0)
fchownat(r4, &(0x7f00000000c0)='\x00', 0x0, 0x0, 0x1000)
Fixes: 6d8c50dcb029 ("socket: close race condition between sock_close() and sockfs_setattr()")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sk_alloc() already sets sock::sk_family to PF_ALG which is passed as the
'family' argument, so there's no need to set it again.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
af_alg_count_tsgl() iterates through a list without modifying it, so use
list_for_each_entry() rather than list_for_each_entry_safe(). Also make
the pointers 'const' to make it clearer that nothing is modified.
No actual change in behavior.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Some exported functions in af_alg.c aren't used outside of that file.
Therefore, un-export them and make them 'static'.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This reverts commit dd979b4df817e9976f18fb6f9d134d6bc4a3c317.
This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
internal TCP socket for the initial handshake with the remote peer.
Whenever the SMC connection can not be established this TCP socket is
used as a fallback. All socket operations on the SMC socket are then
forwarded to the TCP socket. In case of poll, the file->private_data
pointer references the SMC socket because the TCP socket has no file
assigned. This causes tcp_poll to wait on the wrong socket.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The wait_address argument is always directly derived from the filp
argument, so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes an allocation error-path bug in af_alg discovered by
syzkaller"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: af_alg - Initialize sg_num_bytes in error code path
|
|
The RX SGL in processing is already registered with the RX SGL tracking
list to support proper cleanup. The cleanup code path uses the
sg_num_bytes variable which must therefore be always initialized, even
in the error code path.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Reported-by: syzbot+9c251bdd09f83b92ba95@syzkaller.appspotmail.com
#syz test: https://github.com/google/kmsan.git master
CC: <stable@vger.kernel.org> #4.14
Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The poll() changes were not well thought out, and completely
unexplained. They also caused a huge performance regression, because
"->poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.
Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"->get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead. That gets rid of one of the new indirections.
But that doesn't fix the new complexity that is completely unwarranted
for the regular case. The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.
[ This revert is a revert of about 30 different commits, not reverted
individually because that would just be unnecessarily messy - Linus ]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull overflow updates from Kees Cook:
"This adds the new overflow checking helpers and adds them to the
2-factor argument allocators. And this adds the saturating size
helpers and does a treewide replacement for the struct_size() usage.
Additionally this adds the overflow testing modules to make sure
everything works.
I'm still working on the treewide replacements for allocators with
"simple" multiplied arguments:
*alloc(a * b, ...) -> *alloc_array(a, b, ...)
and
*zalloc(a * b, ...) -> *calloc(a, b, ...)
as well as the more complex cases, but that's separable from this
portion of the series. I expect to have the rest sent before -rc1
closes; there are a lot of messy cases to clean up.
Summary:
- Introduce arithmetic overflow test helper functions (Rasmus)
- Use overflow helpers in 2-factor allocators (Kees, Rasmus)
- Introduce overflow test module (Rasmus, Kees)
- Introduce saturating size helper functions (Matthew, Kees)
- Treewide use of struct_size() for allocators (Kees)"
* tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
treewide: Use struct_size() for devm_kmalloc() and friends
treewide: Use struct_size() for vmalloc()-family
treewide: Use struct_size() for kmalloc()-family
device: Use overflow helpers for devm_kmalloc()
mm: Use overflow helpers in kvmalloc()
mm: Use overflow helpers in kmalloc_array*()
test_overflow: Add memory allocation overflow tests
overflow.h: Add allocation size calculation helpers
test_overflow: Report test failures
test_overflow: macrofy some more, do more tests for free
lib: add runtime test of check_*_overflow functions
compiler.h: enable builtin overflow checkers and add fallback code
|
|
Replaces open-coded struct size calculations with struct_size() for
devm_*, f2fs_*, and sock_* allocations. Automatically generated (and
manually adjusted) from the following Coccinelle script:
// Direct reference to struct field.
@@
identifier alloc =~ "devm_kmalloc|devm_kzalloc|sock_kmalloc|f2fs_kmalloc|f2fs_kzalloc";
expression HANDLE;
expression GFP;
identifier VAR, ELEMENT;
expression COUNT;
@@
- alloc(HANDLE, sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT), GFP)
+ alloc(HANDLE, struct_size(VAR, ELEMENT, COUNT), GFP)
// mr = kzalloc(sizeof(*mr) + m * sizeof(mr->map[0]), GFP_KERNEL);
@@
identifier alloc =~ "devm_kmalloc|devm_kzalloc|sock_kmalloc|f2fs_kmalloc|f2fs_kzalloc";
expression HANDLE;
expression GFP;
identifier VAR, ELEMENT;
expression COUNT;
@@
- alloc(HANDLE, sizeof(*VAR) + COUNT * sizeof(VAR->ELEMENT[0]), GFP)
+ alloc(HANDLE, struct_size(VAR, ELEMENT, COUNT), GFP)
// Same pattern, but can't trivially locate the trailing element name,
// or variable name.
@@
identifier alloc =~ "devm_kmalloc|devm_kzalloc|sock_kmalloc|f2fs_kmalloc|f2fs_kzalloc";
expression HANDLE;
expression GFP;
expression SOMETHING, COUNT, ELEMENT;
@@
- alloc(HANDLE, sizeof(SOMETHING) + COUNT * sizeof(ELEMENT), GFP)
+ alloc(HANDLE, CHECKME_struct_size(&SOMETHING, ELEMENT, COUNT), GFP)
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Now that sock_poll handles a NULL ->poll or ->poll_mask there is no need
for a stub.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
syzbot reported :
BUG: KMSAN: uninit-value in alg_bind+0xe3/0xd90 crypto/af_alg.c:162
We need to check addr_len before dereferencing sa (or uaddr)
Fixes: bb30b8848c85 ("crypto: af_alg - whitelist mask and type")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Stephan Mueller <smueller@chronox.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Enforce the setting of keys for keyed aead/hash/skcipher
algorithms.
- Add multibuf speed tests in tcrypt.
Algorithms:
- Improve performance of sha3-generic.
- Add native sha512 support on arm64.
- Add v8.2 Crypto Extentions version of sha3/sm3 on arm64.
- Avoid hmac nesting by requiring underlying algorithm to be unkeyed.
- Add cryptd_max_cpu_qlen module parameter to cryptd.
Drivers:
- Add support for EIP97 engine in inside-secure.
- Add inline IPsec support to chelsio.
- Add RevB core support to crypto4xx.
- Fix AEAD ICV check in crypto4xx.
- Add stm32 crypto driver.
- Add support for BCM63xx platforms in bcm2835 and remove bcm63xx.
- Add Derived Key Protocol (DKP) support in caam.
- Add Samsung Exynos True RNG driver.
- Add support for Exynos5250+ SoCs in exynos PRNG driver"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (166 commits)
crypto: picoxcell - Fix error handling in spacc_probe()
crypto: arm64/sha512 - fix/improve new v8.2 Crypto Extensions code
crypto: arm64/sm3 - new v8.2 Crypto Extensions implementation
crypto: arm64/sha3 - new v8.2 Crypto Extensions implementation
crypto: testmgr - add new testcases for sha3
crypto: sha3-generic - export init/update/final routines
crypto: sha3-generic - simplify code
crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize
crypto: sha3-generic - fixes for alignment and big endian operation
crypto: aesni - handle zero length dst buffer
crypto: artpec6 - remove select on non-existing CRYPTO_SHA384
hwrng: bcm2835 - Remove redundant dev_err call in bcm2835_rng_probe()
crypto: stm32 - remove redundant dev_err call in stm32_cryp_probe()
crypto: axis - remove unnecessary platform_get_resource() error check
crypto: testmgr - test misuse of result in ahash
crypto: inside-secure - make function safexcel_try_push_requests static
crypto: aes-generic - fix aes-generic regression on powerpc
crypto: chelsio - Fix indentation warning
crypto: arm64/sha1-ce - get rid of literal pool
crypto: arm64/sha2-ce - move the round constant table to .rodata section
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull poll annotations from Al Viro:
"This introduces a __bitwise type for POLL### bitmap, and propagates
the annotations through the tree. Most of that stuff is as simple as
'make ->poll() instances return __poll_t and do the same to local
variables used to hold the future return value'.
Some of the obvious brainos found in process are fixed (e.g. POLLIN
misspelled as POLL_IN). At that point the amount of sparse warnings is
low and most of them are for genuine bugs - e.g. ->poll() instance
deciding to return -EINVAL instead of a bitmap. I hadn't touched those
in this series - it's large enough as it is.
Another problem it has caught was eventpoll() ABI mess; select.c and
eventpoll.c assumed that corresponding POLL### and EPOLL### were
equal. That's true for some, but not all of them - EPOLL### are
arch-independent, but POLL### are not.
The last commit in this series separates userland POLL### values from
the (now arch-independent) kernel-side ones, converting between them
in the few places where they are copied to/from userland. AFAICS, this
is the least disruptive fix preserving poll(2) ABI and making epoll()
work on all architectures.
As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
it will trigger only on what would've triggered EPOLLWRBAND on other
architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
at all on sparc. With this patch they should work consistently on all
architectures"
* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
make kernel-side POLL... arch-independent
eventpoll: no need to mask the result of epi_item_poll() again
eventpoll: constify struct epoll_event pointers
debugging printk in sg_poll() uses %x to print POLL... bitmap
annotate poll(2) guts
9p: untangle ->poll() mess
->si_band gets POLL... bitmap stored into a user-visible long field
ring_buffer_poll_wait() return value used as return value of ->poll()
the rest of drivers/*: annotate ->poll() instances
media: annotate ->poll() instances
fs: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
net: annotate ->poll() instances
apparmor: annotate ->poll() instances
tomoyo: annotate ->poll() instances
sound: annotate ->poll() instances
acpi: annotate ->poll() instances
crypto: annotate ->poll() instances
block: annotate ->poll() instances
x86: annotate ->poll() instances
...
|
|
The user space interface allows specifying the type and mask field used
to allocate the cipher. Only a subset of the possible flags are intended
for user space. Therefore, white-list the allowed flags.
In case the user space caller uses at least one non-allowed flag, EINVAL
is returned.
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- racy use of ctx->rcvused in af_alg
- algif_aead crash in chacha20poly1305
- freeing bogus pointer in pcrypt
- build error on MIPS in mpi
- memory leak in inside-secure
- memory overwrite in inside-secure
- NULL pointer dereference in inside-secure
- state corruption in inside-secure
- build error without CRYPTO_GF128MUL in chelsio
- use after free in n2"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: inside-secure - do not use areq->result for partial results
crypto: inside-secure - fix request allocations in invalidation path
crypto: inside-secure - free requests even if their handling failed
crypto: inside-secure - per request invalidation
lib/mpi: Fix umul_ppmm() for MIPS64r6
crypto: pcrypt - fix freeing pcrypt instances
crypto: n2 - cure use after free
crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t
crypto: chacha20poly1305 - validate the digest size
crypto: chelsio - select CRYPTO_GF128MUL
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- fix chacha20 crash on zero-length input due to unset IV
- fix potential race conditions in mcryptd with spinlock
- only wait once at top of algif recvmsg to avoid inconsistencies
- fix potential use-after-free in algif_aead/algif_skcipher"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: af_alg - fix race accessing cipher request
crypto: mcryptd - protect the per-CPU queue with a lock
crypto: af_alg - wait for data at beginning of recvmsg
crypto: skcipher - set walk.iv for zero-length inputs
|
|
Merge the crypto tree to pick up inside-secure fixes.
|
|
This variable was increased and decreased without any protection.
Result was an occasional misscount and negative wrap around resulting
in false resource allocation failures.
Fixes: 7d2c3f54e6f6 ("crypto: af_alg - remove locking in async callback")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This push fixes the following issues:
- buffer overread in RSA
- potential use after free in algif_aead.
- error path null pointer dereference in af_alg
- forbid combinations such as hmac(hmac(sha3)) which may crash
- crash in salsa20 due to incorrect API usage"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: salsa20 - fix blkcipher_walk API usage
crypto: hmac - require that the underlying hash algorithm is unkeyed
crypto: af_alg - fix NULL pointer dereference in
crypto: algif_aead - fix reference counting of null skcipher
crypto: rsa - fix buffer overread when stripping leading zeroes
|
|
The wait for data is a non-atomic operation that can sleep and therefore
potentially release the socket lock. The release of the socket lock
allows another thread to modify the context data structure. The waiting
operation for new data therefore must be called at the beginning of
recvmsg. This prevents a race condition where checks of the members of
the context data structure are performed by recvmsg while there is a
potential for modification of these values.
Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
af_alg_free_areq_sgls()
If allocating the ->tsgl member of 'struct af_alg_async_req' failed,
during cleanup we dereferenced the NULL ->tsgl pointer in
af_alg_free_areq_sgls(), because ->tsgl_entries was nonzero.
Fix it by only freeing the ->tsgl list if it is non-NULL.
This affected both algif_skcipher and algif_aead.
Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The code paths protected by the socket-lock do not use or modify the
socket in a non-atomic fashion. The actions pertaining the socket do not
even need to be handled as an atomic operation. Thus, the socket-lock
can be safely ignored.
This fixes a bug regarding scheduling in atomic as the callback function
may be invoked in interrupt context.
In addition, the sock_hold is moved before the AIO encrypt/decrypt
operation to ensure that the socket is always present. This avoids a
tiny race window where the socket is unprotected and yet used by the AIO
operation.
Finally, the release of resources for a crypto operation is moved into a
common function of af_alg_free_resources.
Cc: <stable@vger.kernel.org>
Fixes: e870456d8e7c8 ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6ae43 ("crypto: algif_aead - overhaul memory management")
Reported-by: Romain Izard <romain.izard.pro@gmail.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Tested-by: Romain Izard <romain.izard.pro@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
algif starts several async crypto ops and waits for their completion.
Move it over to generic code doing the same.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When two adjacent TX SGL are processed and parts of both TX SGLs
are pulled into the per-request TX SGL, the wrong per-request
TX SGL entries were updated.
This fixes a NULL pointer dereference when a cipher implementation walks
the TX SGL where some of the SGL entries were NULL.
Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory...")
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When a page is assigned to a TX SGL, call get_page to increment the
reference counter. It is possible that one page is referenced in
multiple SGLs:
- in the global TX SGL in case a previous af_alg_pull_tsgl only
reassigned parts of a page to a per-request TX SGL
- in the per-request TX SGL as assigned by af_alg_pull_tsgl
Note, multiple requests can be active at the same time whose TX SGLs all
point to different parts of the same page.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Consolidate following data structures:
skcipher_async_req, aead_async_req -> af_alg_async_req
skcipher_rsgl, aead_rsql -> af_alg_rsgl
skcipher_tsgl, aead_tsql -> af_alg_tsgl
skcipher_ctx, aead_ctx -> af_alg_ctx
Consolidate following functions:
skcipher_sndbuf, aead_sndbuf -> af_alg_sndbuf
skcipher_writable, aead_writable -> af_alg_writable
skcipher_rcvbuf, aead_rcvbuf -> af_alg_rcvbuf
skcipher_readable, aead_readable -> af_alg_readable
aead_alloc_tsgl, skcipher_alloc_tsgl -> af_alg_alloc_tsgl
aead_count_tsgl, skcipher_count_tsgl -> af_alg_count_tsgl
aead_pull_tsgl, skcipher_pull_tsgl -> af_alg_pull_tsgl
aead_free_areq_sgls, skcipher_free_areq_sgls -> af_alg_free_areq_sgls
aead_wait_for_wmem, skcipher_wait_for_wmem -> af_alg_wait_for_wmem
aead_wmem_wakeup, skcipher_wmem_wakeup -> af_alg_wmem_wakeup
aead_wait_for_data, skcipher_wait_for_data -> af_alg_wait_for_data
aead_data_wakeup, skcipher_data_wakeup -> af_alg_data_wakeup
aead_sendmsg, skcipher_sendmsg -> af_alg_sendmsg
aead_sendpage, skcipher_sendpage -> af_alg_sendpage
aead_async_cb, skcipher_async_cb -> af_alg_async_cb
aead_poll, skcipher_poll -> af_alg_poll
Split out the following common code from recvmsg:
af_alg_alloc_areq: allocation of the request data structure for the
cipher operation
af_alg_get_rsgl: creation of the RX SGL anchored in the request data
structure
The following changes to the implementation without affecting the
functionality have been applied to synchronize slightly different code
bases in algif_skcipher and algif_aead:
The wakeup in af_alg_wait_for_data is triggered when either more data
is received or the indicator that more data is to be expected is
released. The first is triggered by user space, the second is
triggered by the kernel upon finishing the processing of data
(i.e. the kernel is ready for more).
af_alg_sendmsg uses size_t in min_t calculation for obtaining len.
Return code determination is consistent with algif_skcipher. The
scope of the variable i is reduced to match algif_aead. The type of the
variable i is switched from int to unsigned int to match algif_aead.
af_alg_sendpage does not contain the superfluous err = 0 from
aead_sendpage.
af_alg_async_cb requires to store the number of output bytes in
areq->outlen before the AIO callback is triggered.
The POLLIN / POLLRDNORM is now set when either not more data is given or
the kernel is supplied with data. This is consistent to the wakeup from
sleep when the kernel waits for data.
The request data structure is extended by the field last_rsgl which
points to the last RX SGL list entry. This shall help recvmsg
implementation to chain the RX SGL to other SG(L)s if needed. It is
currently used by algif_aead which chains the tag SGL to the RX SGL
during decryption.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
crypto: af_alg - Avoid sock_graft call warning
The newly added sock_graft warning triggers in af_alg_accept.
It's harmless as we're essentially doing sock->sk = sock->sk.
The sock_graft call is actually redundant because all the work
it does is subsumed by sock_init_data. However, it was added
to placate SELinux as it uses it to initialise its internal state.
This patch avoisd the warning by making the SELinux call directly.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
|
|
This patch removes the hard-coded 64-byte limit on the length
of the algorithm name through bind(2). The address length can
now exceed that. The user-space structure remains unchanged.
In order to use a longer name simply extend the salg_name array
beyond its defined 64 bytes length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.
The theory lockdep comes up with is as follows:
(1) If the pagefault handler decides it needs to read pages from AFS, it
calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
creating a call requires the socket lock:
mmap_sem must be taken before sk_lock-AF_RXRPC
(2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
binds the underlying UDP socket whilst holding its socket lock.
inet_bind() takes its own socket lock:
sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
(3) Reading from a TCP socket into a userspace buffer might cause a fault
and thus cause the kernel to take the mmap_sem, but the TCP socket is
locked whilst doing this:
sk_lock-AF_INET must be taken before mmap_sem
However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks. The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace. This is
a limitation in the design of lockdep.
Fix the general case by:
(1) Double up all the locking keys used in sockets so that one set are
used if the socket is created by userspace and the other set is used
if the socket is created by the kernel.
(2) Store the kern parameter passed to sk_alloc() in a variable in the
sock struct (sk_kern_sock). This informs sock_lock_init(),
sock_init_data() and sk_clone_lock() as to the lock keys to be used.
Note that the child created by sk_clone_lock() inherits the parent's
kern setting.
(3) Add a 'kern' parameter to ->accept() that is analogous to the one
passed in to ->create() that distinguishes whether kernel_accept() or
sys_accept4() was the caller and can be passed to sk_alloc().
Note that a lot of accept functions merely dequeue an already
allocated socket. I haven't touched these as the new socket already
exists before we get the parameter.
Note also that there are a couple of places where I've made the accepted
socket unconditionally kernel-based:
irda_accept()
rds_rcp_accept_one()
tcp_accept_from_sock()
because they follow a sock_create_kern() and accept off of that.
Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal. I wonder if these should do that so
that they use the new set of lock keys.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|