summaryrefslogtreecommitdiff
path: root/drivers/nvme/target
AgeCommit message (Collapse)Author
2020-10-27nvmet: fix a NULL pointer dereference when tracing the flush commandChaitanya Kulkarni
When target side trace in turned on and flush command is issued from the host it results in the following Oops. [ 856.789724] BUG: kernel NULL pointer dereference, address: 0000000000000068 [ 856.790686] #PF: supervisor read access in kernel mode [ 856.791262] #PF: error_code(0x0000) - not-present page [ 856.791863] PGD 6d7110067 P4D 6d7110067 PUD 66f0ad067 PMD 0 [ 856.792527] Oops: 0000 [#1] SMP NOPTI [ 856.792950] CPU: 15 PID: 7034 Comm: nvme Tainted: G OE 5.9.0nvme-5.9+ #71 [ 856.793790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214 [ 856.794956] RIP: 0010:trace_event_raw_event_nvmet_req_init+0x13e/0x170 [nvmet] [ 856.795734] Code: 41 5c 41 5d c3 31 d2 31 f6 e8 4e 9b b8 e0 e9 0e ff ff ff 49 8b 55 00 48 8b 38 8b 0 [ 856.797740] RSP: 0018:ffffc90001be3a60 EFLAGS: 00010246 [ 856.798375] RAX: 0000000000000000 RBX: ffff8887e7d2c01c RCX: 0000000000000000 [ 856.799234] RDX: 0000000000000020 RSI: 0000000057e70ea2 RDI: ffff8887e7d2c034 [ 856.800088] RBP: ffff88869f710578 R08: ffff888807500d40 R09: 00000000fffffffe [ 856.800951] R10: 0000000064c66670 R11: 00000000ef955201 R12: ffff8887e7d2c034 [ 856.801807] R13: ffff88869f7105c8 R14: 0000000000000040 R15: ffff88869f710440 [ 856.802667] FS: 00007f6a22bd8780(0000) GS:ffff888813a00000(0000) knlGS:0000000000000000 [ 856.803635] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 856.804367] CR2: 0000000000000068 CR3: 00000006d73e0000 CR4: 00000000003506e0 [ 856.805283] Call Trace: [ 856.805613] nvmet_req_init+0x27c/0x480 [nvmet] [ 856.806200] nvme_loop_queue_rq+0xcb/0x1d0 [nvme_loop] [ 856.806862] blk_mq_dispatch_rq_list+0x123/0x7b0 [ 856.807459] ? kvm_sched_clock_read+0x14/0x30 [ 856.808025] __blk_mq_sched_dispatch_requests+0xc7/0x170 [ 856.808708] blk_mq_sched_dispatch_requests+0x30/0x60 [ 856.809372] __blk_mq_run_hw_queue+0x70/0x100 [ 856.809935] __blk_mq_delay_run_hw_queue+0x156/0x170 [ 856.810574] blk_mq_run_hw_queue+0x86/0xe0 [ 856.811104] blk_mq_sched_insert_request+0xef/0x160 [ 856.811733] blk_execute_rq+0x69/0xc0 [ 856.812212] ? blk_mq_rq_ctx_init+0xd0/0x230 [ 856.812784] nvme_execute_passthru_rq+0x57/0x130 [nvme_core] [ 856.813461] nvme_submit_user_cmd+0xeb/0x300 [nvme_core] [ 856.814099] nvme_user_cmd.isra.82+0x11e/0x1a0 [nvme_core] [ 856.814752] blkdev_ioctl+0x1dc/0x2c0 [ 856.815197] block_ioctl+0x3f/0x50 [ 856.815606] __x64_sys_ioctl+0x84/0xc0 [ 856.816074] do_syscall_64+0x33/0x40 [ 856.816533] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 856.817168] RIP: 0033:0x7f6a222ed107 [ 856.817617] Code: 44 00 00 48 8b 05 81 cd 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 8 [ 856.819901] RSP: 002b:00007ffca848f058 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 856.820846] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f6a222ed107 [ 856.821726] RDX: 00007ffca848f060 RSI: 00000000c0484e43 RDI: 0000000000000003 [ 856.822603] RBP: 0000000000000003 R08: 000000000000003f R09: 0000000000000005 [ 856.823478] R10: 00007ffca848ece0 R11: 0000000000000202 R12: 00007ffca84912d3 [ 856.824359] R13: 00007ffca848f4d0 R14: 0000000000000002 R15: 000000000067e900 [ 856.825236] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corel Move the nvmet_req_init() tracepoint after we parse the command in nvmet_req_init() so that we can get rid of the duplicate nvmet_find_namespace() call. Rename __assign_disk_name() -> __assign_req_name(). Now that we call tracepoint after parsing the command simplify the newly added __assign_req_name() which fixes this bug. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-22nvmet: don't use BLK_MQ_REQ_NOWAIT for passthruChaitanya Kulkarni
By default, we set the passthru request allocation flag such that it returns the error in the following code path and we fail the I/O when BLK_MQ_REQ_NOWAIT is used for request allocation :- nvme_alloc_request()  blk_mq_alloc_request()   blk_mq_queue_enter()    if (flag & BLK_MQ_REQ_NOWAIT)         return -EBUSY; <-- return if busy. On some controllers using BLK_MQ_REQ_NOWAIT ends up in I/O error where the controller is perfectly healthy and not in a degraded state. Block layer request allocation does allow us to wait instead of immediately returning the error when we BLK_MQ_REQ_NOWAIT flag is not used. This has shown to fix the I/O error problem reported under heavy random write workload. Remove the BLK_MQ_REQ_NOWAIT parameter for passthru request allocation which resolves this issue. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-22nvmet: cleanup nvmet_passthru_map_sg()Logan Gunthorpe
Clean up some confusing elements of nvmet_passthru_map_sg() by returning early if the request is greater than the maximum bio size. This allows us to drop the sg_cnt variable. This should not result in any functional change but makes the code clearer and more understandable. The original code allocated a truncated bio then would return EINVAL when bio_add_pc_page() filled that bio. The new code just returns EINVAL early if this would happen. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Suggested-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Cc: Christoph Hellwig <hch@lst.de> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-22nvmet: limit passthru MTDS by BIO_MAX_PAGESLogan Gunthorpe
nvmet_passthru_map_sg() only supports mapping a single BIO, not a chain so the effective maximum transfer should also be limitted by BIO_MAX_PAGES (presently this works out to 1MB). For PCI passthru devices the max_sectors would typically be more limitting than BIO_MAX_PAGES, but this may not be true for all passthru devices. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-22nvmet: fix uninitialized work for zero katozhenwei pi
When connecting a controller with a zero kato value using the following command line nvme connect -t tcp -n NQN -a ADDR -s PORT --keep-alive-tmo=0 the warning below can be reproduced: WARNING: CPU: 1 PID: 241 at kernel/workqueue.c:1627 __queue_delayed_work+0x6d/0x90 with trace: mod_delayed_work_on+0x59/0x90 nvmet_update_cc+0xee/0x100 [nvmet] nvmet_execute_prop_set+0x72/0x80 [nvmet] nvmet_tcp_try_recv_pdu+0x2f7/0x770 [nvmet_tcp] nvmet_tcp_io_work+0x63f/0xb2d [nvmet_tcp] ... This is caused by queuing up an uninitialized work. Althrough the keep-alive timer is disabled during allocating the controller (fixed in 0d3b6a8d213a), ka_work still has a chance to run (called by nvmet_start_ctrl). Fixes: 0d3b6a8d213a ("nvmet: Disable keep-alive timer when kato is cleared to 0h") Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-07nvme-loop: don't put ctrl on nvme_init_ctrl errorChaitanya Kulkarni
The function nvme_init_ctrl() gets the ctrl reference & when it fails it does put the ctrl reference in the error unwind code. When creating loop ctrl in nvme_loop_create_ctrl() if nvme_init_ctrl() returns non zero (i.e. error) value it jumps to the "out_put_ctrl" label which calls nvme_put_ctrl(), that will lead to douple ctrl put in error unwind path. Update nvme_loop_create_ctrl() such that this patch removes the "out_put_ctrl" label, add a new "out" label after nvme_put_ctrl() in error unwind path and jump to newly added label when nvme_init_ctrl() call retuns an error. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-27nvmet-fc: fix missing check for no hostport structJames Smart
A hostport port pointer is allowed to be NULL as it is not allocated if the lldd does not support the new interfaces for NVME LS request support. The hostport free routine validates the handle but forgot to validate the hostport pointer. Validate the hostport pointer before using it to validate the handle. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-27nvmet: add passthru ZNS supportChaitanya Kulkarni
In the default passthru implementation NVMeOF target passthru ctrl is not capable of handling Zoned Namespaces (ZNS). Update the nvmet_parse_pasthru_admin_cmd() to allow NVME_ID_CNS_CS_CTRL/NVME_CIS_ZNS and NVME_ID_CNS_CS_NS/NVME_CIS_ZNS. With this addition NVMeOF Passthru allows Zoned Namespaces. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-27nvmet: handle keep-alive timer when kato is modified by a set features cmdAmit Engel
A user may modify the kato by a set features cmd. To properly deal with races or a kato value of 0 (no keep alive enabled) change nvmet_set_feat_kato to first disable the timer, then set the value and then re-enable the timer. Signed-off-by: Amit Engel <amit.engel@dell.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-27nvmet-tcp: have queue io_work context run on sock incoming cpuMark Wunderlich
No real good need to spread queues artificially. Usually the target will serve multiple hosts, and it's better to run on the socket incoming cpu for better affinitization rather than spread queues on all online cpus. We rely on RSS to spread the work around sufficiently. Signed-off-by: Mark Wunderlich <mark.wunderlich@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-27nvme: lift the file open code from nvme_ctrl_get_by_pathChaitanya Kulkarni
Lift opening the file open/close code from nvme_ctrl_get_by_path into the caller, just keeping a simple nvme_ctrl_from_file() helper. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: refactored a bit, split the bug fixes into a separate prep patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2020-09-22Merge tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A few NVMe fixes, and a dasd write zero fix" * tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block: nvmet: get transport reference for passthru ctrl nvme-core: get/put ctrl and transport module in nvme_dev_open/release() nvme-tcp: fix kconfig dependency warning when !CRYPTO nvme-pci: disable the write zeros command for Intel 600P/P3100 s390/dasd: Fix zero write for FBA devices
2020-09-17nvmet: get transport reference for passthru ctrlChristoph Hellwig
Grab a reference to the transport driver to ensure it can't be unloaded while a passthrough controller is active. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Reported-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2020-09-04Merge tag 'block-5.9-2020-09-04' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A bit larger than usual this week, mostly due to the NVMe fixes arriving late for -rc3 and hence didn't make last weeks pull request. - NVMe: - instance leak and io boundary fixes from Keith - fc locking fix from Christophe - various tcp/rdma reset during traffic fixes from Sagi - pci use-after-free fix from Tong - tcp target null deref fix from Ziye - Locking fix for partition removal (Christoph) - Ensure bdi->io_pages is always set (me) - Fixup for hd struct reference (Ming) - Fix for zero length bvecs (Ming) - Two small blk-iocost fixes (Tejun)" * tag 'block-5.9-2020-09-04' of git://git.kernel.dk/linux-block: block: allow for_each_bvec to support zero len bvec blk-stat: make q->stats->lock irqsafe blk-iocost: ioc_pd_free() shouldn't assume irq disabled block: fix locking in bdev_del_partition block: release disk reference in hd_struct_free_work block: ensure bdi->io_pages is always initialized nvme-pci: cancel nvme device request before disabling nvme: only use power of two io boundaries nvme: fix controller instance leak nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()' nvme: Fix NULL dereference for pci nvme controllers nvme-rdma: fix reset hang if controller died in the middle of a reset nvme-rdma: fix timeout handler nvme-rdma: serialize controller teardown sequences nvme-tcp: fix reset hang if controller died in the middle of a reset nvme-tcp: fix timeout handler nvme-tcp: serialize controller teardown sequences nvme: have nvme_wait_freeze_timeout return if it timed out nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu
2020-08-28nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()'Christophe JAILLET
The way 'spin_lock()' and 'spin_lock_irqsave()' are used is not consistent in this function. Use 'spin_lock_irqsave()' also here, as there is no guarantee that interruptions are disabled at that point, according to surrounding code. Fixes: a97ec51b37ef ("nvmet_fc: Rework target side abort handling") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pduZiye Yang
When handling commands without in-capsule data, we assign the ttag assuming we already have the queue commands array allocated (based on the queue size information in the connect data payload). However if the connect itself did not send the connect data in-capsule we have yet to allocate the queue commands,and we will assign a bogus ttag and suffer a NULL dereference when we receive the corresponding h2cdata pdu. Fix this by checking if we already allocated commands before dereferencing it when handling h2cdata, if we didn't, its for sure a connect and we should use the preallocated connect command. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-24Merge tag 'io_uring-5.9-2020-08-23' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request from Sagi: - nvme completion rework from Christoph and Chao that mostly came from a bit of divergence of how we classify errors related to pathing/retry etc. - nvmet passthru fixes from Chaitanya - minor nvmet fixes from Amit and I - mpath round-robin path selection fix from Martin - ignore noiob for zoned devices from Keith - minor nvme-fc fix from Tianjia" - BFQ cgroup leak fix (Dmitry) - block layer MAINTAINERS addition (Geert) - fix null_blk FUA checking (Hou) - get_max_io_size() size fix (Keith) - fix block page_is_mergeable() for compound pages (Matthew) - discard granularity fixes (Ming) - IO scheduler ordering fix (Ming) - misc fixes * tag 'io_uring-5.9-2020-08-23' of git://git.kernel.dk/linux-block: (31 commits) null_blk: fix passing of REQ_FUA flag in null_handle_rq nvmet: Disable keep-alive timer when kato is cleared to 0h nvme: redirect commands on dying queue nvme: just check the status code type in nvme_is_path_error nvme: refactor command completion nvme: rename and document nvme_end_request nvme: skip noiob for zoned devices nvme-pci: fix PRP pool size nvme-pci: Use u32 for nvme_dev.q_depth and nvme_queue.q_depth nvme: Use spin_lock_irq() when taking the ctrl->lock nvmet: call blk_mq_free_request() directly nvmet: fix oops in pt cmd execution nvmet: add ns tear down label for pt-cmd handling nvme: multipath: round-robin: eliminate "fallback" variable nvme: multipath: round-robin: fix single non-optimized path case nvme-fc: Fix wrong return value in __nvme_fc_init_request() nvmet-passthru: Reject commands with non-sgl flags set nvmet: fix a memory leak blkcg: fix memleak for iolatency MAINTAINERS: Add missing header files to BLOCK LAYER section ...
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-21nvmet: Disable keep-alive timer when kato is cleared to 0hAmit Engel
Based on nvme spec, when keep alive timeout is set to zero the keep-alive timer should be disabled. Signed-off-by: Amit Engel <amit.engel@dell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-21nvme: rename and document nvme_end_requestChristoph Hellwig
nvme_end_request is a bit misnamed, as it wraps around the blk_mq_complete_* API. It's semantics also are non-trivial, so give it a more descriptive name and add a comment explaining the semantics. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-21nvmet: call blk_mq_free_request() directlyChaitanya Kulkarni
Instead of calling blk_put_request() which calls blk_mq_free_request(), call blk_mq_free_request() directly for NVMeOF passthru. This is to mainly avoid an extra function call in the completion path nvmet_passthru_req_done(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-21nvmet: fix oops in pt cmd executionChaitanya Kulkarni
In the existing NVMeOF Passthru core command handling on failure of nvme_alloc_request() it errors out with rq value set to NULL. In the error handling path it calls blk_put_request() without checking if rq is set to NULL or not which produces following Oops:- [ 1457.346861] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1457.347838] #PF: supervisor read access in kernel mode [ 1457.348464] #PF: error_code(0x0000) - not-present page [ 1457.349085] PGD 0 P4D 0 [ 1457.349402] Oops: 0000 [#1] SMP NOPTI [ 1457.349851] CPU: 18 PID: 10782 Comm: kworker/18:2 Tainted: G OE 5.8.0-rc4nvme-5.9+ #35 [ 1457.350951] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214 [ 1457.352347] Workqueue: events nvme_loop_execute_work [nvme_loop] [ 1457.353062] RIP: 0010:blk_mq_free_request+0xe/0x110 [ 1457.353651] Code: 3f ff ff ff 83 f8 01 75 0d 4c 89 e7 e8 1b db ff ff e9 2d ff ff ff 0f 0b eb ef 66 8 [ 1457.355975] RSP: 0018:ffffc900035b7de0 EFLAGS: 00010282 [ 1457.356636] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002 [ 1457.357526] RDX: ffffffffa060bd05 RSI: 0000000000000000 RDI: 0000000000000000 [ 1457.358416] RBP: 0000000000000037 R08: 0000000000000000 R09: 0000000000000000 [ 1457.359317] R10: 0000000000000000 R11: 000000000000006d R12: 0000000000000000 [ 1457.360424] R13: ffff8887ffa68600 R14: 0000000000000000 R15: ffff8888150564c8 [ 1457.361322] FS: 0000000000000000(0000) GS:ffff888814600000(0000) knlGS:0000000000000000 [ 1457.362337] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1457.363058] CR2: 0000000000000000 CR3: 000000081c0ac000 CR4: 00000000003406e0 [ 1457.363973] Call Trace: [ 1457.364296] nvmet_passthru_execute_cmd+0x150/0x2c0 [nvmet] [ 1457.364990] process_one_work+0x24e/0x5a0 [ 1457.365493] ? __schedule+0x353/0x840 [ 1457.365957] worker_thread+0x3c/0x380 [ 1457.366426] ? process_one_work+0x5a0/0x5a0 [ 1457.366948] kthread+0x135/0x150 [ 1457.367362] ? kthread_create_on_node+0x60/0x60 [ 1457.367934] ret_from_fork+0x22/0x30 [ 1457.368388] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corer [ 1457.368414] ata_piix crc32c_intel virtio_pci libata virtio_ring serio_raw t10_pi virtio floppy dm_] [ 1457.380849] CR2: 0000000000000000 [ 1457.381288] ---[ end trace c6cab61bfd1f68fd ]--- [ 1457.381861] RIP: 0010:blk_mq_free_request+0xe/0x110 [ 1457.382469] Code: 3f ff ff ff 83 f8 01 75 0d 4c 89 e7 e8 1b db ff ff e9 2d ff ff ff 0f 0b eb ef 66 8 [ 1457.384749] RSP: 0018:ffffc900035b7de0 EFLAGS: 00010282 [ 1457.385393] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002 [ 1457.386264] RDX: ffffffffa060bd05 RSI: 0000000000000000 RDI: 0000000000000000 [ 1457.387142] RBP: 0000000000000037 R08: 0000000000000000 R09: 0000000000000000 [ 1457.388029] R10: 0000000000000000 R11: 000000000000006d R12: 0000000000000000 [ 1457.388914] R13: ffff8887ffa68600 R14: 0000000000000000 R15: ffff8888150564c8 [ 1457.389798] FS: 0000000000000000(0000) GS:ffff888814600000(0000) knlGS:0000000000000000 [ 1457.390796] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1457.391508] CR2: 0000000000000000 CR3: 000000081c0ac000 CR4: 00000000003406e0 [ 1457.392525] Kernel panic - not syncing: Fatal exception [ 1457.394138] Kernel Offset: disabled [ 1457.394677] ---[ end Kernel panic - not syncing: Fatal exception ]--- We fix this Oops by adding a new goto label out_put_req and reordering the blk_put_request call to avoid calling blk_put_request() with rq value is set to NULL. Here we also update the rest of the code accordingly. Fixes: 06b7164dfdc0 ("nvmet: add passthru code to process commands") Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-21nvmet: add ns tear down label for pt-cmd handlingChaitanya Kulkarni
In the current implementation before submitting the passthru cmd we may come across error e.g. getting ns from passthru controller, allocating a request from passthru controller, etc. For all the failure cases it only uses single goto label fail_out. In the target code, we follow the pattern to have a separate label for each error out the case when setting up multiple things before the actual action. This patch follows the same pattern and renames generic fail_out label to out_put_ns and updates the error out cases in the nvmet_passthru_execute_cmd() where it is needed. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-21nvmet-passthru: Reject commands with non-sgl flags setLogan Gunthorpe
Any command with a non-SGL flag set (like fuse flags) should be rejected. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-21nvmet: fix a memory leakSagi Grimberg
We forgot to free new_model_number Fixes: 013b7ebe5a0d ("nvmet: make ctrl model configurable") Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-29nvme-loop: remove extra variable in create ctrlChaitanya Kulkarni
We can call the nvme_change_ctrl_state() directly and have WARN_ON_ONCE(1) call instead of having to use an extra variable which matches the name of the function. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvme-loop: set ctrl state connecting after initChaitanya Kulkarni
When creating a loop controller (ctrl) in nvme_loop_create_ctrl() -> nvme_init_ctrl() we set the ctrl state to NVME_CTRL_NEW. Prior to [1] NVME_CTRL_NEW state was allowed in nvmf_check_ready() for fabrics command type connect. Now, this fails in the following code path for fabrics connect command when creating admin queue :- nvme_loop_create_ctrl() nvme_loo_configure_admin_queue() nvmf_connect_admin_queue() __nvme_submit_sync_cmd() blk_execute_rq() nvme_loop_queue_rq() nvmf_check_ready() # echo "transport=loop,nqn=fs" > /dev/nvme-fabrics [ 6047.741327] nvmet: adding nsid 1 to subsystem fs [ 6048.756430] nvme nvme1: Connect command failed, error wo/DNR bit: 880 We need to set the ctrl state to NVME_CTRL_CONNECTING after :- nvme_loop_create_ctrl() nvme_init_ctrl() so that the above mentioned check for nvmf_check_ready() will return true. This patch sets the ctrl state to connecting after we init the ctrl in nvme_loop_create_ctrl() nvme_init_ctrl() . [1] commit aa63fa6776a7 ("nvme-fabrics: allow to queue requests for live queues") Fixes: aa63fa6776a7 ("nvme-fabrics: allow to queue requests for live queues") Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: introduce the passthru Kconfig optionChaitanya Kulkarni
This patch updates KConfig file for the NVMeOF target where we add new option so that user can selectively enable/disable passthru code. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [logang@deltatee.com: fixed some of the wording in the help message] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: introduce the passthru configfs interfaceLogan Gunthorpe
When CONFIG_NVME_TARGET_PASSTHRU as 'passthru' directory will be added to each subsystem. The directory is similar to a namespace and has two attributes: device_path and enable. The user must set the path to the nvme controller's char device and write '1' to enable the subsystem to use passthru. Any given subsystem is prevented from enabling both a regular namespace and the passthru device. If one is enabled, enabling the other will produce an error. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: Add passthru enable/disable helpersLogan Gunthorpe
This patch adds helper functions which are used in the NVMeOF configfs when the user is configuring the passthru subsystem. Here we ensure that only one subsys is assigned to each nvme_ctrl by using an xarray on the cntlid. The subsystem's version number is overridden by the passed through controller's version. However, if that version is less than 1.2.1, then we bump the advertised version to that and print a warning in dmesg. Based-on-a-patch-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: add passthru code to process commandsLogan Gunthorpe
Add passthru command handling capability for the NVMeOF target and export passthru APIs which are used to integrate passthru code with nvmet-core. The new file passthru.c handles passthru cmd parsing and execution. In the passthru mode, we create a block layer request from the nvmet request and map the data on to the block layer request. Admin commands and features are on an allow list as there are a number of each that don't make too much sense with passthrough. We use an allow list such that new commands can be considered before being blindly passed through. In both cases, vendor specific commands are always allowed. We also reject reservation IO commands as the underlying device cannot differentiate between multiple hosts behind a fabric. Based-on-a-patch-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet-fc: remove redundant del_work_active flagJames Smart
The transport has a del_work_active flag to avoid duplicate scheduling of the del_work item. This is redundant with the checks that schedule_work() makes. Remove the del_work_active flag. Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet-fc: check successful reference in nvmet_fc_find_target_assocJames Smart
When searching for an association based on an association id, when there is a match, the code takes a reference. However, it is not validating that the reference taking was successful. Check the status of the reference. If unsuccessful, the device is being deleted and should be ignored. Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: use xarray for ctrl ns storingChaitanya Kulkarni
This patch replaces the ctrl->namespaces tracking from linked list to xarray and improves the performance when accessing one namespce :- XArray vs Default:- IOPS and BW (more the better) increase BW (~1.8%):- --------------------------------------------------- XArray :- read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) read: IOPS=162k, BW=631MiB/s (662MB/s)(18.5GiB/30001msec) Default:- read: IOPS=156k, BW=609MiB/s (639MB/s)(17.8GiB/30001msec) read: IOPS=157k, BW=613MiB/s (643MB/s)(17.0GiB/30001msec) read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) Submission latency (less the better) decrease (~8.3%):- ------------------------------------------------------- XArray:- slat (usec): min=7, max=8386, avg=11.19, stdev=5.96 slat (usec): min=7, max=441, avg=11.09, stdev=4.48 slat (usec): min=7, max=1088, avg=11.21, stdev=4.54 Default :- slat (usec): min=8, max=2826.5k, avg=23.96, stdev=3911.50 slat (usec): min=8, max=503, avg=12.52, stdev=5.07 slat (usec): min=8, max=2384, avg=12.50, stdev=5.28 CPU Usage (less the better) decrease (~5.2%):- ---------------------------------------------- XArray:- cpu : usr=1.84%, sys=18.61%, ctx=949471, majf=0, minf=250 cpu : usr=1.83%, sys=18.41%, ctx=950262, majf=0, minf=237 cpu : usr=1.82%, sys=18.82%, ctx=957224, majf=0, minf=234 Default:- cpu : usr=1.70%, sys=19.21%, ctx=858196, majf=0, minf=251 cpu : usr=1.82%, sys=19.98%, ctx=929720, majf=0, minf=227 cpu : usr=1.83%, sys=20.33%, ctx=947208, majf=0, minf=235. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet-rdma: use new shared CQ mechanismYamin Friedman
Has the driver use shared CQs providing ~10%-20% improvement when multiple disks are used. Instead of opening a CQ for each QP per controller, a CQ for each core will be provided by the RDMA core driver that will be shared between the QPs on that core reducing interrupt overhead. Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08nvmet-tcp: simplify nvmet_process_resp_listSagi Grimberg
We can make it shorter and simpler without some redundant checks. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08nvme-fcloop: verify wwnn and wwpn formatDongli Zhang
The nvme host and target verify the wwnn and wwpn format via nvme_fc_parse_traddr(). For instance, it is required that the length of wwnn to be either 21 ("nn-0x") or 19 (nn-). Add this verification to nvme-fcloop so that the input should always be in hex and the length of input should always be 18. Otherwise, the user may use e.g. 0x2 to create fcloop local port, while 0x0000000000000002 is required for nvme host and target. This makes the requirement of format confusing. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08nvmet: use unsigned type for u64Chaitanya Kulkarni
In function nvmet_subsys_atte_version_show() which uses the NVME_XXX() macros related to version (of type u64) get rid of the int type cast when printing subsys version and use appropriate format specifier for u64. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08nvmet: introduce flags member in nvmet_fabrics_opsMax Gurtovoy
Replace has_keyed_sgls and metadata_support booleans with a flags member that will be used for adding more features in the future. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08nvmet-tcp: remove has_keyed_sgls initializationMax Gurtovoy
Since the nvmet_tcp_ops is static, there is no need to initialize values to zero. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08nvmet-loop: remove unused 'target_ctrl' in nvme_loop_ctrlDongli Zhang
This field is never used since the introduction of nvme loopback by commit 3a85a5de29ea ("nvme-loop: add a NVMe loopback host driver"). Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08Merge tag 'v5.8-rc4' into for-5.9/driversJens Axboe
Merge in 5.8-rc4 for-5.9/block to setup for-5.9/drivers, to provide a clean base and making the life for the NVMe changes easier. Signed-off-by: Jens Axboe <axboe@kernel.dk> * tag 'v5.8-rc4': (732 commits) Linux 5.8-rc4 x86/ldt: use "pr_info_once()" instead of open-coding it badly MIPS: Do not use smp_processor_id() in preemptible code MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen .gitignore: Do not track `defconfig` from `make savedefconfig` io_uring: fix regression with always ignoring signals in io_cqring_wait() x86/ldt: Disable 16-bit segments on Xen PV x86/entry/32: Fix #MC and #DB wiring on x86_32 x86/entry/xen: Route #DB correctly on Xen PV x86/entry, selftests: Further improve user entry sanity checks x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER i2c: mlxcpld: check correct size of maximum RECV_LEN packet i2c: add Kconfig help text for slave mode i2c: slave-eeprom: update documentation i2c: eg20t: Load module automatically if ID matches i2c: designware: platdrv: Set class based on DMI i2c: algo-pca: Add 0x78 as SCL stuck low status for PCA9665 mm/page_alloc: fix documentation error vmalloc: fix the owner argument for the new __vmalloc_node_range callers mm/cma.c: use exact_nid true to fix possible per-numa cma leak ...
2020-07-01block: remove the bd_queue field from struct block_deviceChristoph Hellwig
Just use bd_disk->queue instead. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-24nvme-loop: initialize tagset numa value to the value of the ctrlMax Gurtovoy
Both admin's and drive's tagsets should be set according the numa node of the controller. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-24nvme: use blk_mq_complete_request_remote to avoid an indirect function callChristoph Hellwig
Use the new blk_mq_complete_request_remote helper to avoid an indirect function call in the completion fast path. Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-11Merge tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Some followup fixes for this merge window. In particular: - Seqcount write missing preemption disable for stats (Ahmed) - blktrace fixes (Chaitanya) - Redundant initializations (Colin) - Various small NVMe fixes (Chaitanya, Christoph, Daniel, Max, Niklas, Rikard) - loop flag bug regression fix (Martijn) - blk-mq tagging fixes (Christoph, Ming)" * tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block: umem: remove redundant initialization of variable ret pktcdvd: remove redundant initialization of variable ret nvmet: fail outstanding host posted AEN req nvme-pci: use simple suspend when a HMB is enabled nvme-fc: don't call nvme_cleanup_cmd() for AENs nvmet-tcp: constify nvmet_tcp_ops nvme-tcp: constify nvme_tcp_mq_ops and nvme_tcp_admin_mq_ops nvme: do not call del_gendisk() on a disk that was never added blk-mq: fix blk_mq_all_tag_iter blk-mq: split out a __blk_mq_get_driver_tag helper blktrace: fix endianness for blk_log_remap() blktrace: fix endianness in get_pdu_int() blktrace: use errno instead of bi_status block: nr_sects_write(): Disable preemption on seqcount write block: remove the error argument to the block_bio_complete tracepoint loop: Fix wrong masking of status flags block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed
2020-06-11nvmet: fail outstanding host posted AEN reqChaitanya Kulkarni
In function nvmet_async_event_process() we only process AENs iff there is an open slot on the ctrl->async_event_cmds[] && aen event list posted by the target is not empty. This keeps host posted AEN outstanding if target generated AEN list is empty. We do cleanup the target generated entries from the aen list in nvmet_ctrl_free()-> nvmet_async_events_free() but we don't process AEN posted by the host. This leads to following problem :- When processing admin sq at the time of nvmet_sq_destroy() holds an extra percpu reference(atomic value = 1), so in the following code path after switching to atomic rcu, release function (nvmet_sq_free()) is not getting called which blocks the sq->free_done in nvmet_sq_destroy() :- nvmet_sq_destroy() percpu_ref_kill_and_confirm() - __percpu_ref_switch_mode() -- __percpu_ref_switch_to_atomic() --- call_rcu() -> percpu_ref_switch_to_atomic_rcu() ---- /* calls switch callback */ - percpu_ref_put() -- percpu_ref_put_many(ref, 1) --- else if (unlikely(atomic_long_sub_and_test(nr, &ref->count))) ---- ref->release(ref); <---- Not called. This results in indefinite hang:- void nvmet_sq_destroy(struct nvmet_sq *sq) ... if (ctrl && ctrl->sqs && ctrl->sqs[0] == sq) { nvmet_async_events_process(ctrl, status); percpu_ref_put(&sq->ref); } percpu_ref_kill_and_confirm(&sq->ref, nvmet_confirm_sq); wait_for_completion(&sq->confirm_done); wait_for_completion(&sq->free_done); <-- Hang here Which breaks the further disconnect sequence. This problem seems to be introduced after commit 64f5e9cdd711b ("nvmet: fix memory leak when removing namespaces and controllers concurrently"). This patch processes ctrl->async_event_cmds[] in the admin sq destroy() context irrespetive of aen_list. Also we get rid of the controller's aen_list processing in the nvmet_sq_destroy() context and just ignore ctrl->aen_list. This results in nvmet_async_events_process() being called from workqueue context so we adjust the code accordingly. Fixes: 64f5e9cdd711 ("nvmet: fix memory leak when removing namespaces and controllers concurrently ") Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-11nvmet-tcp: constify nvmet_tcp_opsMax Gurtovoy
nvmet_tcp_ops is never modified and can be made const to allow the compiler to put it in read-only memory, as done in other transports. Before: text data bss dec hex filename 16164 160 12 16336 3fd0 drivers/nvme/target/tcp.o After: text data bss dec hex filename 16277 64 12 16353 3fe1 drivers/nvme/target/tcp.o Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "A more active cycle than most of the recent past, with a few large, long discussed works this time. The RNBD block driver has been posted for nearly two years now, and flowing through RDMA due to it also introducing a new ULP. The removal of FMR has been a recurring discussion theme for a long time. And the usual smattering of features and bug fixes. Summary: - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa - Continuing driver cleanups in bnxt_re, hns - Big cleanup of mlx5 QP creation flows - More consistent use of src port and flow label when LAG is used and a mlx5 implementation - Additional set of cleanups for IB CM - 'RNBD' network block driver and target. This is a network block RDMA device specific to ionos's cloud environment. It brings strong multipath and resiliency capabilities. - Accelerated IPoIB for HFI1 - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple async fds - Support for exchanging the new IBTA defiend ECE data during RDMA CM exchanges - Removal of the very old and insecure FMR interface from all ULPs and drivers. FRWR should be preferred for at least a decade now" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits) RDMA/cm: Spurious WARNING triggered in cm_destroy_id() RDMA/mlx5: Return ECE DC support RDMA/mlx5: Don't rely on FW to set zeros in ECE response RDMA/mlx5: Return an error if copy_to_user fails IB/hfi1: Use free_netdev() in hfi1_netdev_free() RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr() RDMA/core: Move and rename trace_cm_id_create() IB/hfi1: Fix hfi1_netdev_rx_init() error handling RDMA: Remove 'max_map_per_fmr' RDMA: Remove 'max_fmr' RDMA/core: Remove FMR device ops RDMA/rdmavt: Remove FMR memory registration RDMA/mthca: Remove FMR support for memory registration RDMA/mlx4: Remove FMR support for memory registration RDMA/i40iw: Remove FMR leftovers RDMA/bnxt_re: Remove FMR leftovers RDMA/mlx5: Remove FMR leftovers RDMA/core: Remove FMR pool API RDMA/rds: Remove FMR support for memory registration RDMA/srp: Remove support for FMR memory registration ...
2020-06-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz Augusto von Dentz. 2) Add GSO partial support to igc, from Sasha Neftin. 3) Several cleanups and improvements to r8169 from Heiner Kallweit. 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a device self-test. From Andrew Lunn. 5) Start moving away from custom driver versions, use the globally defined kernel version instead, from Leon Romanovsky. 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin. 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet. 8) Add sriov and vf support to hinic, from Luo bin. 9) Support Media Redundancy Protocol (MRP) in the bridging code, from Horatiu Vultur. 10) Support netmap in the nft_nat code, from Pablo Neira Ayuso. 11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina Dubroca. Also add ipv6 support for espintcp. 12) Lots of ReST conversions of the networking documentation, from Mauro Carvalho Chehab. 13) Support configuration of ethtool rxnfc flows in bcmgenet driver, from Doug Berger. 14) Allow to dump cgroup id and filter by it in inet_diag code, from Dmitry Yakunin. 15) Add infrastructure to export netlink attribute policies to userspace, from Johannes Berg. 16) Several optimizations to sch_fq scheduler, from Eric Dumazet. 17) Fallback to the default qdisc if qdisc init fails because otherwise a packet scheduler init failure will make a device inoperative. From Jesper Dangaard Brouer. 18) Several RISCV bpf jit optimizations, from Luke Nelson. 19) Correct the return type of the ->ndo_start_xmit() method in several drivers, it's netdev_tx_t but many drivers were using 'int'. From Yunjian Wang. 20) Add an ethtool interface for PHY master/slave config, from Oleksij Rempel. 21) Add BPF iterators, from Yonghang Song. 22) Add cable test infrastructure, including ethool interfaces, from Andrew Lunn. Marvell PHY driver is the first to support this facility. 23) Remove zero-length arrays all over, from Gustavo A. R. Silva. 24) Calculate and maintain an explicit frame size in XDP, from Jesper Dangaard Brouer. 25) Add CAP_BPF, from Alexei Starovoitov. 26) Support terse dumps in the packet scheduler, from Vlad Buslov. 27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei. 28) Add devm_register_netdev(), from Bartosz Golaszewski. 29) Minimize qdisc resets, from Cong Wang. 30) Get rid of kernel_getsockopt and kernel_setsockopt in order to eliminate set_fs/get_fs calls. From Christoph Hellwig. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits) selftests: net: ip_defrag: ignore EPERM net_failover: fixed rollback in net_failover_open() Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv" Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv" vmxnet3: allow rx flow hash ops only when rss is enabled hinic: add set_channels ethtool_ops support selftests/bpf: Add a default $(CXX) value tools/bpf: Don't use $(COMPILE.c) bpf, selftests: Use bpf_probe_read_kernel s390/bpf: Use bcr 0,%0 as tail call nop filler s390/bpf: Maintain 8-byte stack alignment selftests/bpf: Fix verifier test selftests/bpf: Fix sample_cnt shared between two threads bpf, selftests: Adapt cls_redirect to call csum_level helper bpf: Add csum_level helper for fixing up csum levels bpf: Fix up bpf_skb_adjust_room helper's skb csum setting sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf() crypto/chtls: IPv6 support for inline TLS Crypto/chcr: Fixes a coccinile check error Crypto/chcr: Fixes compilations warnings ...