summaryrefslogtreecommitdiff
path: root/drivers/infiniband/ulp
AgeCommit message (Collapse)Author
2021-04-28Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, target, tcmu, smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core change is using a sbitmap instead of an atomic for queue tracking" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits) scsi: target: tcm_fc: Fix a kernel-doc header scsi: target: Shorten ALUA error messages scsi: target: Fix two format specifiers scsi: target: Compare explicitly with SAM_STAT_GOOD scsi: sd: Introduce a new local variable in sd_check_events() scsi: dc395x: Open-code status_byte(u8) calls scsi: 53c700: Open-code status_byte(u8) calls scsi: smartpqi: Remove unused functions scsi: qla4xxx: Remove an unused function scsi: myrs: Remove unused functions scsi: myrb: Remove unused functions scsi: mpt3sas: Fix two kernel-doc headers scsi: fcoe: Suppress a compiler warning scsi: libfc: Fix a format specifier scsi: aacraid: Remove an unused function scsi: core: Introduce enum scsi_disposition scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case scsi: core: Rename scsi_softirq_done() into scsi_complete() scsi: core: Remove an incorrect comment scsi: core: Make the scsi_alloc_sgtables() documentation more accurate ...
2021-04-28Merge tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: - MD changes via Song: - raid5 POWER fix - raid1 failure fix - UAF fix for md cluster - mddev_find_or_alloc() clean up - Fix NULL pointer deref with external bitmap - Performance improvement for raid10 discard requests - Fix missing information of /proc/mdstat - rsxx const qualifier removal (Arnd) - Expose allocated brd pages (Calvin) - rnbd via Gioh Kim: - Change maintainer - Change domain address of maintainers' email - Add polling IO mode and document update - Fix memory leak and some bug detected by static code analysis tools - Code refactoring - Series of floppy cleanups/fixes (Denis) - s390 dasd fixes (Julian) - kerneldoc fixes (Lee) - null_blk double free (Lv) - null_blk virtual boundary addition (Max) - Remove xsysace driver (Michal) - umem driver removal (Davidlohr) - ataflop fixes (Dan) - Revalidate disk removal (Christoph) - Bounce buffer cleanups (Christoph) - Mark lightnvm as deprecated (Christoph) - mtip32xx init cleanups (Shixin) - Various fixes (Tian, Gustavo, Coly, Yang, Zhang, Zhiqiang) * tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block: (143 commits) async_xor: increase src_offs when dropping destination page drivers/block/null_blk/main: Fix a double free in null_init. md/raid1: properly indicate failure when ending a failed write request md-cluster: fix use-after-free issue when removing rdev nvme: introduce generic per-namespace chardev nvme: cleanup nvme_configure_apst nvme: do not try to reconfigure APST when the controller is not live nvme: add 'kato' sysfs attribute nvme: sanitize KATO setting nvmet: avoid queuing keep-alive timer if it is disabled brd: expose number of allocated pages in debugfs ataflop: fix off by one in ataflop_probe() ataflop: potential out of bounds in do_format() drbd: Fix fall-through warnings for Clang block/rnbd: Use strscpy instead of strlcpy block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name block/rnbd-clt: Remove max_segment_size block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev Documentation/ABI/rnbd-clt: Add description for nr_poll_queues ...
2021-04-20block/rnbd-clt: Remove max_segment_sizeJack Wang
We always map with SZ_4K, so do not need max_segment_size. Cc: Leon Romanovsky <leonro@nvidia.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20210419073722.15351-18-gi-oh.kim@ionos.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_evGioh Kim
struct rtrs_srv is not used when handling rnbd_srv_rdma_ev messages, so cleaned up rdma_ev function pointer in rtrs_srv_ops also is changed. Cc: Leon Romanovsky <leonro@nvidia.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Aleksei Marov <aleksei.marov@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20210419073722.15351-16-gi-oh.kim@ionos.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20block/rnbd-clt: Support polling mode for IO latency optimizationGioh Kim
RNBD can make double-queues for irq-mode and poll-mode. For example, on 4-CPU system 8 request-queues are created, 4 for irq-mode and 4 for poll-mode. If the IO has HIPRI flag, the block-layer will call .poll function of RNBD. Then IO is sent to the poll-mode queue. Add optional nr_poll_queues argument for map_devices interface. To support polling of RNBD, RTRS client creates connections for both of irq-mode and direct-poll-mode. For example, on 4-CPU system it could've create 5 connections: con[0] => user message (softirq cq) con[1:4] => softirq cq After this patch, it can create 9 connections: con[0] => user message (softirq cq) con[1:4] => softirq cq con[5:8] => DIRECT-POLL cq Cc: Leon Romanovsky <leonro@nvidia.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20210419073722.15351-14-gi-oh.kim@ionos.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT}Gioh Kim
They are defined with the same value and similar meaning, let's remove one of them, then we can remove {WAIT,NOWAIT}. Also change the type of 'wait' from 'int' to 'enum wait_type' to make it clear. Cc: Leon Romanovsky <leonro@nvidia.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20210419073722.15351-9-gi-oh.kim@ionos.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-01RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session filesMd Haris Iqbal
KASAN detected the following BUG: BUG: KASAN: use-after-free in rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] Read of size 8 at addr ffff88bf2fb4adc0 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 5.4.84-pserver #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012 Call Trace: <IRQ> dump_stack+0x96/0xe0 print_address_description.constprop.4+0x1f/0x300 ? irq_work_claim+0x2e/0x50 __kasan_report.cold.8+0x78/0x92 ? rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] kasan_report+0x10/0x20 rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] rtrs_clt_rdma_done+0xb1/0x760 [rtrs_client] ? lockdep_hardirqs_on+0x1a8/0x290 ? process_io_rsp+0xb0/0xb0 [rtrs_client] ? mlx4_ib_destroy_cq+0x100/0x100 [mlx4_ib] ? add_interrupt_randomness+0x1a2/0x340 __ib_process_cq+0x97/0x100 [ib_core] ib_poll_handler+0x41/0xb0 [ib_core] irq_poll_softirq+0xe0/0x260 __do_softirq+0x127/0x672 irq_exit+0xd1/0xe0 do_IRQ+0xa3/0x1d0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:cpuidle_enter_state+0xea/0x780 Code: 31 ff e8 99 48 47 ff 80 7c 24 08 00 74 12 9c 58 f6 c4 02 0f 85 53 05 00 00 31 ff e8 b0 6f 53 ff e8 ab 4f 5e ff fb 8b 44 24 04 <85> c0 0f 89 f3 01 00 00 48 8d 7b 14 e8 65 1e 77 ff c7 43 14 00 00 RSP: 0018:ffffffffab007d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffca RAX: 0000000000000002 RBX: ffff88b803d69800 RCX: ffffffffa91a8298 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffab021414 RBP: ffffffffab6329e0 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 R13: 000000bf39d82466 R14: ffffffffab632aa0 R15: ffffffffab632ae0 ? lockdep_hardirqs_on+0x1a8/0x290 ? cpuidle_enter_state+0xe5/0x780 cpuidle_enter+0x3c/0x60 do_idle+0x2fb/0x390 ? arch_cpu_idle_exit+0x40/0x40 ? schedule+0x94/0x120 cpu_startup_entry+0x19/0x1b start_kernel+0x5da/0x61b ? thread_stack_cache_init+0x6/0x6 ? load_ucode_amd_bsp+0x6f/0xc4 ? init_amd_microcode+0xa6/0xa6 ? x86_family+0x5/0x20 ? load_ucode_bsp+0x182/0x1fd secondary_startup_64+0xa4/0xb0 Allocated by task 5730: save_stack+0x19/0x80 __kasan_kmalloc.constprop.9+0xc1/0xd0 kmem_cache_alloc_trace+0x15b/0x350 alloc_sess+0xf4/0x570 [rtrs_client] rtrs_clt_open+0x3b4/0x780 [rtrs_client] find_and_get_or_create_sess+0x649/0x9d0 [rnbd_client] rnbd_clt_map_device+0xd7/0xf50 [rnbd_client] rnbd_clt_map_device_store+0x4ee/0x970 [rnbd_client] kernfs_fop_write+0x141/0x240 vfs_write+0xf3/0x280 ksys_write+0xba/0x150 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 5822: save_stack+0x19/0x80 __kasan_slab_free+0x125/0x170 kfree+0xe7/0x3f0 kobject_put+0xd3/0x240 rtrs_clt_destroy_sess_files+0x3f/0x60 [rtrs_client] rtrs_clt_close+0x3c/0x80 [rtrs_client] close_rtrs+0x45/0x80 [rnbd_client] rnbd_client_exit+0x10f/0x2bd [rnbd_client] __x64_sys_delete_module+0x27b/0x340 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe When rtrs_clt_close is triggered, it iterates over all the present rtrs_clt_sess and triggers close on them. However, the call to rtrs_clt_destroy_sess_files is done before the rtrs_clt_close_conns. This is incorrect since during the initialization phase we allocate rtrs_clt_sess first, and then we go ahead and create rtrs_clt_con for it. If we free the rtrs_clt_sess structure before closing the rtrs_clt_con, it may so happen that an inflight IO completion would trigger the function rtrs_clt_rdma_done, which would lead to the above UAF case. Hence close the rtrs_clt_con connections first, and then trigger the destruction of session files. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20210325153308.1214057-12-gi-oh.kim@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-04scsi: target: core: Add gfp_t arg to target_cmd_init_cdb()Mike Christie
tcm_loop could be used like a normal block device, so we can't use GFP_KERNEL and should use GFP_NOIO. This adds a gfp_t arg to target_cmd_init_cdb() and converts the users. For every driver but loop GFP_KERNEL is kept. This will also be useful in subsequent patches where loop needs to do target_submit_prep() from interrupt context to get a ref to the se_device, and so it will need to use GFP_ATOMIC. Link: https://lore.kernel.org/r/20210227170006.5077-16-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: srpt: Convert to new submission APIMike Christie
target_submit_cmd_map_sgls() is being removed, so convert srpt to the new submission API. srpt uses target_stop_session() to sync session shutdown with LIO core, so we use target_init_cmd()/target_submit_prep()/target_submit(), because target_init_cmd() will detect the target_stop_session() call and return an error. Link: https://lore.kernel.org/r/20210227170006.5077-6-michael.christie@oracle.com Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-02-16RDMA/rtrs-srv: Do not pass a valid pointer to PTR_ERR()Jack Wang
smatch gives the warning: drivers/infiniband/ulp/rtrs/rtrs-srv.c:1805 rtrs_rdma_connect() warn: passing zero to 'PTR_ERR' Which is trying to say smatch has shown that srv is not an error pointer and thus cannot be passed to PTR_ERR. The solution is to move the list_add() down after full initilization of rtrs_srv. To avoid holding the srv_mutex too long, only hold it during the list operation as suggested by Leon. Fixes: 03e9b33a0fd6 ("RDMA/rtrs: Only allow addition of path to an already established session") Link: https://lore.kernel.org/r/20210216143807.65923-1-jinpu.wang@cloud.ionos.com Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodesNicolas Morey-Chaisemartin
The current code computes a number of channels per SRP target and spreads them equally across all online NUMA nodes. Each channel is then assigned a CPU within this node. In the case of unbalanced, or even unpopulated nodes, some channels do not get a CPU associated and thus do not get connected. This causes the SRP connection to fail. This patch solves the issue by rewriting channel computation and allocation: - Drop channel to node/CPU association as it had no real effect on locality but added unnecessary complexity. - Tweak the number of channels allocated to reduce CPU contention when possible: - Up to one channel per CPU (instead of up to 4 by node) - At least 4 channels per node, unless ch_count module parameter is used. Link: https://lore.kernel.org/r/9cb4d9d3-30ad-2276-7eff-e85f7ddfb411@suse.com Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16RDMA/rtrs-srv-sysfs: fix missing put_deviceGioh Kim
put_device() decreases the ref-count and then the device will be cleaned-up, while at is also add missing put_device in rtrs_srv_create_once_sysfs_root_folders This patch solves a kmemleak error as below: unreferenced object 0xffff88809a7a0710 (size 8): comm "kworker/4:1H", pid 113, jiffies 4295833049 (age 6212.380s) hex dump (first 8 bytes): 62 6c 61 00 6b 6b 6b a5 bla.kkk. backtrace: [<0000000054413611>] kstrdup+0x2e/0x60 [<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0 [<00000000f1a17a6b>] dev_set_name+0xab/0xe0 [<00000000d5502e32>] rtrs_srv_create_sess_files+0x2fb/0x314 [rtrs_server] [<00000000ed11a1ef>] rtrs_srv_info_req_done+0x631/0x800 [rtrs_server] [<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core] [<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core] [<00000000cfc376be>] process_one_work+0x4bc/0x980 [<0000000016e5c96a>] worker_thread+0x78/0x5c0 [<00000000c20b8be0>] kthread+0x191/0x1e0 [<000000006c9c0003>] ret_from_fork+0x3a/0x50 Fixes: baa5b28b7a47 ("RDMA/rtrs-srv: Replace device_register with device_initialize and device_add") Link: https://lore.kernel.org/r/20210212134525.103456-5-jinpu.wang@cloud.ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16RDMA/rtrs-srv: fix memory leak by missing kobject freeGioh Kim
kmemleak reported an error as below: unreferenced object 0xffff8880674b7640 (size 64): comm "kworker/4:1H", pid 113, jiffies 4296403507 (age 507.840s) hex dump (first 32 bytes): 69 70 3a 31 39 32 2e 31 36 38 2e 31 32 32 2e 31 ip:192.168.122.1 31 30 40 69 70 3a 31 39 32 2e 31 36 38 2e 31 32 10@ip:192.168.12 backtrace: [<0000000054413611>] kstrdup+0x2e/0x60 [<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0 [<00000000ca2be3ee>] kobject_init_and_add+0xb0/0x120 [<0000000062ba5e78>] rtrs_srv_create_sess_files+0x14c/0x314 [rtrs_server] [<00000000b45b7217>] rtrs_srv_info_req_done+0x5b1/0x800 [rtrs_server] [<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core] [<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core] [<00000000cfc376be>] process_one_work+0x4bc/0x980 [<0000000016e5c96a>] worker_thread+0x78/0x5c0 [<00000000c20b8be0>] kthread+0x191/0x1e0 [<000000006c9c0003>] ret_from_fork+0x3a/0x50 It is caused by the not-freed kobject of rtrs_srv_sess. The kobject embedded in rtrs_srv_sess has ref-counter 2 after calling process_info_req(). Therefore it must call kobject_put twice. Currently it calls kobject_put only once at rtrs_srv_destroy_sess_files because kobject_del removes the state_in_sysfs flag and then kobject_put in free_sess() is not called. This patch moves kobject_del() into free_sess() so that the kobject of rtrs_srv_sess can be freed. And also this patch adds the missing call of sysfs_remove_group() to clean-up the sysfs directory. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20210212134525.103456-4-jinpu.wang@cloud.ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16RDMA/rtrs: Only allow addition of path to an already established sessionMd Haris Iqbal
While adding a path from the client side to an already established session, it was possible to provide the destination IP to a different server. This is dangerous. This commit adds an extra member to the rtrs_msg_conn_req structure, named first_conn; which is supposed to notify if the connection request is the first for that session or not. On the server side, if a session does not exist but the first_conn received inside the rtrs_msg_conn_req structure is 1, the connection request is failed. This signifies that the connection request is for an already existing session, and since the server did not find one, it is an wrong connection request. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20210212134525.103456-3-jinpu.wang@cloud.ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16RDMA/rtrs-srv: Fix stack-out-of-boundsJack Wang
BUG: KASAN: stack-out-of-bounds in _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib] Read of size 4 at addr ffff8880d5a7f980 by task kworker/0:1H/565 CPU: 0 PID: 565 Comm: kworker/0:1H Tainted: G O 5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012 Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] Call Trace: dump_stack+0x96/0xe0 print_address_description.constprop.4+0x1f/0x300 ? irq_work_claim+0x2e/0x50 __kasan_report.cold.8+0x78/0x92 ? _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib] kasan_report+0x10/0x20 _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib] ? check_chain_key+0x1d7/0x2e0 ? _mlx4_ib_post_recv+0x630/0x630 [mlx4_ib] ? lockdep_hardirqs_on+0x1a8/0x290 ? stack_depot_save+0x218/0x56e ? do_profile_hits.isra.6.cold.13+0x1d/0x1d ? check_chain_key+0x1d7/0x2e0 ? save_stack+0x4d/0x80 ? save_stack+0x19/0x80 ? __kasan_slab_free+0x125/0x170 ? kfree+0xe7/0x3b0 rdma_write_sg+0x5b0/0x950 [rtrs_server] The problem is when we send imm_wr, the type should be ib_rdma_wr, so hw driver like mlx4 can do rdma_wr(wr), so fix it by use the ib_rdma_wr as type for imm_wr. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20210212134525.103456-2-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16RDMA/ipoib: Remove racy Subnet Manager sendonly join checksChristoph Lameter
When a system receives a REREG event from the SM, then the SM information in the kernel is marked as invalid and a request is sent to the SM to update the information. The SM information is invalid in that time period. However, receiving a REREG also occurs simultaneously in user space applications that are now trying to rejoin the multicast groups. Some of those may be sendonly multicast groups which are then failing. If the SM information is invalid then ib_sa_sendonly_fullmem_support() returns false. That is wrong because it just means that we do not know yet if the potentially new SM supports sendonly joins. Sendonly join was introduced in 2015 and all the Subnet managers have supported it ever since. So there is no point in checking if a subnet manager supports it. Should an old opensm get a request for a sendonly join then the request will fail. The code that is removed here accomodated that situation and fell back to a full join. Falling back to a full join is problematic in itself. The reason to use the sendonly join was to reduce the traffic on the Infiniband fabric otherwise one could have just stayed with the regular join. So this patch may cause users of very old opensms to discover that lots of traffic needlessly crosses their IB fabrics. Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2101281845160.13303@www.lameter.com Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19IB/iser: Simplify prot_caps settingMax Gurtovoy
Reduce the number of instructions made for setting protection caps. No need to do bitwise OR with 0 since we can zero the return value in the beginning of the function. Link: https://lore.kernel.org/r/20210111145754.56727-5-mgurtovoy@nvidia.com Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19IB/iser: Enforce iser_max_sectors to be greater than 0Max Gurtovoy
A value of 0 will casue the driver to fail establishing a valid connection to remote target. The following can be seen in the log in this case: iser: iser_connect: connecting to: 1.1.1.88:3260 iser: iser_cma_handler: address resolved (0): status 0 conn 00000000090aa4de id 00000000167d3b5a iser: iser_cma_handler: route resolved (2): status 0 conn 00000000090aa4de id 00000000167d3b5a iser: iscsi_iser_ep_poll: iser conn 00000000090aa4de rc = 0 iser: iser_create_ib_conn_res: setting conn 00000000090aa4de cma_id 00000000167d3b5a qp 00000000efa80660 max_send_wr 4619 iser_cma_handler: established (9): status 0 conn 00000000090aa4de id 00000000167d3b5a iser: iser_connected_handler: remote qpn:1c7 my qpn:1c6 iser: iser_connected_handler: conn 00000000090aa4de: negotiated remote invalidation iser: iscsi_iser_ep_poll: iser conn 00000000090aa4de rc = 1 scsi host10: iSCSI Initiator over iSER mlx5_core 0000:07:00.0: mlx5_cmd_check:769:(pid 616473): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f) iser: iser_create_fastreg_desc: Failed to allocate ib_fast_reg_mr err=-22 iser: iser_alloc_rx_descriptors: failed allocating rx descriptors / data buffers iser: iscsi_iser_ep_disconnect: ep 00000000d2040785 iser conn 00000000090aa4de iser: iser_conn_terminate: iser_conn 00000000090aa4de state 3 iser: iser_free_ib_conn_res: freeing conn 00000000090aa4de cma_id 00000000167d3b5a qp 00000000efa80660 iser: iser_device_try_release: device 00000000dc871b1b refcount 0 Link: https://lore.kernel.org/r/20210111145754.56727-4-mgurtovoy@nvidia.com Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19IB/iser: Protect iscsi_max_lun module param using callbackMax Gurtovoy
Remove the check from the module_init function. Link: https://lore.kernel.org/r/20210111145754.56727-3-mgurtovoy@nvidia.com Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19IB/iser: Remove unneeded semicolonsMax Gurtovoy
No need to add semicolon after closing bracket. Link: https://lore.kernel.org/r/20210111145754.56727-2-mgurtovoy@nvidia.com Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18IB/isert: Simplify signature cap checkMax Gurtovoy
Use if/else clause instead of "condition ? val1 : val2" to make the code cleaner and simpler. Link: https://lore.kernel.org/r/20210110111903.486681-3-mgurtovoy@nvidia.com Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18IB/isert: Remove unneeded semicolonMax Gurtovoy
No need to add semicolon after closing bracket. Link: https://lore.kernel.org/r/20210110111903.486681-2-mgurtovoy@nvidia.com Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18IB/isert: Remove unneeded new linesMax Gurtovoy
The Linux convention is to have only 1 new line between functions. Link: https://lore.kernel.org/r/20210110111903.486681-1-mgurtovoy@nvidia.com Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs: Fix KASAN: stack-out-of-bounds bugJack Wang
When KASAN is enabled, we notice warning below: [ 483.436975] ================================================================== [ 483.437234] BUG: KASAN: stack-out-of-bounds in _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib] [ 483.437430] Read of size 4 at addr ffff88a195fd7d30 by task kworker/1:3/6954 [ 483.437731] CPU: 1 PID: 6954 Comm: kworker/1:3 Kdump: loaded Tainted: G O 5.4.82-pserver #5.4.82-1+feature+linux+5.4.y+dbg+20201210.1532+987e7a6~deb10 [ 483.437976] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 3.3 02/21/2020 [ 483.438168] Workqueue: rtrs_server_wq hb_work [rtrs_core] [ 483.438323] Call Trace: [ 483.438486] dump_stack+0x96/0xe0 [ 483.438646] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib] [ 483.438802] print_address_description.constprop.6+0x1b/0x220 [ 483.438966] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib] [ 483.439133] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib] [ 483.439285] __kasan_report.cold.9+0x1a/0x32 [ 483.439444] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib] [ 483.439597] kasan_report+0x10/0x20 [ 483.439752] _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib] [ 483.439910] ? update_sd_lb_stats+0xfb1/0xfc0 [ 483.440073] ? set_reg_wr+0x520/0x520 [mlx5_ib] [ 483.440222] ? update_group_capacity+0x340/0x340 [ 483.440377] ? find_busiest_group+0x314/0x870 [ 483.440526] ? update_sd_lb_stats+0xfc0/0xfc0 [ 483.440683] ? __bitmap_and+0x6f/0x100 [ 483.440832] ? __lock_acquire+0xa2/0x2150 [ 483.440979] ? __lock_acquire+0xa2/0x2150 [ 483.441128] ? __lock_acquire+0xa2/0x2150 [ 483.441279] ? debug_lockdep_rcu_enabled+0x23/0x60 [ 483.441430] ? lock_downgrade+0x390/0x390 [ 483.441582] ? __lock_acquire+0xa2/0x2150 [ 483.441729] ? __lock_acquire+0xa2/0x2150 [ 483.441876] ? newidle_balance+0x425/0x8f0 [ 483.442024] ? __lock_acquire+0xa2/0x2150 [ 483.442172] ? debug_lockdep_rcu_enabled+0x23/0x60 [ 483.442330] hb_work+0x15d/0x1d0 [rtrs_core] [ 483.442479] ? schedule_hb+0x50/0x50 [rtrs_core] [ 483.442627] ? lock_downgrade+0x390/0x390 [ 483.442781] ? process_one_work+0x40d/0xa50 [ 483.442931] process_one_work+0x4ee/0xa50 [ 483.443082] ? pwq_dec_nr_in_flight+0x110/0x110 [ 483.443231] ? do_raw_spin_lock+0x119/0x1d0 [ 483.443383] worker_thread+0x65/0x5c0 [ 483.443532] ? process_one_work+0xa50/0xa50 [ 483.451839] kthread+0x1e2/0x200 [ 483.451983] ? kthread_create_on_node+0xc0/0xc0 [ 483.452139] ret_from_fork+0x3a/0x50 The problem is we use wrong type when send wr, hw driver expect the type of IB_WR_RDMA_WRITE_WITH_IMM wr should be ib_rdma_wr, and doing container_of to access member. The fix is simple use ib_rdma_wr instread of ib_send_wr. Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules") Link: https://lore.kernel.org/r/20201217141915.56989-20-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-srv: Init wr_cnt as 1Jack Wang
Fix up wr_avail accounting. if wr_cnt is 0, then we do SIGNAL for first wr, in completion we add queue_depth back, which is not right in the sense of tracking for available wr. So fix it by init wr_cnt to 1. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-19-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-srv: Do not signal REG_MRJack Wang
We do not need to wait for REG_MR completion, so remove the SIGNAL flag. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-18-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Use bitmask to check sess->flagsJack Wang
We may want to add new flags, so it's better to use bitmask to check flags. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-17-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs: Do not signal for heatbeatJack Wang
For HB, there is no need to generate signal for completion. Also remove a comment accordingly. Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules") Link: https://lore.kernel.org/r/20201217141915.56989-16-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reported-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Refactor the failure cases in alloc_cltGuoqing Jiang
Make all failure cases go to the common path to avoid duplicate code. And some issued existed before. 1. clt need to be freed to avoid memory leak. 2. return ERR_PTR(-ENOMEM) if kobject_create_and_add fails, because rtrs_clt_open checks the return value of by call "IS_ERR(clt)". Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-15-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-srv: Fix missing wr_cqeJack Wang
We had a few places wr_cqe is not set, which could lead to NULL pointer deref or GPF in error case. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-14-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Rename __rtrs_clt_change_state to rtrs_clt_change_stateGuoqing Jiang
Let's rename it to rtrs_clt_change_state since the previous one is killed. Also update the comment to make it more clear. Link: https://lore.kernel.org/r/20201217141915.56989-13-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Kill rtrs_clt_change_stateGuoqing Jiang
It is just a wrapper of rtrs_clt_change_state_get_old, and we can reuse rtrs_clt_change_state_get_old with add the checking of 'old_state' is valid or not. Link: https://lore.kernel.org/r/20201217141915.56989-12-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Remove unnecessary 'goto out'Guoqing Jiang
This is not needed since the label is just after the place. Link: https://lore.kernel.org/r/20201217141915.56989-11-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Kill wait_for_inflight_permitsGuoqing Jiang
Let's wait the inflight permits before free it. Link: https://lore.kernel.org/r/20201217141915.56989-10-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Consolidate rtrs_clt_destroy_sysfs_root_{folder,files}Guoqing Jiang
Since the two functions are called together, let's consolidate them in a new function rtrs_clt_destroy_sysfs_root. Link: https://lore.kernel.org/r/20201217141915.56989-9-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs: Call kobject_put in the failure pathGuoqing Jiang
Per the comment of kobject_init_and_add, we need to free the memory by call kobject_put. Fixes: 215378b838df ("RDMA/rtrs: client: sysfs interface functions") Fixes: 91b11610af8d ("RDMA/rtrs: server: sysfs interface functions") Link: https://lore.kernel.org/r/20201217141915.56989-8-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-srv: Jump to dereg_mr label if allocate iu failsGuoqing Jiang
The rtrs_iu_free is called in rtrs_iu_alloc if memory is limited, so we don't need to free the same iu again. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-7-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Set mininum limit when create QPJack Wang
Currently rtrs when create_qp use a coarse numbers (bigger in general), which leads to hardware create more resources which only waste memory with no benefits. - SERVICE con, For max_send_wr/max_recv_wr, it's 2 times SERVICE_CON_QUEUE_DEPTH + 2 - IO con For max_send_wr/max_recv_wr, it's sess->queue_depth * 3 + 1 Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-6-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnectJack Wang
Remove self first to avoid deadlock, we don't want to use close_work to remove sess sysfs. Fixes: 91b11610af8d ("RDMA/rtrs: server: sysfs interface functions") Link: https://lore.kernel.org/r/20201217141915.56989-5-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-srv: Release lock before call into close_sessJack Wang
In this error case, we don't need hold mutex to call close_sess. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-4-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs: Extend ibtrs_cq_qp_createJack Wang
rtrs does not have same limit for both max_send_wr and max_recv_wr, To allow client and server set different values, export in a separate parameter for rtrs_cq_qp_create. Also fix the type accordingly, u32 should be used instead of u16. Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules") Link: https://lore.kernel.org/r/20201217141915.56989-2-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-07RDMA: Convert comma to semicolonZheng Yongjun
Replace a comma between expression statements by a semicolon. Link: https://lore.kernel.org/r/20201214134118.4349-1-zhengyongjun3@huawei.com Link: https://lore.kernel.org/r/20201214134146.4456-1-zhengyongjun3@huawei.com Link: https://lore.kernel.org/r/20201214134218.4510-1-zhengyongjun3@huawei.com Link: https://lore.kernel.org/r/20201214134243.4563-1-zhengyongjun3@huawei.com Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-16block/rnbd-clt: Does not request pdu to rtrs-cltGioh Kim
Previously the rnbd client requested the rtrs to allocate rnbd_iu just after the rtrs_iu. So the rnbd client passes the size of rnbd_iu for rtrs_clt_open() and rtrs creates an array of rnbd_iu and rtrs_iu. For IO handling, rnbd_iu exists after the request because we pass the size of rnbd_iu when setting the tag-set. Therefore we do not use the rnbd_iu allocated by rtrs for IO handling. We only use the rnbd_iu allocated by rtrs when doing session initialization. Almost all rnbd_iu allocated by rtrs are wasted. By this patch the rnbd client does not request rnbd_iu allocation to rtrs but allocate it for itself when doing session initialization. Also remove unused rtrs_permit_to_pdu from rtrs. Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "A smaller set of patches, nothing stands out as being particularly major this cycle. The biggest item would be the new HIP09 HW support from HNS, otherwise it was pretty quiet for new work here: - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw, cxgb4, mlx4 and mlx5 - Bug fixes and polishing for the new rts ULP - Cleanup of uverbs checking for allowed driver operations - Use sysfs_emit all over the place - Lots of bug fixes and clarity improvements for hns - hip09 support for hns - NDR and 50/100Gb signaling rates - Remove dma_virt_ops and go back to using the IB DMA wrappers - mlx5 optimizations for contiguous DMA regions" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (147 commits) RDMA/cma: Don't overwrite sgid_attr after device is released RDMA/mlx5: Fix MR cache memory leak RDMA/rxe: Use acquire/release for memory ordering RDMA/hns: Simplify AEQE process for different types of queue RDMA/hns: Fix inaccurate prints RDMA/hns: Fix incorrect symbol types RDMA/hns: Clear redundant variable initialization RDMA/hns: Fix coding style issues RDMA/hns: Remove unnecessary access right set during INIT2INIT RDMA/hns: WARN_ON if get a reserved sl from users RDMA/hns: Avoid filling sl in high 3 bits of vlan_id RDMA/hns: Do shift on traffic class when using RoCEv2 RDMA/hns: Normalization the judgment of some features RDMA/hns: Limit the length of data copied between kernel and userspace RDMA/mlx4: Remove bogus dev_base_lock usage RDMA/uverbs: Fix incorrect variable type RDMA/core: Do not indicate device ready when device enablement fails RDMA/core: Clean up cq pool mechanism RDMA/core: Update kernel documentation for ib_create_named_qp() MAINTAINERS: SOFT-ROCE: Change Zhu Yanjun's email address ...
2020-12-16Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, qla2xxx, smartpqi, target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of cleanups, a major power management rework and a load of assorted minor updates. There are a few core updates (formatting fixes being the big one) but nothing major this cycle" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits) scsi: mpt3sas: Update driver version to 36.100.00.00 scsi: mpt3sas: Handle trigger page after firmware update scsi: mpt3sas: Add persistent MPI trigger page scsi: mpt3sas: Add persistent SCSI sense trigger page scsi: mpt3sas: Add persistent Event trigger page scsi: mpt3sas: Add persistent Master trigger page scsi: mpt3sas: Add persistent trigger pages support scsi: mpt3sas: Sync time periodically between driver and firmware scsi: qla2xxx: Update version to 10.02.00.104-k scsi: qla2xxx: Fix device loss on 4G and older HBAs scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry scsi: qla2xxx: Fix the call trace for flush workqueue scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines scsi: qla2xxx: Handle aborts correctly for port undergoing deletion scsi: qla2xxx: Fix N2N and NVMe connect retry failure scsi: qla2xxx: Fix FW initialization error on big endian machines scsi: qla2xxx: Fix crash during driver load on big endian machines scsi: qla2xxx: Fix compilation issue in PPC systems scsi: qla2xxx: Don't check for fw_started while posting NVMe command scsi: qla2xxx: Tear down session if FW say it is down ...
2020-12-07RDMA/iser: Remove in_interrupt() usageSebastian Andrzej Siewior
iser_initialize_task_headers() uses in_interrupt() to find out if it is safe to acquire a mutex. in_interrupt() is deprecated as it is ill defined and does not provide what it suggests. Aside of that it covers only parts of the contexts in which a mutex may not be acquired. The following callchains exist: iscsi_queuecommand() *locks* iscsi_session::frwd_lock -> iscsi_prep_scsi_cmd_pdu() -> session->tt->init_task() (iscsi_iser_task_init()) -> iser_initialize_task_headers() -> iscsi_iser_task_xmit() (iscsi_transport::xmit_task) -> iscsi_iser_task_xmit_unsol_data() -> iser_send_data_out() -> iser_initialize_task_headers() iscsi_data_xmit() *locks* iscsi_session::frwd_lock -> iscsi_prep_mgmt_task() -> session->tt->init_task() (iscsi_iser_task_init()) -> iser_initialize_task_headers() -> iscsi_prep_scsi_cmd_pdu() -> session->tt->init_task() (iscsi_iser_task_init()) -> iser_initialize_task_headers() __iscsi_conn_send_pdu() caller has iscsi_session::frwd_lock -> iscsi_prep_mgmt_task() -> session->tt->init_task() (iscsi_iser_task_init()) -> iser_initialize_task_headers() -> session->tt->xmit_task() ( The only callchain that is close to be invoked in preemptible context: iscsi_xmitworker() worker -> iscsi_data_xmit() -> iscsi_xmit_task() -> conn->session->tt->xmit_task() (iscsi_iser_task_xmit() In iscsi_iser_task_xmit() there is this check: if (!task->sc) return iscsi_iser_mtask_xmit(conn, task); so it does end up in iser_initialize_task_headers() and iser_initialize_task_headers() relies on iscsi_task::sc == NULL. Remove conditional locking of iser_conn::state_mutex because there is no call chain to do so. Remove the goto label and return early now that there is no clean up needed. Link: https://lore.kernel.org/r/20201204174256.62xfcvudndt7oufl@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Max Gurtovoy <maxg@nvidia.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: linux-rdma@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-07IB: Fix kernel-doc markupsMauro Carvalho Chehab
Some functions have different names between their prototypes and the kernel-doc markup. Others need to be fixed, as kernel-doc markups should use this format: identifier - description Link: https://lore.kernel.org/r/78b98c41a5a0f4c0106433d305b143028a4168b0.1606823973.git.mchehab+huawei@kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-20RDMA/ipoib: Distribute cq completion vector betterJack Wang
Currently ipoib choose cq completion vector based on port number, when HCA only have one port, all the interface recv queue completion are bind to cq completion vector 0. To better distribute the load, use same method as __ib_alloc_cq_any to choose completion vector, with the change, each interface now use different completion vectors. Link: https://lore.kernel.org/r/20201013074342.15867-1-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-17Merge branch 'for-rc' into rdma.gitJason Gunthorpe
From https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git The rc RDMA branch is needed due to dependencies on the next patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12IB/isert: Do not excplicitly check == false for boolZou Wei
It is not the kernel style, warning reported by coccicheck: ./ib_isert.c:1104:12-24: WARNING: Comparison to bool Link: https://lore.kernel.org/r/1604404674-32998-1-git-send-email-zou_wei@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>