summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2020-11-16RDMA/mlx5: Use mlx5_umem_find_best_quantized_pgoff() for WQJason Gunthorpe
This fixes a subtle bug, the WQ mailbox has only 5 bits to describe the page_offset, while mlx5_ib_get_buf_offset() is hard wired to only work with 6 bit page_offsets. Thus it did not properly reject badly aligned buffers. Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs") Fixes: 0fb2ed66a14c ("IB/mlx5: Add create and destroy functionality for Raw Packet QP") Link: https://lore.kernel.org/r/20201115114311.136250-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16RDMA/mlx5: Use ib_umem_find_best_pgoff() for SRQJason Gunthorpe
SRQ uses a quantized and scaled page_offset, which is another variation of ib_umem_find_best_pgsz(). Add mlx5_umem_find_best_quantized_pgoff() to perform this calculation for each mailbox. A macro shows how the calculation is directly connected to the mailbox format. This new routine replaces the limited mlx5_ib_cont_pages() and mlx5_ib_get_buf_offset() pairing which would reject valid configurations rather than adjust the page_size to make it work. In turn this is much more aggressive about choosing large page sizes for these objects and when THP is enabled it will now often find a single page solution. Link: https://lore.kernel.org/r/20201115114311.136250-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16RDMA/efa: Remove .create_ah callback assignmentGal Pressman
Drivers now expose two callbacks for address handle creation, one for uverbs and one for kverbs. EFA only supports uverbs so the .create_ah assignment can be removed. Fix the core code caller to check the proper function pointer. Link: https://lore.kernel.org/r/20201115103404.48829-3-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16RDMA/hns: Add new PCI device ID matching for HIP09Lang Cheng
The 200G device has a new device ID 0xA228, add it to the PCI table. Link: https://lore.kernel.org/r/1605187184-26079-1-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16RDMA/cma: Add missing error handling of listen_idLeon Romanovsky
Don't silently continue if rdma_listen() fails but destroy previously created CM_ID and return an error to the caller. Fixes: d02d1f5359e7 ("RDMA/cma: Fix deadlock destroying listen requests") Link: https://lore.kernel.org/r/20201104144008.3808124-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16RDMA/restrack: Store all special QPs in restrack DBLeon Romanovsky
Special QPs (SMI and GSI) have different rules in regards of their QP numbers. While all other QP numbers are unique per-device, the QP0 and QP1 are created per-port as requested by IBTA. In multiple port devices, the number of SMI and GSI QPs with be equal to the number ports. $ rdma dev 0: ibp0s9: node_type ca fw 4.4.9999 node_guid 5254:00c0:fe12:3455 sys_image_guid 5254:00c0:fe12:3455 $ rdma link 0/1: ibp0s9/1: subnet_prefix fe80:0000:0000:0000 lid 13397 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP 0/2: ibp0s9/2: subnet_prefix fe80:0000:0000:0000 lid 13397 sm_lid 49151 lmc 0 state UNKNOWN physical_state UNKNOWN Before: $ rdma res show qp type SMI,GSI link ibp0s9/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core] link ibp0s9/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core] After: $ rdma res show qp type SMI,GSI link ibp0s9/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core] link ibp0s9/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core] link ibp0s9/2 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core] link ibp0s9/2 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core] Link: https://lore.kernel.org/r/20201104144008.3808124-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16RDMA/counter: Combine allocation and bind logicLeon Romanovsky
RDMA counters are allocated and bounded to QP immediately after that. Only after this two step process they are really usable. By combining the logic, we are ensuring that once counter is returned to the caller, it will have everything set. Link: https://lore.kernel.org/r/20201104144008.3808124-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16treewide: rename nla_strlcpy to nla_strscpy.Francis Laniel
Calls to nla_strlcpy are now replaced by calls to nla_strscpy which is the new name of this function. Signed-off-by: Francis Laniel <laniel_francis@privacyrequired.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13IB/hfi1: Fix error return code in hfi1_init_dd()Zhang Changzhong
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev") Link: https://lore.kernel.org/r/1605249747-17942-1-git-send-email-zhangchangzhong@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12IB/hfi1: switch to core handling of rx/tx byte/packet countersHeiner Kallweit
Use netdev->tstats instead of a member of hfi1_ipoib_dev_priv for storing a pointer to the per-cpu counters. This allows us to use core functionality for statistics handling. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12RDMA/hns: Fix double free of the pointer to TSQ/TPQWeihang Li
A return statement is omitted after getting HEM table, then the newly allocated pointer will be freed directly, which will cause a calltrace when the driver was removed. Fixes: d6d91e46210f ("RDMA/hns: Add support for configuring GMV table") Link: https://lore.kernel.org/r/1605180582-46504-1-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12RDMA/umem: Use ib_dma_max_seg_size instead of dma_get_max_seg_sizeChristoph Hellwig
RDMA ULPs must not call DMA mapping APIs directly but instead use the ib_dma_* wrappers. Fixes: 0c16d9635e3a ("RDMA/umem: Move to allocate SG table from pages") Link: https://lore.kernel.org/r/20201106181941.1878556-3-hch@lst.de Reported-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12RMDA/sw: Don't allow drivers using dma_virt_ops on highmem configsChristoph Hellwig
dma_virt_ops requires that all pages have a kernel virtual address. Introduce a INFINIBAND_VIRT_DMA Kconfig symbol that depends on !HIGHMEM and make all three drivers depend on the new symbol. Also remove the ARCH_DMA_ADDR_T_64BIT dependency, which has been obsolete since commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig") Fixes: 551199aca1c3 ("lib/dma-virt: Add dma_virt_ops") Link: https://lore.kernel.org/r/20201106181941.1878556-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12RDMA/pvrdma: Fix missing kfree() in pvrdma_register_device()Qinglang Miao
Fix missing kfree in pvrdma_register_device() when failure from ib_device_set_netdev(). Fixes: 4b38da75e089 ("RDMA/drivers: Convert easy drivers to use ib_device_set_netdev()") Link: https://lore.kernel.org/r/20201111032202.17925-1-miaoqinglang@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12RDMa/mthca: Work around -Wenum-conversion warningArnd Bergmann
gcc points out a suspicious mixing of enum types in a function that converts from MTHCA_OPCODE_* values to IB_WC_* values: drivers/infiniband/hw/mthca/mthca_cq.c: In function 'mthca_poll_one': drivers/infiniband/hw/mthca/mthca_cq.c:607:21: warning: implicit conversion from 'enum <anonymous>' to 'enum ib_wc_opcode' [-Wenum-conversion] 607 | entry->opcode = MTHCA_OPCODE_INVALID; Nothing seems to ever check for MTHCA_OPCODE_INVALID again, no idea if this is meaningful, but it seems harmless as it deals with an invalid input. Remove MTHCA_OPCODE_INVALID and set the ib_wc_opcode to 0xFF, which is still bogus, but at least doesn't make compiler warnings. Fixes: 2a4443a69934 ("[PATCH] IB/mthca: fill in opcode field for send completions") Link: https://lore.kernel.org/r/20201026211311.3887003-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12RDMA/core: Make FD destroy callback voidLeon Romanovsky
All FD object destroy implementations return 0, so declare this callback void. Link: https://lore.kernel.org/r/20201104144556.3809085-3-leon@kernel.org Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12RDMA/core: Postpone uobject cleanup on failure till FD closeLeon Romanovsky
Remove the ib_is_destroyable_retryable() concept. The idea here was to allow the drivers to forcibly clean the HW object even if they otherwise didn't want to (eg because of usecnt). This was an attempt to clean up in a world where drivers were not allowed to fail HW object destruction. Now that we are going back to allowing HW objects to fail destroy this doesn't make sense. Instead if a uobject's HW object can't be destroyed it is left on the uobject list and it is up to uverbs_destroy_ufile_hw() to clean it. Multiple passes over the uobject list allow hidden dependencies to be resolved. If that fails the HW driver is broken, throw a WARN_ON and leak the HW object memory. All the other tricky failure paths (eg on creation error unwind) have already been updated to this new model. Link: https://lore.kernel.org/r/20201104144556.3809085-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12RDMA/cm: Make the local_id_table xarray non-irqJason Gunthorpe
The xarray is never mutated from an IRQ handler, only from work queues under a spinlock_irq. Thus there is no reason for it be an IRQ type xarray. This was copied over from the original IDR code, but the recent rework put the xarray inside another spinlock_irq which will unbalance the unlocking. Fixes: c206f8bad15d ("RDMA/cm: Make it clearer how concurrency works in cm_req_handler()") Link: https://lore.kernel.org/r/0-v1-808b6da3bd3f+1857-cm_xarray_no_irq_jgg@nvidia.com Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12IB/isert: Do not excplicitly check == false for boolZou Wei
It is not the kernel style, warning reported by coccicheck: ./ib_isert.c:1104:12-24: WARNING: Comparison to bool Link: https://lore.kernel.org/r/1604404674-32998-1-git-send-email-zou_wei@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12RDMA/rxe: Remove VLAN code leftovers from RXEZhu Yanjun
Since the commit fd49ddaf7e26 ("RDMA/rxe: prevent rxe creation on top of vlan interface") does not permit rxe on top of vlan device, all the stuff related with vlan should be removed. Fixes: fd49ddaf7e26 ("RDMA/rxe: prevent rxe creation on top of vlan interface") Link: https://lore.kernel.org/r/1604326422-18625-1-git-send-email-yanjunz@nvidia.com Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "A few more merge window regressions that didn't make rc1: - New validation in the DMA layer triggers wrong use of the DMA layer in rxe, siw and rdmavt - Accidental change of a hypervisor facing ABI when widening the port speed u8 to u16 in vmw_pvrdma - Memory leak on error unwind in SRP target" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/srpt: Fix typo in srpt_unregister_mad_agent docstring RDMA/vmw_pvrdma: Fix the active_speed and phys_state value IB/srpt: Fix memory leak in srpt_add_one RDMA: Fix software RDMA drivers for dma mapping error
2020-11-05RDMA/srpt: Fix typo in srpt_unregister_mad_agent docstringJason Gunthorpe
htmldocs fails with: drivers/infiniband/ulp/srpt/ib_srpt.c:630: warning: Function parameter or member 'port_cnt' not described in 'srpt_unregister_mad_agent' Fixes: 372a1786283e ("IB/srpt: Fix memory leak in srpt_add_one") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-04scsi: target: Drop sess_cmd_lock from I/O pathMike Christie
Drop the sess_cmd_lock by: - Removing the sess_cmd_list use from LIO core, because it's been moved to qla2xxx. - Removing sess_tearing_down check in the I/O path. Instead of using that bit and the sess_cmd_lock, we rely on the cmd_count percpu ref. To do this we switch to percpu_ref_kill_and_confirm/percpu_ref_tryget_live. Link: https://lore.kernel.org/r/1604257174-4524-7-git-send-email-michael.christie@oracle.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-04scsi: target: Rename cmd.bad_sector to cmd.sense_infoDavid Disseldorp
cmd.bad_sector currently gets packed into the sense INFORMATION field for TCM_LOGICAL_BLOCK_{GUARD,APP_TAG,REF_TAG}_CHECK_FAILED errors, which carry an .add_sector_info flag in the sense_detail_table to ensure this. In preparation for propagating a byte offset on COMPARE AND WRITE TCM_MISCOMPARE_VERIFY error, rename cmd.bad_sector to cmd.sense_info and sense_detail.add_sector_info to sense_detail.add_sense_info so that it better reflects the sense INFORMATION field destination. [ddiss: update previously overlooked ib_isert] Link: https://lore.kernel.org/r/20201031233211.5207-3-ddiss@suse.de Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-03Merge tag 'docs-5.10-warnings' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation build warning fixes from Jonathan Corbet: "This contains a series of warning fixes from Mauro; once applied, the number of warnings from the once-noisy docs build process is nearly zero. Getting to this point has required a lot of work; once there, hopefully we can keep things that way. I have packaged this as a separate pull because it does a fair amount of reaching outside of Documentation/. The changes are all in comments and in code placement. It's all been in linux-next since last week" * tag 'docs-5.10-warnings' of git://git.lwn.net/linux: (24 commits) docs: SafeSetID: fix a warning amdgpu: fix a few kernel-doc markup issues selftests: kselftest_harness.h: fix kernel-doc markups drm: amdgpu_dm: fix a typo gpu: docs: amdgpu.rst: get rid of wrong kernel-doc markups drm: amdgpu: kernel-doc: update some adev parameters docs: fs: api-summary.rst: get rid of kernel-doc include IB/srpt: docs: add a description for cq_size member locking/refcount: move kernel-doc markups to the proper place docs: lockdep-design: fix some warning issues MAINTAINERS: fix broken doc refs due to yaml conversion ice: docs fix a devlink info that broke a table crypto: sun8x-ce*: update entries to its documentation net: phy: remove kernel-doc duplication mm: pagemap.h: fix two kernel-doc markups blk-mq: docs: add kernel-doc description for a new struct member docs: userspace-api: add iommu.rst to the index file docs: hwmon: mp2975.rst: address some html build warnings docs: net: statistics.rst: remove a duplicated kernel-doc docs: kasan.rst: add two missing blank lines ...
2020-11-02RDMA/vmw_pvrdma: Fix the active_speed and phys_state valueAdit Ranadive
The pvrdma_port_attr structure is ABI toward the hypervisor, changing it breaks the ability to report the speed properly. Revert the change to u16. Fixes: 376ceb31ff87 ("RDMA: Fix link active_speed size") Link: https://lore.kernel.org/r/20201102225437.26557-1-aditr@vmware.com Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02IB/mlx5: Add support for NDR link speedMeir Lichtinger
The IBTA specification has new speed - NDR. That speed supports signaling rate of 100Gb. mlx5 IB driver translates link modes reported by ConnectX device to IB speed and width. Added translation of new 100Gb, 200Gb and 400Gb link modes to NDR IB type and width of x1, x2 or x4 respectively. Link: https://lore.kernel.org/r/20201026133738.1340432-3-leon@kernel.org Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02IB/core: Add support for NDR link speedMeir Lichtinger
Add new IBTA speed NDR, supporting signaling rate of 100Gb. Link: https://lore.kernel.org/r/20201026133738.1340432-2-leon@kernel.org Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/ipoib: Add 50Gb and 100Gb link speeds to ethtoolMeir Lichtinger
The IBTA specification has new speeds - HDR and NDR, supporting signaling rate of 50Gb and 100Gb respectively. ethtool support of ipoib driver translates IB speed to signaling rate. Added translation of HDR and NDR IB types to rates of 50Gb and 100Gb ethernet speed. Link: https://lore.kernel.org/r/20201026132904.1338526-1-leon@kernel.org Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/rxe,siw: Restore uverbs_cmd_mask IB_USER_VERBS_CMD_POST_SENDJason Gunthorpe
These two drivers open code the call to POST_SEND and do not use the rdma-core wrapper to do it, thus their usages was missed during the audit. Both drivers use this as a doorbell to signal the kernel to start DMA. Fixes: 628c02bf38aa ("RDMA: Remove uverbs cmds from drivers that don't use them") Link: https://lore.kernel.org/r/0-v1-4608c5610afa+fb-uverbs_cmd_post_send_fix_jgg@nvidia.com Reported-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/siw: Fix typo of EAGAIN not -EAGAIN in siw_cm_work_handler()Zhang Qilong
The rv cannot be 'EAGAIN' in the previous path, we should use '-EAGAIN' to check it. For example: Call trace: ->siw_cm_work_handler ->siw_proc_mpareq ->siw_recv_mpa_rr Link: https://lore.kernel.org/r/20201028122509.47074-1-zhangqilong3@huawei.com Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02IB/srpt: Fix memory leak in srpt_add_oneMaor Gottlieb
Failure in srpt_refresh_port() for the second port will leave MAD registered for the first one, however, the srpt_add_one() will be marked as "failed" and SRPT will leak resources for that registered but not used and released first port. Unregister the MAD agent for all ports in case of failure. Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1") Link: https://lore.kernel.org/r/20201028065051.112430-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA: Fix software RDMA drivers for dma mapping errorParav Pandit
The commit f959dcd6ddfd ("dma-direct: Fix potential NULL pointer dereference") made dma_mask as mandetory field to be setup even for dma_virt_ops based dma devices. The commit in the fixes tag omitted setting up the dma_mask on virtual devices triggering the below trace when they were combined during the merge window. Fix it by setting empty DMA MASK for software based RDMA devices. WARNING: CPU: 1 PID: 8488 at kernel/dma/mapping.c:149 dma_map_page_attrs+0x493/0x700 CPU: 1 PID: 8488 Comm: syz-executor144 Not tainted 5.9.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:dma_map_page_attrs+0x493/0x700 kernel/dma/mapping.c:149 Trace: dma_map_single_attrs include/linux/dma-mapping.h:279 [inline] ib_dma_map_single include/rdma/ib_verbs.h:3967 [inline] ib_mad_post_receive_mads+0x23f/0xd60 drivers/infiniband/core/mad.c:2715 ib_mad_port_start drivers/infiniband/core/mad.c:2862 [inline] ib_mad_port_open drivers/infiniband/core/mad.c:3016 [inline] ib_mad_init_device+0x72b/0x1400 drivers/infiniband/core/mad.c:3092 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:680 enable_device_and_get+0x1d5/0x3c0 drivers/infiniband/core/device.c:1301 ib_register_device drivers/infiniband/core/device.c:1376 [inline] ib_register_device+0x7a7/0xa40 drivers/infiniband/core/device.c:1335 rxe_register_device+0x46d/0x570 drivers/infiniband/sw/rxe/rxe_verbs.c:1182 rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247 rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:507 rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline] rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250 nldev_newlink+0x30e/0x540 drivers/infiniband/core/nldev.c:1555 rdma_nl_rcv_msg+0x367/0x690 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x2f2/0x440 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:671 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2353 ___sys_sendmsg+0xf3/0x170 net/socket.c:2407 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2440 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x443699 Link: https://lore.kernel.org/r/20201030093803.278830-1-parav@nvidia.com Reported-by: syzbot+34dc2fea3478e659af01@syzkaller.appspotmail.com Fixes: e0477b34d9d1 ("RDMA: Explicitly pass in the dma_device to ib_register_device") Signed-off-by: Parav Pandit <parav@nvidia.com> Tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Zhu Yanjun <yanjunz@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Use ib_umem_find_best_pgsz() for mkc'sJason Gunthorpe
Now that all the PAS arrays or UMR XLT's for mkcs are filled using rdma_for_each_block() we can use the common ib_umem_find_best_pgsz() algorithm. Link: https://lore.kernel.org/r/20201026132314.1336717-6-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Split mlx5_ib_update_xlt() into ODP and non-ODP casesJason Gunthorpe
Mixing these together is just a mess, make a dedicated version, mlx5_ib_update_mr_pas(), which directly loads the whole MTT for a non-ODP MR. The split out version can trivially use a simple loop with rdma_for_each_block() which allows using the core code to compute the MR pages and avoids seeking in the SGL list after each chunk as the __mlx5_ib_populate_pas() call required. Significantly speeds loading large MTTs. Link: https://lore.kernel.org/r/20201026132314.1336717-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Split the WR setup out of mlx5_ib_update_xlt()Jason Gunthorpe
The memory allocation is quite complicated, and makes this function hard to understand. Refactor things so that a function call sets up the WR, SG, DMA mapping and buffer, further splitting that into buffer and DMA/wr. This also slightly changes the buffer allocation logic to try an order 0 page allocation (with OOM warnings on) before going to the emergency page. Link: https://lore.kernel.org/r/20201026132314.1336717-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Move xlt_emergency_page_mutex into mr.cJason Gunthorpe
This is the only user, so remove the wrappers. Link: https://lore.kernel.org/r/20201026132314.1336717-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Change mlx5_ib_populate_pas() to use rdma_for_each_block()Jason Gunthorpe
This routine converts the umem SGL into a list of fixed pages for DMA, which is exactly what rdma_umem_for_each_dma_block() is for, use the common code directly. Link: https://lore.kernel.org/r/20201026132314.1336717-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Remove npages from mlx5_ib_cont_pages()Jason Gunthorpe
Most callers don't need this, and the few that do can get it as ib_umem_num_pages(umem). Link: https://lore.kernel.org/r/20201026131936.1335664-8-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Remove ncont from mlx5_ib_cont_pages()Jason Gunthorpe
This is the same as ib_umem_num_dma_blocks(umem, 1UL << page_shift), have the callers compute it directly. Link: https://lore.kernel.org/r/20201026131936.1335664-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Remove order from mlx5_ib_cont_pages()Jason Gunthorpe
Only alloc_mr_from_cache() needs order and can trivially compute it, so lift it to the one call site and remove the NULL arguments. Link: https://lore.kernel.org/r/20201026131936.1335664-6-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Move mlx5_ib_cont_pages() to the creation of the mlx5_ib_mrJason Gunthorpe
For the user MR path, instead of calling this after getting the umem, call it as part of creating the struct mlx5_ib_mr and distill its output to a single page_shift stored inside the mr. This avoids passing around the tuple of its output. Based on the umem and page_shift, the output arguments can be computed using: count == ib_umem_num_pages(mr->umem) shift == mr->page_shift ncont == ib_umem_num_dma_blocks(mr->umem, 1 << mr->page_shift) order == order_base_2(ncont) And since mr->page_shift == umem_odp->page_shift then ncont == ib_umem_num_dma_blocks() == ib_umem_odp_num_pages() for ODP umems. Link: https://lore.kernel.org/r/20201026131936.1335664-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Remove mlx5_ib_mr->npagesJason Gunthorpe
This is the same value as ib_umem_num_pages(mr->umem), use that instead. Link: https://lore.kernel.org/r/20201026131936.1335664-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Fix corruption of reg_pages in mlx5_ib_rereg_user_mr()Jason Gunthorpe
reg_pages should always contain mr->npage since when the mr is finally de-reg'd it is always subtracted out. If there were any error exits then mlx5_ib_rereg_user_mr() would leave the reg_pages adjusted and this will cause it to be double subtracted eventually. The manipulation of reg_pages is inherently connected to the umem, so lift it out of set_mr_fields() and only adjust it around creating/destroying a umem. reg_pages is only used for diagnostics in sysfs. Fixes: 7d0cc6edcc70 ("IB/mlx5: Add MR cache for large UMR regions") Link: https://lore.kernel.org/r/20201026131936.1335664-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA/mlx5: Remove mlx5_ib_mr->orderJason Gunthorpe
The is only ever set to non-zero if the MR is from the cache, and if it is cached then the order is in cached_ent->order. Make it clearer that use_umr_mtt_update() only returns true for cached MRs and remove the redundant data. Link: https://lore.kernel.org/r/20201026131936.1335664-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-30RDMA: Convert various random sprintf sysfs _show uses to sysfs_emitJoe Perches
Manual changes for sysfs_emit as cocci scripts can't easily convert them. Link: https://lore.kernel.org/r/ecde7791467cddb570c6f6d2c908ffbab9145cac.1602122880.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-30RDMA: Manual changes for sysfs_emit and neateningJoe Perches
Make changes to use sysfs_emit in the RDMA code as cocci scripts can not be written to handle _all_ the possible variants of various sprintf family uses in sysfs show functions. While there, make the code more legible and update its style to be more like the typical kernel styles. Miscellanea: o Use intermediate pointers for dereferences o Add and use string lookup functions o return early when any intermediate call fails so normal return is at the bottom of the function o mlx4/mcg.c:sysfs_show_group: use scnprintf to format intermediate strings Link: https://lore.kernel.org/r/f5c9e4c9d8dafca1b7b70bd597ee7f8f219c31c8.1602122880.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28IB/srpt: docs: add a description for cq_size memberMauro Carvalho Chehab
Changeset c804af2c1d31 ("IB/srpt: use new shared CQ mechanism") added a new member for struct srpt_rdma_ch, but didn't add the corresponding kernel-doc markup, as repoted when doing "make htmldocs": ./drivers/infiniband/ulp/srpt/ib_srpt.h:331: warning: Function parameter or member 'cq_size' not described in 'srpt_rdma_ch' Add a description for it. Fixes: c804af2c1d31 ("IB/srpt: use new shared CQ mechanism") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Tested-by: Brendan Higgins <brendanhiggins@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Link: https://lore.kernel.org/r/df0e5f0e866b91724299ef569a2da8115e48c0cf.1603791716.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-28RDMA/hns: Add support for filling GMV tableWeihang Li
Add a interface to fill GMV(SGID/SMAC/VLAN) table for HIP09, all of above source address information is stored as an entry in GMV table. The users just need to provide the index to the hardware when POST SEND. Link: https://lore.kernel.org/r/1603508836-33054-3-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/hns: Add support for configuring GMV tableWeihang Li
HIP09 supports to store SGID/SMAC/VLAN together in a table named GMV. The driver needs to allocate memory for it and tell the information about this region to hardware. Link: https://lore.kernel.org/r/1603508836-33054-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>