summaryrefslogtreecommitdiff
path: root/include/uapi/rdma
AgeCommit message (Collapse)Author
2021-06-22Merge branch 'mlx5_realtime_ts' into rdma.git for-nextJason Gunthorpe
Aharon Landau says: ==================== In case device supports only real-time timestamp, the kernel will fail to create QP despite rdma-core requested such timestamp type. It is because device returns free-running timestamp, and the conversion from free-running to real-time is performed in the user space. This series fixes it, by returning real-time timestamp. ==================== * mlx5_realtime_ts: RDMA/mlx5: Support real-time timestamp directly from the device RDMA/mlx5: Refactor get_ts_format functions to simplify code Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-22RDMA/mlx5: Support real-time timestamp directly from the deviceAharon Landau
Currently, if the user asks for a real-time timestamp, the device will return a free-running one, and the timestamp will be translated to real-time in the user-space. When the device supports only real-time timestamp and not free-running, the creation of the QP will fail even though the user needs supported the real-time one. To prevent this, we will return the real-time timestamp directly from the device. Link: https://lore.kernel.org/r/c6cfc8e6f038575c5c2de6505830f7e74e4de80d.1623829775.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-22RDMA/core: Use flexible array for mad dataKees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally reading across neighboring array fields. Without a flexible array, this looks like an attempt to perform a memcpy() read beyond the end of the packet->mad.data array: drivers/infiniband/core/user_mad.c: memcpy(packet->msg->mad, packet->mad.data, IB_MGMT_MAD_HDR); Switch from [0] to [] to use the appropriately handled type for trailing bytes. Link: https://lore.kernel.org/r/20210616202615.1247242-1-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21RDMA/bnxt_re: Update ABI to pass wqe-mode to user spaceDevesh Sharma
Changing ucontext ABI response structure to pass wqe_mode to user library. A flag in comp_mask has been set to indicate presence of wqe_mode. Moved wqe-mode ABI to uapi/rdma/bnxt_re-abi.h Link: https://lore.kernel.org/r/20210616202817.1185276-1-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16RDMA/rxe: Add bind MW fields to rxe_send_wrBob Pearson
Add fields to struct rxe_send_wr in rdma_user_rxe.h to support bind MW work requests Link: https://lore.kernel.org/r/20210608042552.33275-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02Merge branch 'irdma' into rdma.git for-nextJason Gunthorpe
Shiraz Saleem says: ==================== Add Intel Ethernet Protocol Driver for RDMA (irdma) The following patch series introduces a unified Intel Ethernet Protocol Driver for RDMA (irdma) for the X722 iWARP device and a new E810 device which supports iWARP and RoCEv2. The irdma module replaces the legacy i40iw module for X722 and extends the ABI already defined for i40iw. It is backward compatible with legacy X722 rdma-core provider (libi40iw). X722 and E810 are PCI network devices that are RDMA capable. The RDMA block of this parent device is represented via an auxiliary device exported to 'irdma' using the core auxiliary bus infrastructure recently added for 5.11 kernel. The parent PCI netdev drivers 'i40e' and 'ice' register auxiliary RDMA devices with private data/ops encapsulated that bind to auxiliary drivers registered in irdma module. Currently, default is RoCEv2 for E810. Runtime support for protocol switch to iWARP will be made available via devlink in a future patch. ==================== Link: https://lore.kernel.org/r/20210602205138.889-1-shiraz.saleem@intel.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> * branch 'irdma': RDMA/irdma: Update MAINTAINERS file RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw RDMA/irdma: Add ABI definitions RDMA/irdma: Add dynamic tracing for CM RDMA/irdma: Add miscellaneous utility definitions RDMA/irdma: Add user/kernel shared libraries RDMA/irdma: Add RoCEv2 UD OP support RDMA/irdma: Implement device supported verb APIs RDMA/irdma: Add PBLE resource manager RDMA/irdma: Add connection manager RDMA/irdma: Add QoS definitions RDMA/irdma: Add privileged UDA queue implementation RDMA/irdma: Add HMC backing store setup functions RDMA/irdma: Implement HW Admin Queue OPs RDMA/irdma: Implement device initialization definitions RDMA/irdma: Register auxiliary driver and implement private channel OPs i40e: Register auxiliary devices to provide RDMA i40e: Prep i40e header for aux bus conversion ice: Register auxiliary device to provide RDMA ice: Implement iidc operations ice: Initialize RDMA support iidc: Introduce iidc.h i40e: Replace one-element array with flexible-array member
2021-06-02RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iwShiraz Saleem
Add Kconfig and Makefile to build irdma driver. Remove i40iw driver and add an alias in irdma. Remove legacy exported symbols i40e_register_client and i40e_unregister_client from i40e as they are no longer used. irdma is the replacement driver that supports X722. Link: https://lore.kernel.org/r/20210602205138.889-16-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02RDMA/irdma: Add ABI definitionsMustafa Ismail
Add ABI definitions for irdma. Link: https://lore.kernel.org/r/20210602205138.889-15-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02RDMA/irdma: Implement device supported verb APIsMustafa Ismail
Implement device supported verb APIs. The supported APIs vary based on the underlying transport the ibdev is registered as (i.e. iWARP or RoCEv2). Link: https://lore.kernel.org/r/20210602205138.889-10-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-20RDMA/mlx5: Add SQD2RTS bit to the alloc ucontext responseSergey Gorenko
The new bit in the comp_mask is needed to mark that kernel supports SQD2RTS transition for the modify QP command. Link: https://lore.kernel.org/r/7ce705fedac1b2b8e3a2f4013e04244dc5946344.1620641808.git.leonro@nvidia.com Reviewed-by: Evgenii Kochetov <evgeniik@nvidia.com> Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-27RDMA/nldev: Add copy-on-fork attribute to get sys commandGal Pressman
The new attribute indicates that the kernel copies DMA pages on fork, hence libibverbs' fork support through madvise and MADV_DONTFORK is not needed. The introduced attribute is always reported as supported since the kernel has the patch that added the copy-on-fork behavior. This allows the userspace library to identify older vs newer kernel versions. Extra care should be taken when backporting this patch as it relies on the fact that the copy-on-fork patch is merged, hence no check for support is added. Don't backport this patch unless you also have the following series: commit 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes") and commit 4eae4efa2c29 ("hugetlb: do early cow when page pinned on src mm"). Fixes: 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes") Fixes: 4eae4efa2c29 ("hugetlb: do early cow when page pinned on src mm") Link: https://lore.kernel.org/r/20210418121025.66849-1-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22RDMA/nldev: Add QP numbers to SRQ informationNeta Ostrovsky
Add QP numbers that are associated with the SRQ to the SRQ information. The QPs are displayed in a range form. Sample output: $ rdma res show srq dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f0 srqn 4 type BASIC lqpn 125-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC lqpn 141-156 pdn 10 pid 3584 comm ibv_srq_pingpon dev ibp8s0f0 srqn 6 type BASIC lqpn 157-172 pdn 11 pid 3590 comm ibv_srq_pingpon dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f1 srqn 1 type BASIC lqpn 329-344 pdn 4 pid 3586 comm ibv_srq_pingpon $ rdma res show srq lqpn 126-141 dev ibp8s0f0 srqn 4 type BASIC lqpn 126-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC lqpn 141 pdn 10 pid 3584 comm ibv_srq_pingpon $ rdma res show srq lqpn 127 dev ibp8s0f0 srqn 4 type BASIC lqpn 127 pdn 9 pid 3581 comm ibv_srq_pingpon Link: https://lore.kernel.org/r/79a4bd4caec2248fd9583cccc26786af8e4414fc.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22RDMA/nldev: Return SRQ informationNeta Ostrovsky
Extend the RDMA nldev return a SRQ information, like SRQ number, SRQ type, PD number, CQ number and process ID that created that SRQ. Sample output: $ rdma res show srq dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f0 srqn 4 type BASIC pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC pdn 10 pid 3584 comm ibv_srq_pingpon dev ibp8s0f0 srqn 6 type BASIC pdn 11 pid 3590 comm ibv_srq_pingpon dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f1 srqn 1 type BASIC pdn 4 pid 3586 comm ibv_srq_pingpon Link: https://lore.kernel.org/r/322f9210b95812799190dd4a0fb92f3a3bba0333.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22RDMA/nldev: Return context informationNeta Ostrovsky
Extend the RDMA nldev return a context information, like ctx number and process ID that created that context. This functionality is helpful to find orphan contexts that are not closed for some reason. Sample output: $ rdma res show ctx dev ibp8s0f0 ctxn 0 pid 980 comm ibv_rc_pingpong dev ibp8s0f0 ctxn 1 pid 981 comm ibv_rc_pingpong dev ibp8s0f0 ctxn 2 pid 992 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong $ rdma res show ctx dev ibp8s0f1 dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong Link: https://lore.kernel.org/r/5c956acfeac4e9d532988575f3da7d64cb449374.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-20RDMA/mlx5: Expose private query portMark Bloch
Expose a non standard query port via IOCTL that will be used to expose port attributes that are specific to mlx5 devices. The new interface receives a port number to query and returns a structure that contains the available attributes for that port. This will be used to fill the gap between pure DEVX use cases and use cases where a kernel needs to inform userspace about various kernel driver configurations that userspace must use in order to work correctly. Flags is used to indicate which fields are valid on return. MLX5_IB_UAPI_QUERY_PORT_VPORT: The vport number of the queered port. MLX5_IB_UAPI_QUERY_PORT_VPORT_VHCA_ID: The VHCA ID of the vport of the queered port. MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_RX: The vport's RX ICM address used for sw steering. MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_TX: The vport's TX ICM address used for sw steering. MLX5_IB_UAPI_QUERY_PORT_VPORT_REG_C0: The metadata used to tag egress packets of the vport. MLX5_IB_UAPI_QUERY_PORT_ESW_OWNER_VHCA_ID: The E-Switch owner vhca id of the vport. Link: https://lore.kernel.org/r/6e2ef13e5a266a6c037eb0105eb1564c7bb52f23.1618743394.git.leonro@nvidia.com Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13Merge branch 'mlx5_memic_ops' of ↵Jason Gunthorpe
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Maor Gottlieb says: ==================== This series from Maor extends MEMIC to support atomic operations from the host in addition to already supported regular read/write. ==================== * 'memic_ops': RDMA/mlx5: Expose UAPI to query DM RDMA/mlx5: Add support in MEMIC operations RDMA/mlx5: Add support to MODIFY_MEMIC command RDMA/mlx5: Re-organize the DM code RDMA/mlx5: Move all DM logic to separate file RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number net/mlx5: Add MEMIC operations related bits
2021-04-13RDMA/mlx5: Expose UAPI to query DMMaor Gottlieb
Expose UAPI to query MEMIC DM, this will let user space application that didn't allocate the DM but has access to by owning the matching command FD to retrieve its information. Link: https://lore.kernel.org/r/20210411122924.60230-8-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/mlx5: Add support in MEMIC operationsMaor Gottlieb
MEMIC buffer, in addition to regular read and write operations, can support atomic operations from the host. Introduce and implement new UAPI to allocate address space for MEMIC operations such as atomic. This includes: 1. Expose new IOCTL for request mapping of MEMIC operation. 2. Hold the operations address in a list, so same operation to same DM will be allocated only once. 3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid until all addresses were unmapped. Link: https://lore.kernel.org/r/20210411122924.60230-7-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11RDMA/mlx5: Allow larger pages in DevX umemJason Gunthorpe
The umem DMA list calculation was locked at 4k pages due to confusion around how this API works and is used when larger pages are present. The conclusion is: - umem's cannot extend past what is mapped into the process, so creating a lage page size and referring to a sub-range is not allowed - umem's must always have a page offset of zero, except for sub PAGE_SIZE umems - The feature of umem_offset to create multiple objects inside a umem is buggy and isn't used anyplace. Thus we can assume all users of the current API have umem_offset == 0 as well Provide a new page size calculator that limits the DMA list to the VA range and enforces umem_offset == 0. Allow user space to specify the page sizes which it can accept, this bitmap must be derived from the intended use of the umem, based on per-usage HW limitations. Link: https://lore.kernel.org/r/20210304130501.1102577-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11RDMA/hns: Add support for XRC on HIP09Wenpeng Liang
The HIP09 supports XRC transport service, it greatly saves the number of QPs required to connect all processes in a large cluster. Link: https://lore.kernel.org/r/1614826558-35423-1-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-18Merge tag 'v5.11' into rdma.git for-nextJason Gunthorpe
Linux 5.11 Merged to resolve conflicts with RDMA rc commits - drivers/infiniband/sw/rxe/rxe_net.c The final logic is to call rxe_get_dev_from_net() again with the master netdev if the packet was rx'd on a vlan. To keep the elimination of the local variables requires a trivial edit to the code in -rc Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-20RDMA/uverbs: Add uverbs command for dma-buf based MR registrationJianxin Xiong
Implement a new uverbs ioctl method for memory registration with file descriptor as an extra parameter. Link: https://lore.kernel.org/r/1608067636-98073-4-git-send-email-jianxin.xiong@intel.com Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Acked-by: Christian Koenig <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19RDMA/vmw_pvrdma: Fix network_hdr_type reported in WCBryan Tan
The PVRDMA device HW interface defines network_hdr_type according to an old definition of the internal kernel rdma_network_type enum that has since changed, resulting in the wrong rdma_network_type being reported. Fix this by explicitly defining the enum used by the PVRDMA device and adding a function to convert the pvrdma_network_type to rdma_network_type enum. Cc: stable@vger.kernel.org # 5.10+ Fixes: 1c15b4f2a42f ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type") Link: https://lore.kernel.org/r/1611026189-17943-1-git-send-email-bryantan@vmware.com Reviewed-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-11RDMA/rxe: Use acquire/release for memory orderingBob Pearson
Change work and completion queues to use smp_load_acquire() and smp_store_release() to synchronize between driver and users. This commit goes with a matching series of commits in the rxe user space provider. Link: https://lore.kernel.org/r/20201210174258.5234-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-07RDMA/hns: Move capability flags of QP and CQ to hns-abi.hWeihang Li
These flags will be returned to the userspace through ABI, so they should be defined in hns-abi.h. Furthermore, there is no need to include hns-abi.h in every source files, it just needs to be included in the common header file. Link: https://lore.kernel.org/r/1606872560-17823-1-git-send-email-liweihang@huawei.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-07Merge tag 'mlx5-next-2020-12-02' of ↵Jason Gunthorpe
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next-2020-12-02 Low level mlx5 updates required by both netdev and rdma trees: net/mlx5: Treat host PF vport as other (non eswitch manager) vport net/mlx5: Enable host PF HCA after eswitch is initialized net/mlx5: Rename peer_pf to host_pf net/mlx5: Make API mlx5_core_is_ecpf accept const pointer net/mlx5: Export steering related functions net/mlx5: Expose other function ifc bits net/mlx5: Expose IP-in-IP TX and RX capability bits net/mlx5: Update the hardware interface definition for vhca state net/mlx5: Update the list of the PCI supported devices net/mlx5: Avoid exposing driver internal command helpers net/mlx5: Add ts_cqe_to_dest_cqn related bits net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bits net/mlx5: Check dr mask size against mlx5_match_param size net/mlx5: Add sampler destination type net/mlx5: Add sample offload hardware bits and structures ==================== Link: https://lore.kernel.org/r/20201203011010.213440-1-saeedm@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-26net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bitsMuhammad Sammar
Add misc4 match params to enable matching on prog_sample_fields. Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-26RDMA: Check attr_mask during modify_qpJason Gunthorpe
Each driver should check that it can support the provided attr_mask during modify_qp. IB_USER_VERBS_EX_CMD_MODIFY_QP was being used to block modify_qp_ex because the driver didn't check RATE_LIMIT. Link: https://lore.kernel.org/r/6-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-16RDMA/rxe: Move the definitions for rxe_av.network_type to uAPIJason Gunthorpe
RXE was wrongly using an internal kernel enum as part of its uAPI, split this out into a dedicated uAPI enum just for RXE. It only uses the IPv4 and IPv6 values. This was exposed by changing the internal kernel enum definition which broke RXE. Fixes: 1c15b4f2a42f ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-01RDMA/uverbs: Expose the new GID query API to user spaceAvihai Horon
Expose the query GID table and entry API to user space by adding two new methods and method handlers to the device object. This API provides a faster way to query a GID table using single call and will be used in libibverbs to improve current approach that requires multiple calls to open, close and read multiple sysfs files for a single GID table entry. Link: https://lore.kernel.org/r/20200923165015.2491894-5-leon@kernel.org Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-01RDMA/core: Introduce new GID table query APIAvihai Horon
Introduce rdma_query_gid_table which enables querying all the GID tables of a given device and copying the attributes of all valid GID entries to a provided buffer. This API provides a faster way to query a GID table using single call and will be used in libibverbs to improve current approach that requires multiple calls to open, close and read multiple sysfs files for a single GID table entry. Link: https://lore.kernel.org/r/20200923165015.2491894-4-leon@kernel.org Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-01RDMA/mlx5: Extend advice MR to support non faulting modeYishai Hadas
Extend advice MR to support non faulting mode, this can improve performance by increasing the populated page tables in the device. Link: https://lore.kernel.org/r/20200930163828.1336747-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-24RDMA/hns: Add support for CQE in size of 64 BytesWenpeng Liang
The new version of RoCEE supports using CQE in size of 32B or 64B. The performance of bus can be improved by using larger size of CQE. Link: https://lore.kernel.org/r/1600245806-56321-3-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-11RDMA/core: Added missing WR and WC opcodesBob Pearson
Add work completion opcodes to a new ib_uverbs_wc_opcode enum in ib_user_verbs.h. This plays the same role as ib_uverbs_wr_opcode documenting the opcodes in the user space API. Assigned the IB_WC_XXX opcodes in ib_verbs.h to the IB_UVERBS_WC_XXX where they are defined. This follows the same pattern as the IB_WR_XXX opcodes. This fixes an incorrect value for LSO that had crept in but is not currently being used. Added a missing IB_WR_BIND_MW opcode in ib_verbs.h. Link: https://lore.kernel.org/r/20200903224039.437391-2-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27RDMA/rxe: Fix style warningsBob Pearson
Fixed several minor checkpatch warnings in existing rxe source. Link: https://lore.kernel.org/r/20200820224638.3212-3-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-18RDMA/efa: Introduce SRD RNR retryGal Pressman
This patch introduces the ability to configure SRD QPs with the RNR retry parameter when issuing a modify QP command. In addition, a capability bit was added to report support to the userspace library. Link: https://lore.kernel.org/r/20200731060420.17053-5-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29RDMA/efa: User/kernel compatibility handshake mechanismGal Pressman
Introduce a mechanism that performs an handshake between the userspace provider and kernel driver which verifies that the user supports all required features in order to operate correctly. The handshake verifies the needed functionality by comparing the reported device caps and the provider caps. If the device reports a non-zero capability the appropriate comp mask is required from the userspace provider in order to allocate the context. Link: https://lore.kernel.org/r/20200722140312.3651-4-galpress@amazon.com Reviewed-by: Shadi Ammouri <sammouri@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29RDMA/efa: Expose minimum SQ sizeGal Pressman
The device reports the minimum SQ size required for creation. This patch queries the min SQ size and reports it back to the userspace library. Link: https://lore.kernel.org/r/20200722140312.3651-3-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Shadi Ammouri <sammouri@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29RDMA/efa: Expose maximum TX doorbell batchGal Pressman
The device reports the maximum number of bytes to be written before ringing the doorbell (zero means unlimited). This patch queries the max batch size and reports it back to the userspace library. Link: https://lore.kernel.org/r/20200722140312.3651-2-galpress@amazon.com Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Firas JahJah <firasj@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-24RDMA/mlx5: Fix typo in enum namePavel Machek
Nnothing uses the enum name, so this is harmless. Fixes: 322694412400 ("IB/mlx5: Introduce driver create and destroy flow methods") Link: https://lore.kernel.org/r/20200724084112.GC31930@amd Signed-off-by: Pavel Machek (CIP) <pavel@denx.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20RDMA: rdma_user_ioctl.h: fix a duplicated word + clarifyRandy Dunlap
Change the repeated word "it" in a comment to "it to". Also insert a dash in the sentence to add clarity. Link: https://lore.kernel.org/r/20200719003220.21250-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/qedr: Add EDPM max size to alloc ucontext responseMichal Kalderon
User space should receive the maximum edpm size from kernel driver, similar to other edpm/ldpm related limits. Add an additional parameter to the alloc_ucontext_resp structure for the edpm maximum size. In addition, pass an indication from user-space to kernel (and not just kernel to user) that the DPM sizes are supported. This is for supporting backward-forward compatibility between driver and lib for everything related to DPM transaction and limit sizes. This should have been part of commit mentioned in Fixes tag. Link: https://lore.kernel.org/r/20200707063100.3811-3-michal.kalderon@marvell.com Fixes: 93a3d05f9d68 ("RDMA/qedr: Add kernel capability flags for dpm enabled mode") Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/qedr: Add EDPM mode type for user-fw compatibilityMichal Kalderon
In older FW versions the completion flag was treated as the ack flag in edpm messages. commit ff937b916eb6 ("qed: Add EDPM mode type for user-fw compatibility") exposed the FW option of setting which mode the QP is in by adding a flag to the qedr <-> qed API. This patch adds the qedr <-> libqedr interface so that the libqedr can set the flag appropriately and qedr can pass it down to FW. Flag is added for backward compatibility with libqedr. For older libs, this flag didn't exist and therefore set to zero. Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs") Link: https://lore.kernel.org/r/20200707063100.3811-2-michal.kalderon@marvell.com Signed-off-by: Yuval Bason <yuval.bason@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10RDMA/counter: Add PID category support in auto modeMark Zhang
With the "PID" category QPs have same PID will be bound to same counter; If this category is not set then QPs have different PIDs will be bound to same counter. This is implemented for 2 reasons: 1. The counter is a limited resource, while there may be dozens of applications, each of which creates several types of QPs, which means it may doesn't have enough counter. 2. The system administrator needs all QPs created by all applications with same type bound to one counter. The counter name and PID is only make sense when "PID" category are configured. This category can also be used in combine with others, e.g. QP type. Link: https://lore.kernel.org/r/20200702082933.424537-2-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06IB/uverbs: Expose UAPI to query MRYishai Hadas
Expose UAPI to query MR, this will let user space application that didn't allocate the MR but has access to by owning the matching command FD to retrieve its information. Link: https://lore.kernel.org/r/20200630093916.332097-8-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/mlx5: Introduce UAPI to query PD attributesYishai Hadas
Introduce UAPI to query PD attributes, this can be used to retrieve PD attributes by having the PD handle of the created one and owning the command FD for the ucontxet. Link: https://lore.kernel.org/r/20200630093916.332097-7-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/mlx5: Implement the query ucontext functionalityYishai Hadas
Implement the query ucontext functionality by returning the original ucontext data as part of an extra mlx5 attribute that holds the driver UAPI response. Link: https://lore.kernel.org/r/20200630093916.332097-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06IB/uverbs: Expose UAPI to query ucontextYishai Hadas
Expose UAPI to query ucontext, this will let user space application that didn't allocate the ucontext but has access to by owning the matching command FD to retrieve the ucontext information. Link: https://lore.kernel.org/r/20200630093916.332097-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-24RDMA: Add support to dump resource tracker in RAW formatMaor Gottlieb
Add support to get resource dump in raw format. It enable drivers to return the entire device specific QP/CQ/MR context without a need from the driver to set each field separately. The raw query returns only the device specific data, general data is still returned by using the existing queries. Example: $ rdma res show mr dev mlx5_1 mrn 2 -r -j [{"ifindex":7,"ifname":"mlx5_1", "data":[0,4,255,254,0,0,0,0,0,0,0,0,16,28,0,216,...]}] Link: https://lore.kernel.org/r/20200623113043.1228482-9-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "A more active cycle than most of the recent past, with a few large, long discussed works this time. The RNBD block driver has been posted for nearly two years now, and flowing through RDMA due to it also introducing a new ULP. The removal of FMR has been a recurring discussion theme for a long time. And the usual smattering of features and bug fixes. Summary: - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa - Continuing driver cleanups in bnxt_re, hns - Big cleanup of mlx5 QP creation flows - More consistent use of src port and flow label when LAG is used and a mlx5 implementation - Additional set of cleanups for IB CM - 'RNBD' network block driver and target. This is a network block RDMA device specific to ionos's cloud environment. It brings strong multipath and resiliency capabilities. - Accelerated IPoIB for HFI1 - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple async fds - Support for exchanging the new IBTA defiend ECE data during RDMA CM exchanges - Removal of the very old and insecure FMR interface from all ULPs and drivers. FRWR should be preferred for at least a decade now" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits) RDMA/cm: Spurious WARNING triggered in cm_destroy_id() RDMA/mlx5: Return ECE DC support RDMA/mlx5: Don't rely on FW to set zeros in ECE response RDMA/mlx5: Return an error if copy_to_user fails IB/hfi1: Use free_netdev() in hfi1_netdev_free() RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr() RDMA/core: Move and rename trace_cm_id_create() IB/hfi1: Fix hfi1_netdev_rx_init() error handling RDMA: Remove 'max_map_per_fmr' RDMA: Remove 'max_fmr' RDMA/core: Remove FMR device ops RDMA/rdmavt: Remove FMR memory registration RDMA/mthca: Remove FMR support for memory registration RDMA/mlx4: Remove FMR support for memory registration RDMA/i40iw: Remove FMR leftovers RDMA/bnxt_re: Remove FMR leftovers RDMA/mlx5: Remove FMR leftovers RDMA/core: Remove FMR pool API RDMA/rds: Remove FMR support for memory registration RDMA/srp: Remove support for FMR memory registration ...