summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-04-13RDMA/rtrs-srv: More debugging info when fail to send replyGioh Kim
It does not help to debug if it only print error message without any debugging information which session and connection the error happened. Link: https://lore.kernel.org/r/20210406123639.202899-3-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/rtrs-clt: Print more info when an error happensGioh Kim
Client prints only error value and it is not enough for debugging. 1. When client receives an error from server: the client does not only print the error value but also more information of server connection. 2. When client failes to send IO: the client gets an error from RDMA layer. It also print more information of server connection. Link: https://lore.kernel.org/r/20210406123639.202899-2-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13Documentation/ABI/rtrs-clt: Add descriptions for min-latency policyGioh Kim
Describe new multipath policy min-latency of the RTRS client. Link: https://lore.kernel.org/r/20210407113444.150961-5-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/rtrs-clt: New sysfs attribute to print the latency of each pathGioh Kim
It shows the latest latency that the client checked when sending the heart-beat. Link: https://lore.kernel.org/r/20210407113444.150961-3-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/rtrs-clt: Add a minimum latency multipath policyGioh Kim
This patch adds new multipath policy: min-latency. Client checks the latency of each path when it sends the heart-beat. And it sends IO to the path with the minimum latency. Link: https://lore.kernel.org/r/20210407113444.150961-2-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13Merge branch 'mlx5_memic_ops' of ↵Jason Gunthorpe
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Maor Gottlieb says: ==================== This series from Maor extends MEMIC to support atomic operations from the host in addition to already supported regular read/write. ==================== * 'memic_ops': RDMA/mlx5: Expose UAPI to query DM RDMA/mlx5: Add support in MEMIC operations RDMA/mlx5: Add support to MODIFY_MEMIC command RDMA/mlx5: Re-organize the DM code RDMA/mlx5: Move all DM logic to separate file RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number net/mlx5: Add MEMIC operations related bits
2021-04-13RDMA/mlx5: Expose UAPI to query DMMaor Gottlieb
Expose UAPI to query MEMIC DM, this will let user space application that didn't allocate the DM but has access to by owning the matching command FD to retrieve its information. Link: https://lore.kernel.org/r/20210411122924.60230-8-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/mlx5: Add support in MEMIC operationsMaor Gottlieb
MEMIC buffer, in addition to regular read and write operations, can support atomic operations from the host. Introduce and implement new UAPI to allocate address space for MEMIC operations such as atomic. This includes: 1. Expose new IOCTL for request mapping of MEMIC operation. 2. Hold the operations address in a list, so same operation to same DM will be allocated only once. 3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid until all addresses were unmapped. Link: https://lore.kernel.org/r/20210411122924.60230-7-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/mlx5: Add support to MODIFY_MEMIC commandMaor Gottlieb
Add two functions to allocate and deallocate MEMIC operations by using the MODIFY_MEMIC command. Link: https://lore.kernel.org/r/20210411122924.60230-6-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/mlx5: Re-organize the DM codeMaor Gottlieb
1. Inline the checks from check_dm_type_support() into their respective allocation functions. 2. Fix use after free when driver fails to copy the MEMIC address to the user by moving the allocation code into their respective functions, hence we avoid the explicit call to free the DM in the error flow. 3. Split mlx5_ib_dm struct to memic and icm proper typesafety throughout. Fixes: dc2316eba73f ("IB/mlx5: Fix device memory flows") Link: https://lore.kernel.org/r/20210411122924.60230-5-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/mlx5: Move all DM logic to separate fileMaor Gottlieb
Move all device memory related code to a separate file. Link: https://lore.kernel.org/r/20210411122924.60230-4-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line numberMaor Gottlieb
In order to support multiple methods declaration in the same file we should use the line number as part of the name. Link: https://lore.kernel.org/r/20210411122924.60230-3-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13net/mlx5: Add MEMIC operations related bitsMaor Gottlieb
Add the MEMIC operations bits and structures to the mlx5_ifc file. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2021-04-13IB/hfi1: Rework AIP and VNIC dummy netdev usageMike Marciniszyn
All other users of the dummy netdevice embed the netdev in other structures: init_dummy_netdev(&mal->dummy_dev); init_dummy_netdev(&eth->dummy_dev); init_dummy_netdev(&ar->napi_dev); init_dummy_netdev(&irq_grp->napi_ndev); init_dummy_netdev(&wil->napi_ndev); init_dummy_netdev(&trans_pcie->napi_dev); init_dummy_netdev(&dev->napi_dev); init_dummy_netdev(&bus->mux_dev); The AIP and VNIC implementation turns that model inside out and used a kfree() to free what appears to be a netdev struct when in reality, it is a struct that enbodies the rx state as well as the dummy netdev used to support napi_poll across disparate receive contexts. The relationship is infered by the odd allocation: const int netdev_size = sizeof(*dd->dummy_netdev) + sizeof(struct hfi1_netdev_priv); <snip> dd->dummy_netdev = kcalloc_node(1, netdev_size, GFP_KERNEL, dd->node); Correct the issue by: - Correctly naming the alloc and free functions - Renaming hfi1_netdev_priv to hfi1_netdev_rx - Replacing dd dummy_netdev with a netdev_rx pointer - Embedding the net_device in hfi1_netdev_rx - Moving the init_dummy_netdev to the alloc routine - Adjusting wrappers to fit the new model Fixes: 6991abcb993c ("IB/hfi1: Add functions to receive accelerated ipoib packets") Link: https://lore.kernel.org/r/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/qib: Remove useless qib_read_ureg() functionJiapeng Chong
Fix the following clang warning: drivers/infiniband/hw/qib/qib_iba7322.c:803:19: warning: unused function 'qib_read_ureg' [-Wunused-function]. Link: https://lore.kernel.org/r/1618305063-29007-1-git-send-email-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13RDMA/hns: Remove unnecessary flush operation for workqueueYixian Liu
As a flush operation is implemented inside destroy_workqueue(), there is no need to do flush operation before. Fixes: bfcc681bd09d ("IB/hns: Fix the bug when free mr") Fixes: 0425e3e6e0c7 ("RDMA/hns: Support flush cqe for hip08 in kernel space") Link: https://lore.kernel.org/r/1618305087-30799-1-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/rtrs-clt: destroy sysfs after removing session from active listGioh Kim
A session can be removed dynamically by sysfs interface "remove_path" that eventually calls rtrs_clt_remove_path_from_sysfs function. The current rtrs_clt_remove_path_from_sysfs first removes the sysfs interfaces and frees sess->stats object. Second it removes the session from the active list. Therefore some functions could access non-connected session and access the freed sess->stats object even-if they check the session status before accessing the session. For instance rtrs_clt_request and get_next_path_min_inflight check the session status and try to send IO to the session. The session status could be changed when they are trying to send IO but they could not catch the change and update the statistics information in sess->stats object, and generate use-after-free problem. (see: "RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its stats") This patch changes the rtrs_clt_remove_path_from_sysfs to remove the session from the active session list and then destroy the sysfs interfaces. Each function still should check the session status because closing or error recovery paths can change the status. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20210412084002.33582-1-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/srpt: Fix error return code in srpt_cm_req_recv()Wang Wensheng
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: db7683d7deb2 ("IB/srpt: Fix login-related race conditions") Link: https://lore.kernel.org/r/20210408113132.87250-1-wangwensheng4@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12rds: ib: Remove two ib_modify_qp() callsHåkon Bugge
For some HCAs, ib_modify_qp() is an expensive operation running virtualized. For both the active and passive side, the QP returned by the CM has the state set to RTS, so no need for this excess RTS -> RTS transition. With IB Core's ability to set the RNR Retry timer, we use this interface to shave off another ib_modify_qp(). Fixes: ec16227e1414 ("RDS/IB: Infiniband transport") Link: https://lore.kernel.org/r/1617216194-12890-3-git-send-email-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12IB/cma: Introduce rdma_set_min_rnr_timer()Håkon Bugge
Introduce the ability for kernel ULPs to adjust the minimum RNR Retry timer. The INIT -> RTR transition executed by RDMA CM will be used for this adjustment. This avoids an additional ib_modify_qp() call. rdma_set_min_rnr_timer() must be called before the call to rdma_connect() on the active side and before the call to rdma_accept() on the passive side. The default value of RNR Retry timer is zero, which translates to 655 ms. When the receiver is not ready to accept a send messages, it encodes the RNR Retry timer value in the NAK. The requestor will then wait at least the specified time value before retrying the send. The 5-bit value to be supplied to the rdma_set_min_rnr_timer() is documented in IBTA Table 45: "Encoding for RNR NAK Timer Field". Link: https://lore.kernel.org/r/1617216194-12890-2-git-send-email-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/i40iw: Use DEFINE_SPINLOCK() for spinlockYe Bin
spinlock can be initialized automatically with DEFINE_SPINLOCK() rather than explicitly calling spin_lock_init(). Link: https://lore.kernel.org/r/20210409095147.2294269-1-yebin10@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/bnxt_re: Fix error return code in bnxt_qplib_cq_process_terminal()Wang Wensheng
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://lore.kernel.org/r/20210408113137.97202-1-wangwensheng4@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12IB/hfi1: Fix error return code in parse_platform_config()Wang Wensheng
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20210408113140.103032-1-wangwensheng4@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/qedr: Fix error return code in qedr_iw_connect()Wang Wensheng
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 82af6d19d8d9 ("RDMA/qedr: Fix synchronization methods and memory leaks in qedr") Link: https://lore.kernel.org/r/20210408113135.92165-1-wangwensheng4@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/core: Correct format of block commentsWenpeng Liang
Block comments should not use a trailing */ on a separate line and every line of a block comment should start with an '*'. Link: https://lore.kernel.org/r/1617783353-48249-7-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/core: Correct format of bracesWenpeng Liang
Do following cleanups about braces: - Add the necessary braces to maintain context alignment. - Fix the open '{' that is not on the same line as "switch". - Remove braces that are not necessary for single statement blocks. - Fix "else" that doesn't follow close brace '}'. Link: https://lore.kernel.org/r/1617783353-48249-6-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/core: Remove redundant spacesWenpeng Liang
Space is not required after '(', before ')', before ',' and between '*' and symbol name of a definition. Link: https://lore.kernel.org/r/1617783353-48249-5-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/core: Add necessary spacesWenpeng Liang
Space is required before '(' of switch statements and around '='. Link: https://lore.kernel.org/r/1617783353-48249-4-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/core: Remove the redundant return statementsWenpeng Liang
The return statements at the end of a void function is meaningless. Link: https://lore.kernel.org/r/1617783353-48249-3-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12RDMA/core: Print the function name by __func__ instead of an fixed stringWenpeng Liang
It's better to use __func__ than a fixed string to print a function's name. Link: https://lore.kernel.org/r/1617783353-48249-2-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12Merge branch 'mlx5-next' of ↵Jason Gunthorpe
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== This pr contains changes from mlx5-next branch, already reviewed on netdev and rdma mailing lists, links below. 1) From Leon, Dynamically assign MSI-X vectors count Already Acked by Bjorn Helgaas. https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/ 2) Cleanup series: https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/ From Mark, E-Switch cleanups and refactoring, and the addition of single FDB mode needed HW bits. From Mikhael, Remove unused struct field From Saeed, Cleanup W=1 prototype warning From Zheng, Esw related cleanup From Tariq, User order-0 page allocation for EQs ==================== * mlx5-next: net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks net/mlx5: Dynamically assign MSI-X vectors count net/mlx5: Add dynamic MSI-X capabilities bits PCI/IOV: Add sysfs MSI-X vector assignment interface net/mlx5: Use order-0 allocations for EQs net/mlx5: Add IFC bits needed for single FDB mode net/mlx5: E-Switch, Refactor send to vport to be more generic RDMA/mlx5: Use representor E-Switch when getting netdev and metadata net/mlx5: E-Switch, Add eswitch pointer to each representor net/mlx5: E-Switch, Add match on vhca id to default send rules net/mlx5: Remove unused mlx5_core_health member recover_work net/mlx5: simplify the return expression of mlx5_esw_offloads_pair() net/mlx5: Cleanup prototype warning Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Prevent le32 from being implicitly converted to u32Lang Cheng
Replace BUILD_BUG_ON_ZERO() with BUILD_BUG_ON() to avoid sparse complaining "restricted __le32 degrades to integer". Link: https://lore.kernel.org/r/1617354454-47840-10-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Simplify the function config_eqc()Yixing Liu
Use "hr_reg_write" replace "roce_set_filed". Link: https://lore.kernel.org/r/1617354454-47840-9-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Add XRC subtype in QPC and XRC type in SRQCWenpeng Liang
A field to distuiguish basic SRQ from XRC SRQ in SRQC and a field in QPC to determine whether a QP is XRC TGT QP or XRC INI QP are missing. Fixes: 32548870d438 ("RDMA/hns: Add support for XRC on HIP09") Link: https://lore.kernel.org/r/1617354454-47840-8-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Remove unsupported QP typesWenpeng Liang
The hns ROCEE does not support UC QP currently. Link: https://lore.kernel.org/r/1617354454-47840-7-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Delete unused members in the structure hns_roce_hwYangyang Li
Some structure members in hns_roce_hw have never been used and need to be deleted. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Fixes: b156269d88e4 ("RDMA/hns: Add modify CQ support for hip08") Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode") Link: https://lore.kernel.org/r/1617354454-47840-6-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Delete redundant abnormal interrupt statusWenpeng Liang
The hardware supports only two types of abnormal interrupts. Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Link: https://lore.kernel.org/r/1617354454-47840-5-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Delete redundant condition judgment related to eqYangyang Li
The register value related to the eq interrupt depends only on enable_flag, so the redundant condition judgment is deleted. Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Link: https://lore.kernel.org/r/1617354454-47840-4-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Fix missing assignment of max_inline_dataWeihang Li
When querying QP, the ULPs should be informed of the max length of inline data supported by the hardware. Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC") Link: https://lore.kernel.org/r/1617354454-47840-3-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Avoid enabling RQ inline on UDWeihang Li
RQ inline is not supported on UD/GSI QP, it should be disabled in QPC. Link: https://lore.kernel.org/r/1617354454-47840-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Modify prints for mailbox and command queueWenpeng Liang
Use ratelimited print in mbox and cmq. And print mailbox operation if mailbox fails because it's useful information for the user. Link: https://lore.kernel.org/r/1617262341-37571-4-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Support more return types of command queueLang Cheng
Add error code definition according to the return code from firmware to help find out more detailed reasons why a command fails to be sent. Link: https://lore.kernel.org/r/1617262341-37571-3-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Enable all CMDQ contextLang Cheng
Fix error of cmd's context number calculation algorithm to enable all of 32 cmd entries and support 32 concurrent accesses. Link: https://lore.kernel.org/r/1617262341-37571-2-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/rxe: Fix missing acks from responderBob Pearson
All responder errors from request packets that do not consume a receive WQE fail to generate acks for RC QPs. This patch corrects this behavior by making the flow follow the same path as request packets that do consume a WQE after the completion. Link: https://lore.kernel.org/r/20210402001016.3210-1-rpearson@hpe.com Link: https://lore.kernel.org/linux-rdma/1a7286ac-bcea-40fb-2267-480134dd301b@gmail.com/ Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07RDMA/core: Make the wc status prompt message clearerYixian Liu
Local invalidate is also a kind of memory management operation, not only memory bind operation. Furthermore, as invalidate operations include local and remote, add prefix to the prompt message to make it clearer. Link: https://lore.kernel.org/r/1617698772-13871-1-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07RDMA/hns: Use GFP_ATOMIC under spin lockWei Yongjun
A spin lock is taken here so we should use GFP_ATOMIC. Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW") Link: https://lore.kernel.org/r/20210407154900.3486268-1-weiyongjun1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07IB/mlx5: Reduce max order of memory allocated for xlt updatePraveen Kumar Kannoju
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to 1 MB (order-8) memory, depending on the size of the MR. This costly allocation can sometimes take very long to return (a few seconds). This causes the calling application to hang for a long time, especially when the system is fragmented. To avoid these long latency spikes, the calls the higher order allocations need to fail faster in case they are not available. In order to acheive this we need __GFP_NORETRY flag in the gfp_mask before during fetching the free pages. Allow the algorithm to automatically fall back to smaller page sizes. Link: https://lore.kernel.org/r/1617425635-35631-1-git-send-email-praveen.kannoju@oracle.com Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07IB/hfi1: Remove unused functionKaike Wan
Remove the unused function sdma_iowait_schedule(). Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/1617026791-89997-1-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07IB/hfi1: Use kzalloc() for mmu_rb_handler allocationMike Marciniszyn
The code currently assumes that the mmu_notifier struct embedded in mmu_rb_handler only contains two fields. There are now extra fields: struct mmu_notifier { struct hlist_node hlist; const struct mmu_notifier_ops *ops; struct mm_struct *mm; struct rcu_head rcu; unsigned int users; }; Given that there in no init for the mmu_notifier, a kzalloc() should be used to insure that any newly added fields are given a predictable initial value of zero. Fixes: 06e0ffa69312 ("IB/hfi1: Re-factor MMU notification code") Link: https://lore.kernel.org/r/1617026056-50483-9-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Adam Goldman <adam.goldman@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07IB/hfi1: Add additional usdma tracesMike Marciniszyn
Add traces that were vital in isolating an issue with pq waitlist in commit fa8dac396863 ("IB/hfi1: Fix another case where pq is left on waitlist") Link: https://lore.kernel.org/r/1617026056-50483-8-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>