summaryrefslogtreecommitdiff
path: root/drivers/infiniband/ulp
AgeCommit message (Collapse)Author
2020-06-14treewide: replace '---help---' in Kconfig files with 'help'Masahiro Yamada
Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "A more active cycle than most of the recent past, with a few large, long discussed works this time. The RNBD block driver has been posted for nearly two years now, and flowing through RDMA due to it also introducing a new ULP. The removal of FMR has been a recurring discussion theme for a long time. And the usual smattering of features and bug fixes. Summary: - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa - Continuing driver cleanups in bnxt_re, hns - Big cleanup of mlx5 QP creation flows - More consistent use of src port and flow label when LAG is used and a mlx5 implementation - Additional set of cleanups for IB CM - 'RNBD' network block driver and target. This is a network block RDMA device specific to ionos's cloud environment. It brings strong multipath and resiliency capabilities. - Accelerated IPoIB for HFI1 - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple async fds - Support for exchanging the new IBTA defiend ECE data during RDMA CM exchanges - Removal of the very old and insecure FMR interface from all ULPs and drivers. FRWR should be preferred for at least a decade now" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits) RDMA/cm: Spurious WARNING triggered in cm_destroy_id() RDMA/mlx5: Return ECE DC support RDMA/mlx5: Don't rely on FW to set zeros in ECE response RDMA/mlx5: Return an error if copy_to_user fails IB/hfi1: Use free_netdev() in hfi1_netdev_free() RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr() RDMA/core: Move and rename trace_cm_id_create() IB/hfi1: Fix hfi1_netdev_rx_init() error handling RDMA: Remove 'max_map_per_fmr' RDMA: Remove 'max_fmr' RDMA/core: Remove FMR device ops RDMA/rdmavt: Remove FMR memory registration RDMA/mthca: Remove FMR support for memory registration RDMA/mlx4: Remove FMR support for memory registration RDMA/i40iw: Remove FMR leftovers RDMA/bnxt_re: Remove FMR leftovers RDMA/mlx5: Remove FMR leftovers RDMA/core: Remove FMR pool API RDMA/rds: Remove FMR support for memory registration RDMA/srp: Remove support for FMR memory registration ...
2020-06-02RDMA/srp: Remove support for FMR memory registrationMax Gurtovoy
FMR is not supported on most recent RDMA devices (that use fast memory registration mechanism). Also, FMR was recently removed from NFS/RDMA ULP. Link: https://lore.kernel.org/r/2-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-02RDMA/iser: Remove support for FMR memory registrationIsrael Rukshin
FMR is not supported on most recent RDMA devices (that use fast memory registration mechanism). Also, FMR was recently removed from NFS/RDMA ULP. Link: https://lore.kernel.org/r/1-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/srpt: Increase max_send_sgeBart Van Assche
The ib_srpt driver limits max_send_sge to 16. Since that is a workaround for an mlx4 bug that has been fixed, increase max_send_sge. See also commit f95ccffc715b ("IB/mlx4: Use 4K pages for kernel QP's WQE buffer"). Link: https://lore.kernel.org/r/20200525172212.14413-5-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/srpt: Reduce max_recv_sge to 1Bart Van Assche
Since srpt_post_recv() always sets num_sge to 1, reduce the max_recv_sge parameter that is used at queue pair allocation time to 1. Link: https://lore.kernel.org/r/20200525172212.14413-4-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/srpt: Make debug output more detailedBart Van Assche
Since the session name by itself is not sufficient to uniquely identify a queue pair, include the queue pair number. Show the ASCII channel state name instead of the numeric value. This change makes the ib_srpt debug output more consistent. Link: https://lore.kernel.org/r/20200525172212.14413-3-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/srp: Make the channel count configurable per targetBart Van Assche
Increase the flexibility of the SRP initiator driver by making the channel count configurable per target instead of only providing a kernel module parameter for configuring the channel count. Link: https://lore.kernel.org/r/20200525172212.14413-2-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27IB/ipoib: Fix double free of skb in case of multicast traffic in CM modeValentine Fatiev
When connected mode is set, and we have connected and datagram traffic in parallel, ipoib might crash with double free of datagram skb. The current mechanism assumes that the order in the completion queue is the same as the order of sent packets for all QPs. Order is kept only for specific QP, in case of mixed UD and CM traffic we have few QPs (one UD and few CM's) in parallel. The problem: ---------------------------------------------------------- Transmit queue: ----------------- UD skb pointer kept in queue itself, CM skb kept in spearate queue and uses transmit queue as a placeholder to count the number of total transmitted packets. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 .........127 ------------------------------------------------------------ NL ud1 UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ........... ------------------------------------------------------------ ^ ^ tail head Completion queue (problematic scenario) - the order not the same as in the transmit queue: 1 2 3 4 5 6 7 8 9 ------------------------------------ ud1 CM1 UD2 ud3 cm2 cm3 ud4 cm4 ud5 ------------------------------------ 1. CM1 'wc' processing - skb freed in cm separate ring. - tx_tail of transmit queue increased although UD2 is not freed. Now driver assumes UD2 index is already freed and it could be used for new transmitted skb. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 .........127 ------------------------------------------------------------ NL NL UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ........... ------------------------------------------------------------ ^ ^ ^ (Bad)tail head (Bad - Could be used for new SKB) In this case (due to heavy load) UD2 skb pointer could be replaced by new transmitted packet UD_NEW, as the driver assumes its free. At this point we will have to process two 'wc' with same index but we have only one pointer to free. During second attempt to free the same skb we will have NULL pointer exception. 2. UD2 'wc' processing - skb freed according the index we got from 'wc', but it was already overwritten by mistake. So actually the skb that was released is the skb of the new transmitted packet and not the original one. 3. UD_NEW 'wc' processing - attempt to free already freed skb. NUll pointer exception. The fix: ----------------------------------------------------------------------- The fix is to stop using the UD ring as a placeholder for CM packets, the cyclic ring variables tx_head and tx_tail will manage the UD tx_ring, a new cyclic variables global_tx_head and global_tx_tail are introduced for managing and counting the overall outstanding sent packets, then the send queue will be stopped and waken based on these variables only. Note that no locking is needed since global_tx_head is updated in the xmit flow and global_tx_tail is updated in the NAPI flow only. A previous attempt tried to use one variable to count the outstanding sent packets, but it did not work since xmit and NAPI flows can run at the same time and the counter will be updated wrongly. Thus, we use the same simple cyclic head and tail scheme that we have today for the UD tx_ring. Fixes: 2c104ea68350 ("IB/ipoib: Get rid of the tx_outstanding variable in all modes") Link: https://lore.kernel.org/r/20200527134705.480068-1-leon@kernel.org Signed-off-by: Valentine Fatiev <valentinef@mellanox.com> Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/cma: Provide ECE reject reasonLeon Romanovsky
IBTA declares "vendor option not supported" reject reason in REJ messages if passive side doesn't want to accept proposed ECE options. Due to the fact that ECE is managed by userspace, there is a need to let users to provide such rejected reason. Link: https://lore.kernel.org/r/20200526103304.196371-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/ipoib: Remove can_sleep parameter from iboib_mcast_allocKamal Heib
can_sleep is always 0 when iboib_mcast_alloc() is called, so remove it and use GFP_ATOMIC instead of GFP_KERNEL. Link: https://lore.kernel.org/r/20200525130305.171509-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22RDMA/rtrs: Get rid of the do_next_path while_next_path macrosDanil Kipnis
The macros do_each_path/while_each_path lead to a smatch warning: drivers/infiniband/ulp/rtrs/rtrs-clt.c:1196 rtrs_clt_failover_req() warn: inconsistent indenting drivers/infiniband/ulp/rtrs/rtrs-clt.c:2890 rtrs_clt_request() warn: inconsistent indenting Also checkpatch complains: ERROR: Macros with multiple statements should be enclosed in a do - while loop The macros are used only in two places: for a normal IO path and for the failover path triggered after errors. Get rid of the macros and just use a for loop iterating over the list of paths in both places. It is easier to read and also less lines of code. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200522053924.528980-1-danil.kipnis@cloud.ionos.com Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22RDMA/rtrs: server: Use already dereferenced rtrs_sess structureMd Haris Iqbal
The rtrs_sess structure has already been extracted above from the rtrs_srv_sess structure. Use that to avoid redundant dereferencing. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20200522082833.1480551-1-haris.phnx@gmail.com Signed-off-by: Md Haris Iqbal <haris.phnx@gmail.com> Acked-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/ipoib: Add capability to switch between datagram and connected modeGary Leshner
This is the prerequisite modification to the ipoib ulp to allow a rdma netdev to obtain the default ndo ops for init/uninit/open/close. This is accomplished by setting the netdev ops field within the callback function passed to the netdev allocation routine which in turn was passed into the rdma netdev allocation routine. This allows the rdma netdev to call back into the ulp to create the resources required for connected mode operation. Additionally as the ulp is not re-entrant, when switching modes, the number of real tx queues is set to 1 for the connected mode. For datagram mode the number of real tx queues is set to the actual number of tx queues specified at the netdev's allocation. For the internal ulp netdev the number of tx queues defaults to 1. It is up to the rdma netdev to specify the actual number it can support. When the driver does not support a rdma netdev for acceleration, (-ENOTSUPPORTED return code or the verbs function for allocation is NULL) the ipoib ulp functions are unaffected by using the internal netdev allocated by the ipoib ulp. Link: https://lore.kernel.org/r/20200511160706.173205.19086.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/{hfi1, ipoib, rdma}: Broadcast ping sent packets which exceeded mtu sizeGary Leshner
When in connected mode ipoib sent broadcast pings which exceeded the mtu size for broadcast addresses. Add an mtu attribute to the rdma_netdev structure which ipoib sets to its mcast mtu size. The RDMA netdev uses this value to determine if the skb length is too long for the mtu specified and if it is, drops the packet and logs an error about the errant packet. Link: https://lore.kernel.org/r/20200511160655.173205.14546.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/ipoib: Increase ipoib Datagram mode MTU's upper limitKaike Wan
Currently the ipoib UD mtu is restricted to 4K bytes. Remove this limitation so that the IPOIB module can potentially use an MTU (in UD mode) that is bounded by the MTU of the underlying device. A field is added to the ib_port_attr structure to indicate the maximum physical MTU the underlying device supports. Link: https://lore.kernel.org/r/20200511160618.173205.23053.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/{rdmavt, hfi1}: Implement creation of accelerated UD QPsGary Leshner
Adds capability to create a qpn to be recognized as an accelerated UD QP for ipoib. This is accomplished by reserving 0x81 in byte[0] of the qpn as the prefix for these qp types and reserving qpns between 0x810000 and 0x81ffff. The hfi1 capability mask already contained a flag for the VNIC netdev. This has been renamed and extended to include both VNIC and ipoib. The rvt code to allocate qps now recognizes this flag and sets 0x81 into byte[0] of the qpn. The code to allocate qpns is modified to reset the qpn numbering when it is detected that a value is located in byte[0] for a UD QP and it is a qpn being requested for net dev use. If it is a regular UD QP then it is allowable to have bits set in byte[0] of the qpn and provide the previously normal behavior. The code to free the qpn now checks for the AIP prefix value of 0x81 and removes it from the qpn before being freed so that the lower 16 bit number can be reused. This patch requires minor changes in the IB core and ipoib to facilitate the creation of accelerated UP QPs. Link: https://lore.kernel.org/r/20200511160607.173205.11757.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19rnbd/rtrs: Pass max segment size from blk user to the rdma libraryDanil Kipnis
When Block Device Layer is disabled, BLK_MAX_SEGMENT_SIZE is undefined. The rtrs is a transport library and should compile independently of the block layer. The desired max segment size should be passed down by the user. Introduce max_segment_size parameter for the rtrs_clt_open() call. Fixes: f7a7a5c228d4 ("block/rnbd: client: main functionality") Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Fixes: cb80329c9434 ("RDMA/rtrs: client: private header with client structs and functions") Fixes: b5c27cdb094e ("RDMA/rtrs: public interface header to establish RDMA connections") Link: https://lore.kernel.org/r/20200519111419.924170-1-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19RDMA/rtrs: server: Fix some error return codeWei Yongjun
Fix to return negative error code -ENOMEM from the some error handling cases instead of 0, as done elsewhere in this function. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Fixes: 91b11610af8d ("RDMA/rtrs: server: sysfs interface functions") Link: https://lore.kernel.org/r/20200519091912.134358-1-weiyongjun1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19RDMA/rtrs: client: Fix function return on successGustavo A. R. Silva
Remove the if-statement and return the value contained in _err_, unconditionally. Link: https://lore.kernel.org/r/20200519163612.GA6043@embeddedor Addresses-Coverity-ID: 1493753 ("Identical code for different branches") Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19RDMA/rtrs: Fix a couple off by one bugs in rtrs_srv_rdma_done()Dan Carpenter
These > comparisons should be >= to prevent accessing one element beyond the end of the buffer. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20200519154525.GA66801@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19RDMA/rtrs: Fix some signedness bugs in error handlingDan Carpenter
The problem is that "req->sg_cnt" is an unsigned int so if "nr" is negative, it gets type promoted to a high positive value and the condition is false. This patch fixes it by handling negatives separately. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200519133223.GN2078@kadam Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/srpt: Fix disabling device managementKamal Heib
Avoid disabling device management for devices that don't support Management datagrams (MADs) by checking if the "mad_agent" pointer is initialized before calling ib_modify_port, also fix the error flow in srpt_refresh_port() to disable device management if ib_register_mad_agent() fail. Fixes: 09f8a1486dca ("RDMA/srpt: Fix handling of SR-IOV and iWARP ports") Link: https://lore.kernel.org/r/20200514114720.141139-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/srpt: Add a newline when printing parameter 'srpt_service_guid' by sysfsXiongfeng Wang
When I cat module parameter 'srpt_service_guid', it displays as follows. It is better to add a newline for easy reading. [root@hulk-202 ~]# cat /sys/module/ib_srpt/parameters/srpt_service_guid 0x0205cdfffe8346b9[root@hulk-202 ~]# Link: https://lore.kernel.org/r/1589182629-27743-1-git-send-email-wangxiongfeng2@huawei.com Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: a bit of documentationJack Wang
README with description of major sysfs entries, sysfs documentation has been moved to ABI dir as suggested by Bart. Link: https://lore.kernel.org/r/20200511135131.27580-15-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: include client and server modules into kernel compilationJack Wang
Add rtrs Makefile, Kconfig and also corresponding lines into upper layer infiniband/ulp files. Link: https://lore.kernel.org/r/20200511135131.27580-14-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: server: sysfs interface functionsJack Wang
This is the sysfs interface to rtrs sessions on server side: /sys/class/rtrs-server/<SESS-NAME>/ *** rtrs session accepted from a client peer | |- paths/<SRC@DST>/ *** established paths from a client in a session | |- disconnect | *** disconnect path | |- hca_name | *** HCA name | |- hca_port | *** HCA port | |- stats/ *** current path statistics | |- rdma Link: https://lore.kernel.org/r/20200511135131.27580-13-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: server: statistics functionsJack Wang
This introduces set of functions used on server side to account statistics of RDMA data sent/received. Link: https://lore.kernel.org/r/20200511135131.27580-12-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: server: main functionalityJack Wang
This is main functionality of rtrs-server module, which accepts set of RDMA connections (so called rtrs session), creates/destroys sysfs entries associated with rtrs session and notifies upper layer (user of RTRS API) about RDMA requests or link events. Link: https://lore.kernel.org/r/20200511135131.27580-11-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: server: private header with server structs and functionsJack Wang
This header describes main structs and functions used by rtrs-server module, mainly for accepting rtrs sessions, creating/destroying sysfs entries, accounting statistics on server side. Link: https://lore.kernel.org/r/20200511135131.27580-10-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: client: sysfs interface functionsJack Wang
This is the sysfs interface to rtrs sessions on client side: /sys/class/rtrs-client/<SESS-NAME>/ *** rtrs session created by rtrs_clt_open() API call | |- max_reconnect_attempts | *** number of reconnect attempts for session | |- add_path | *** adds another connection path into rtrs session | |- paths/<SRC@DST>/ *** established paths to server in a session | |- disconnect | *** disconnect path | |- reconnect | *** reconnect path | |- remove_path | *** remove current path | |- state | *** retrieve current path state | |- hca_port | *** HCA port number | |- hca_name | *** HCA name | |- stats/ *** current path statistics | |- cpu_migration |- rdma |- reconnects |- reset_all Link: https://lore.kernel.org/r/20200511135131.27580-9-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: client: statistics functionsJack Wang
This introduces set of functions used on client side to account statistics of RDMA data sent/received, amount of IOs inflight, latency, cpu migrations, etc. Almost all statistics are collected using percpu variables. Link: https://lore.kernel.org/r/20200511135131.27580-8-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: client: main functionalityJack Wang
This is main functionality of rtrs-client module, which manages set of RDMA connections for each rtrs session, does multipathing, load balancing and failover of RDMA requests. Link: https://lore.kernel.org/r/20200511135131.27580-7-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: client: private header with client structs and functionsJack Wang
This header describes main structs and functions used by rtrs-client module, mainly for managing rtrs sessions, creating/destroying sysfs entries, accounting statistics on client side. Link: https://lore.kernel.org/r/20200511135131.27580-6-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: core: lib functions shared between client and server modulesJack Wang
This is a set of library functions existing as a rtrs-core module, used by client and server modules. Mainly these functions wrap IB and RDMA calls and provide a bit higher abstraction for implementing of RTRS protocol on client or server sides. Link: https://lore.kernel.org/r/20200511135131.27580-5-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: private headers with rtrs protocol structs and helpersJack Wang
These are common private headers with rtrs protocol structures, logging, sysfs and other helper functions, which are used on both client and server sides. Link: https://lore.kernel.org/r/20200511135131.27580-4-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17RDMA/rtrs: public interface header to establish RDMA connectionsJack Wang
Introduce public header which provides set of API functions to establish RDMA connections from client to server machine using RTRS protocol, which manages RDMA connections for each session, does multipathing and load balancing. Main functions for client (active) side: rtrs_clt_open() - Creates set of RDMA connections incapsulated in IBTRS session and returns pointer on RTRS session object. rtrs_clt_close() - Closes RDMA connections associated with RTRS session. rtrs_clt_request() - Requests zero-copy RDMA transfer to/from server. Main functions for server (passive) side: rtrs_srv_open() - Starts listening for RTRS clients on specified port and invokes RTRS callbacks for incoming RDMA requests or link events. rtrs_srv_close() - Closes RTRS server context. Link: https://lore.kernel.org/r/20200511135131.27580-3-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06RDMA: Allow ib_client's to fail when add() is calledJason Gunthorpe
When a client is added it isn't allowed to fail, but all the client's have various failure paths within their add routines. This creates the very fringe condition where the client was added, failed during add and didn't set the client_data. The core code will then still call other client_data centric ops like remove(), rename(), get_nl_info(), and get_net_dev_by_params() with NULL client_data - which is confusing and unexpected. If the add() callback fails, then do not call any more client ops for the device, even remove. Remove all the now redundant checks for NULL client_data in ops callbacks. Update all the add() callbacks to return error codes appropriately. EOPNOTSUPP is used for cases where the ULP does not support the ib_device - eg because it only works with IB. Link: https://lore.kernel.org/r/20200421172440.387069-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14RDMA: Remove a few extra calls to ib_get_client_data()Jason Gunthorpe
These four places already have easy access to the client data, just use that instead. Link: https://lore.kernel.org/r/0-v1-fae83f600b4a+68-less_get_client_data%25jgg@mellanox.com Acked-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "The majority of the patches are cleanups, refactorings and clarity improvements. This cycle saw some more activity from Syzkaller, I think we are now clean on all but one of those bugs, including the long standing and obnoxious rdma_cm locking design defect. Continue to see many drivers getting cleanups, with a few new user visible features. Summary: - Various driver updates for siw, bnxt_re, rxe, efa, mlx5, hfi1 - Lots of cleanup patches for hns - Convert more places to use refcount - Aggressively lock the RDMA CM code that syzkaller says isn't working - Work to clarify ib_cm - Use the new ib_device lifecycle model in bnxt_re - Fix mlx5's MR cache which seems to be failing more often with the new ODP code - mlx5 'dynamic uar' and 'tx steering' user interfaces" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (144 commits) RDMA/bnxt_re: make bnxt_re_ib_init static IB/qib: Delete struct qib_ivdev.qp_rnd RDMA/hns: Fix uninitialized variable bug RDMA/hns: Modify the mask of QP number for CQE of hip08 RDMA/hns: Reduce the maximum number of extend SGE per WQE RDMA/hns: Reduce PFC frames in congestion scenarios RDMA/mlx5: Add support for RDMA TX flow table net/mlx5: Add support for RDMA TX steering IB/hfi1: Call kobject_put() when kobject_init_and_add() fails IB/hfi1: Fix memory leaks in sysfs registration and unregistration IB/mlx5: Move to fully dynamic UAR mode once user space supports it IB/mlx5: Limit the scope of struct mlx5_bfreg_info to mlx5_ib IB/mlx5: Extend QP creation to get uar page index from user space IB/mlx5: Extend CQ creation to get uar page index from user space IB/mlx5: Expose UAR object and its alloc/destroy commands IB/hfi1: Get rid of a warning RDMA/hns: Remove redundant judgment of qp_type RDMA/hns: Remove redundant assignment of wc->smac when polling cq RDMA/hns: Remove redundant qpc setup operations RDMA/hns: Remove meaningless prints ...
2020-03-27IB/hfi1: Get rid of a warningMauro Carvalho Chehab
The right markup for a variable is @foo, and not @foo[]. Using a wrong markup caused this warning: ./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:243: WARNING: Inline strong start-string without end-string. Link: https://lore.kernel.org/r/9dce702510505556d75a13d9641e09218a4b4a65.1584456635.git.mchehab+huawei@kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26IB/iser: Always check sig MR before putting it to the free poolSergey Gorenko
libiscsi calls the check_protection transport handler only if SCSI-Respose is received. So, the handler is never called if iSCSI task is completed for some other reason like a timeout or error handling. And this behavior looks correct. But the iSER does not handle this case properly because it puts a non-checked signature MR to the free pool. Then the error occurs at reusing the MR because it is not allowed to invalidate a signature MR without checking. This commit adds an extra check to iser_unreg_mem_fastreg(), which is a part of the task cleanup flow. Now the signature MR is checked there if it is needed. Link: https://lore.kernel.org/r/20200325151210.1548-1-sergeygo@mellanox.com Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-06RDMA/ipoib: reject unsupported coalescing paramsJakub Kicinski
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-04Merge tag 'v5.6-rc4' into rdma.git for-nextJason Gunthorpe
Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-27RDMA/opa_vnic: Delete driver versionLeon Romanovsky
The default version provided by "ethtool -i" it the correct way to identify Driver version. There is no need to overwrite it. Link: https://lore.kernel.org/r/20200220071239.231800-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-27RDMA/ipoib: Don't set constant driver versionLeon Romanovsky
There is no need to set driver version in in-tree kernel code. Link: https://lore.kernel.org/r/20200220071239.231800-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-20RDMA: Replace zero-length array with flexible-array memberGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Link: https://lore.kernel.org/r/20200213010425.GA13068@embeddedor.com Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> # added a few more
2020-02-14scsi: Revert "RDMA/isert: Fix a recently introduced regression related to ↵Bart Van Assche
logout" Since commit 04060db41178 introduces soft lockups when toggling network interfaces, revert it. Link: https://marc.info/?l=target-devel&m=158157054906196 Cc: Rahul Kundu <rahul.kundu@chelsio.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Sagi Grimberg <sagi@grimberg.me> Reported-by: Dakshaja Uppalapati <dakshaja@chelsio.com> Fixes: 04060db41178 ("scsi: RDMA/isert: Fix a recently introduced regression related to logout") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-31Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "A very quiet cycle with few notable changes. Mostly the usual list of one or two patches to drivers changing something that isn't quite rc worthy. The subsystem seems to be seeing a larger number of rework and cleanup style patches right now, I feel that several vendors are prepping their drivers for new silicon. Summary: - Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4, rxe, i40iw - Larger series doing cleanup and rework for hns and hfi1. - Some general reworking of the CM code to make it a little more understandable - Unify the different code paths connected to the uverbs FD scheme - New UAPI ioctls conversions for get context and get async fd - Trace points for CQ and CM portions of the RDMA stack - mlx5 driver support for virtio-net formatted rings as RDMA raw ethernet QPs - verbs support for setting the PCI-E relaxed ordering bit on DMA traffic connected to a MR - A couple of bug fixes that came too late to make rc7" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (108 commits) RDMA/core: Make the entire API tree static RDMA/efa: Mask access flags with the correct optional range RDMA/cma: Fix unbalanced cm_id reference count during address resolve RDMA/umem: Fix ib_umem_find_best_pgsz() IB/mlx4: Fix leak in id_map_find_del IB/opa_vnic: Spelling correction of 'erorr' to 'error' IB/hfi1: Fix logical condition in msix_request_irq RDMA/cm: Remove CM message structs RDMA/cm: Use IBA functions for complex structure members RDMA/cm: Use IBA functions for simple structure members RDMA/cm: Use IBA functions for swapping get/set acessors RDMA/cm: Use IBA functions for simple get/set acessors RDMA/cm: Add SET/GET implementations to hide IBA wire format RDMA/cm: Add accessors for CM_REQ transport_type IB/mlx5: Return the administrative GUID if exists RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence IB/mlx4: Fix memory leak in add_gid error flow IB/mlx5: Expose RoCE accelerator counters RDMA/mlx5: Set relaxed ordering when requested RDMA/core: Add the core support field to METHOD_GET_CONTEXT ...
2020-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Add WireGuard 2) Add HE and TWT support to ath11k driver, from John Crispin. 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca. 4) Add variable window congestion control to TIPC, from Jon Maloy. 5) Add BCM84881 PHY driver, from Russell King. 6) Start adding netlink support for ethtool operations, from Michal Kubecek. 7) Add XDP drop and TX action support to ena driver, from Sameeh Jubran. 8) Add new ipv4 route notifications so that mlxsw driver does not have to handle identical routes itself. From Ido Schimmel. 9) Add BPF dynamic program extensions, from Alexei Starovoitov. 10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes. 11) Add support for macsec HW offloading, from Antoine Tenart. 12) Add initial support for MPTCP protocol, from Christoph Paasch, Matthieu Baerts, Florian Westphal, Peter Krystad, and many others. 13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu Cherian, and others. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits) net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC udp: segment looped gso packets correctly netem: change mailing list qed: FW 8.42.2.0 debug features qed: rt init valid initialization changed qed: Debug feature: ilt and mdump qed: FW 8.42.2.0 Add fw overlay feature qed: FW 8.42.2.0 HSI changes qed: FW 8.42.2.0 iscsi/fcoe changes qed: Add abstraction for different hsi values per chip qed: FW 8.42.2.0 Additional ll2 type qed: Use dmae to write to widebus registers in fw_funcs qed: FW 8.42.2.0 Parser offsets modified qed: FW 8.42.2.0 Queue Manager changes qed: FW 8.42.2.0 Expose new registers and change windows qed: FW 8.42.2.0 Internal ram offsets modifications MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver Documentation: net: octeontx2: Add RVU HW and drivers overview octeontx2-pf: ethtool RSS config support octeontx2-pf: Add basic ethtool support ...