summaryrefslogtreecommitdiff
path: root/include/rdma/ib_verbs.h
AgeCommit message (Collapse)Author
2015-12-23IB/core: Add ib_is_udata_clearedMatan Barak
Extending core and vendor verb commands require us to check that the unknown part of the user's given command is all zeros. Adding ib_is_udata_cleared in order to do so. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove the write-only usecnt field from struct ib_mrChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove the struct ib_phys_buf definitionChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove in-kernel support for memory windowsChristoph Hellwig
Remove the unused ib_allow_mw and ib_bind_mw functions, remove the unused IB_WR_BIND_MW and IB_WC_BIND_MW opcodes and move ib_dealloc_mw into the uverbs module. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove support for phys MRsChristoph Hellwig
We have stopped using phys MRs in the kernel a while ago, so let's remove all the cruft used to implement them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-By: Devesh Sharma<devesh.sharma@avagotech.com> [ocrdma] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: remove ib_query_mrChristoph Hellwig
This functionality has no users and was only supported by the staged out EHCA driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [core] Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB: start documenting device capabilitiesChristoph Hellwig
Just IB_DEVICE_LOCAL_DMA_LKEY and IB_DEVICE_MEM_MGT_EXTENSIONS for now as I'm most familar with those. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add rdma_network_type to wcSomnath Kotur
Providers should tell IB core the wc's network type. This is used in order to search for the proper GID in the GID table. When using HCAs that can't provide this info, IB core tries to deep examine the packet and extract the GID type by itself. We choose sgid_index and type from all the matching entries in RDMA-CM based on hint from the IP stack and we set hop_limit for the IP packet based on above hint from IP stack. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Somnath Kotur <Somnath.Kotur@Avagotech.Com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add ROCE_UDP_ENCAP (RoCE V2) typeMatan Barak
Adding RoCE v2 GID type and port type. Vendors which support this type will get their GID table populated with RoCE v2 GIDs automatically. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add gid_type to gid attributeMatan Barak
In order to support multiple GID types, we need to store the gid_type with each GID. This is also aligned with the RoCE v2 annex "RoCEv2 PORT GID table entries shall have a "GID type" attribute that denotes the L3 Address type". The currently supported GID is IB_GID_TYPE_IB which is also RoCE v1 GID type. This implies that gid_type should be added to roce_gid_table meta-data. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22Merge branches '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5Doug Ledford
Signed-off-by: Doug Ledford <dledford@redhat.com> Conflicts: drivers/infiniband/ulp/iser/iser_verbs.c
2015-12-22IB/core: Remove ib_query_deviceOr Gerlitz
The copy of the attributes present on the device is now used by all consumers except for uverbs in case of serving user-space query, where dev->query_device is called. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22IB/core: Save the device attributes on the device structureIra Weiny
This way both the IB core and upper level drivers can access these cached device attributes rather than querying or caching them on their own. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-15Merge branch 'rdma-cq.2' of git://git.infradead.org/users/hch/rdma into ↵Doug Ledford
4.5/rdma-cq Signed-off-by: Doug Ledford <dledford@redhat.com> Conflicts: drivers/infiniband/ulp/srp/ib_srp.c - Conflicts with changes in ib_srp.c introduced during 4.4-rc updates
2015-12-11IB: add a proper completion queue abstractionChristoph Hellwig
This adds an abstraction that allows ULPs to simply pass a completion object and completion callback with each submitted WR and let the RDMA core handle the nitty gritty details of how to handle completion interrupts and poll the CQ. In detail there is a new ib_cqe structure which just contains the completion callback, and which can be used to get at the containing object using container_of. It is pointed to by the WR and WC as an alternative to the wr_id field, similar to how many ULPs already use the field to store a pointer using casts. A driver using the new completion callbacks allocates it's CQs using the new ib_create_cq API, which in addition to the number of CQEs and the completion vectors also takes a mode on how we poll for CQEs. Three modes are available: direct for drivers that never take CQ interrupts and just poll for them, softirq to poll from softirq context using the to be renamed blk-iopoll infrastructure which takes care of rearming and budgeting, or a workqueue for consumer who want to be called from user context. Thanks a lot to Sagi Grimberg who helped reviewing the API, wrote the current version of the workqueue code because my two previous attempts sucked too much and converted the iSER initiator to the new API. Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-12-07IB/core: use RCU for uverbs id lookupMike Marciniszyn
The current implementation gets a spin_lock, and at any scale with qib and hfi1 post send, the lock contention grows exponentially with the number of QPs. idr_find() is RCU compatibile, so read doesn't need the lock. Change to use rcu_read_lock() and rcu_read_unlock() in __idr_get_uobj(). kfree_rcu() is used to insure a grace period between the idr removal and actual free. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-30IB/core, cma: Make __attribute_const__ declarations sparse-friendlyBart Van Assche
Move the __attribute_const__ declarations such that sparse understands that these apply to the function itself and not to the return type. This avoids that sparse reports error messages like the following: drivers/infiniband/core/verbs.c:73:12: error: symbol 'ib_event_msg' redeclared with different type (originally declared at include/rdma/ib_verbs.h:470) - different modifiers Fixes: 2b1b5b601230 ("IB/core, cma: Nice log-friendly string helpers") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-29IB/core: Remove old fast registration APISagi Grimberg
No callers and no providers left, go ahead and remove it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28IB/core: Introduce new fast registration APISagi Grimberg
The new fast registration verb ib_map_mr_sg receives a scatterlist and converts it to a page list under the verbs API thus hiding the specific HW mapping details away from the consumer. The provider drivers are provided with a generic helper ib_sg_to_pages that converts a scatterlist into a vector of page addresses. The drivers can still perform any HW specific page address setting by passing a set_page function pointer which will be invoked for each page address. This allows drivers to avoid keeping a shadow page vectors and convert them to HW specific translations by doing extra copies. This API will allow ULPs to remove the duplicated code of constructing a page vector from a given sg list. The send work request ib_reg_wr also shrinks as it will contain only mr, key and access flags in addition. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28Merge branch 'wr-cleanup' into k.o/for-4.4Doug Ledford
2015-10-21IB/core: Remove smac and vlan id from qp_attr and ah_attrMatan Barak
Smac and vlan id could be resolved from the GID attribute, and thus these attributes aren't needed anymore. Removing them. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21IB/core: Add netdev and gid attributes paramteres to cacheMatan Barak
Adding an ability to query the IB cache by a netdev and get the attributes of a GID. These parameters are necessary in order to successfully resolve the required GID (when the netdevice is known) and get the Ethernet L2 attributes from a GID. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-08IB: remove xrc_remote_srq_num from struct ib_send_wrChristoph Hellwig
The field is only initialized in mlx, but never used. If we want to add proper XRC support it should be done with a new struct ib_xrc_wr. This shrinks the various WR structures by another 4 bytes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Haggai Eran <haggaie@mellanox.com>
2015-10-08IB: split struct ib_send_wrChristoph Hellwig
This patch split up struct ib_send_wr so that all non-trivial verbs use their own structure which embedds struct ib_send_wr. This dramaticly shrinks the size of a WR for most common operations: sizeof(struct ib_send_wr) (old): 96 sizeof(struct ib_send_wr): 48 sizeof(struct ib_rdma_wr): 64 sizeof(struct ib_atomic_wr): 96 sizeof(struct ib_ud_wr): 88 sizeof(struct ib_fast_reg_wr): 88 sizeof(struct ib_bind_mw_wr): 96 sizeof(struct ib_sig_handover_wr): 80 And with Sagi's pending MR rework the fast registration WR will also be down to a reasonable size: sizeof(struct ib_fastreg_wr): 64 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt] Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc] Tested-by: Haggai Eran <haggaie@mellanox.com> Tested-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Steve Wise <swise@opengridcomputing.com>
2015-09-28IB/core: Add support of checksum capability reporting for RC and RAWBodong Wang
Two enum members IB_DEVICE_RC_IP_CSUM and IB_DEVICE_RAW_IP_CSUM are added to ib_device_cap_flags. Device should set these two flags if they support insertion of UDP and TCP checksum on outgoing IPv4 messages and can verify the validity of checksum for incoming IPv4 messages, for RC IPoIB and RAW over Ethernet respectively. They are similar to IB_DEVICE_UD_IP_CSUM. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-09Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull inifiniband/rdma updates from Doug Ledford: "This is a fairly sizeable set of changes. I've put them through a decent amount of testing prior to sending the pull request due to that. There are still a few fixups that I know are coming, but I wanted to go ahead and get the big, sizable chunk into your hands sooner rather than waiting for those last few fixups. Of note is the fact that this creates what is intended to be a temporary area in the drivers/staging tree specifically for some cleanups and additions that are coming for the RDMA stack. We deprecated two drivers (ipath and amso1100) and are waiting to hear back if we can deprecate another one (ehca). We also put Intel's new hfi1 driver into this area because it needs to be refactored and a transfer library created out of the factored out code, and then it and the qib driver and the soft-roce driver should all be modified to use that library. I expect drivers/staging/rdma to be around for three or four kernel releases and then to go away as all of the work is completed and final deletions of deprecated drivers are done. Summary of changes for 4.3: - Create drivers/staging/rdma - Move amso1100 driver to staging/rdma and schedule for deletion - Move ipath driver to staging/rdma and schedule for deletion - Add hfi1 driver to staging/rdma and set TODO for move to regular tree - Initial support for namespaces to be used on RDMA devices - Add RoCE GID table handling to the RDMA core caching code - Infrastructure to support handling of devices with differing read and write scatter gather capabilities - Various iSER updates - Kill off unsafe usage of global mr registrations - Update SRP driver - Misc mlx4 driver updates - Support for the mr_alloc verb - Support for a netlink interface between kernel and user space cache daemon to speed path record queries and route resolution - Ininitial support for safe hot removal of verbs devices" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (136 commits) IB/ipoib: Suppress warning for send only join failures IB/ipoib: Clean up send-only multicast joins IB/srp: Fix possible protection fault IB/core: Move SM class defines from ib_mad.h to ib_smi.h IB/core: Remove unnecessary defines from ib_mad.h IB/hfi1: Add PSM2 user space header to header_install IB/hfi1: Add CSRs for CONFIG_SDMA_VERBOSITY mlx5: Fix incorrect wc pkey_index assignment for GSI messages IB/mlx5: avoid destroying a NULL mr in reg_user_mr error flow IB/uverbs: reject invalid or unknown opcodes IB/cxgb4: Fix if statement in pick_local_ip6adddrs IB/sa: Fix rdma netlink message flags IB/ucma: HW Device hot-removal support IB/mlx4_ib: Disassociate support IB/uverbs: Enable device removal when there are active user space applications IB/uverbs: Explicitly pass ib_dev to uverbs commands IB/uverbs: Fix race between ib_uverbs_open and remove_one IB/uverbs: Fix reference counting usage of event files IB/core: Make ib_dealloc_pd return void IB/srp: Create an insecure all physical rkey only if needed ...
2015-08-30IB/uverbs: Enable device removal when there are active user space applicationsYishai Hadas
Enables the uverbs_remove_one to succeed despite the fact that there are running IB applications working with the given ib device. This functionality enables a HW device to be unbind/reset despite the fact that there are running user space applications using it. It exposes a new IB kernel API named 'disassociate_ucontext' which lets a driver detaching its HW resources from a given user context without crashing/terminating the application. In case a driver implemented the above API and registered with ib_uverb there will be no dependency between its device to its uverbs_device. Upon calling remove_one of ib_uverbs the call should return after disassociating the open HW resources without waiting to clients disconnecting. In case driver didn't implement this API there will be no change to current behaviour and uverbs_remove_one will return only when last client has disconnected and reference count on uverbs device became 0. In case the lower driver device was removed any application will continue working over some zombie HCA, further calls will ended with an immediate error. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Make ib_dealloc_pd return voidJason Gunthorpe
The majority of callers never check the return value, and even if they did, they can't do anything about a failure. All possible failure cases represent a bug in the caller, so just WARN_ON inside the function instead. This fixes a few random errors: net/rd/iw.c infinite loops while it fails. (racing with EBUSY?) This also lays the ground work to get rid of error return from the drivers. Most drivers do not error, the few that do are broken since it cannot be handled. Since uverbs can legitimately make use of EBUSY, open code the check. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Guarantee that a local_dma_lkey is availableJason Gunthorpe
Every single ULP requires a local_dma_lkey to do anything with a QP, so let us ensure one exists for every PD created. If the driver can supply a global local_dma_lkey then use that, otherwise ask the driver to create a local use all physical memory MR associated with the new PD. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Acked-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/mlx4: Implement ib_device callbacksMoni Shoua
get_netdev: get the net_device on the physical port of the IB transport port. In port aggregation mode it is required to return the netdev of the active port. modify_gid: note for a change in the RoCE gid cache. Handle this by writing to the harsware GID table. It is possible that indexes in cahce and hardware tables won't match so a translation is required when modifying a QP or creating an address handle. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Add RoCE GID table managementMatan Barak
RoCE GIDs are based on IP addresses configured on Ethernet net-devices which relate to the RDMA (RoCE) device port. Currently, each of the low-level drivers that support RoCE (ocrdma, mlx4) manages its own RoCE port GID table. As there's nothing which is essentially vendor specific, we generalize that, and enhance the RDMA core GID cache to do this job. In order to populate the GID table, we listen for events: (a) netdev up/down/change_addr events - if a netdev is built onto our RoCE device, we need to add/delete its IPs. This involves adding all GIDs related to this ndev, add default GIDs, etc. (b) inet events - add new GIDs (according to the IP addresses) to the table. For programming the port RoCE GID table, providers must implement the add_gid and del_gid callbacks. RoCE GID management requires us to state the associated net_device alongside the GID. This information is necessary in order to manage the GID table. For example, when a net_device is removed, its associated GIDs need to be removed as well. RoCE mandates generating a default GID for each port, based on the related net-device's IPv6 link local. In contrast to the GID based on the regular IPv6 link-local (as we generate GID per IP address), the default GID is also available when the net device is down (in order to support loopback). Locking is done as follows: The patch modify the GID table code both for new RoCE drivers implementing the add_gid/del_gid callbacks and for current RoCE and IB drivers that do not. The flows for updating the table are different, so the locking requirements are too. While updating RoCE GID table, protection against multiple writers is achieved via mutex_lock(&table->lock). Since writing to a table requires us to find an entry (possible a free entry) in the table and then modify it, this mutex protects both the find_gid and write_gid ensuring the atomicity of the action. Each entry in the GID cache is protected by rwlock. In RoCE, writing (usually results from netdev notifier) involves invoking the vendor's add_gid and del_gid callbacks, which could sleep. Therefore, an invalid flag is added for each entry. Updates for RoCE are done via a workqueue, thus sleeping is permitted. In IB, updates are done in write_lock_irq(&device->cache.lock), thus write_gid isn't allowed to sleep and add_gid/del_gid are not called. When passing net-device into/out-of the GID cache, the device is always passed held (dev_hold). The code uses a single work item for updating all RDMA devices, following a netdev or inet notifier. The patch moves the cache from being a client (which was incorrect, as the cache is part of the IB infrastructure) to being explicitly initialized/freed when a device is registered/removed. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Drop ib_alloc_fast_reg_mrSagi Grimberg
Fully replaced by a more generic and suitable ib_alloc_mr. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB: Modify ib_create_mr APISagi Grimberg
Use ib_alloc_mr with specific parameters. Change the existing callers. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Get rid of redundant verb ib_destroy_mrSagi Grimberg
This was added in a thought of uniting all mr allocation and deallocation routines but the fact is we have a single deallocation routine already, ib_dereg_mr. And, move mlx5_ib_destroy_mr specific logic into mlx5_ib_dereg_mr (includes only signature stuff for now). And, fixup the only callers (iser/isert) accordingly. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Find the network device matching connection parametersYotam Kenneth
In the case of IPoIB, and maybe in other cases, the network device is managed by an upper-layer protocol (ULP). In order to expose this network device to other users of the IB device, let ULPs implement a callback that returns network device according to connection parameters. The IB device and port, together with the P_Key and the GID should be enough to uniquely identify the ULP net device. However, in current kernels there can be multiple IPoIB interfaces created with the same GID. Furthermore, such configuration may be desireable to support ipvlan-like configurations for RDMA CM with IPoIB. To resolve the device in these cases the code will also take the IP address as an additional input. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: lock client data with lists_rwsemHaggai Eran
An ib_client callback that is called with the lists_rwsem locked only for read is protected from changes to the IB client lists, but not from ib_unregister_device() freeing its client data. This is because ib_unregister_device() will remove the device from the device list with lists_rwsem locked for write, but perform the rest of the cleanup, including the call to remove() without that lock. Mark client data that is undergoing de-registration with a new going_down flag in the client data context. Lock the client data list with lists_rwsem for write in addition to using the spinlock, so that functions calling the callback would be able to lock only lists_rwsem for read and let callbacks sleep. Since ib_unregister_client() now marks the client data context, no need for remove() to search the context again, so pass the client data directly to remove() callbacks. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28RDMA/Core: remove rdma_cap_read_multi_sge() helperSteve Wise
This functionality already exists via the max_sge_rd device capability. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-05core: Remove the ib_reg_phys_mr() and ib_rereg_phys_mr() verbsChuck Lever
The verbs are obsolete. The ib_rereg_phys_mr() verb is not used by kernel ULPs, and the last ib_reg_phys_mr() call site in the kernel tree has now been removed. Two staging tree call sites remain in the Lustre client. The Lustre team has been notified of the deprecation of reg_phys_mr. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2015-07-14IB: Add rdma_cap_ib_switch helper and use where appropriateHal Rosenstock
Persuant to Liran's comments on node_type on linux-rdma mailing list: In an effort to reform the RDMA core and ULPs to minimize use of node_type in struct ib_device, an additional bit is added to struct ib_device for is_switch (IB switch). This is needed to be initialized by any IB switch device driver. This is a NEW requirement on such device drivers which are all "out of tree". In addition, an ib_switch helper was added to ib_verbs.h based on the is_switch device bit rather than node_type (although those should be consistent). The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs) as well as (IPoIB and SRP) ULPs are updated where appropriate to use this new helper. In some cases, the helper is now used under the covers of using rdma_[start end]_port rather than the open coding previously used. Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12IB/core: Add OPA MAD core capability flagIra Weiny
Add OPA MAD support flags to the core capability immutable flags. In addition add the rdma_cap_opa_mad helper function for core functions to use to detect OPA MAD support. OPA MADs share a common header with IBTA MADs but with some differences for increased performance. Sharing a common header with IBTA MADs allows us to share most of the MAD processing code when dealing with OPA MADs in addition to supporting some IBTA MADs on OPA devices. OPA MADs differ in the following ways: 1) MADs are variable size up to 2K IBTA defined MADs remain fixed at 256 bytes 2) OPA SMPs must carry valid PKeys 3) OPA SMP packets are a different format The MAD stack will use this new functionality to determine if OPA MAD processing should occur on individual device ports. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12IB/mad: Add support for additional MAD info to/from driversIra Weiny
In order to support alternate sized MADs (and variable sized MADs on OPA devices) add in/out MAD size parameters to the process_mad core call. In addition, add an out_mad_pkey_index to communicate the pkey index the driver wishes the MAD stack to use when sending OPA MAD responses. The out MAD size and the out MAD PKey index are required by the MAD stack to generate responses on OPA devices. Furthermore, the in and out MAD parameters are made generic by specifying them as ib_mad_hdr rather than ib_mad. Drivers are modified as needed and are protected by BUG_ON flags if the MAD sizes passed to them is incorrect. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12IB/core: Add ability for drivers to report an alternate MAD size.Ira Weiny
Add max MAD size to the device immutable data set and have all drivers that support MADs report the current IB MAD size (IB_MGMT_MAD_SIZE) to the core. Verify MAD size data in both the MAD core and when reading the immutable data. OPA drivers will report alternate MAD sizes in subsequent patches. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12IB/core: Pass hardware specific data in query_deviceMatan Barak
Vendors should be able to pass vendor specific data to/from user-space via query_device uverb. In order to do this, we need to pass the vendors' specific udata. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12IB/core: Add timestamp_mask and hca_core_clock to query_deviceMatan Barak
In order to expose timestamp we need to expose two new attributes in query_device to be used for CQ completion time-stamping: timestamp_mask - how many bits are valid in the timestamp, where timestamp values could be 64bits the most. hca_core_clock - timestamp is given in HW cycles, the frequency in KHZ units of the HCA, necessary in order to convert cycles to seconds. This is added both to ib_query_device and its respective uverbs counterpart. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12IB/core: Add CQ creation time-stamping flagMatan Barak
Add CQ creation flag which dictates that the created CQ will report completion time-stamp value in the WC. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12IB/core: Change ib_create_cq to use struct ib_cq_init_attrMatan Barak
Currently, ib_create_cq uses cqe and comp_vecotr instead of the extendible ib_cq_init_attr struct. Earlier patches already changed the vendors to work with ib_cq_init_attr. This patch changes the consumers too. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12IB/core: Change provider's API of create_cq to be extendibleMatan Barak
Add a new ib_cq_init_attr structure which contains the previous cqe (minimum number of CQ entries) and comp_vector (completion vector) in addition to a new flags field. All vendors' create_cq callbacks are changed in order to work with the new API. This commit does not change any functionality. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> to patch #2 Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-10IB/core: Don't advertise SA in RoCE port capabilitiesMoni Shoua
The Subnet Administrator (SA) is not a component of the RoCE spec. Therefore, it should not be a capability of a RoCE port. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-02IB/core cleanup: Add const to args - agent_send_responseIra Weiny
In order to support constant callers of agent_send_response we add const specifiers to the its pointer arguments. Adjust the call tree accordingly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-02IB/core cleanup: Add const on args - device->process_madIra Weiny
The process_mad device function declares some parameters as "in". Make those parameters const and adjust the call tree under process_mad in the various drivers accordingly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>