summaryrefslogtreecommitdiff
path: root/drivers/infiniband/core/cma.c
AgeCommit message (Collapse)Author
2011-11-06Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
2011-11-01Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', ↵Roland Dreier
'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next
2011-10-31infiniband: Fix up module files that need to include module.hPaul Gortmaker
They had been getting it implicitly via device.h but we can't rely on that for the future, due to a pending cleanup so fix it now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-13RDMA/cma: Support XRC QPsSean Hefty
Allow users to connect XRC QPs through the rdma_cm. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13RDMA/cm: Define new RDMA port space specific to IBSean Hefty
Add RDMA_PS_IB. XRC QP types will use the IB port space when operating over the RDMA CM. For the 'IP protocol' field value, we select 0x3F, which is listed as being for 'any local network'. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-06RDMA/iwcm: Propagate ird/ord values upwardsKumar Sanghvi
Update struct iw_cm_event to support propagating the ird/ord values upwards to the application. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-06RDMA/cma: Check for NULL conn_param in rdma_acceptHefty, Sean
Check that conn_param is not null before dereferencing it when processing rdma_accept(). This is necessary to prevent a possible system crash, which can be caused by user space. Problem found by code inspection. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-06RDMA/cma: Fix crash in cma_req_handlerHefty, Sean
The RDMA CM uses the local qp_type to determine how to process an incoming request. This can result in an incoming REQ being treated as a SIDR REQ and vice versa. Fix this by switching off the event type instead, and for good measure verify that the listener supports the incoming connection request. This problem showed up when a user space application mismatched the QP types between a client and server app. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18RDMA/cma: Don't allow IPoIB port space for IBoEMoni Shoua
This patch fixes a kernel crash in cma_set_qkey(). When the link layer is Ethernet, it is wrong to use IPoIB port space since no IPoIB interface is available. Specifically, setting the Q_Key when port space is RDMA_PS_IPOIB requires MGID calculation and an SA query, which doesn't make sense over Ethernet. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18RDMA/cma: Avoid assigning an IS_ERR value to cm_id pointer in CMA id objectJack Morgenstein
Avoid assigning an IS_ERR value to the cm_id pointer. This fixes a few anomalies in the error flow due to confusion about checking for NULL vs IS_ERR, and eliminates the need to test for the IS_ERR value every time we wish to determine if the cma_id object has a cm device associated with it. Also, eliminate the now-unnecessary procedure cma_has_cm_dev (we can check directly for the existence of the device pointer -- for a non-NULL check, makes no difference if it is the iwarp or the ib pointer). Finally, make a few code changes here to improve coding consistency. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25RDMA/cma: Save PID of ID's ownerNir Muchtar
Save the PID associated with an RDMA CM ID for reporting via netlink. Signed-off-by: Nir Muchtar <nirm@voltaire.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25RDMA/cma: Add support for netlink statistics exportNir Muchtar
Add callbacks and data types for statistics export of all current devices/ids. The schema for RDMA CM is a series of netlink messages. Each one contains an rdma_cm_stat struct. Additionally, two netlink attributes are created for the addresses for each message (if applicable). Their types used are: RDMA_NL_RDMA_CM_ATTR_SRC_ADDR (The source address for this ID) RDMA_NL_RDMA_CM_ATTR_DST_ADDR (The destination address for this ID) sockaddr_* structs are encapsulated within these attributes. In other words, every transaction contains a series of messages like: -------message 1------- struct rdma_cm_id_stats { __u32 qp_num; __u32 bound_dev_if; __u32 port_space; __s32 pid; __u8 cm_state; __u8 node_type; __u8 port_num; __u8 reserved; } RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute - contains the source address RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute - contains the destination address -------end 1------- -------message 2------- struct rdma_cm_id_stats RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute -------end 2------- Signed-off-by: Nir Muchtar <nirm@voltaire.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25RDMA/cma: Pass QP type into rdma_create_id()Sean Hefty
The RDMA CM currently infers the QP type from the port space selected by the user. In the future (eg with RDMA_PS_IB or XRC), there may not be a 1-1 correspondence between port space and QP type. For netlink export of RDMA CM state, we want to export the QP type to userspace, so it is cleaner to explicitly associate a QP type to an ID. Modify rdma_create_id() to allow the user to specify the QP type, and use it to make our selections of datagram versus connected mode. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>Nir Muchtar
Move cma.c's internal definition of enum cma_state to enum rdma_cm_state in an exported header so that it can be exported via RDMA netlink. Signed-off-by: Nir Muchtar <nirm@voltaire.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09RDMA/cma: Add an ID_REUSEADDR optionHefty, Sean
Lustre requires that clients bind to a privileged port number before connecting to a remote server. On larger clusters (typically more than about 1000 nodes), the number of privileged ports is exhausted, resulting in lustre being unusable. To handle this, we add support for reusable addresses to the rdma_cm. This mimics the behavior of the socket option SO_REUSEADDR. A user may set an rdma_cm_id to reuse an address before calling rdma_bind_addr() (explicitly or implicitly). If set, other rdma_cm_id's may be bound to the same address, provided that they all have reuse enabled, and there are no active listens. If rdma_listen() is called on an rdma_cm_id that has reuse enabled, it will only succeed if there are no other id's bound to that same address. The reuse option is exported to user space. The behavior of the kernel reuse implementation was verified against that given by sockets. This patch is derived from a path by Ira Weiny <weiny2@llnl.gov> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09RDMA/cma: Fix handling of IPv6 addressing in cma_use_portHefty, Sean
cma_use_port() assumes that the sockaddr is an IPv4 address. Since IPv6 addressing is supported (and also to support other address families) make the code more generic in its address handling. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-15RDMA/cma: Replace global lock in rdma_destroy_id() with id-specific oneSean Hefty
rdma_destroy_id currently uses the global rdma cm 'lock' to test if an rdma_cm_id has been bound to a device. This prevents an active address resolution callback handler from assigning a device to the rdma_cm_id after rdma_destroy_id checks for one. Instead, we can replace the use of the global lock around the check to the rdma_cm_id device pointer by setting the id state to destroying, then flushing all active callbacks. The latter is accomplished by acquiring and releasing the handler_mutex. Any active handler will complete first, and any newly scheduled handlers will find the rdma_cm_id in an invalid state. In addition to optimizing the current locking scheme, the use of the rdma_cm_id mutex is a more intuitive synchronization mechanism than that of the global lock. These changes are based on feedback from Doug Ledford <dledford@redhat.com> while he was trying to debug a crash in the rdma cm destroy path. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-15RDMA/cma: Fix crash in request handlersSean Hefty
Doug Ledford and Red Hat reported a crash when running the rdma_cm on a real-time OS. The crash has the following call trace: cm_process_work cma_req_handler cma_disable_callback rdma_create_id kzalloc init_completion cma_get_net_info cma_save_net_info cma_any_addr cma_zero_addr rdma_translate_ip rdma_copy_addr cma_acquire_dev rdma_addr_get_sgid ib_find_cached_gid cma_attach_to_dev ucma_event_handler kzalloc ib_copy_ah_attr_to_user cma_comp [ preempted ] cma_write copy_from_user ucma_destroy_id copy_from_user _ucma_find_context ucma_put_ctx ucma_free_ctx rdma_destroy_id cma_exch cma_cancel_operation rdma_node_get_transport rt_mutex_slowunlock bad_area_nosemaphore oops_enter They were able to reproduce the crash multiple times with the following details: Crash seems to always happen on the: mutex_unlock(&conn_id->handler_mutex); as conn_id looks to have been freed during this code path. An examination of the code shows that a race exists in the request handlers. When a new connection request is received, the rdma_cm allocates a new connection identifier. This identifier has a single reference count on it. If a user calls rdma_destroy_id() from another thread after receiving a callback, rdma_destroy_id will proceed to destroy the id and free the associated memory. However, the request handlers may still be in the process of running. When control returns to the request handlers, they can attempt to access the newly created identifiers. Fix this by holding a reference on the newly created rdma_cm_id until the request handler is through accessing it. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Doug Ledford <dledford@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2010-10-25IB/core: Add VLAN support for IBoEEli Cohen
Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the GID derived from a link local address in the following way: GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN. The 3 bits user priority field of the packets are identical to the 3 bits of the SL. In case of rdma_cm apps, the TOS field is used to generate the SL field by doing a shift right of 5 bits effectively taking to 3 MS bits of the TOS field. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-13RDMA/cm: Add RDMA CM support for IBoE devicesEli Cohen
Add support for IBoE device binding and IP --> GID resolution. Path resolving and multicast joining are implemented within cma.c by filling in the responses and running callbacks in the CMA work queue. IP --> GID resolution always yields IPv6 link local addresses; remote GIDs are derived from the destination MAC address of the remote port. Multicast GIDs are always mapped to multicast MACs as is done in IPv6. (IPv4 multicast is enabled by translating IPv4 multicast addresses to IPv6 multicast as described in <http://www.mail-archive.com/ipng@sunroof.eng.sun.com/msg02134.html>.) Some helper functions are added to ib_addr.h. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-15Merge branches 'amso1100', 'bkl', 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'iser', ↵Roland Dreier
'masked-atomics', 'misc', 'mthca' and 'nes' into for-next
2010-05-15IB/core: Use kmemdup() instead of kmalloc()+memcpy()Julia Lawall
Use kmemdup when some other buffer is immediately copied into the allocated region. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; statement S; @@ - to = \(kmalloc\|kzalloc\)(size,flag); + to = kmemdup(from,size,flag); if (to==NULL || ...) S - memcpy(to, from, size); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21RDMA/cma: Randomize local port allocationTetsuo Handa
Randomize local port allocation in the way sctp_get_port_local() does. Update rover at the end of loop since we're likely to pick a valid port on the first try. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Check correct variable for allocation failure RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp() RDMA/cm: Set num_paths when manually assigning path records IB/cm: Fix device_create() return value check
2010-04-07RDMA/cm: Set num_paths when manually assigning path recordsSean Hefty
When manually assigning the path records to use for a connection, save the number of paths that were set. Otherwise, checks against num_path will show 0, even though path record data is available. This was discovered by manually setting the path records from user space, then querying the kernel to see if the correct path records were assigned, only to discover that the kernel returned 0 path records to the query. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-02-10RDMA/cm: Revert association of an RDMA device when binding to loopbackSean Hefty
Revert the following change from commit 6f8372b6 ("RDMA/cm: fix loopback address support") The defined behavior of rdma_bind_addr is to associate an RDMA device with an rdma_cm_id, as long as the user specified a non- zero address. (ie they weren't just trying to reserve a port) Currently, if the loopback address is passed to rdma_bind_addr, no device is associated with the rdma_cm_id. Fix this. It turns out that important apps such as Open MPI depend on rdma_bind_addr() NOT associating any RDMA device when binding to a loopback address. Open MPI is being updated to deal with this, but at least until a new Open MPI release is available, maintain the previous behavior: allow rdma_bind_addr() to succeed, but do not bind to a device. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-01-06IB/addr: Correct CONFIG_IPv6 to CONFIG_IPV6Robert P. J. Day
Correct misspelled "CONFIG_IPv6" that was introduced in commit d14714df ("IB/addr: Fix IPv6 routing lookup"). The config variable should be all uppercase. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> [ This was my fault when I munged the original patch. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-19IB/addr: Fix IPv6 routing lookupSean Hefty
Include link scope as part of address resolution. Combine local and remote address resolution into a single, simpler code path. Fix error checking in the IPv6 routing lookups. Based on work from: David Wilder <dwilder@us.ibm.com> Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ Fix up cma_check_linklocal() for !IPV6 case. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-19RDMA/cm: fix loopback address supportSean Hefty
The RDMA CM is intended to support the use of a loopback address when establishing a connection; however, the behavior of the CM when loopback addresses are used is confusing and does not always work, depending on whether loopback was specified by the server, the client, or both. The defined behavior of rdma_bind_addr is to associate an RDMA device with an rdma_cm_id, as long as the user specified a non- zero address. (ie they weren't just trying to reserve a port) Currently, if the loopback address is passed to rdam_bind_addr, no device is associated with the rdma_cm_id. Fix this. If a loopback address is specified by the client as the destination address for a connection, it will fail to establish a connection. This is true even if the server is listing across all addresses or on the loopback address itself. The issue is that the server tries to translate the IP address carried in the REQ message to a local net_device address, which fails. The translation is not needed in this case, since the REQ carries the actual HW address that should be used. Finally, cleanup loopback support to be more transport neutral. Replace separate calls to get/set the sgid and dgid from the device address to a single call that behaves correctly depending on the format of the device address. And support both IPv4 and IPv6 address formats. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ Fixed RDS build by s/ib_addr_get/rdma_addr_get/ - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-19IB/addr: Store net_device type instead of translating to RDMA transportSean Hefty
The struct rdma_dev_addr stores net_device address information: the source device address, destination hardware address, and broadcast address. For consistency, store the net_device type rather than converting it to the rdma_node_type. The type indicates the format of the various hardware addresses, which is what we're concerned with, and not the RDMA node type that the address may map to. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-19RDMA/cma: Replace net_device pointer with indexSean Hefty
Provide the device interface when resolving route information to ensure that the correct outbound device is used. This will also simplify processing of sin6_scope_id for IPv6 support. Based on work from: David Wilder <dwilder@us.ibm.com> Jason Gunthorpe <jgunthrope@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-19RDMA/cma: Fix AF_INET6 support in multicast joiningJason Gunthorpe
If joining to an AF_INET6 address, we need to map the address to a MGID in the same way as the IP stack. The old code would just fall through to the IPv4 case and generate garbage. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-19RDMA/cma: Correct detection of SA Created MGIDJason Gunthorpe
RDMA CM treats AF_INET6 addresses that are either 0 or prefixed with FF1x:A01B::/32 as MGIDs, but the detection for the prefix was buggy; fix it up. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-23RDMA: Add __init/__exit macros to addr.c and cma.cPeter Huewe
Add __init and __exit annotations to the module_init/module_exit functions from drivers/infiniband/core/addr.c and cma.c. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08RDMA/cma: Create cm id even when IB port is downYossi Etigin
When doing rdma_resolve_addr(), if the relevant IB port is down, the function fails and the cm_id is not bound to the correct device. Therefore, application does not have a device handle and cannot wait for the port to become active. The function fails because the underlying IPoIB interface is not joined to the broadcast group and therefore the SA does not have a multicast record to take a Q_Key from. The fix is to use lazy Q_Key resolution - cma_set_qkey() will set id_priv->qkey if it was not set, and will be called just before the Q_Key is really required. Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-01RDMA/cma: Use rate from IPoIB broadcast when joining IPoIB multicast groupsYossi Etigin
When joining an IPoIB multicast group, use the same rate as in the broadcast group. Otherwise, if the RDMA CM creates this group before IPoIB does, it might get a different rate. This will cause IPoIB to fail joining to the same group later on, because IPoIB uses strict rate selection. Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-24RDMA/cma: Add IPv6 supportAleksey Senin
Handle AF_INET6 cases where required, and use struct sockaddr_storage wherever an IPv6 address might be stored. Signed-off-by: Aleksey Senin <aleksey@alst60.(none)> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-08-04RDMA/cma: Remove padding arrays by using struct sockaddr_storageRoland Dreier
There are a few places where the RDMA CM code handles IPv6 by doing struct sockaddr addr; u8 pad[sizeof(struct sockaddr_in6) - sizeof(struct sockaddr)]; This is fragile and ugly; handle this in a better way with just struct sockaddr_storage addr; [ Also roll in patch from Aleksey Senin <alekseys@voltaire.com> to switch to struct sockaddr_storage and get rid of padding arrays in struct rdma_addr. ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT eventAmir Vadai
Consumers that want to re-use their QPs in new connections need to know when the QP has exited the timewait state. Report the timewait event through the rdma_cm. Signed-off-by: Amir Vadai <amirv@mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE eventOr Gerlitz
Add an RDMA_CM_EVENT_ADDR_CHANGE event can be used by rdma-cm consumers that wish to have their RDMA sessions always use the same links (eg <hca/port>) as the IP stack does. In the current code, this does not happen when bonding is used and fail-over happened but the IB link used by an already existing session is operating fine. Use the netevent notification for sensing that a change has happened in the IP stack, then scan the rdma-cm ID list to see if there is an ID that is "misaligned" with respect to the IP stack, and deliver RDMA_CM_EVENT_ADDR_CHANGE for this ID. The consumer can act on the event or just ignore it. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/cma: Simplify locking needed for serialization of callbacksOr Gerlitz
The RDMA CM has some logic in place to make sure that callbacks on a given CM ID are delivered to the consumer in a serialized manner. Specifically it has code to protect against a device removal racing with a running callback function. This patch simplifies this logic by using a mutex per ID instead of a wait queue and atomic variable. This means that cma_disable_remove() now is more properly named to cma_disable_callback(), and cma_enable_remove() can now be removed because it just would become a trivial wrapper around mutex_unlock(). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/addr: Keep pointer to netdevice in struct rdma_dev_addrOr Gerlitz
Keep a pointer to the local (src) netdevice in struct rdma_dev_addr, and copy it in as part of rdma_copy_addr(). Use rdma_translate_ip() in cma_new_conn_id() to reduce some code duplication and also make sure the src_dev member gets set. In a high-availability configuration the netdevice pointer can be used by the RDMA CM to align RDMA sessions to use the same links as the IP stack does under fail-over and route change cases. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/cma: Add missing newlines to printk()sRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Sean Hefty <sean.hefty@intel.com>
2008-07-14RDMA: Fix license textSean Hefty
The license text for several files references a third software license that was inadvertently copied in. Update the license to what was intended. This update was based on a request from HP. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16RDMA/iwcm: Test rdma_create_id() for IS_ERR rather than 0Julia Lawall
The function rdma_create_id() always returns either a valid pointer or a value made with ERR_PTR, so its result should be tested with IS_ERR, not with a test for 0. The problem was found using the following semantic match. (http://www.emn.fr/x-info/coccinelle/) //<smpl> @a@ expression E, E1; statement S,S1; position p; @@ E = rdma_create_id(...) ... when != E = E1 if@p (E) S else S1 @n@ position a.p; expression E,E1; statement S,S1; @@ E = NULL ... when != E = E1 if@p (E) S else S1 @depends on !n@ expression E; statement S,S1; position a.p; @@ * if@p (E) S else S1 //</smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-03-30trivial endianness annotations: infiniband coreAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14RDMA/cma: Do not issue MRA if user rejects connection requestSean Hefty
There's an undesirable interaction with issuing MRA requests to increase connection timeouts and the listen backlog. When the rdma_cm receives a connection request, it queues an MRA with the ib_cm. (The ib_cm will send an MRA if it receives a duplicate REQ.) The rdma_cm will then create a new rdma_cm_id and give that to the user, which in this case is the rdma_user_cm. If the listen backlog maintained in the rdma_user_cm is full, it destroys the rdma_cm_id, which in turns destroys the ib_cm_id. The ib_cm_id generates a REJ because the state of the ib_cm_id has changed to MRA sent, versus REQ received. When the backlog is full, we just want to drop the REQ so that it is retried later. Fix this by deferring queuing the MRA until after the user of the rdma_cm has examined the connection request. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-28[NETNS]: Add namespace parameter to ip_dev_find.Denis V. Lunev
in_dev_find() need a namespace to pass it to fib_get_table(), so add an argument. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV4] drivers/infiniband: Use ipv4_is_<type>Joe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>