summaryrefslogtreecommitdiff
path: root/include/linux/mlx5
AgeCommit message (Collapse)Author
2016-05-20Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "Primary 4.7 merge window changes - Updates to the new Intel X722 iWARP driver - Updates to the hfi1 driver - Fixes for the iw_cxgb4 driver - Misc core fixes - Generic RDMA READ/WRITE API addition - SRP updates - Misc ipoib updates - Minor mlx5 updates" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (148 commits) IB/mlx5: Fire the CQ completion handler from tasklet net/mlx5_core: Use tasklet for user-space CQ completion events IB/core: Do not require CAP_NET_ADMIN for packet sniffing IB/mlx4: Fix unaligned access in send_reply_to_slave IB/mlx5: Report Scatter FCS device capability when supported IB/mlx5: Add Scatter FCS support for Raw Packet QP IB/core: Add Scatter FCS create flag IB/core: Add Raw Scatter FCS device capability IB/core: Add extended device capability flags i40iw: pass hw_stats by reference rather than by value i40iw: Remove unnecessary synchronize_irq() before free_irq() i40iw: constify i40iw_vf_cqp_ops structure IB/mlx5: Add UARs write-combining and non-cached mapping IB/mlx5: Allow mapping the free running counter on PROT_EXEC IB/mlx4: Use list_for_each_entry_safe IB/SA: Use correct free function IB/core: Fix a potential array overrun in CMA and SA agent IB/core: Remove unnecessary check in ibnl_rcv_msg IB/IWPM: Fix a potential skb leak RDMA/nes: replace custom print_hex_dump() ...
2016-05-18net/mlx5_core: Use tasklet for user-space CQ completion eventsMatan Barak
Previously, we've fired all our completion callbacks straight from our ISR. Some of those callbacks were lightweight (for example, mlx5 Ethernet napi callbacks), but some of them did more work (for example, the user-space RDMA stack uverbs' completion handler). Besides that, doing more than the minimal work in ISR is generally considered wrong, it could even lead to a hard lockup of the system. Since when a lot of completion events are generated by the hardware, the loop over those events could be so long, that we'll get into a hard lockup by the system watchdog. In order to avoid that, add a new way of invoking completion events callbacks. In the interrupt itself, we add the CQs which receive completion event to a per-EQ list and schedule a tasklet. In the tasklet context we loop over all the CQs in the list and invoke the user callback. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-16net/mlx5_core: Flow counters infrastructureAmir Vadai
If a counter has the aging flag set when created, it is added to a list of counters that will be queried periodically from a workqueue. query result and last use timestamp are cached. add/del counter must be very efficient since thousands of such operations might be issued in a second. There is only a single reference to counters without aging, therefore no need for locks. But, counters with aging enabled are stored in a list. In order to make code as lockless as possible, all the list manipulation and access to hardware is done from a single context - the periodic counters query thread. The hardware supports multiple counters per FTE, however currently we are using one counter for each FTE. Signed-off-by: Amir Vadai <amirva@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16net/mlx5_core: Introduce flow steering destination of type counterAmir Vadai
When adding a flow steering rule with a counter, need to supply a destination of type MLX5_FLOW_DESTINATION_TYPE_COUNTER, with a pointer to a struct mlx5_fc. Also, MLX5_FLOW_CONTEXT_ACTION_COUNT bit should be set in the action. Signed-off-by: Amir Vadai <amirva@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16net/mlx5_core: Firmware commands to support flow countersAmir Vadai
Getting packet/byte statistics on flows is done through flow counters. Implement the firmware commands to alloc, free and query flow counters. Signed-off-by: Amir Vadai <amirva@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12net/mlx5: Update mlx5_ifc hardware featuresSaeed Mahameed
Adding the needed mlx5_ifc hardware bits and structs for the following features: * Add vport to steering commands for SRIOV ACL support * Add mlcr, pcmr and mcia registers for dump module EEPROM * Add support for FCS, beacon led and disable_link bits to hca caps * Add CQE period mode bit in CQ context for CQE based CQ moderation support * Add umr SQ bit for fragmented memory registration * Add needed bits and caps for Striding RQ support In-order to avoid possible future conflicts between rdma and net-next we added all expected updates to this file for this release. If more changes will be submitted, we plan to do it only through one of the subsystems, probably net-next. All updated bits in this patch will be later used in the up-coming submissions to net-next and rdma trees. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-12net/mlx5: Fix mlx5 ifc cmd_hca_cap bad offsetsTariq Toukan
All reserved fields after early_vf_enable are off by 1, since early_vf_enable was not explicitly declared as array of size 1. Reserved field before cqe_zip had a wrong size, it should be 0x80 + 0x3f. Fixes: b0844444590e ("net/mlx5_core: Introduce access function to read internal timer ") Fixes: b4ff3a36d3e4 ("net/mlx5: Use offset based reserved field names in the IFC header file") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-11net/mlx5e: CQE compressionTariq Toukan
CQE compression feature is meant to save PCIe bandwidth by compressing few CQEs into smaller amount of bytes on PCIe. CQE compression can be selectively enabled per CQ. By default is disabled for now and will be enabled later on. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04net/mlx5: Flow steering, Add vport ACL supportMohamad Haj Yahia
Update the relevant flow steering device structs and commands to support vport. Update the flow steering core API to receive vport number. Add ingress and egress ACL flow table name spaces. Add ACL flow table support: * ACL (Access Control List) flow table is a table that contains only allow/drop steering rules. * We have two types of ACL flow tables - ingress and egress. * ACLs handle traffic sent from/to E-Switch FDB table, Ingress refers to traffic sent from Vport to E-Switch and Egress refers to traffic sent from E-Switch to vport. * Ingress ACL flow table allow/drop rules is checked against traffic sent from VF. * Egress ACL flow table allow/drop rules is checked against traffic sent to VF. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/ipv4/ip_gre.c Minor conflicts between tunnel bug fixes in net and ipv6 tunnel cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "Final set of -rc fixes for 4.6. I've collected up a number of patches that are all pretty small with the exception of only a couple. The hfi1 driver has a number of important patches, and it is what really drives the line count of this pull request up. These are all small and I've got this kernel built and running in the test lab (I have most of the hardware, I think nes is the only thing in this patch set that I can't say I've personally tested and have up and running). Summary: - A number of collected fixes for oopses, memory corruptions, deadlocks, etc. All of these fixes are small (many only 5-10 lines), obvious, and tested. - Fix for the security issue related to the use of write for bi-directional communications" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: RDMA/nes: don't leak skb if carrier down IB/security: Restrict use of the write() interface IB/hfi1: Use kernel default llseek for ui device IB/hfi1: Don't attempt to free resources if initialization failed IB/hfi1: Fix missing lock/unlock in verbs drain callback IB/rdmavt: Fix send scheduling IB/hfi1: Prevent unpinning of wrong pages IB/hfi1: Fix deadlock caused by locking with wrong scope IB/hfi1: Prevent NULL pointer deferences in caching code MAINTAINERS: Update iser/isert maintainer contact info IB/mlx5: Expose correct max_sge_rd limit RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips iw_cxgb4: handle draining an idle qp iw_cxgb3: initialize ibdev.iwcm->ifname for port mapping iw_cxgb4: initialize ibdev.iwcm->ifname for port mapping IB/core: Don't drain non-existent rq queue-pair IB/core: Fix oops in ib_cache_gid_set_default_gid
2016-04-29net/mlx5: Initializing CPU reverse mappingMaor Gottlieb
Allocating CPU rmap and add entry for each IRQ. CPU rmap is used in aRFS to get the RX queue number of the RX completion interrupts. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29net/mlx5: Add user chosen levels when allocating flow tablesMaor Gottlieb
Currently, consumers of the flow steering infrastructure can't choose their own flow table levels and are limited to one flow table per level. This just waste levels. Instead, we introduce here the possibility to use multiple flow tables in a level. The user is free to connect these flow tables, while following the rule (FTEs in FT of level x could only point to FTs of level y where y > x). In addition this patch switch the order of the create/destroy flow tables of the NIC(vlan and main). Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29net/mlx5: Introduce modify flow rule destinationMaor Gottlieb
This API is used for modifying the flow rule destination. This is needed for modifying the pointed flow table by the traffic type classifier rules to point on the aRFS tables. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28Merge branch 'k.o/for-4.6-rc' into testing/4.6Doug Ledford
2016-04-28IB/mlx5: Expose correct max_sge_rd limitSagi Grimberg
mlx5 devices (Connect-IB, ConnectX-4, ConnectX-4-LX) has a limitation where rdma read work queue entries cannot exceed 512 bytes. A rdma_read wqe needs to fit in 512 bytes: - wqe control segment (16 bytes) - rdma segment (16 bytes) - scatter elements (16 bytes each) So max_sge_rd should be: (512 - 16 - 16) / 16 = 30. Cc: linux-stable@vger.kernel.org Reported-by: Christoph Hellwig <hch@lst.de> Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagig@grimberg.me> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor overlapping changes in the conflicts. In the macsec case, the change of the default ID macro name overlapped with the 64-bit netlink attribute alignment fixes in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26net/mlx5e: Fix checksum handling for non-stripped vlan packetsSaeed Mahameed
Now as rx-vlan offload can be disabled, packets can be received with vlan tag not stripped, which means is_first_ethertype_ip will return false, for that we need to check if the hardware reported csum OK so we will report CHECKSUM_UNNECESSARY for those packets. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26net/mlx5e: Add ethtool support for rxvlan-offload (vlan stripping)Gal Pressman
Use ethtool -K <interface> rxvlan <on/off> to enable/disable C-TAG vlan stripping by hardware. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26net/mlx5e: Add ethtool support for dump module EEPROMGal Pressman
Add query MCIA, PMLP registers infrastructure and commands. Add ethtool support for get_module_info() and get_module_eeprom() callbacks. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26net/mlx5e: Add ethtool support for interface identify (LED blinking)Gal Pressman
Add the needed hardware command and mlx5_ifc structs for managing LED control. Add set_phys_id ethtool callback to support ethtool -p flag. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26net/mlx5e: Add support for RXALL netdev featureEran Ben Elisha
Introduce new access register named Ports Check Mask Register (PCMR) to control all HW checks on port. With this register, the driver can enable/disable Hardware FCS validation. When RXALL is enabled/disabled using ndo_set_features, enable/disable fcs check at HW. User can change HW configuration using rx-all flag at ethtool. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26net/mlx5e: Add link down events counterGal Pressman
Expose link_down_events counter through ethtool -S. This counter is read from PPort statistics, then proccessed and stored as a special handling software counter. This counter is stored along software counters since it is the only PPort counter that it's size is not 64 bits. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26net/mlx5e: Statistics handling refactoringGal Pressman
Redesign ethtool statistics handling and reporting in the driver: 1. Move counters to a separate file (en_stats.h). 2. Remove unnecessary dependencies between stats and strings. 3. Use counter descriptors which hold a name and offset for each counter, and will be used to decide which counters will be exposed. For example when adding a new software counter to ethtool, instead of: 1. Add to stats struct. 2. Add to strings struct in the same order. 3. Change macro defining number of software counters. The only thing needed is to link the new counter to a counter descriptor. VPort counters are a set of hardware traffic counters created automatically for each virtual port opened. PPort counters are a set of counters describing per physical port performance statistics. These counters are gathered from hardware register and divided to groups according to different protocols. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5: Add pci shutdown callbackMajd Dibbiny
This patch introduces kexec support for mlx5. When switching kernels, kexec() calls shutdown, which unloads the driver and cleans its resources. In addition, remove unregister netdev from shutdown flow. This will allow a clean shutdown, even if some netdev clients did not release their reference from this netdev. Releasing The HW resources only is enough as the kernel is shutting down Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5e: Use vport MTU rather than physical port MTUSaeed Mahameed
Set and report vport MTU rather than physical MTU, Driver will set both vport and physical port mtu and will rely on the query of vport mtu. SRIOV VFs have to report their MTU to their vport manager (PF), and this will allow them to work with any MTU they need without failing the request. Also for some cases where the PF is not a port owner, PF can work with MTU less than the physical port mtu if set physical port mtu didn't take effect. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5e: Device's mtu field is u16 and not intSaeed Mahameed
For set/query MTU port firmware commands the MTU field is 16 bits, here I changed all the "int mtu" parameters of the functions wrapping those firmware commands to be u16. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21net/mlx5e: Support RX multi-packet WQE (Striding RQ)Tariq Toukan
Introduce the feature of multi-packet WQE (RX Work Queue Element) referred to as (MPWQE or Striding RQ), in which WQEs are larger and serve multiple packets each. Every WQE consists of many strides of the same size, every received packet is aligned to a beginning of a stride and is written to consecutive strides within a WQE. In the regular approach, each regular WQE is big enough to be capable of serving one received packet of any size up to MTU or 64K in case of device LRO is enabled, making it very wasteful when dealing with small packets or device LRO is enabled. For its flexibility, MPWQE allows a better memory utilization (implying improvements in CPU utilization and packet rate) as packets consume strides according to their size, preserving the rest of the WQE to be available for other packets. MPWQE default configuration: Num of WQEs = 16 Strides Per WQE = 2048 Stride Size = 64 byte The default WQEs memory footprint went from 1024*mtu (~1.5MB) to 16 * 2048 * 64 = 2MB per ring. However, HW LRO can now be supported at no additional cost in memory footprint, and hence we turn it on by default and get an even better performance. Performance tested on ConnectX4-Lx 50G. To isolate the feature under test, the numbers below were measured with HW LRO turned off. We verified that the performance just improves when LRO is turned back on. * Netperf single TCP stream: - BW raised by 10-15% for representative packet sizes: default, 64B, 1024B, 1478B, 65536B. * Netperf multi TCP stream: - No degradation, line rate reached. * Pktgen: packet rate raised by 2-10% for traffic of different message sizes: 64B, 128B, 256B, 1024B, and 1500B. * Pktgen: packet loss in bursts of small messages (64byte), single stream: - | num packets | packets loss before | packets loss after | 2K | ~ 1K | 0 | 8K | ~ 6K | 0 | 16K | ~13K | 0 | 32K | ~28K | 0 | 64K | ~57K | ~24K As expected as the driver can receive as many small packets (<=64B) as the number of total strides in the ring (default = 2048 * 16) vs. 1024 (default ring size regardless of packets size) before this feature. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21net/mlx5: Introduce device queue countersTariq Toukan
A queue counter can collect several statistics for one or more hardware queues (QPs, RQs, etc ..) that the counter is attached to. For Ethernet it will provide an "out of buffer" counter which collects the number of all packets that are dropped due to lack of software buffers. Here we add device commands to alloc/query/dealloc queue counters. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15net/mlx5: Update mlx5_ifc hardware featuresSaeed Mahameed
Adding the needed mlx5_ifc hardware bits and structs for the following feature: * Add vport to steering commands for SRIOV ACL support * Add mlcr, pcmr and mcia registers for dump module EEPROM * Add support for FCS, baeacon led and disable_link bits to hca caps * Add CQE period mode bit in CQ context for CQE based CQ moderation support * Add umr SQ bit for fragmented memory registration * Add needed bits and caps for Striding RQ support Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15net/mlx5: Fix mlx5 ifc cmd_hca_cap bad offsetsTariq Toukan
All reserved fields after early_vf_enable are off by 1, since early_vf_enable was not explicitly declared as array of size 1. Reserved field before cqe_zip had a wrong size, it should be 0x80 + 0x3f. Fixes: b0844444590e ("net/mlx5_core: Introduce access function to read internal timer ") Fixes: b4ff3a36d3e4 ("net/mlx5: Use offset based reserved field names in the IFC header file") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-22Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull more rdma updates from Doug Ledford: "Round two of 4.6 merge window patches. This is a monster pull request. I held off on the hfi1 driver updates (the hfi1 driver is intimately tied to the qib driver and the new rdmavt software library that was created to help both of them) in my first pull request. The hfi1/qib/rdmavt update is probably 90% of this pull request. The hfi1 driver is being left in staging so that it can be fixed up in regards to the API that Al and yourself didn't like. Intel has agreed to do the work, but in the meantime, this clears out 300+ patches in the backlog queue and brings my tree and their tree closer to sync. This also includes about 10 patches to the core and a few to mlx5 to create an infrastructure for configuring SRIOV ports on IB devices. That series includes one patch to the net core that we sent to netdev@ and Dave Miller with each of the three revisions to the series. We didn't get any response to the patch, so we took that as implicit approval. Finally, this series includes Intel's new iWARP driver for their x722 cards. It's not nearly the beast as the hfi1 driver. It also has a linux-next merge issue, but that has been resolved and it now passes just fine. Summary: - A few minor core fixups needed for the next patch series - The IB SRIOV series. This has bounced around for several versions. Of note is the fact that the first patch in this series effects the net core. It was directed to netdev and DaveM for each iteration of the series (three versions total). Dave did not object, but did not respond either. I've taken this as permission to move forward with the series. - The new Intel X722 iWARP driver - A huge set of updates to the Intel hfi1 driver. Of particular interest here is that we have left the driver in staging since it still has an API that people object to. Intel is working on a fix, but getting these patches in now helps keep me sane as the upstream and Intel's trees were over 300 patches apart" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (362 commits) IB/ipoib: Allow mcast packets from other VFs IB/mlx5: Implement callbacks for manipulating VFs net/mlx5_core: Implement modify HCA vport command net/mlx5_core: Add VF param when querying vport counter IB/ipoib: Add ndo operations for configuring VFs IB/core: Add interfaces to control VF attributes IB/core: Support accessing SA in virtualized environment IB/core: Add subnet prefix to port info IB/mlx5: Fix decision on using MAD_IFC net/core: Add support for configuring VF GUIDs IB/{core, ulp} Support above 32 possible device capability flags IB/core: Replace setting the zero values in ib_uverbs_ex_query_device net/mlx5_core: Introduce offload arithmetic hardware capabilities net/mlx5_core: Refactor device capability function net/mlx5_core: Fix caching ATOMIC endian mode capability ib_srpt: fix a WARN_ON() message i40iw: Replace the obsolete crypto hash interface with shash IB/hfi1: Add SDMA cache eviction algorithm IB/hfi1: Switch to using the pin query function IB/hfi1: Specify mm when releasing pages ...
2016-03-21IB/mlx5: Implement callbacks for manipulating VFsEli Cohen
Implement the IB defined callbacks used to manipulate the policy for the link state, set GUIDs or get statistics information. This functionality is added into a new file that will be used to add any SRIOV related functionality to the mlx5 IB layer. The following callbacks have been added: mlx5_ib_get_vf_config mlx5_ib_set_vf_link_state mlx5_ib_get_vf_stats mlx5_ib_set_vf_guid In addition, publish whether this device is based on a virtual function. In mlx5 supported devices, virtual functions are implemented as vHCAs. vHCAs have their own QP number space so it is possible that two vHCAs will use a QP with the same number at the same time. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-21net/mlx5_core: Implement modify HCA vport commandEli Cohen
Implement the modify HCA vport commands used to modify the parameters of virtual HCA's ports. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-21net/mlx5_core: Add VF param when querying vport counterEli Cohen
Add a vf parameter to mlx5_core_query_vport_counter so we can call it to query counters of virtual functions. Also update current users of the API. PFs may call mlx5_core_query_vport_counter with other_vport set to indicate that they are querying a virtual function. The virtual function to be queried is given by the vf parameter. Virtual function numbering is zero based so the first VF is 0 and so on. When a PF queries its own function, the other_vport parameter is cleared. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-21net/mlx5_core: Introduce offload arithmetic hardware capabilitiesSagi Grimberg
Define the necessary hardware structures for the offload arithmetic capabilities and read/cache them on driver load. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-21net/mlx5_core: Refactor device capability functionLeon Romanovsky
Device capability function was called similar in all places. It was called twice for every queried parameter, while the difference between calls was in HCA capability mode only. The change proposed unify these calls into one function. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights: 1) Support more Realtek wireless chips, from Jes Sorenson. 2) New BPF types for per-cpu hash and arrap maps, from Alexei Starovoitov. 3) Make several TCP sysctls per-namespace, from Nikolay Borisov. 4) Allow the use of SO_REUSEPORT in order to do per-thread processing of incoming TCP/UDP connections. The muxing can be done using a BPF program which hashes the incoming packet. From Craig Gallek. 5) Add a multiplexer for TCP streams, to provide a messaged based interface. BPF programs can be used to determine the message boundaries. From Tom Herbert. 6) Add 802.1AE MACSEC support, from Sabrina Dubroca. 7) Avoid factorial complexity when taking down an inetdev interface with lots of configured addresses. We were doing things like traversing the entire address less for each address removed, and flushing the entire netfilter conntrack table for every address as well. 8) Add and use SKB bulk free infrastructure, from Jesper Brouer. 9) Allow offloading u32 classifiers to hardware, and implement for ixgbe, from John Fastabend. 10) Allow configuring IRQ coalescing parameters on a per-queue basis, from Kan Liang. 11) Extend ethtool so that larger link mode masks can be supported. From David Decotigny. 12) Introduce devlink, which can be used to configure port link types (ethernet vs Infiniband, etc.), port splitting, and switch device level attributes as a whole. From Jiri Pirko. 13) Hardware offload support for flower classifiers, from Amir Vadai. 14) Add "Local Checksum Offload". Basically, for a tunneled packet the checksum of the outer header is 'constant' (because with the checksum field filled into the inner protocol header, the payload of the outer frame checksums to 'zero'), and we can take advantage of that in various ways. From Edward Cree" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits) bonding: fix bond_get_stats() net: bcmgenet: fix dma api length mismatch net/mlx4_core: Fix backward compatibility on VFs phy: mdio-thunder: Fix some Kconfig typos lan78xx: add ndo_get_stats64 lan78xx: handle statistics counter rollover RDS: TCP: Remove unused constant RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket net: smc911x: convert pxa dma to dmaengine team: remove duplicate set of flag IFF_MULTICAST bonding: remove duplicate set of flag IFF_MULTICAST net: fix a comment typo ethernet: micrel: fix some error codes ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it bpf, dst: add and use dst_tclassid helper bpf: make skb->tc_classid also readable net: mvneta: bm: clarify dependencies cls_bpf: reset class and reuse major in da ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c ldmvsw: Add ldmvsw.c driver code ...
2016-03-18Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "Initial roundup of 4.6 merge window patches. This is the first of two pull requests. It is the smaller request, but touches for more different things (this is everything but what is in or going into staging). The pull request for the code in staging/rdma is on hold until after we decide what to do on the write/writev API issue and may be partially deferred until 4.7 as a result. Summary: - cxgb4 updates - nes updates - unification of iwarp portmapper code to core - add drain_cq API - various ib_core updates - minor ipoib updates - minor mlx4 updates - more significant mlx5 updates (including a minor merge conflict with net-next tree...merge is simple to resolve and Stephen's resolution was confirmed by Mellanox) - trivial net/9p rdma conversion - ocrdma RoCEv2 update - srpt updates" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (85 commits) iwpm: crash fix for large connections test iw_cxgb3: support for iWARP port mapping iw_cxgb4: remove port mapper related code iw_nes: remove port mapper related code iwcm: common code for port mapper net/9p: convert to new CQ API IB/mlx5: Add support for don't trap rules net/mlx5_core: Introduce forward to next priority action net/mlx5_core: Create anchor of last flow table iser: Accept arbitrary sg lists mapping if the device supports it mlx5: Add arbitrary sg list support IB/core: Add arbitrary sg_list support IB/mlx5: Expose correct max_fast_reg_page_list_len IB/mlx5: Make coding style more consistent IB/mlx5: Convert UMR CQ to new CQ API IB/ocrdma: Skip using unneeded intermediate variable IB/ocrdma: Skip using unneeded intermediate variable IB/ocrdma: Delete unnecessary variable initialisations in 11 functions IB/core: Documentation fix in the MAD header file IB/core: trivial prink cleanup. ...
2016-03-16Merge branches 'mlx4', 'mlx5' and 'ocrdma' into k.o/for-4.6Doug Ledford
2016-03-10IB/mlx5: Add support for don't trap rulesMaor Gottlieb
Each bypass flow steering priority will be split into two priorities: 1. Priority for don't trap rules. 2. Priority for normal rules. When user creates a flow using IB_FLOW_ATTR_FLAGS_DONT_TRAP flag, the driver creates two flow rules, one used for receiving the traffic and the other one for forwarding the packet to continue matching in lower or equal priorities. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10net/mlx5_core: Introduce forward to next priority actionMaor Gottlieb
Add support to create flow rule that forward packets to the first flow table in the next priority (next priority could be the first priority in the next namespace or the next priority in the same namespace). This feature could be used for DONT_TRAP rules or rules that only want to mark the packet with flow tag. In order to do it optimally, each flow table has list of all rules that point to this flow table, when a flow table is destroyed/created, we update the list head correspondingly. This kind of rule is created when destination is NULL and action is MLX5_FLOW_CONTEXT_ACTION_FWD_NEXT_PRIO. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10net/mlx5_core: Create anchor of last flow tableMaor Gottlieb
Create an empty flow table in the end of NIC rx namesapce. Adding this flow table simplify the implementation of "forward to next prio" rules. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02net/mlx5e: Fix ethtool RX hash func configuration changeTariq Toukan
We should modify TIRs explicitly to apply the new RSS configuration. The light ndo close/open calls do not "refresh" them. Fixes: 2d75b2bc8a8c ('net/mlx5e: Add ethtool RSS configuration options') Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5: Fix global UAR mappingMoshe Lazer
Avoid double mapping of io mapped memory, Device page may be mapped to non-cached(NC) or to write-combining(WC). The code before this fix tries to map it both to WC and NC contrary to what stated in Intel's software developer manual. Here we remove the global WC mapping of all UARS "dev->priv.bf_mapping", since UAR mapping should be decided per UAR (e.g we want different mappings for EQs, CQs vs QPs). Caller will now have to choose whether to map via write-combining API or not. mlx5e SQs will choose write-combining in order to perform BlueFlame writes. Fixes: 88a85f99e51f ('TX latency optimization to save DMA reads') Signed-off-by: Moshe Lazer <moshel@mellanox.com> Reviewed-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5: Make command timeout way shorterOr Gerlitz
The command timeout is terribly long, whole two hours. Make it 60s so if things do go wrong, the user gets feedback in relatively short time, so they can take corrective actions and/or investigate using tools and such. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01IB/mlx5: Add memory windows allocation supportMatan Barak
This patch adds user-space support for memory windows allocation and deallocation. It also exposes the supported types via query_device_caps verb. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Tested-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-01net/mlx5: Refactor mlx5_core_mr to mkeyMatan Barak
Mlx5's mkey mechanism is also used for memory windows. The current code base uses MR (memory region) naming, which is inaccurate. Changing MR to mkey in order to represent its different usages more accurately. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-01IB/mlx5: Add support for setting source QP numberHaggai Eran
In order to create multiple GSI QPs, we need to set the source QP number to one on all these QPs. Add the necessary definitions and infrastructure to do that. Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>