summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-08-02arm64: dts: fsl: ls1028a: Enable eth port1 on the ls1028a QDS boardClaudiu Manoil
LS1028a has one Ethernet management interface. On the QDS board, the MDIO signals are multiplexed to either on-board AR8035 PHY device or to 4 PCIe slots allowing for SGMII cards. To enable the Ethernet ENETC Port 1, which can only be connected to a RGMII PHY, the multiplexer needs to be configured to route the MDIO to the AR8035 PHY. The MDIO/MDC routing is controlled by bits 7:4 of FPGA board config register 0x54, and value 0 selects the on-board RGMII PHY. The FPGA board config registers are accessible on the i2c bus, at address 0x66. The PF3 MDIO PCIe integrated endpoint device allows for centralized access to the MDIO bus. Add the corresponding devicetree node and set it to be the MDIO bus parent. Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02dt-bindings: net: fsl: enetc: Add bindings for the central MDIO PCIe endpointClaudiu Manoil
The on-chip PCIe root complex that integrates the ENETC ethernet controllers also integrates a PCIe endpoint for the MDIO controller providing for centralized control of the ENETC mdio bus. Add bindings for this "central" MDIO Integrated PCIe Endpoint. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02enetc: Add mdio bus driver for the PCIe MDIO endpointClaudiu Manoil
ENETC ports can manage the MDIO bus via local register interface. However there's also a centralized way to manage the MDIO bus, via the MDIO PCIe endpoint device integrated by the same root complex that also integrates the ENETC ports (eth controllers). Depending on board design and use case, centralized access to MDIO may be better than using local ENETC port registers. For instance, on the LS1028A QDS board where MDIO muxing is required. Also, the LS1028A on-chip switch doesn't have a local MDIO register interface. The current patch registers the above PCIe endpoint as a separate MDIO bus and provides a driver for it by re-using the code used for local MDIO access. It also allows the ENETC port PHYs to be managed by this driver if the local "mdio" node is missing from the ENETC port node. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02enetc: Clean up makefileClaudiu Manoil
Clean up overcomplicated makefile to make it more maintainable. Basically, there's a set of common objects shared between the PF and VF driver modules. This can be implemented in a simpler way, without conditionals, less repetition, allowing also for easier updates in the future. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02enetc: Clean up local mdio bus allocationClaudiu Manoil
What's needed is basically a pointer to the mdio registers. This is one way to store it inside bus->priv allocated space, without upsetting sparse. Reworked accessors to avoid __iomem casting. Used devm_* variant to further clean up the init error / remove paths. Fixes following sparse warning: warning: incorrect type in assignment (different address spaces) expected void *priv got struct enetc_mdio_regs [noderef] <asn:2>*[assigned] regs Fixes: ebfcb23d62ab ("enetc: Add ENETC PF level external MDIO support") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02Merge branch 'net-dsa-mv88e6xxx-add-support-for-MV88E6220'David S. Miller
Hubert Feurstein says: ==================== net: dsa: mv88e6xxx: add support for MV88E6220 This patch series adds support for the MV88E6220 chip to the mv88e6xxx driver. The MV88E6220 is almost the same as MV88E6250 except that the ports 2-4 are not routed to pins. Furthermore, PTP support is added to the MV88E6250 family. v2: - insert all 6220 entries in correct numerical order - introduce invalid_port_mask - move ptp_cc_mult* to ptp_ops and restored original ptp_adjfine code - added Andrews Reviewed-By to patch 2 and 4 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02net: dsa: mv88e6xxx: add PTP support for MV88E6250 familyHubert Feurstein
This adds PTP support for the MV88E6250 family. Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02net: dsa: mv88e6xxx: order ptp structs numerically ascendingHubert Feurstein
As it is done for all the other structs within this driver. Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02net: dsa: mv88e6xxx: setup message port is not supported in the 6250 familiyHubert Feurstein
The MV88E6250 family doesn't support the MV88E6XXX_PORT_CTL1_MESSAGE_PORT bit. Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02net: dsa: mv88e6xxx: introduce invalid_port_mask in mv88e6xxx_infoHubert Feurstein
With this it is possible to mark certain chip ports as invalid. This is required for example for the MV88E6220 (which is in general a MV88E6250 with 7 ports) but the ports 2-4 are not routed to pins. If a user configures an invalid port, an error is returned. Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02dt-bindings: net: dsa: marvell: add 6220 model to the 6250 familyHubert Feurstein
The MV88E6220 is part of the MV88E6250 family. Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02net: dsa: mv88e6xxx: add support for MV88E6220Hubert Feurstein
The MV88E6220 is almost the same as MV88E6250 except that the ports 2-4 are not routed to pins. So the usable ports are 0, 1, 5 and 6. Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02Merge branch 'net-phy-Add-AST2600-MDIO-support'David S. Miller
Andrew Jeffery says: ==================== net: phy: Add AST2600 MDIO support v2 of the ASPEED MDIO series addresses comments from Rob on the devicetree bindings and Andrew on the driver itself. v1 of the series can be found here: http://patchwork.ozlabs.org/cover/1138140/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02net: ftgmac100: Select ASPEED MDIO driver for the AST2600Andrew Jeffery
Ensures we can talk to a PHY via MDIO on the AST2600, as the MDIO controller is now separate from the MAC. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02net: ftgmac100: Add support for DT phy-handle propertyAndrew Jeffery
phy-handle is necessary for the AST2600 which separates the MDIO controllers from the MAC. I've tried to minimise the intrusion of supporting the AST2600 to the FTGMAC100 by leaving in place the existing MDIO support for the embedded MDIO interface. The AST2400 and AST2500 continue to be supported this way, as it avoids breaking/reworking existing devicetrees. The AST2600 support by contrast requires the presence of the phy-handle property in the MAC devicetree node to specify the appropriate PHY to associate with the MAC. In the event that someone wants to specify the MDIO bus topology under the MAC node on an AST2400 or AST2500, the current auto-probe approach is done conditional on the absence of an "mdio" child node of the MAC. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02net: phy: Add mdio-aspeedAndrew Jeffery
The AST2600 design separates the MDIO controllers from the MAC, which is where they were placed in the AST2400 and AST2500. Further, the register interface is reworked again, so now we have three possible different interface implementations, however this driver only supports the interface provided by the AST2600. The AST2400 and AST2500 will continue to be supported by the MDIO support embedded in the FTGMAC100 driver. The hardware supports both C22 and C45 mode, but for the moment only C22 support is implemented. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02dt-bindings: net: Add aspeed, ast2600-mdio bindingAndrew Jeffery
The AST2600 splits out the MDIO bus controller from the MAC into its own IP block and rearranges the register layout. Add a new binding to describe the new hardware. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01tipc: reduce risk of wakeup queue starvationJon Maloy
In commit 365ad353c256 ("tipc: reduce risk of user starvation during link congestion") we allowed senders to add exactly one list of extra buffers to the link backlog queues during link congestion (aka "oversubscription"). However, the criteria for when to stop adding wakeup messages to the input queue when the overload abates is inaccurate, and may cause starvation problems during very high load. Currently, we stop adding wakeup messages after 10 total failed attempts where we find that there is no space left in the backlog queue for a certain importance level. The counter for this is accumulated across all levels, which may lead the algorithm to leave the loop prematurely, although there may still be plenty of space available at some levels. The result is sometimes that messages near the wakeup queue tail are not added to the input queue as they should be. We now introduce a more exact algorithm, where we keep adding wakeup messages to a level as long as the backlog queue has free slots for the corresponding level, and stop at the moment there are no more such slots or when there are no more wakeup messages to dequeue. Fixes: 365ad35 ("tipc: reduce risk of user starvation during link congestion") Reported-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01fm10k: reduce scope of the ring variableJacob Keller
Reduce the scope of the ring local variable in the fm10k_assign_l2_accel function. This was detected by cppcheck and resolves the following warning produced by that tool: [fm10k_netdev.c:1447]: (style) The scope of the variable 'ring' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce the scope of the result local variableJacob Keller
Reduce the scope of the result local variable in the fm10k_iov_msg_lport_state_pf function. This was detected by cppcheck and resolves the following warning produced by that tool: [fm10k_pf.c:1435]: (style) The scope of the variable 'result' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce the scope of the local msg variableJacob Keller
The msg variable in the fm10k_mbx_validate_msg_size and fm10k_sm_mbx_transmit functions is only used within the do {} loop scope. Reduce its scope only to where it is used. This was detected by cppcheck, and resolves the following warnings produced by that tool: [fm10k_mbx.c:299]: (style) The scope of the variable 'msg' can be reduced. [fm10k_mbx.c:2004]: (style) The scope of the variable 'msg' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce the scope of the local i variableJacob Keller
Reduce the scope of the local loop variable in the fm10k_check_hang_subtask function. This was detected by cppcheck and resolves the following warning produced by that tool: [driver/fm10k_pci.c:852]: (style) The scope of the variable 'i' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce the scope of the err variableJacob Keller
Reduce the scope of the local variable err in the fm10k_detach_subtask function. This was detected by cppcheck and resolves the following warning produced by that tool: [fm10k_pci.c:403]: (style) The scope of the variable 'err' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce the scope of the tx_buffer variableJacob Keller
The tx_buffer local variable in the function fm10k_clean_tx_ring is not used except inside a smaller block scope. Reduce the scope to its point of use. This was detected by cppcheck and resolves the following style warning produced by that tool: [fm10k_netdev.c:179]: (style) The scope of the variable 'tx_buffer' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce the scope of the q_idx local variableJacob Keller
Reduce the scope of the q_idx local variable in the fm10k_cache_ring_qos function. This was detected by cppcheck and resolves the following style warning produced by that tool: [fm10k_main.c:2016]: (style) The scope of the variable 'q_idx' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce the scope of local err variableJacob Keller
Reduce the scope of the local err variable in the fm10k_iov_alloc_data function. This was detected by cppcheck and resolves the following style warning produced by that tool: [fm10k_iov.c:426]: (style) The scope of the variable 'err' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce the scope of qv local variableJacob Keller
Reduce the scope of the qv vector pointer local variable in the fm10k_set_coalesce function. This was detected by cppcheck and resolves the following style warning produced by that tool: [fm10k_ethtool.c:658]: (style) The scope of the variable 'qv' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce scope of *p local variableJacob Keller
Reduce the scope of the char *p local variable to only the block where it is used. This was detected by cppcheck and resolves the following style warning produced by that tool: [fm10k_ethtool.c:229]: (style) The scope of the variable 'p' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01fm10k: reduce scope of the err variableJacob Keller
Reduce the scope of the err local variable in the fm10k_dcbnl_ieee_setets function. This was detected using cppcheck, and resolves the following style warning: [fm10k_dcbnl.c:37]: (style) The scope of the variable 'err' can be reduced. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-01Merge branch 'net-dsa-mv88e6xxx-avoid-some-redundant-VTU-operations'David S. Miller
Vivien Didelot says: ==================== net: dsa: mv88e6xxx: avoid some redundant VTU operations The mv88e6xxx driver currently uses a mv88e6xxx_vtu_get wrapper to get a single entry and uses a boolean to eventually initialize a fresh one. However the fresh entry is only needed in one place and mv88e6xxx_vtu_getnext is simple enough to call it directly. Doing so makes the code easier to read, especially for the return code expected by switchdev to honor software VLANs. In addition to not loading the VTU again when an entry is already correctly programmed, this also allows to avoid programming the broadcast entries again when updating a port's membership, from e.g. tagged to untagged. This patch series removes the mv88e6xxx_vtu_get wrapper in favor of direct calls to mv88e6xxx_vtu_getnext, and also renames the _mv88e6xxx_port_vlan_add and _mv88e6xxx_port_vlan_del helpers using an old underscore prefix convention. In case the port's membership is already correctly programmed in hardware, the following debug message may be printed: [ 745.989884] mv88e6085 2188000.ethernet-1:00: p4: already a member of VLAN 42 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01net: dsa: mv88e6xxx: call vtu_getnext directly in vlan_addVivien Didelot
Wrapping mv88e6xxx_vtu_getnext makes the code less easy to read and _mv88e6xxx_port_vlan_add is the only function requiring the preparation of a new VLAN entry. To simplify things up, remove the mv88e6xxx_vtu_get wrapper and explicit the VLAN lookup in _mv88e6xxx_port_vlan_add. This rework also avoids programming the broadcast entries again when changing a port's membership, e.g. from tagged to untagged. At the same time, rename the helper using an old underscore convention. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01net: dsa: mv88e6xxx: call vtu_getnext directly in vlan_delVivien Didelot
Wrapping mv88e6xxx_vtu_getnext makes the code less easy to read. Explicit the call to mv88e6xxx_vtu_getnext in _mv88e6xxx_port_vlan_del and the return value expected by switchdev in case of software VLANs. At the same time, rename the helper using an old underscore convention. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01net: dsa: mv88e6xxx: call vtu_getnext directly in db load/purgeVivien Didelot
mv88e6xxx_vtu_getnext is simple enough to call it directly in the mv88e6xxx_port_db_load_purge function and explicit the return code expected by switchdev for software VLANs when an hardware VLAN does not exist. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01net: dsa: mv88e6xxx: explicit entry passed to vtu_getnextVivien Didelot
mv88e6xxx_vtu_getnext interprets two members from the input mv88e6xxx_vtu_entry structure: the (excluded) vid member to start the iteration from, and the valid argument specifying whether the VID must be written or not (only required once at the start of a loop). Explicit the assignation of these two fields right before calling mv88e6xxx_vtu_getnext, as it is done in the mv88e6xxx_vtu_get wrapper. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01net: dsa: mv88e6xxx: lock mutex in vlan_prepareVivien Didelot
Lock the mutex in the mv88e6xxx_port_vlan_prepare function called by the DSA stack, instead of doing it in the internal mv88e6xxx_port_check_hw_vlan helper. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01net/mlx5e: Allow dropping specific tunnel packetsTonghao Zhang
In some case, we don't want to allow specific tunnel packets to host that can avoid to take up high CPU (e.g network attacks). But other tunnel packets which not matched in hardware will be sent to host too. $ tc filter add dev vxlan_sys_4789 \ protocol ip chain 0 parent ffff: prio 1 handle 1 \ flower dst_ip 1.1.1.100 ip_proto tcp dst_port 80 \ enc_dst_ip 2.2.2.100 enc_key_id 100 enc_dst_port 4789 \ action tunnel_key unset pipe action drop Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: TX reporter cleanupAya Levin
Remove redundant include files. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: Set tx reporter only on successful creationAya Levin
When failing to create tx reporter, don't set the reporter's pointer. Creating a reporter is not mandatory for driver load, avoid garbage/error pointer. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: Fix mlx5e_tx_reporter_create return valueAya Levin
Return error when failing to create a reporter in devlink. Since NET_DEVLINK mandatory to MLX5_CORE in Kconfig, returned pointer can't be NULL and can only hold an error in bad path. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: Rx, checksum handling refactoringSaeed Mahameed
Move vlan checksum fixup flow into mlx5e_skb_padding_csum(), which is supposed to fixup SKB checksum if needed. And rename mlx5e_skb_padding_csum() to mlx5e_skb_csum_fixup(). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: Tx, Soften inline mode VLAN dependenciesTariq Toukan
If capable, use zero inline mode in TX WQE for non-VLAN packets. For VLAN ones, keep the enforcement of at least L2 inline mode, unless the WQE VLAN insertion offload cap is on. Performance: Tested single core packet rate of 64Bytes. NIC: ConnectX-5 CPU: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz pktgen: Before: 12.46 Mpps After: 14.65 Mpps (+17.5%) XDP_TX: The MPWQE flow is not affected, as it already has this optimization. So we test with priv-flag xdp_tx_mpwqe: off. Before: 9.90 Mpps After: 10.20 Mpps (+3%) Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Tested-by: Noam Stolero <noams@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: XDP, Slight enhancement for WQE fetch functionTariq Toukan
Instead of passing an output param, let function return the WQE pointer. In addition, pass &pi so it gets its value in the function, and save the redundant assignment that comes after it. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: XDP, Close TX MPWQE session when no room for inline packet leftShay Agroskin
In MPWQE mode, when transmitting packets with XDP, a packet that is smaller than a certain size (set to 256 bytes) would be sent inline within its WQE TX descriptor (mem-copied), in case the hardware tx queue is congested beyond a pre-defined water-mark. If a MPWQE cannot contain an additional inline packet, we close this MPWQE session, and send the packet inlined within the next MPWQE. To save some MPWQE session close+open operations, we don't open MPWQE sessions that are contiguously smaller than certain size (set to the HW MPWQE maximum size). If there isn't enough contiguous room in the send queue, we fill it with NOPs and wrap the send queue index around. This way, qualified packets are always sent inline. Perf tests: Tested packet rate for UDP 64Byte multi-stream over two dual port ConnectX-5 100Gbps NICs. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz XDP_TX: With 24 channels: | ------ | bounced packets | inlined packets | inline ratio | | before | 113.6Mpps | 96.3Mpps | 84% | | after | 115Mpps | 99.5Mpps | 86% | With one channel: | ------ | bounced packets | inlined packets | inline ratio | | before | 6.7Mpps | 0pps | 0% | | after | 6.8Mpps | 0pps | 0% | As we can see, there is improvement in both inline ratio and overall packet rate for 24 channels. Also, we see no degradation for the one-channel case. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: Tx, Strict the room needed for SQ edge NOPsTariq Toukan
We use NOPs to populate the WQ fragment edge if the WQE does not fit in frag, to avoid WQEs crossing a page boundary (or wrap-around the WQ). The upper bound on the needed number of NOPs is one WQEBB less than the largest possible WQE, for otherwise the WQE would certainly fit. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5: Add flow counter poolGavi Teitz
Add a pool of flow counters, based on flow counter bulks, removing the need to allocate a new counter via a costly FW command during the flow creation process. The time it takes to acquire/release a flow counter is cut from ~50 [us] to ~50 [ns]. The pool is part of the mlx5 driver instance, and provides flow counters for aging flows. mlx5_fc_create() was modified to provide counters for aging flows from the pool by default, and mlx5_destroy_fc() was modified to release counters back to the pool for later reuse. If bulk allocation is not supported or fails, and for non-aging flows, the fallback behavior is to allocate and free individual counters. The pool is comprised of three lists of flow counter bulks, one of fully used bulks, one of partially used bulks, and one of unused bulks. Counters are provided from the partially used bulks first, to help limit bulk fragmentation. The pool maintains a threshold, and strives to maintain the amount of available counters below it. The pool is increased in size when a counter acquisition request is made and there are no available counters, and it is decreased in size when the last counter in a bulk is released and there are more available counters than the threshold. All pool size changes are done in the context of the acquiring/releasing process. The value of the threshold is directly correlated to the amount of used counters the pool is providing, while constrained by a hard maximum, and is recalculated every time a bulk is allocated/freed. This ensures that the pool only consumes large amounts of memory for available counters if the pool is being used heavily. When fully populated and at the hard maximum, the buffer of available counters consumes ~40 [MB]. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5: Add flow counter bulk infrastructureGavi Teitz
Add infrastructure to track bulks of flow counters, providing the means to allocate and deallocate bulks, and to acquire and release individual counters from the bulks. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5: E-Switch, add ingress rate supportEli Cohen
Use the scheduling elements to implement ingress rate limiter on an eswitch ports ingress traffic. Since the ingress of eswitch port is the egress of VF port, we control eswitch ingress by controlling VF egress. Configuration is done using the ports' representor net devices. Please note that burst size configuration is not supported by devices ConnectX-5 and earlier generations. Configuration examples: tc: tc filter add dev enp59s0f0_0 root protocol ip matchall action police rate 1mbit burst 20k ovs: ovs-vsctl set interface eth0 ingress_policing_rate=1000 Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Misc updates from mlx5-next branch. 1) Eli improves the handling of the support for QoS element type 2) Gavi refactors and prepares mlx5 flow counters for bulk allocation support 3) Parav, refactors and improves E-Switch load/unload flows 4) Saeed, two misc cleanups Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5: E-switch, Tide up eswitch config sequenceParav Pandit
Currently for PF and ECPF vports, representors are created before their eswitch hardware ports are initialized in below flow. mlx5_eswitch_enable() esw_offloads_init() esw_offloads_load_all_reps() [..] esw_enable_vport() However for VFs, vports are initialized before creating their respective netdev represnetors in event handling context. Similarly while disabling eswitch, first hardware vports are disabled, followed by destroying their representors. Here while underlying vports gets destroyed but its respective user facing netdevice can still exist on which user can continue to perform more offload operations. Instead, its more accurate to do enable_eswitch switchdev mode: 1. perform FDB tables initialization 2. initialize hw vport 3. create and publish representor for this vport disable_eswitch switchdev mode: 1. destroy user facing representor for the vport 2. disable hw vport 3. perform FDB tables cleanup Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5: E-Switch, Remove redundant mc_promisc NULL checkParav Pandit
mc_promisc pointer points to an instance of struct esw_mc_addr allocated as part of the esw structure. Hence it cannot be NULL. Removed such redundant check and assign where it is actually used. While at it, add comment around legacy mode fields and move mc_promisc close to other legacy mode structures to improve code redability. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>