summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-11ice: add low level PTP clock access functionsJacob Keller
Add the ice_ptp_hw.c file and some associated definitions to the ice driver folder. This file contains basic low level definitions for functions that interact with the device hardware. For now, only E810-based devices are supported. The ice hardware supports 2 major variants which have different PHYs with different procedures necessary for interacting with the device clock. Because the device captures timestamps in the PHY, each PHY has its own internal timer. The timers are synchronized in hardware by first preparing the source timer and the PHY timer shadow registers, and then issuing a synchronization command. This ensures that both the source timer and PHY timers are programmed simultaneously. The timers themselves are all driven from the same oscillator source. The functions in ice_ptp_hw.c abstract over the differences between how the PHYs in E810 are programmed vs how the PHYs in E822 devices are programmed. This series only implements E810 support, but E822 support will be added in a future change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11ice: add support for set/get of driver-stored firmware parametersJacob Keller
Depending on the device configuration, the ice hardware may share the PTP hardware clock timer between multiple PFs. Each PF is informed by firmware during initialization of the PTP timer association. When bringing up PTP, only the PFs which own the timer shall allocate a PTP hardware clock. Other PFs associated with that timer must report the correct PTP clock index in order to allow userspace software the ability to know which ports are connected to the same clock. To support this, the firmware has driver shared parameters. These parameters enable one PF to write the clock index into firmware, and have other PFs read the associated value out. This enables the driver to have only a single PF allocate and control the device timer registers, while other PFs associated with that timer can report the correct clock in the ETHTOOL_GET_TS_INFO report. Add support for the necessary admin queue commands to enable reading and writing of the driver shared parameters. This will be used in a future change to enable sharing the PTP clock index between PF drivers. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11ice: process 1588 PTP capabilities during initializationJacob Keller
The device firmware reports PTP clock capabilities to each PF during initialization. This includes various information for both the overall device and the individual function, including For functions: * whether this function has timesync enabled * whether this function owns one of the 2 possible clock timers, and which one * which timer the function is associated with * the clock frequency, if the device supports multiple clock frequencies * The GPIO pin association for the timer owned by this PF, if any For the device: * Which PF owns timer 0, if any * Which PF owns timer 1, if any * whether timer 0 is enabled * whether timer 1 is enabled Extract the bits from the capabilities information reported by firmware and store them in the device and function capability structures.o This information will be used in a future change to have the function driver enable PTP hardware clock support. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11ice: add support for sideband messagesJacob Keller
In order to support certain device features, including enabling the PTP hardware clock, the ice driver needs to control some registers on the device PHY. These registers are accessed by sending sideband messages. For some hardware, these messages must be sent over the device admin queue, while other hardware has a dedicated control queue for the sideband messages. Add the neighbor device message structure for sending a message to the neighboring device. Where supported, initialize the sideband control queue and handle cleanup. Add a wrapper function for sending sideband control queue messages that read or write a neighboring device register. Because some devices send sideband messages over the AdminQ, also increase the length of the admin queue to allow more messages to be queued up. This is important because the sideband messages add additional pressure on the AQ usage. This support will be used in following patches to enable support for CONFIG_1588_PTP_CLOCK. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-10Merge branch 'ipa-mem-2'David S. Miller
Alex Elder says: ==================== net: ipa: memory region rework, part 2 This is the second portion of a set of patches updating the IPA memory region code. In this portion (part 2), the focus is on adjusting the code so that it no longer assumes the memory region descriptor array is indexed by the region identifier. This brings with it some related cleanup. Three loops are changed so their loop index variable is an unsigned rather than an enumerated type. A set of functions is changed so a region identifier (rather than a memory region descriptor pointer) is passed as argument, to simplify their call sites. This isn't entirely related or required, but I think it improves the code. A validation function for filter and route table memory regions is changed to take memory region IDs, rather than determining which region to validate based on a set of Boolean flags. Finally, ipa_mem_find() is created to abstract getting a memory descriptor based on its ID, and it is used everywhere rather than indexing the array. With that implemented, all of the memory regions can be defined by arrays of entries defined without providing index designators. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ipa: don't index mem data array by IDAlex Elder
Finally the code handles the IPA memory region array in the configuration data without assuming it is indexed by region ID. Get rid of the array index designators where these arrays are initialized. As a result, there's no more need to define an explicitly undefined memory region ID, so get rid of that. Change ipa_mem_find() so it no longer assumes the ipa->mem[] array is indexed by memory region ID. Instead, have it search the array for the entry having the requested memory ID, and return the address of the descriptor if found. Otherwise return NULL. Stop allowing memory regions to be defined with zero size and zero canary value. Check for this condition in ipa_mem_valid_one(). As a result, it is not necessary to check for this case in ipa_mem_config(). Finally, there is no need for IPA_MEM_UNDEFINED to be defined any more, so get rid of it. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ipa: introduce ipa_mem_find()Alex Elder
Introduce a new function that abstracts finding information about a region in IPA-local memory, given its memory region ID. For now it simply uses the region ID as an index into the IPA memory array. If the region is not defined, ipa_mem_find() returns a null pointer. Update all code that accesses the ipa->mem[] array directly to use ipa_mem_find() instead. The return value must be checked for null when optional memory regions are sought. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ipa: pass memory id to ipa_table_valid_one()Alex Elder
Stop passing most of the Boolean flags to ipa_table_valid_one(), and just pass a memory region ID to it instead. We still need to indicate whether we're operating on a routing or filter table. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ipa: pass mem_id to ipa_table_reset_add()Alex Elder
Pass a memory region ID rather than the address of a memory region descriptor to ipa_table_reset_add() to simplify callers. Similarly, pass memory region IDs to ipa_table_init_add(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ipa: pass mem ID to ipa_mem_zero_region_add()Alex Elder
Pass a memory region ID rather than the address of a memory region descriptor to ipa_mem_zero_region_add() to simplify callers. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ipa: pass mem_id to ipa_filter_reset_table()Alex Elder
Pass a memory region ID rather than the address of a memory region descriptor to ipa_filter_reset_table(), to simplify callers. We can eliminate the check for a zero region size in this function because ipa_table_reset_add() checks that before adding anything to the transaction. Note that here and in subsequent commits there is no need to check whether a memory region exists, because we will have already verified that during initialization. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ipa: clean up header memory validationAlex Elder
Do some general cleanup in ipa_cmd_header_valid(): - Delay assigning the mem variable until just before it's used. - Assign the maximum offset and size values together. - Improve comments explaining the single range of memory being made up of a modem portion and an AP portion. - Record the offset of the combined range in a local variable. - Do the initial size assignment right after assigning the offset. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ipa: don't assume mem array indexed by IDAlex Elder
Change ipa_mem_valid() to iterate over the entries using a u32 index variable rather than using a memory region ID. Use the ID found inside the memory descriptor rather than the loop index. Change ipa_mem_size_valid() to iterate over the entries but without assuming the array index is the memory region ID. "Empty" entries will have zero size; and we'll temporarily assume such entries have zero offset as well (they all do, currently). Similarly, don't assume the mem[] array is indexed by ID in ipa_mem_config(). There, "empty" entries will have a zero canary count, so no special assumptions are needed to handle them correctly. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10ibmvnic: Allow device probe if the device is not ready at bootCristobal Forno
Allow the device to be initialized at a later time if it is not available at boot. The device will be allowed to probe but will be given a "down" state. After completing device probe and registering the net device, the driver will await an interrupt signal from its partner device, indicating that it is ready for boot. The driver will schedule a work event to perform the necessary procedure and begin operation. Co-developed-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Cristobal Forno <cforno12@linux.ibm.com> Acked-by: Lijun Pan <lijunp213@gmail.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10Merge branch 'marvell-prestera-lag'David S. Miller
Vadym Kochan says: ==================== net: marvell: prestera: add LAG support The following features are supported: - LAG basic operations - create/delete LAG - add/remove a member to LAG - enable/disable member in LAG - LAG Bridge support - LAG VLAN support - LAG FDB support Limitations: - Only HASH lag tx type is supported - The Hash parameters are not configurable. They are applied during the LAG creation stage. - Enslaving a port to the LAG device that already has an upper device is not supported. Changes extracted from: https://lkml.org/lkml/2021/2/3/877 and marked with "v2". v2: There are 2 additional preparation patches which simplifies the netdev topology handling. 1) Initialize 'lag' with NULL in prestera_lag_create() [suggested by Vladimir Oltean] 2) Use -ENOSPC in prestera_lag_port_add() if max lag [suggested by Vladimir Oltean] numbers were reached. 3) Do not propagate netdev events to prestera_switchdev [suggested by Vladimir Oltean] but call bridge specific funcs. It simplifies the code. 4) Check on info->link_up in prestera_netdev_port_lower_event() [suggested by Vladimir Oltean] 5) Return -EOPNOTSUPP in prestera_netdev_port_event() in case [suggested by Vladimir Oltean] LAG hashing mode is not supported. 6) Do not pass "lower" netdev to bridge join/leave functions. [suggested by Vladimir Oltean] It is not need as offloading settings applied on particular physical port. It requires to do extra upper dev lookup in case port is in the LAG which is in the bridge on vlans add/del. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: marvell: prestera: add LAG supportSerhiy Boiko
The following features are supported: - LAG basic operations - create/delete LAG - add/remove a member to LAG - enable/disable member in LAG - LAG Bridge support - LAG VLAN support - LAG FDB support Limitations: - Only HASH lag tx type is supported - The Hash parameters are not configurable. They are applied during the LAG creation stage. - Enslaving a port to the LAG device that already has an upper device is not supported. Co-developed-by: Andrii Savka <andrii.savka@plvision.eu> Signed-off-by: Andrii Savka <andrii.savka@plvision.eu> Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Co-developed-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: marvell: prestera: do not propagate netdev events to prestera_switchdev.cVadym Kochan
Replace prestera_bridge_port_event(...) by prestera_bridge_port_join(...) and prestera_bridge_port_leave(). It simplifies the code by reading netdev event specific handling only once in prestera_main.c Signed-off-by: Vadym Kochan <vkochan@marvell.com> CC: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: marvell: prestera: move netdev topology validation to prestera_mainVadym Kochan
Move handling of PRECHANGEUPPER event from prestera_switchdev to prestera_main which is responsible for basic netdev events handling and routing them to related module. Signed-off-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10soc: qcom: ipa: Remove superfluous error message around platform_get_irq()Tan Zhongjun
The platform_get_irq() prints error message telling that interrupt is missing,hence there is no need to duplicated that message in the drivers. Signed-off-by: Tan Zhongjun <tanzhongjun@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10dccp: tfrc: fix doc warnings in tfrc_equation.cBaokun Li
Add description for `tfrc_invert_loss_event_rate` to fix the W=1 warnings: net/dccp/ccids/lib/tfrc_equation.c:695: warning: Function parameter or member 'loss_event_rate' not described in 'tfrc_invert_loss_event_rate' Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Richard Sailer <richard_siegfried@systemli.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10atm: Use list_for_each_entry() to simplify code in resources.cWang Hai
Convert list_for_each() to list_for_each_entry() where applicable. This simplifies the code. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10ibmvnic: Use list_for_each_entry() to simplify code in ibmvnic.cWang Hai
Convert list_for_each() to list_for_each_entry() where applicable. This simplifies the code. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Lijun Pan <lijunp213@gmail.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: x25: Use list_for_each_entry() to simplify code in x25_route.cWang Hai
Convert list_for_each() to list_for_each_entry() where applicable. This simplifies the code. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10mt76: mt7615: Use devm_platform_get_and_ioremap_resource()Yang Yingliang
Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: mido: mdio-mux-bcm-iproc: Use devm_platform_get_and_ioremap_resource()Yang Yingliang
Use devm_platform_get_and_ioremap_resource() to simplify code and avoid a null-ptr-deref by checking 'res' in it. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: stmmac: Fix mixed enum type warningWong Vee Khee
The commit 5a5586112b92 ("net: stmmac: support FPE link partner hand-shaking procedure") introduced the following coverity warning: "Parse warning (PW.MIXED_ENUM_TYPE)" "1. mixed_enum_type: enumerated type mixed with another type" This is due to both "lo_state" and "lp_sate" which their datatype are enum stmmac_fpe_state type, and being assigned with "FPE_EVENT_UNKNOWN" which is a macro-defined of 0. Fixed this by assigned both these variables with the correct enum value. Fixes: 5a5586112b92 ("net: stmmac: support FPE link partner hand-shaking procedure") Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10fjes: check return value after calling platform_get_resource()Yang Yingliang
It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: axienet: Use devm_platform_get_and_ioremap_resource()Yang Yingliang
Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: w5100: Use devm_platform_get_and_ioremap_resource()Yang Yingliang
Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10Merge branch 'ixp4xxx_hss-cleanups'David S. Miller
Peng Li says: ==================== net: ixp4xx_hss: clean up some code style issues This patchset clean up some code style issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ixp4xx_hss: add braces {} to all arms of the statementPeng Li
Braces {} should be used on all arms of this statement. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ixp4xx_hss: fix the comments style issuePeng Li
Networking block comments don't use an empty /* line, use /* Comment... Block comments use * on subsequent lines. Block comments use a trailing */ on a separate line. This patch fixes the comments style issues. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ixp4xx_hss: remove redundant spacesPeng Li
According to the chackpatch.pl, space prohibited after that open parenthesis '('. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ixp4xx_hss: add some required spacesPeng Li
Add space required before the open parenthesis '('. Add space required after that close brace '}'. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ixp4xx_hss: move out assignment in if conditionPeng Li
Should not use assignment in if condition. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ixp4xx_hss: fix the code style issue about "foo* bar"Peng Li
Fix the checkpatch error as "foo* bar" and should be "foo *bar", and "(foo*)" should be "(foo *)". Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ixp4xx_hss: add blank line after declarationsPeng Li
This patch fixes the checkpatch error about missing a blank line after declarations. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ixp4xx_hss: remove redundant blank linesPeng Li
This patch removes some redundant blank lines. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10tipc:subscr.c: fix a spelling mistakegushengxian
Fix a spelling mistake. Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10tipc: socket.c: fix the use of copular verbgushengxian
Fix the use of copular verb. Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10node.c: fix the use of indefinite articlegushengxian
Fix the use of indefinite article. Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10af_unix: remove the repeated word "and"gushengxian
Remove the repeated word "and". Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10vsock/vmci: remove the repeated word "be"gushengxian
Remove the repeated word "be". Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10Merge tag 'mlx5-updates-2021-06-09' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-06-09 Introduce steering header insert/remove and switchdev bridge offloads 1) From Yevgeny, Steering header insert/remove support ConnectX supports offloading of various encapsulations and decapsulations (e.g. VXLAN), which are performed by 'Packet Reformat' action. Starting with ConnectX-6 DX, a new reformat type is supported - INSERT_HEADER. This reformat allows inserting an arbitrary size buffer at a selected location in the packet on RX flows. The insert/remove header support are needed as a prerequisite for the bridge offloads vlan pop/push supprt, see below. 2) From Vlad, Support for bridge offloads for switchdev mode This change implements bridge offloads with VLAN-support that works on top of mlx5 representors in switchdev mode. HIGH-LEVEL OVERVIEW Hardware supported by mlx5 driver doesn't provide dynamic learning or aging functionality and requires the driver to emulate all switch-like behavior in software. As such, all packets by default go through miss path, appear on representor and get to software bridge, if it is the upper device of the representor. This causes bridge to process packet in software, learn the MAC address to FDB and send SWITCHDEV_FDB_ADD_TO_DEVICE event to all subscribers. Upon reception of SWITCHDEV_FDB_ADD_TO_DEVICE notification mlx5 bridge offloads the FDB to hardware and sends back SWITCHDEV_FDB_ADD_TO_BRIDGE notification to prevent such entries from being aged out by kernel bridge. Leaving aging to kernel bridge would result deletion of offloaded dynamic FDB entries every aging_time period due to packets being processed by hardware and, consecutively, 'used' timestamp for FDB entry not being updated. Hardware aging is emulated in driver by running periodic workqueue task that manually updates the rules according to their hardware counter: - If hardware counter has changed since last update, the handler updates 'used' timestamp in kernel bridge dynamic entry by sending SWITCHDEV_FDB_ADD_TO_BRIDGE notification for the entry. - If FDB entry wasn't updated for user-controllable aging_time period, then the FDB entry is unoffloaded from hardware and corresponding SWITCHDEV_FDB_DEL_TO_BRIDGE notification is sent to kernel bridge. The mlx5 bridge offload implementation fully supports port VLAN objects, including PVID (vlan push) and "Egress Untagged" (vlan pop). SOFTWARE ARCHITECTURE Mlx5_eswitch is extended with pointer to new mlx5_esw_bridge_offloads structure which has a linked list of mlx5_esw_bridge objects. Struct mlx5_esw_bridge is the main switch object in mlx5 that holds all data for offloaded FDB entries and metadata for bridge ports and their vlans. The mlx5_esw_bridge object is created when first representor of eswitch vport is added to bridge and deleted when the last representor is detached from it. Bridge FDB entries are saved in linked list (to iterate over all FDB entries in aging workqueue task) and also in hashtable for quick lookup by MAC+VLAN tuple. Bridge FDB entries are saved in linked list (to iterate over all FDB entries in aging workqueue task) and in hashtable for quick lookup by MAC+VLAN tuple. Port metadata is stored in struct mlx5_esw_bridge_port that is saved in xarray to allow quick lookup by vport number. Part of the port metadata is the set of port vlans that are represented by mlx5_esw_bridge_vlan structure. The vlan structure points to all FDBs on vlan/port via fdb_list linked list. Simplified diagram of mlx5 bridge objects: +------------------+ | mxl5_eswitch | | | | br_offloads | +--------+---------+ | +--------v-------------------+ | mlx5_esw_bridge_offloads | | | +--> bridges | | +-------+--------------------+ | | | | | +---v---------------+ | | mlx5_esw_bridge | | | | | | vports | | | | | | fdb_ht | | +---+---------------+ | | | +---v---------------+ +------+ mlx5_esw_bridge | | | +-------------------------+ vports | | | | | | fdb_ht +------------------------------------------+ | +-------------------+ | | | | | | +----------------------+ +---------------------------+ | +-> mlx5_esw_bridge_port | +--> mlx5_esw_bridge_fdb_entry <-+ | | | +----------------------+ | +--+------------------------+ | | | vlans +--+-> mlx5_esw_bridge_vlan | | | | | | | | | | | +--v------------------------+ | | +----------------------+ | | fdb_list +--+ | mlx5_esw_bridge_fdb_entry <-+ | | +-------^--------------+ +--+------------------------+ | | +----------------------+ | | | | +-> mlx5_esw_bridge_port | | +-----------------------+ | | | | | | vlans | | -----------------------+ | | | +-> mlx5_esw_bridge_vlan | | +----------------------+ | | +---------------------------+ | | fdb_list +-----> mlx5_esw_bridge_fdb_entry <-+ +-------^--------------+ +--+------------------------+ | | +-----------------------+ HARDWARE REPRESENTATION In order to adhere to kernel software datapath model bridge offloads must come after TC and NF FDBs. However, since netfilter offload in mlx5 is implemented with unmanaged tables, its miss path is not automatically connected to next priority and requires the code to manually connect with slow table. To keep bridge offloads encapsulated and not mix it with eswitch offloads new FDB_TC_MISS priority is created between FDB_FT_OFFLOAD and FDB_SLOW_PATH which allows bridge offloads to be created without exposing its internal tables to any other modules since miss path of managed TC-miss table is automatically wired to next priority. The bridge tables are created with new priority FDB_BR_OFFLOAD in FDB namespace. The new priority is between tc-miss and slow path priorities. Priority consist of two levels: the ingress table that is global per eswitch and matches incoming packets by src_mac/vid and redirects them to next level (egress table) that is chosen according to ingress port bridge membership and matches on dst_mac/vid in order to redirect packet to vport according to the following diagram: + | +---------v----------+ | | | FDB_TC_OFFLOAD | | | +---------+----------+ | | +---------v----------+ | | | FDB_FT_OFFLOAD | | | +---------+----------+ | | +---------v----------+ | | | FDB_TC_MISS | | | +---------+----------+ | +--------------------------------------+ | | | | +------+ | | | | | +------v--------+ FDB_BR_OFFLOAD | | | INGRESS_TABLE | | | +------+---+----+ | | | | match | | | +---------+ | | | | | +-------+ | | +-------v-------+ match | | | | | | EGRESS_TABLE +------------> vport | | | +-------+-------+ | | | | | | | +-------+ | | miss | | | +------+------+ | | | | +--------------------------------------+ | | +---------v----------+ | | | FDB_SLOW_PATH | | | +---------+----------+ | v PATCHES OVERVIEW 1-3 - Miscellaneous refactorings and infrastructure changes. 4 - Mlx5 bridge offload infrastructure and dedicated fs_core namespace/tables implementation. 5 - FDB entry offload. 6 - Dynamic FDB entry aging. 7-10 - VLAN filtering offload. 11 - Tracepoints for main mlx5 bridge offload events (FDB entry offload/unoffload, VLAN add/delete, etc.) ==================== Signed-off-by: David S. Miller <davem@davemloft.net> --
2021-06-10net: davinci_emac: Use devm_platform_get_and_ioremap_resource()Yang Yingliang
Use devm_platform_get_and_ioremap_resource() to simplify code and avoid a null-ptr-deref by checking 'res' in it. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: ethernet: ti: cpsw: Use devm_platform_get_and_ioremap_resource()Yang Yingliang
Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: phy: probe for C45 PHYs that return PHY ID of zero in C22 spaceWong Vee Khee
PHY devices such as the Marvell Alaska 88E2110 does not return a valid PHY ID when probed using Clause-22. The current implementation treats PHY ID of zero as a non-error and valid PHY ID, and causing the PHY device failed to bind to the Marvell driver. For such devices, we do an additional probe in the Clause-45 space, if a valid PHY ID is returned, we then proceed to attach the PHY device to the matching PHY ID driver. Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10netlink: simplify NLMSG_DATA with NLMSG_HDRLENChen Li
The NLMSG_LENGTH(0) may confuse the API users, NLMSG_HDRLEN is much more clear. Besides, some code style problems are also fixed. Signed-off-by: Chen Li <chenli@uniontech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09net/mlx5: Bridge, add tracepointsVlad Buslov
Move private bridge structures to dedicated headers that is accessible to bridge tracepoint header. Implemented following tracepoints: - Initialize FDB entry. - Refresh FDB entry. - Cleanup FDB entry. - Create VLAN. - Cleanup VLAN. - Attach port to bridge. - Detach port from bridge. Usage example: ># cd /sys/kernel/debug/tracing ># echo mlx5:mlx5_esw_bridge_fdb_entry_init >> set_event ># cat trace ... kworker/u20:1-96 [001] .... 231.892503: mlx5_esw_bridge_fdb_entry_init: net_device=enp8s0f0_0 addr=e4:fd:05:08:00:02 vid=3 flags=0 lastuse=4294895695 Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09net/mlx5: Bridge, filter tagged packets that didn't match tagged fgVlad Buslov
With support for pvid vlans in mlx5 bridge it is possible to have rules in untagged flow group when vlan filtering is enabled. However, such rules can also match tagged packets that didn't match anything in tagged flow group. Filter such packets by introducing additional flow group between tagged and untagged groups. When filtering is enabled on the bridge create additional flow in vlan filtering flow group and matches tagged packets with specified source MAC address and redirects them to new "skip" table. The skip table is new lowest-level empty table that is used to skip all further processing on packet in bridge priority. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>