summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-09-23net/mlx5: simplify the return expression of mlx5_ec_init()Qinglang Miao
Simplify the return expression. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-23net/mlx5e: Use kfree() to free fd->g in accel_fs_tcp_create_groups()Denis Efremov
Memory ft->g in accel_fs_tcp_create_groups() is allocaed with kcalloc(). It's excessive to free ft->g with kvfree(). Use kfree() instead. Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-23net/mlx5e: IPsec: Use kvfree() for memory allocated with kvzalloc()Denis Efremov
Variables flow_group_in, spec in rx_fs_create() are allocated with kvzalloc(). It's incorrect to free them with kfree(). Use kvfree() instead. Fixes: 5e466345291a ("net/mlx5e: IPsec: Add IPsec steering in local NIC RX") Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-23net/mlx5e: Keep direct reference to mlx5_core_dev in tc ctAriel Levkovich
Keep and use a direct reference to the mlx5 core device in all of tc_ct code instead of accessing it via a pointer to mlx5 eswitch in order to support nic mode ct offload for VF devices that don't have a valid eswitch pointer set. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-23net/mlx5e: TC: Remove unused parameter from mlx5_tc_ct_add_no_trk_match()Saeed Mahameed
priv is never used in this function Fixes: 7e36feeb0467 ("net/mlx5e: CT: Don't offload tuple rewrites for established tuples") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-23net/mlx5e: CT: Use the same counter for both directionsOz Shlomo
A connection is represented by two 5-tuple entries, one for each direction. Currently, each direction allocates its own hw counter, which is inefficient as ct aging is managed per connection. Share the counter that was allocated for the original direction with the reverse direction. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-23net/mlx5e: Support CT offload for tc nic flowsAriel Levkovich
Adding support to perform CT related tc actions and matching on CT states for nic flows. The ct flows management and handling will be done using a new instance of the ct database that is declared in this patch to keep it separate from the eswitch ct flows database. Offloading and unoffloading ct flows will be done using the existing ct offload api by providing it the relevant ct database reference in each mode. In addition, refactoring the tc ct api is introduced to make it agnostic to the flow type and perform the resource allocations and rule insertion to the proper steering domain in the device. In the initialization call, the api requests and stores in the ct database instance all the relevant information that distinguishes between nic flows and esw flows, such as chains database, steering namespace and mod hdr table. This way the operations of adding and removing ct flows to the device can later performed agnostically to the flow type. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-09-23net/mlx5e: rework ct offload init messagesAriel Levkovich
The changes are: - Use mlx5_core print macros instead of netdev_warn since netdev is not always initialized at that stage. - Print a warning message in case the issue is with lack of support for CT offload without indicating an error. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-09-23net/mlx5e: Add tc chains offload support for nic flowsAriel Levkovich
Allow adding nic tc flow rules with goto chain action. Connecting the nic flows to the mlx5 chains infrastructure in previous patches allows us to support the creation of chained flow tables and rules that direct to another chain for further packet processing. This is a required preparation to support CT offloads for nic tc flows. We allow the creation of 256 different chains for nic flows since we have 8 bits available for the chain restore tag in case of a miss. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-09-23net/mlx5: Refactor tc flow attributes structureAriel Levkovich
In order to support chains and connection tracking offload for nic flows, there's a need to introduce a common flow attributes struct so that these features can be agnostic and have access to a single attributes struct, regardless of the flow type. Therefore, a new tc flow attributes format is introduced to allow access to attributes that are common to eswitch and nic flows. The common attributes will always get allocated for the new flows, regardless of their type, while the type specific attributes are separated into different structs and will be allocated based on the flow type to avoid memory waste. When allocating the flow attributes the caller provides the flow steering namespace and according the namespace type the additional space for the extra, type specific, attributes is determined and added to the total attribute allocation size. In addition, the attributes that are going to be common to both flow types are moved to the common attributes struct. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-23net/mlx5e: Split nic tc flow allocation and creationAriel Levkovich
For future support of CT offload with nic tc flows, where the flow rule is not created immediately but rather following a future event, the patch is splitting the nic rule creation and deletion into 2 parts: 1. Creating/Deleting and setting the rule attributes. 2. Creating/Deleting the flow table and flow rule itself. This way the attributes can be prepared and stored in the flow handle when the tc flow is created but the rule can actually be created at any point in the future, using these pre allocated attributes. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-09-23net/mlx5e: Tc nic flows to use mlx5_chains flow tablesAriel Levkovich
Change nic tc flows offload path to use the chains and prios infrastructure for the flow table creation as a preparation to support tc multi chains and priorities for nic flows. Adding an instance of the table chaining database to the nic tc struct and perform the root table creation and desctuction via the chains api while keeping the limit of a single chain (0) in nic tc mode. This will be extendable to supporting multiple chains in the following patches. The flow table sizes and default miss table parameters that are provided to the chains creation api are kept the same. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-09-23net/mlx5: Allow ft level ignore for nic rx tablesAriel Levkovich
Allow setting a flow table with a lower level as a rule destination in nic rx tables. This is required in order to support table chaining of tc nic flows. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-09-23net/mlx5: Refactor multi chains and prios supportAriel Levkovich
Decouple the chains infrastructure from eswitch and make it generic to support other steering namespaces. The change defines an agnostic data structure to keep all the relevant information for maintaining flow table chaining in any steering namespace. Each namespace that requires table chaining will be required to allocate such data structure. The chains creation code will receive the steering namespace and flow table parameters from the caller so it will operate agnosticly when creating the required resources to maintain the table chaining function while Parts of the code that are relevant to eswitch specific functionality are moved to eswitch files. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-09-23Merge branch 'net-bridge-mcast-IGMPv3-MLDv2-fast-path-part-2'David S. Miller
Nikolay Aleksandrov says: ==================== net: bridge: mcast: IGMPv3/MLDv2 fast-path (part 2) This is the second part of the IGMPv3/MLDv2 support which adds support for the fast-path. In order to be able to handle source entries we add mdb support for S,G entries (i.e. we add source address support to br_ip), that requires to extend the current mdb netlink API, fortunately we just add another attribute which will contain nested future mdb attributes, then we use it to add support for S,G user- add, del and dump. The lookup sequence is simple: when IGMPv3/MLDv2 are enabled do the S,G lookup first and if it fails fallback to *,G. The more complex part is when we begin handling source lists and auto-installing S,G entries and *,G filter mode transitions. We have the following cases: 1) *,G INCLUDE -> EXCLUDE transition: we need to install the port in all of *,G's installed S,G entries for proper replication (except the ones explicitly blocked), this is also necessary when adding a new *,G EXCLUDE port group 2) *,G EXCLUDE -> INCLUDE transition: we need to remove the port from all of *,G's installed S,G entries, this is also necessary when removing a *,G port group 3) New S,G port entry: we need to install all current *,G EXCLUDE ports 4) Remove S,G port entry: if all other port groups were auto-installed we can safely remove them and delete the whole S,G entry Currently we compute these operations from the available ports, their source lists and their filter mode. In the future we can extend the port group structure and reduce the running time of these ops. Also one current limitation is that host-joined S,G entries are not supported. I.e. one cannot add "dev bridge port bridge" mdb S,G entries. The host join is currently considered an EXCLUDE {} join, so it's reflected in all of *,G's installed S,G entries. If an S,G,port entry is added as temporary then the kernel can take it over if a source shows up from a report, permanent entries are skipped. In order to properly handle blocked sources we add a new port group blocked flag to avoid forwarding to that port group in the S,G. Finally when forwarding we use the port group filter mode (if it's INCLUDE and the port group is from a *,G then don't replicate to it, respectively if it's EXCLUDE then forward) and the blocked flag (obviously if it's set - skip that port unless it's a router port) to decide if the port should be skipped. Another limitation is that we can't do some of the above transitions without small traffic drop while installing/removing entries. That will be taken care of when we add atomic swap of port replication lists later. Patch break down: patches 1-3: prepare the mdb code for better extack support which is used in future patches to return a more meaningful error patches 4-6: add the source address field to struct br_ip, and do minor cleanups around it patches 7-8: extend the mdb netlink API so we can send new mdb attributes and uses the new API for S,G entry add/del/dump support patch 9: takes care of S,G entries when doing a lookup (first S,G then *,G lookup) patch 10: adds a new port group field and attribute for origin protocol we use the already available RTPROT_ definitions, currently user-space entries are added as RTPROT_STATIC and kernel entries are added as RTPROT_KERNEL, we may allow user-space to set custom values later (e.g. for FRR, clag) patch 11: adds an internal S,G,port rhashtable to speed up filter mode transitions patch 12: initial automatic install of S,G entries based on port groups' source lists patch 13: handles port group modes on transitions or when new port group entries are added patch 14: self-explanatory - adds support for blocked port group entries needed to stop forwarding to particular S,G,port entries patch 15: handles host-join/leave state changes, treats host-joins as EXCLUDE {} groups (reflected in all *,G's S,G entries) patch 16: finally adds the fast-path filter mode and block flag support Here're the sets that will come next (in order): - iproute2 support for IGMPv3/MLDv2 - selftests for all mode transitions and group flags - explicit host tracking for proper fast-leave support - atomic port replication lists (these are also needed for broadcast forwarding optimizations) - mode transition optimization and removal of open-coded sorted lists Not implemented yet: - Host IGMPv3/MLDv2 filter support (currently we handle only join/leave as before) - Proper other querier source timer and value updates - IGMPv3/v2 MLDv2/v1 compat (I have a few rough patches for this one) v2: fix build with CONFIG_BATMAN_ADV_MCAST in patch 6 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: when forwarding handle filter mode and blocked flagNikolay Aleksandrov
We need to avoid forwarding to ports in MCAST_INCLUDE filter mode when the mdst entry is a *,G or when the port has the blocked flag. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: handle host stateNikolay Aleksandrov
Since host joins are considered as EXCLUDE {} joins we need to reflect that in all of *,G ports' S,G entries. Since the S,Gs can have host_joined == true only set automatically we can safely set it to false when removing all automatically added entries upon S,G delete. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: add support for blocked port groupsNikolay Aleksandrov
When excluding S,G entries we need a way to block a particular S,G,port. The new port group flag is managed based on the source's timer as per RFCs 3376 and 3810. When a source expires and its port group is in EXCLUDE mode, it will be blocked. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: handle port group filter modesNikolay Aleksandrov
We need to handle group filter mode transitions and initial state. To change a port group's INCLUDE -> EXCLUDE mode (or when we have added a new port group in EXCLUDE mode) we need to add that port to all of *,G ports' S,G entries for proper replication. When the EXCLUDE state is changed from IGMPv3 report, br_multicast_fwd_filter_exclude() must be called after the source list processing because the assumption is that all of the group's S,G entries will be created before transitioning to EXCLUDE mode, i.e. most importantly its blocked entries will already be added so it will not get automatically added to them. The transition EXCLUDE -> INCLUDE happens only when a port group timer expires, it requires us to remove that port from all of *,G ports' S,G entries where it was automatically added previously. Finally when we are adding a new S,G entry we must add all of *,G's EXCLUDE ports to it. In order to distinguish automatically added *,G EXCLUDE ports we have a new port group flag - MDB_PG_FLAGS_STAR_EXCL. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: install S,G entries automatically based on reportsNikolay Aleksandrov
This patch adds support for automatic install of S,G mdb entries based on the port group's source list and the source entry's timer. Once installed the S,G will be used when forwarding packets if the approprate multicast/mld versions are set. A new source flag called BR_SGRP_F_INSTALLED denotes if the source has a forwarding mdb entry installed. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: add sg_port rhashtableNikolay Aleksandrov
To speedup S,G forward handling we need to be able to quickly find out if a port is a member of an S,G group. To do that add a global S,G port rhashtable with key: source addr, group addr, protocol, vid (all br_ip fields) and port pointer. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: add rt_protocol field to the port group structNikolay Aleksandrov
We need to be able to differentiate between pg entries created by user-space and the kernel when we start generating S,G entries for IGMPv3/MLDv2's fast path. User-space entries are created by default as RTPROT_STATIC and the kernel entries are RTPROT_KERNEL. Later we can allow user-space to provide the entry rt_protocol so we can differentiate between who added the entries specifically (e.g. clag, admin, frr etc). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: when igmpv3/mldv2 are enabled lookup (S,G) first, then (*,G)Nikolay Aleksandrov
If (S,G) entries are enabled (igmpv3/mldv2) then look them up first. If there isn't a present (S,G) entry then try to find (*,G). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mdb: add support for add/del/dump of entries with sourceNikolay Aleksandrov
Add new mdb attributes (MDBE_ATTR_SOURCE for setting, MDBA_MDB_EATTR_SOURCE for dumping) to allow add/del and dump of mdb entries with a source address (S,G). New S,G entries are created with filter mode of MCAST_INCLUDE. The same attributes are used for IPv4 and IPv6, they're validated and parsed based on their protocol. S,G host joined entries which are added by user are not allowed yet. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mdb: add support to extend add/del commandsNikolay Aleksandrov
Since the MDB add/del code expects an exact struct br_mdb_entry we can't really add any extensions, thus add a new nested attribute at the level of MDBA_SET_ENTRY called MDBA_SET_ENTRY_ATTRS which will be used to pass all new options via netlink attributes. This patch doesn't change anything functionally since the new attribute is not used yet, only parsed. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: rename br_ip's u member to dstNikolay Aleksandrov
Since now we have src in br_ip, u no longer makes sense so rename it to dst. No functional changes. v2: fix build with CONFIG_BATMAN_ADV_MCAST CC: Marek Lindner <mareklindner@neomailbox.ch> CC: Simon Wunderlich <sw@simonwunderlich.de> CC: Antonio Quartulli <a@unstable.cc> CC: Sven Eckelmann <sven@narfation.org> CC: b.a.t.m.a.n@lists.open-mesh.org Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: use br_ip's src for src groups and querier addressNikolay Aleksandrov
Now that we have src and dst in br_ip it is logical to use the src field for the cases where we need to work with a source address such as querier source address and group source address. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: add src field to br_ipNikolay Aleksandrov
Add a new src field to struct br_ip which will be used to lookup S, G entries. When SSM option is added we will enable full br_ip lookups. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mdb: use extack in br_mdb_add() and br_mdb_add_group()Nikolay Aleksandrov
Pass and use extack all the way down to br_mdb_add_group(). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mdb: move all port and bridge checks to br_mdb_addNikolay Aleksandrov
To avoid doing duplicate device checks and searches (the same were done in br_mdb_add and __br_mdb_add) pass the already found port to __br_mdb_add and pull the bridge's netif_running and enabled multicast checks to br_mdb_add. This would also simplify the future extack errors. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mdb: use extack in br_mdb_parse()Nikolay Aleksandrov
We can drop the pr_info() calls and just use extack to return a meaningful error to user-space when br_mdb_parse() fails. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: realtek: Remove set but not used variableZheng Yongjun
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/realtek/8139cp.c: In function cp_tx_timeout: drivers/net/ethernet/realtek/8139cp.c:1242:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable] `rc` is never used, so remove it. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23hinic: improve the comments of function headerLuo bin
Fix the warnings about function header comments when building hinic driver with "W=1" option. Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-09-23 The following pull-request contains BPF updates for your *net-next* tree. We've added 95 non-merge commits during the last 22 day(s) which contain a total of 124 files changed, 4211 insertions(+), 2040 deletions(-). The main changes are: 1) Full multi function support in libbpf, from Andrii. 2) Refactoring of function argument checks, from Lorenz. 3) Make bpf_tail_call compatible with functions (subprograms), from Maciej. 4) Program metadata support, from YiFei. 5) bpf iterator optimizations, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23tools resolve_btfids: Always force HOSTARCHJiri Olsa
Seth reported problem with cross builds, that fail on resolve_btfids build, because we are trying to build it on cross build arch. Fixing this by always forcing the host arch. Reported-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200923185735.3048198-2-jolsa@kernel.org
2020-09-23bpf: Check CONFIG_BPF option for resolve_btfidsJiri Olsa
Currently all the resolve_btfids 'users' are under CONFIG_BPF code, so if we have CONFIG_BPF disabled, resolve_btfids will fail, because there's no data to resolve. Disabling resolve_btfids if there's CONFIG_BPF disabled, so we won't fail such builds. Suggested-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200923185735.3048198-1-jolsa@kernel.org
2020-09-23Merge tag 'linux-can-next-for-5.10-20200923' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2020-09-23 this is a pull request of 20 patches for net-next. The complete series target the flexcan driver and is created by Joakim Zhang and me. The first six patches are cleanup (sort include files alphabetically, remove stray empty line, get rid of long lines) and adding more registers and documentation (registers and wakeup interrupt). Then in two patches the transceiver regulator is made optional, and a check for maximum transceiver bitrate is added. Then the ECC support for HW thats supports this is added. The next three patches improve suspend and low power mode handling. Followed by six patches that add CAN-FD support and CAN-FD related features. The last two patches add support for the flexcan IP core on the imx8qm and lx2160ar1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23Merge branch 's390-qeth-next'David S. Miller
Julian Wiedmann says: ==================== s390/qeth: updates 2020-09-23 please apply the following patch series for qeth to netdev's net-next tree. This brings all sorts of cleanups. Highlights are more code sharing in the init/teardown paths, and more fine-grained rollback on errors during initialization (instead of a full-blown teardown). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: remove forward declarations in L2 codeJulian Wiedmann
Shuffle some code around (primarily all the discipline-related stuff) to get rid of all the unnecessary forward declarations. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: consolidate teardown codeJulian Wiedmann
Clarify which discipline-specific steps are needed to roll back after error in qeth_l?_set_online(), and which are common to roll back from qeth_hardsetup_card(). Some steps (cancelling the RX modeset, draining the TX queues) are only necessary if the netdev was potentially UP before, so move them to the common qeth_set_offline(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: consolidate online codeJulian Wiedmann
Move duplicated code from the disciplines into the core path. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: cancel cmds earlier during teardownJulian Wiedmann
Originators of cmd IO typically hold the rtnl or conf_mutex to protect against a concurrent teardown. Since qeth_set_offline() already holds the conf_mutex, the main reason why we still care about cancelling pending cmds is so that they release the rtnl when we need it ourselves. So move this step a little earlier into the teardown sequence. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: tighten ucast IP lockingJulian Wiedmann
The programming of ucast IPs via qeth_l3_modify_ip() is driven independently from any of our typical locking mechanisms (eg. detaching the netdevice, or holding the conf_mutex). So when we inspect the card state to check whether the required cmd IO should be deferred, there is no protection against concurrent state changes. But by slightly re-ordering the teardown sequence, we can rely on the ip_lock to sufficiently serialize things: 1. when running concurrently to qeth_l3_set_online(), any instance of qeth_l3_modify_ip() that aquires the ip_lock _after_ qeth_l3_recover_ip() will observe the state as CARD_STATE_SOFTSETUP and not defer the IO. 2. when running concurrently to qeth_l3_set_offline(), any instance of qeth_l3_modify_ip() that aquires the ip_lock _after_ qeth_l3_clear_ip_htable() will observe the state as CARD_STATE_DOWN and defer the IO. These guarantees in mind, we can now drop the conf_mutex from the qeth_l3_modify_rxip_vipa() wrapper. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: replace deprecated simple_stroul()Julian Wiedmann
Convert the remaining occurences in sysfs code to kstrtouint(). While at it move some input parsing out of locked sections, replace an open-coded clamp() and remove some unnecessary run-time checks for ipatoe->mask_bits that are already enforced when creating the object. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: clean up string ops in qeth_l3_parse_ipatoe()Julian Wiedmann
Indicate the max number of to-be-parsed characters, and avoid copying the address sub-string. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: relax locking for ipato config dataJulian Wiedmann
card->ipato is currently protected by the conf_mutex. But most users also hold the ip_lock - in particular qeth_l3_add_ip(). So slightly expand the sections under ip_lock in a few places (to effectively cover a few error & no-op cases), and then drop the conf_mutex where it's no longer needed. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23s390/qeth: don't init refcount twice for mcast IPsJulian Wiedmann
mcast IP objects are allocated within qeth_l3_add_mcast_rtnl(), with .ref_counter already set to 1 via qeth_l3_init_ipaddr(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23bpf: Explicitly size compatible_reg_typesLorenz Bauer
Arrays with designated initializers have an implicit length of the highest initialized value plus one. I used this to ensure that newly added entries in enum bpf_reg_type get a NULL entry in compatible_reg_types. This is difficult to understand since it requires knowledge of the peculiarities of designated initializers. Use __BPF_ARG_TYPE_MAX to size the array instead. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200923160156.80814-1-lmb@cloudflare.com
2020-09-23net: microchip: Make `lan743x_pm_suspend` function return right valueZheng Yongjun
drivers/net/ethernet/microchip/lan743x_main.c: In function lan743x_pm_suspend: `ret` is set but not used. In fact, `pci_prepare_to_sleep` function value should be the right value of `lan743x_pm_suspend` function, therefore, fix it. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22Merge tag 'mlx5-updates-2020-09-21' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-09-21 Multi packet TX descriptor support for SKBs. This series introduces some refactoring of the regular TX data path in mlx5 and adds the Enhanced TX MPWQE feature support. MPWQE stands for multi-packet work queue element, and it can serve multiple packets, reducing the PCI bandwidth spent on control traffic. It should improve performance in scenarios where PCI is the bottleneck, and xmit_more is signaled by the kernel. The refactoring done in this series also improves the packet rate on its own. MPWQE is already implemented in the XDP tx path, this series adds the support of MPWQE for regular kernel SKB tx path. MPWQE is supported from ConnectX-5 and onward, for legacy devices we need to keep backward compatibility for regular (Single packet) WQE descriptor. MPWQE is not compatible with certain offloads and features, such as TLS offload, TSO, nonlinear SKBs. If such incompatible features are in use, the driver gracefully falls back to non-MPWQE per SKB. Prior to the final patch "net/mlx5e: Enhanced TX MPWQE for SKBs" that adds the actual support, Maxim did some refactoring to the tx data path to split it into stages and smaller helper functions that can be utilized and reused for both legacy and new MPWQE feature. Performance testing: UDP performance is improved in a single stream pktgen test: Packet rate: 16.86 Mpps (±0.15 Mpps) -> 20.94 Mpps (±0.33 Mpps) Instructions per packet: 434 -> 329 Cycles per packet: 158 -> 123 Instructions per cycle: 2.75 -> 2.67 TCP and XDP_TX single stream tests show no performance difference. MPWQE can reduce PCI bandwidth: PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 80.3% Inbound PCI utilization with MPWQE on: 59.0% PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 65.4% Inbound PCI utilization with MPWQE on: 49.3% MPWQE can also reduce CPU load, increasing the packet rate in case of CPU bottleneck: PCI Gen2, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 37.5 Mpps Packet rate with MPWQE on: 49.0 Mpps PCI Gen3, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 57.0 Mpps Packet rate with MPWQE on: 66.8 Mpps Burst size in all pktgen tests is 32. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64) NIC: Mellanox ConnectX-6 Dx GCC 10.2.0 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>