summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-29net: stmmac: option to enable PHY WOL with PMT enabledLing Pei Lee
The current stmmac driver WOL implementation will enable MAC WOL if MAC HW PMT feature is on. Else, the driver will check for PHY WOL support. There is another case where MAC HW PMT is enabled but the platform still goes for the PHY WOL option. E.g, Intel platform are designed for PHY WOL but not MAC WOL although HW MAC PMT features are enabled. Introduce use_phy_wol platform data to select PHY WOL instead of depending on HW PMT features. Set use_phy_wol will disable the plat->pmt which currently used to determine the system to wake up by MAC WOL or PHY WOL. Signed-off-by: Ling Pei Lee <pei.lee.ling@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29Merge branch 'ndo_dflt_fdb-print'David S. Miller
Vladimir Oltean says: ==================== Trivial print improvements in ndo_dflt_fdb_{add,del} These are some changes brought to the informational messages printed in the default .ndo_fdb_add and .ndo_fdb_del method implementations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}Vladimir Oltean
"Static" is a loaded word, and probably not what the author meant when the code was written. In particular, this looks weird: $ bridge fdb add dev swp0 00:01:02:03:04:05 local # totally fine, but $ bridge fdb add dev swp0 00:01:02:03:04:05 static [ 2020.708298] swp0: FDB only supports static addresses # hmm what? By looking at the implementation which uses dev_uc_add/dev_uc_del it is absolutely clear that only local addresses are supported, and the proper Network Unreachability Detection state is being used for this purpose (user space indeed sets NUD_PERMANENT when local addresses are meant). So it is just the message that is wrong, fix it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: use netdev_info in ndo_dflt_fdb_{add,del}Vladimir Oltean
Use the more modern printk helper for network interfaces, which also contains information about the associated struct device, and results in overall shorter line lengths compared to printing an open-coded dev->name. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29ptp: Set lookup cookie when creating a PTP PPS source.Jonathan Lemon
When creating a PTP device, the configuration block allows creation of an associated PPS device. However, there isn't any way to associate the two devices after creation. Set the PPS cookie, so pps_lookup_dev(ptp) performs correctly. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29Merge branch 'inet-sk_error-tracers'David S. Miller
Alexander Aring says: ==================== net: sock: add tracers for inet socket errors this patch series introduce tracers for sk_error_report socket callback calls. The use-case is that a user space application can monitor them and making an own heuristic about bad peer connections even over a socket lifetime. To make a specific example it could be use in the Linux cluster world to fence a "bad" behaving node. For now it's okay to only trace inet sockets. Other socket families can introduce their own tracers easily. Example output with trace-cmd: <idle>-0 [003] 201.799437: inet_sk_error_report: family=AF_INET protocol=IPPROTO_TCP sport=21064 dport=38941 saddr=192.168.122.57 daddr=192.168.122.251 saddrv6=::ffff:192.168.122.57 daddrv6=::ffff:192.168.122.251 error=104 - Alex changes since v2: - change "sk.sk_error_report(&ipc->sk);" to "sk_error_report(&ipc->sk);" in net/qrtr/qrtr.c ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: sock: add trace for socket errorsAlexander Aring
This patch will add tracers to trace inet socket errors only. A user space monitor application can track connection errors indepedent from socket lifetime and do additional handling. For example a cluster manager can fence a node if errors occurs in a specific heuristic. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: sock: introduce sk_error_reportAlexander Aring
This patch introduces a function wrapper to call the sk_error_report callback. That will prepare to add additional handling whenever sk_error_report is called, for example to trace socket errors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29Merge branch 'dsa-rx-filtering'David S. Miller
Vladimir Oltean says: ==================== RX filtering in DSA This is my fourth stab (identical to the third one except sent as non-RFC) at creating a list of unicast and multicast addresses that the DSA CPU ports must trap. I am reusing a lot of Tobias's work which he submitted here: https://patchwork.kernel.org/project/netdevbpf/cover/20210116012515.3152-1-tobias@waldekranz.com/ My additions to Tobias' work come in the form of taking some care that additions and removals of host addresses are properly balanced, so that we can do reference counting on them for cross-chip setups and multiple bridges spanning the same switch (I am working on an NXP board where both are real requirements). During the last attempted submission of multiple CPU ports for DSA: https://patchwork.kernel.org/project/netdevbpf/cover/20210410133454.4768-1-ansuelsmth@gmail.com/ it became clear that the concept of multiple CPU ports would not be compatible with the idea of address learning on those CPU ports (when those CPU ports are statically assigned to user ports, not in a LAG) unless the switch supports complete FDB isolation, which most switches do not. So DSA needs to manage in software all addresses that are installed on the CPU port(s), which is what this patch set does. Compared to all earlier attempts, this series does not fiddle with how DSA operates the ports in standalone mode at all, just when bridged. We need to sort that out properly, then any optimization that comes in standalone mode (i.e. IFF_UNICAST_FLT) can come later. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: replay the local bridge FDB entries pointing to the bridge dev tooVladimir Oltean
When we join a bridge that already has some local addresses pointing to itself, we do not get those notifications. Similarly, when we leave that bridge, we do not get notifications for the deletion of those entries. The only switchdev notifications we get are those of entries added while the DSA port is enslaved to the bridge. This makes use cases such as the following work properly (with the number of additions and removals properly balanced): ip link add br0 type bridge ip link add br1 type bridge ip link set br0 address 00:01:02:03:04:05 ip link set br1 address 00:01:02:03:04:05 ip link set swp0 up ip link set swp1 up ip link set swp0 master br0 ip link set swp1 master br1 ip link set br0 up ip link set br1 up ip link del br1 # 00:01:02:03:04:05 still installed on the CPU port ip link del br0 # 00:01:02:03:04:05 finally removed from the CPU port Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are ↵Vladimir Oltean
on the same dev When (a) "dev" is a bridge port which the DSA switch tree offloads, but is otherwise not a dsa slave (such as a LAG netdev), or (b) "dev" is the bridge net device itself then strange things happen to the dev_hold/dev_put pair: dsa_schedule_work() will still be called with a DSA port that offloads that netdev, but dev_hold() will be called on the non-DSA netdev. Then the "if" condition in dsa_slave_switchdev_event_work() does not pass, because "dev" is not a DSA netdev, so dev_put() is not called. This results in the simple fact that we have a reference counting mismatch on the "dev" net device. This can be seen when we add support for host addresses installed on the bridge net device. ip link add br1 type bridge ip link set br1 address 00:01:02:03:04:05 ip link set swp0 master br1 ip link del br1 [ 968.512278] unregister_netdevice: waiting for br1 to become free. Usage count = 5 It seems foolish to do penny pinching and not add the net_device pointer in the dsa_switchdev_event_work structure, so let's finally do that. As an added bonus, when we start offloading local entries pointing towards the bridge, these will now properly appear as 'offloaded' in 'bridge fdb' (this was not possible before, because 'dev' was assumed to only be a DSA net device): 00:01:02:03:04:05 dev br0 vlan 1 offload master br0 permanent 00:01:02:03:04:05 dev br0 offload master br0 permanent Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: include fdb entries pointing to bridge in the host fdb listVladimir Oltean
The bridge supports a legacy way of adding local (non-forwarded) FDB entries, which works on an individual port basis: bridge fdb add dev swp0 00:01:02:03:04:05 master local As well as a new way, added by Roopa Prabhu in commit 3741873b4f73 ("bridge: allow adding of fdb entries pointing to the bridge device"): bridge fdb add dev br0 00:01:02:03:04:05 self local The two commands are functionally equivalent, except that the first one produces an entry with fdb->dst == swp0, and the other an entry with fdb->dst == NULL. The confusing part, though, is that even if fdb->dst is swp0 for the 'local on port' entry, that destination is not used. Nonetheless, the idea is that the bridge has reference counting for local entries, and local entries pointing towards the bridge are still 'as local' as local entries for a port. The bridge adds the MAC addresses of the interfaces automatically as FDB entries with is_local=1. For the MAC address of the ports, fdb->dst will be equal to the port, and for the MAC address of the bridge, fdb->dst will point towards the bridge (i.e. be NULL). Therefore, if the MAC address of the bridge is not inherited from either of the physical ports, then we must explicitly catch local FDB entries emitted towards the br0, otherwise we'll miss the MAC address of the bridge (and, of course, any entry with 'bridge add dev br0 ... self local'). Co-developed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: include bridge addresses which are local in the host fdb listTobias Waldekranz
The bridge automatically creates local (not forwarded) fdb entries pointing towards physical ports with their interface MAC addresses. For switchdev, the significance of these fdb entries is the exact opposite of that of non-local entries: instead of sending these frame outwards, we must send them inwards (towards the host). NOTE: The bridge's own MAC address is also "local". If that address is not shared with any port, the bridge's MAC is not be added by this functionality - but the following commit takes care of that case. NOTE 2: We mark these addresses as host-filtered regardless of the value of ds->assisted_learning_on_cpu_port. This is because, as opposed to the speculative logic done for dynamic address learning on foreign interfaces, the local FDB entries are rather fixed, so there isn't any risk of them migrating from one bridge port to another. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: sync static FDB entries on foreign interfaces to hardwareVladimir Oltean
DSA is able to install FDB entries towards the CPU port for addresses which were dynamically learnt by the software bridge on foreign interfaces that are in the same bridge with a DSA switch interface. Since this behavior is opportunistic, it is guarded by the "assisted_learning_on_cpu_port" property which can be enabled by drivers and is not done automatically (since certain switches may support address learning of packets coming from the CPU port). But if those FDB entries added on the foreign interfaces are static (added by the user) instead of dynamically learnt, currently DSA does not do anything (and arguably it should). Because static FDB entries are not supposed to move on their own, there is no downside in reusing the "assisted_learning_on_cpu_port" logic to sync static FDB entries to the DSA CPU port unconditionally, even if assisted_learning_on_cpu_port is not requested by the driver. For example, this situation: br0 / \ swp0 dummy0 $ bridge fdb add 02:00:de:ad:00:01 dev dummy0 vlan 1 master static Results in DSA adding an entry in the hardware FDB, pointing this address towards the CPU port. The same is true for entries added to the bridge itself, e.g: $ bridge fdb add 02:00:de:ad:00:01 dev br0 vlan 1 self local (except that right now, DSA still ignores 'local' FDB entries, this will be changed in a later patch) Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: install the host MDB and FDB entries in the master's RX filterVladimir Oltean
If the DSA master implements strict address filtering, then the unicast and multicast addresses kept by the DSA CPU ports should be synchronized with the address lists of the DSA master. Note that we want the synchronization of the master's address lists even if the DSA switch doesn't support unicast/multicast database operations, on the premises that the packets will be flooded to the CPU in that case, and we should still instruct the master to receive them. This is why we do the dev_uc_add() etc first, even if dsa_port_notify() returns -EOPNOTSUPP. In turn, dev_uc_add() and friends return error only if memory allocation fails, so it is probably ok to check and propagate that error code and not just ignore it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: reference count the FDB addresses at the cross-chip notifier levelVladimir Oltean
The same concerns expressed for host MDB entries are valid for host FDBs just as well: - in the case of multiple bridges spanning the same switch chip, deleting a host FDB entry that belongs to one bridge will result in breakage to the other bridge - not deleting FDB entries across DSA links means that the switch's hardware tables will eventually run out, given enough wear&tear So do the same thing and introduce reference counting for CPU ports and DSA links using the same data structures as we have for MDB entries. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: introduce a separate cross-chip notifier type for host FDBsVladimir Oltean
DSA treats some bridge FDB entries by trapping them to the CPU port. Currently, the only class of such entries are FDB addresses learnt by the software bridge on a foreign interface. However there are many more to be added: - FDB entries with the is_local flag (for termination) added by the bridge on the user ports (typically containing the MAC address of the bridge port) - FDB entries pointing towards the bridge net device (for termination). Typically these contain the MAC address of the bridge net device. - Static FDB entries installed on a foreign interface that is in the same bridge with a DSA user port. The reason why a separate cross-chip notifier for host FDBs is justified compared to normal FDBs is the same as in the case of host MDBs: the cross-chip notifier matching function in switch.c should avoid installing these entries on routing ports that route towards the targeted switch, but not towards the CPU. This is required in order to have proper support for H-like multi-chip topologies. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: reference count the MDB entries at the cross-chip notifier levelVladimir Oltean
Ever since the cross-chip notifiers were introduced, the design was meant to be simplistic and just get the job done without worrying too much about dangling resources left behind. For example, somebody installs an MDB entry on sw0p0 in this daisy chain topology. It gets installed using ds->ops->port_mdb_add() on sw0p0, sw1p4 and sw2p4. | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ x ] [ ] [ ] [ ] [ ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] Then the same person deletes that MDB entry. The cross-chip notifier for deletion only matches sw0p0: | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ x ] [ ] [ ] [ ] [ ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ ] Why? Because the DSA links are 'trunk' ports, if we just go ahead and delete the MDB from sw1p4 and sw2p4 directly, we might delete those multicast entries when they are still needed. Just consider the fact that somebody does: - add a multicast MAC address towards sw0p0 [ via the cross-chip notifiers it gets installed on the DSA links too ] - add the same multicast MAC address towards sw0p1 (another port of that same switch) - delete the same multicast MAC address from sw0p0. At this point, if we deleted the MAC address from the DSA links, it would be flooded, even though there is still an entry on switch 0 which needs it not to. So that is why deletions only match the targeted source port and nothing on DSA links. Of course, dangling resources means that the hardware tables will eventually run out given enough additions/removals, but hey, at least it's simple. But there is a bigger concern which needs to be addressed, and that is our support for SWITCHDEV_OBJ_ID_HOST_MDB. DSA simply translates such an object into a dsa_port_host_mdb_add() which ends up as ds->ops->port_mdb_add() on the upstream port, and a similar thing happens on deletion: dsa_port_host_mdb_del() will trigger ds->ops->port_mdb_del() on the upstream port. When there are 2 VLAN-unaware bridges spanning the same switch (which is a use case DSA proudly supports), each bridge will install its own SWITCHDEV_OBJ_ID_HOST_MDB entries. But upon deletion, DSA goes ahead and emits a DSA_NOTIFIER_MDB_DEL for dp->cpu_dp, which is shared between the user ports enslaved to br0 and the user ports enslaved to br1. Not good. The host-trapped multicast addresses installed by br1 will be deleted when any state changes in br0 (IGMP timers expire, or ports leave, etc). To avoid this, we could of course go the route of the zero-sum game and delete the DSA_NOTIFIER_MDB_DEL call for dp->cpu_dp. But the better design is to just admit that on shared ports like DSA links and CPU ports, we should be reference counting calls, even if this consumes some dynamic memory which DSA has traditionally avoided. On the flip side, the hardware tables of switches are limited in size, so it would be good if the OS managed them properly instead of having them eventually overflow. To address the memory usage concern, we only apply the refcounting of MDB entries on ports that are really shared (CPU ports and DSA links) and not on user ports. In a typical single-switch setup, this means only the CPU port (and the host MDB entries are not that many, really). The name of the newly introduced data structures (dsa_mac_addr) is chosen in such a way that will be reusable for host FDB entries (next patch). With this change, we can finally have the same matching logic for the MDB additions and deletions, as well as for their host-trapped variants. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: introduce a separate cross-chip notifier type for host MDBsVladimir Oltean
Commit abd49535c380 ("net: dsa: execute dsa_switch_mdb_add only for routing port in cross-chip topologies") does a surprisingly good job even for the SWITCHDEV_OBJ_ID_HOST_MDB use case, where DSA simply translates a switchdev object received on dp into a cross-chip notifier for dp->cpu_dp. To visualize how that works, imagine the daisy chain topology below and consider a SWITCHDEV_OBJ_ID_HOST_MDB object emitted on sw2p0. How does the cross-chip notifier know to match on all the right ports (sw0p4, the dedicated CPU port, sw1p4, an upstream DSA link, and sw2p4, another upstream DSA link)? | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] The answer is simple: the dedicated CPU port of sw2p0 is sw0p4, and dsa_routing_port returns the upstream port for all switches. That is fine, but there are other topologies where this does not work as well. There are trees with "H" topologies in the wild, where there are 2 or more switches with DSA links between them, but every switch has its dedicated CPU port. For these topologies, it seems stupid for the neighbor switches to install an MDB entry on the routing port, since these multicast addresses are fundamentally different than the usual ones we support (and that is the justification for this patch, to introduce the concept of a termination plane multicast MAC address, as opposed to a forwarding plane multicast MAC address). For example, when a SWITCHDEV_OBJ_ID_HOST_MDB would get added to sw0p0, without this patch, it would get treated as a regular port MDB on sw0p2 and it would match on the ports below (including the sw1p3 routing port). | | sw0p0 sw0p1 sw0p2 sw0p3 sw1p3 sw1p2 sw1p1 sw1p0 [ user ] [ user ] [ cpu ] [ dsa ] [ dsa ] [ cpu ] [ user ] [ user ] [ ] [ ] [ x ] [ ] ---- [ x ] [ ] [ ] [ ] With the patch, the host MDB notifier on sw0p0 matches only on the local switch, which is what we want for a termination plane address. | | sw0p0 sw0p1 sw0p2 sw0p3 sw1p3 sw1p2 sw1p1 sw1p0 [ user ] [ user ] [ cpu ] [ dsa ] [ dsa ] [ cpu ] [ user ] [ user ] [ ] [ ] [ x ] [ ] ---- [ ] [ ] [ ] [ ] Name this new matching function "dsa_switch_host_address_match" since we will be reusing it soon for host FDB entries as well. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: introduce dsa_is_upstream_port and dsa_switch_is_upstream_ofVladimir Oltean
In preparation for the new cross-chip notifiers for host addresses, let's introduce some more topology helpers which we are going to use to discern switches that are in our path towards the dedicated CPU port from switches that aren't. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: delete dsa_legacy_fdb_add and dsa_legacy_fdb_delVladimir Oltean
We want to add reference counting for FDB entries in cross-chip topologies, and in order for that to have any chance of working and not be unbalanced (leading to entries which are never deleted), we need to ensure that higher layers are sane, because if they aren't, it's garbage in, garbage out. For example, if we add a bridge FDB entry twice, the bridge properly errors out: $ bridge fdb add dev swp0 00:01:02:03:04:07 master static $ bridge fdb add dev swp0 00:01:02:03:04:07 master static RTNETLINK answers: File exists However, the same thing cannot be said about the bridge bypass operations: $ bridge fdb add dev swp0 00:01:02:03:04:07 $ bridge fdb add dev swp0 00:01:02:03:04:07 $ bridge fdb add dev swp0 00:01:02:03:04:07 $ bridge fdb add dev swp0 00:01:02:03:04:07 $ echo $? 0 But one 'bridge fdb del' is enough to remove the entry, no matter how many times it was added. The bridge bypass operations are impossible to maintain in these circumstances and lack of support for reference counting the cross-chip notifiers is holding us back from making further progress, so just drop support for them. The only way left for users to install static bridge FDB entries is the proper one, using the "master static" flags. With this change, rtnl_fdb_add() falls back to calling ndo_dflt_fdb_add() which uses the duplicate-exclusive variant of dev_uc_add(): dev_uc_add_excl(). Because DSA does not (yet) declare IFF_UNICAST_FLT, this results in us going to promiscuous mode: $ bridge fdb add dev swp0 00:01:02:03:04:05 [ 28.206743] device swp0 entered promiscuous mode $ bridge fdb add dev swp0 00:01:02:03:04:05 RTNETLINK answers: File exists So even if it does not completely fail, there is at least some indication that it is behaving differently from before, and closer to user space expectations, I would argue (the lack of a "local|static" specifier defaults to "local", or "host-only", so dev_uc_add() is a reasonable default implementation). If the generic implementation of .ndo_fdb_add provided by Vlad Yasevich is a proof of anything, it only proves that the implementation provided by DSA was always wrong, by not looking at "ndm->ndm_state & NUD_NOARP" (the "static" flag which means that the FDB entry points outwards) and "ndm->ndm_state & NUD_PERMANENT" (the "local" flag which means that the FDB entry points towards the host). It all used to mean the same thing to DSA. Update the documentation so that the users are not confused about what's going on. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: bridge: allow br_fdb_replay to be called for the bridge deviceVladimir Oltean
When a port joins a bridge which already has local FDB entries pointing to the bridge device itself, we would like to offload those, so allow the "dev" argument to be equal to the bridge too. The code already does what we need in that case. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: bridge: switchdev: send FDB notifications for host addressesTobias Waldekranz
Treat addresses added to the bridge itself in the same way as regular ports and send out a notification so that drivers may sync it down to the hardware FDB. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: bridge: use READ_ONCE() and WRITE_ONCE() compiler barriers for fdb->dstVladimir Oltean
Annotate the writer side of fdb->dst: - fdb_create() - br_fdb_update() - fdb_add_entry() - br_fdb_external_learn_add() with WRITE_ONCE() and the reader side: - br_fdb_test_addr() - br_fdb_update() - fdb_fill_info() - fdb_add_entry() - fdb_delete_by_addr_and_port() - br_fdb_external_learn_add() - br_switchdev_fdb_notify() with compiler barriers such that the readers do not attempt to reload fdb->dst multiple times, leading to potentially different destination ports when the fdb entry is updated concurrently. This is especially important in read-side sections where fdb->dst is used more than once, but let's convert all accesses for the sake of uniformity. Suggested-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'do_once_lite'David S. Miller
Tanner Love says: ==================== net: update netdev_rx_csum_fault() print dump only once First patch implements DO_ONCE_LITE to abstract uses of the ".data.once" trick. It is defined in its own, new header file -- rather than alongside the existing DO_ONCE in include/linux/once.h -- because include/linux/once.h includes include/linux/jump_label.h, and this causes the build to break for some architectures if include/linux/once.h is included in include/linux/printk.h or include/asm-generic/bug.h. Second patch uses DO_ONCE_LITE in netdev_rx_csum_fault to print dump only once. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: update netdev_rx_csum_fault() print dump only onceTanner Love
Printing this stack dump multiple times does not provide additional useful information, and consumes time in the data path. Printing once is sufficient. Changes v2: Format indentation properly Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28once: implement DO_ONCE_LITE for non-fast-path "do once" functionalityTanner Love
Certain uses of "do once" functionality reside outside of fast path, and so do not require jump label patching via static keys, making existing DO_ONCE undesirable in such cases. Replace uses of __section(".data.once") with DO_ONCE_LITE(_IF)? This patch changes the return values of xfs_printk_once, printk_once, and printk_deferred_once. Before, they returned whether the print was performed, but now, they always return true. This is okay because the return values of the following macros are entirely ignored throughout the kernel: - xfs_printk_once - xfs_warn_once - xfs_notice_once - xfs_info_once - printk_once - pr_emerg_once - pr_alert_once - pr_crit_once - pr_err_once - pr_warn_once - pr_notice_once - pr_info_once - pr_devel_once - pr_debug_once - printk_deferred_once - orc_warn Changes v3: - Expand commit message to explain why changing return values of xfs_printk_once, printk_once, printk_deferred_once is benign v2: - Fix i386 build warnings Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: sparx5: Do not use mac_addr uninitialized in mchp_sparx5_probe()Nathan Chancellor
Clang warns: drivers/net/ethernet/microchip/sparx5/sparx5_main.c:760:29: warning: variable 'mac_addr' is uninitialized when used here [-Wuninitialized] if (of_get_mac_address(np, mac_addr)) { ^~~~~~~~ drivers/net/ethernet/microchip/sparx5/sparx5_main.c:669:14: note: initialize the variable 'mac_addr' to silence this warning u8 *mac_addr; ^ = NULL 1 warning generated. mac_addr is only used to store the value retrieved from of_get_mac_address(), which is then copied into the base_mac member of the sparx5 struct using ether_addr_copy(). It is easier to just use the base_mac address directly, which avoids the warning and the extra copy. Fixes: 3cfa11bac9bb ("net: sparx5: add the basic sparx5 driver") Link: https://github.com/ClangBuiltLinux/linux/issues/1413 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: dsa: sja1105: fix dynamic access to L2 Address Lookup table for SJA1110Vladimir Oltean
The SJA1105P/Q/R/S and SJA1110 may have the same layout for the command to read/write/search for L2 Address Lookup entries, but as explained in the comments at the beginning of the sja1105_dynamic_config.c file, the command portion of the buffer is at the end, and we need to obtain a pointer to it by adding the length of the entry to the buffer. Alas, the length of an L2 Address Lookup entry is larger in SJA1110 than it is for SJA1105P/Q/R/S, so we need to create a common helper to access the command buffer, and this receives as argument the length of the entry buffer. Fixes: 3e77e59bf8cf ("net: dsa: sja1105: add support for the SJA1110 switch family") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: mrp: Update the Test frames for MRAHoratiu Vultur
According to the standard IEC 62439-2, in case the node behaves as MRA and needs to send Test frames on ring ports, then these Test frames need to have an Option TLV and a Sub-Option TLV which has the type AUTO_MGR. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge tag 'for-net-next-2021-06-28' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Add support for QCA_ROME device (0cf3:e500) and RTL8822CE - Update management interface revision to 21 - Use of incluse language - Proper handling of HCI_LE_Advertising_Set_Terminated event - Recovery handing of HCI ncmd=0 - Various memory fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your *net-next* tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messagesAndreas Roeseler
This patch builds off of commit 2b246b2569cd2ac6ff700d0dce56b8bae29b1842 and adds functionality to respond to ICMPV6 PROBE requests. Add icmp_build_probe function to construct PROBE requests for both ICMPV4 and ICMPV6. Modify icmpv6_rcv to detect ICMPV6 PROBE messages and call the icmpv6_echo_reply handler. Modify icmpv6_echo_reply to build a PROBE response message based on the queried interface. This patch has been tested using a branch of the iputils git repo which can be found here: https://github.com/Juniper-Clinic-2020/iputils/tree/probe-request Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: sparx5: fix error return code in sparx5_register_notifier_blocks()Yang Yingliang
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: d6fce5141929 ("net: sparx5: add switching support") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: sparx5: fix return value check in sparx5_create_targets()Yang Yingliang
In case of error, the function devm_ioremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: 3cfa11bac9bb ("net: sparx5: add the basic sparx5 driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: sparx5: check return value after calling platform_get_resource()Yang Yingliang
It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Fixes: 3cfa11bac9bb ("net: sparx5: add the basic sparx5 driver") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge tag 'mlx5-updates-2021-06-26' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-06-26 This series provides small updates to mlx5 driver. 1) Increase hairpin buffer size 2) Improve peroformance in SF allocation 3) Add IPsec support to uplink representor 4) Add stats for number of deleted kTLS TX offloaded connections 5) Add support for flow sampler in SW steering ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'bridge-replay-helpers'David S. Miller
Vladimir Oltean says: ==================== Cleanup for the bridge replay helpers This patch series brings some improvements to the logic added to the bridge and DSA to handle LAG interfaces sandwiched between a bridge and a DSA switch port. br0 / \ / \ bond0 swp2 / \ / \ swp0 swp1 In particular, it ensures that the switchdev object additions and deletions are well balanced per physical port. This is important for future work in the area of offloading local bridge FDB entries to hardware in the context of DSA requesting a replay of those entries at bridge join time (this will be submitted in a future patch series). Due to some difficulty ensuring that the deletion of local FDB entries pointing towards the bridge device itself is notified to switchdev in time (before the switchdev port disconnects from the bridge), this is potentially still not the final form in which the replay helpers will exist. I'm thinking about moving from the pull mode (in which DSA requests the replay) to a push mode (in which the bridge initiates the replay). Nonetheless, these preliminary changes are needed either way. The patch series also addresses some feedback from Nikolai which is long overdue by now (sorry). Switchdev driver maintainers were deliberately omitted due to the trivial nature of the driver changes (just a function prototype). Changes in v2: - fix build issue in patch 4 (function prototype mismatch) - move switchdev object unsync to the NETDEV_PRECHANGEUPPER code path ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: dsa: replay a deletion of switchdev objects for ports leaving a bridged LAGVladimir Oltean
When a DSA switch port leaves a bonding interface that is under a bridge, there might be dangling switchdev objects on that port left behind, because the bridge is not aware that its lower interface (the bond) changed state in any way. Call the bridge replay helpers with adding=false before changing dp->bridge_dev to NULL, because we need to simulate to dsa_slave_port_obj_del() that these notifications were emitted by the bridge. We add this hook to the NETDEV_PRECHANGEUPPER event handler, because we are calling into switchdev (and the __switchdev_handle_port_obj_del fanout helpers expect the upper/lower adjacency lists to still be valid) and PRECHANGEUPPER is the last moment in time when they still are. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: dsa: refactor the prechangeupper sanity checks into a dedicated functionVladimir Oltean
We need to add more logic to the DSA NETDEV_PRECHANGEUPPER event handler, more exactly we need to request an unsync of switchdev objects. In order to fit more code, refactor the existing logic into a helper. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: allow the switchdev replay functions to be called for deletionVladimir Oltean
When a switchdev port leaves a LAG that is a bridge port, the switchdev objects and port attributes offloaded to that port are not removed: ip link add br0 type bridge ip link add bond0 type bond mode 802.3ad ip link set swp0 master bond0 ip link set bond0 master br0 bridge vlan add dev bond0 vid 100 ip link set swp0 nomaster VLAN 100 will remain installed on swp0 despite it going into standalone mode, because as far as the bridge is concerned, nothing ever happened to its bridge port. Let's extend the bridge vlan, fdb and mdb replay functions to take a 'bool adding' argument, and make DSA and ocelot call the replay functions with 'adding' as false from the switchdev unsync path, for the switch port that leaves the bridge. Note that this patch in itself does not salvage anything, because in the current pull mode of operation, DSA still needs to call the replay helpers with adding=false. This will be done in another patch. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: constify variables in the replay helpersVladimir Oltean
Some of the arguments and local variables for the newly added switchdev replay helpers can be const, so let's make them so. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: ignore switchdev events for LAG ports which didn't request replayVladimir Oltean
There is a slight inconvenience in the switchdev replay helpers added recently, and this is when: ip link add br0 type bridge ip link add bond0 type bond ip link set bond0 master br0 bridge vlan add dev bond0 vid 100 ip link set swp0 master bond0 ip link set swp1 master bond0 Since the underlying driver (currently only DSA) asks for a replay of VLANs when swp0 and swp1 join the LAG because it is bridged, what will happen is that DSA will try to react twice on the VLAN event for swp0. This is not really a huge problem right now, because most drivers accept duplicates since the bridge itself does, but it will become a problem when we add support for replaying switchdev object deletions. Let's fix this by adding a blank void *ctx in the replay helpers, which will be passed on by the bridge in the switchdev notifications. If the context is NULL, everything is the same as before. But if the context is populated with a valid pointer, the underlying switchdev driver (currently DSA) can use the pointer to 'see through' the bridge port (which in the example above is bond0) and 'know' that the event is only for a particular physical port offloading that bridge port, and not for all of them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: switchdev: add a context void pointer to struct switchdev_notifier_infoVladimir Oltean
In the case where the driver asks for a replay of a certain type of event (port object or attribute) for a bridge port that is a LAG, it may do so because this port has just joined the LAG. But there might already be other switchdev ports in that LAG, and it is preferable that those preexisting switchdev ports do not act upon the replayed event. The solution is to add a context to switchdev events, which is NULL most of the time (when the bridge layer initiates the call) but which can be set to a value controlled by the switchdev driver when a replay is requested. The driver can then check the context to figure out if all ports within the LAG should act upon the switchdev event, or just the ones that match the context. We have to modify all switchdev_handle_* helper functions as well as the prototypes in the drivers that use these helpers too, because these helpers hide the underlying struct switchdev_notifier_info from us and there is no way to retrieve the context otherwise. The context structure will be populated and used in later patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: ocelot: delete call to br_fdb_replayVladimir Oltean
Not using this driver, I did not realize it doesn't react to SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE notifications, but it implements just the bridge bypass operations (.ndo_fdb_{add,del}). So the call to br_fdb_replay just produces notifications that are ignored, delete it for now. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: include the is_local bit in br_fdb_replayVladimir Oltean
Since commit 2c4eca3ef716 ("net: bridge: switchdev: include local flag in FDB notifications"), the bridge emits SWITCHDEV_FDB_ADD_TO_DEVICE events with the is_local flag populated (but we ignore it nonetheless). We would like DSA to start treating this bit, but it is still not populated by the replay helper, so add it there too. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28xdp: Move the rxq_info.mem clearing to unreg_mem_model()Jakub Kicinski
xdp_rxq_info_unreg() implicitly calls xdp_rxq_info_unreg_mem_model(). This may well be confusing to the driver authors, and lead to double free if they call xdp_rxq_info_unreg_mem_model() before xdp_rxq_info_unreg() (when mem model type == MEM_TYPE_PAGE_POOL). In fact error path of mvpp2_rxq_init() seems to currently do exactly that. The double free will result in refcount underflow in page_pool_destroy(). Make the interface a little more programmer friendly by clearing type and id so that xdp_rxq_info_unreg_mem_model() can be called multiple times. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210625221612.2637086-1-kuba@kernel.org
2021-06-28Merge branch 'bnxt_en-ptp'David S. Miller
Michael Chan says: ==================== bnxt_en: Add hardware PTP timestamping support on 575XX devices Add PTP RX and TX hardware timestamp support on 575XX devices. These devices use the two-step method to implement the IEEE-1588 timestamping support. v2: Add spinlock to serialize access to the timecounter. Use .do_aux_work() for the periodic timer reading and to get the TX timestamp from the firmware. Propagate error code from ptp_clock_register(). Make the 64-bit timer access safe on 32-bit CPUs. Read PHC using direct register access. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Enable hardware PTP supportMichael Chan
Call bnxt_ptp_init() to initialize and register with the clock driver to enable PTP support. Call bnxt_ptp_free() to unregister and clean up during shutdown. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Transmit and retrieve packet timestampsPavan Chebbi
Setup the TXBD to enable TX timestamp if requested. At TX packet DMA completion, if we requested TX timestamp on that packet, we defer to .do_aux_work() to obtain the TX timestamp from the firmware before we free the TX SKB. v2: Use .do_aux_work() to get the TX timestamp from firmware. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>