summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-28net: sparx5: fix error return code in sparx5_register_notifier_blocks()Yang Yingliang
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: d6fce5141929 ("net: sparx5: add switching support") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: sparx5: fix return value check in sparx5_create_targets()Yang Yingliang
In case of error, the function devm_ioremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: 3cfa11bac9bb ("net: sparx5: add the basic sparx5 driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: sparx5: check return value after calling platform_get_resource()Yang Yingliang
It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Fixes: 3cfa11bac9bb ("net: sparx5: add the basic sparx5 driver") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge tag 'mlx5-updates-2021-06-26' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-06-26 This series provides small updates to mlx5 driver. 1) Increase hairpin buffer size 2) Improve peroformance in SF allocation 3) Add IPsec support to uplink representor 4) Add stats for number of deleted kTLS TX offloaded connections 5) Add support for flow sampler in SW steering ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'bridge-replay-helpers'David S. Miller
Vladimir Oltean says: ==================== Cleanup for the bridge replay helpers This patch series brings some improvements to the logic added to the bridge and DSA to handle LAG interfaces sandwiched between a bridge and a DSA switch port. br0 / \ / \ bond0 swp2 / \ / \ swp0 swp1 In particular, it ensures that the switchdev object additions and deletions are well balanced per physical port. This is important for future work in the area of offloading local bridge FDB entries to hardware in the context of DSA requesting a replay of those entries at bridge join time (this will be submitted in a future patch series). Due to some difficulty ensuring that the deletion of local FDB entries pointing towards the bridge device itself is notified to switchdev in time (before the switchdev port disconnects from the bridge), this is potentially still not the final form in which the replay helpers will exist. I'm thinking about moving from the pull mode (in which DSA requests the replay) to a push mode (in which the bridge initiates the replay). Nonetheless, these preliminary changes are needed either way. The patch series also addresses some feedback from Nikolai which is long overdue by now (sorry). Switchdev driver maintainers were deliberately omitted due to the trivial nature of the driver changes (just a function prototype). Changes in v2: - fix build issue in patch 4 (function prototype mismatch) - move switchdev object unsync to the NETDEV_PRECHANGEUPPER code path ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: dsa: replay a deletion of switchdev objects for ports leaving a bridged LAGVladimir Oltean
When a DSA switch port leaves a bonding interface that is under a bridge, there might be dangling switchdev objects on that port left behind, because the bridge is not aware that its lower interface (the bond) changed state in any way. Call the bridge replay helpers with adding=false before changing dp->bridge_dev to NULL, because we need to simulate to dsa_slave_port_obj_del() that these notifications were emitted by the bridge. We add this hook to the NETDEV_PRECHANGEUPPER event handler, because we are calling into switchdev (and the __switchdev_handle_port_obj_del fanout helpers expect the upper/lower adjacency lists to still be valid) and PRECHANGEUPPER is the last moment in time when they still are. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: dsa: refactor the prechangeupper sanity checks into a dedicated functionVladimir Oltean
We need to add more logic to the DSA NETDEV_PRECHANGEUPPER event handler, more exactly we need to request an unsync of switchdev objects. In order to fit more code, refactor the existing logic into a helper. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: allow the switchdev replay functions to be called for deletionVladimir Oltean
When a switchdev port leaves a LAG that is a bridge port, the switchdev objects and port attributes offloaded to that port are not removed: ip link add br0 type bridge ip link add bond0 type bond mode 802.3ad ip link set swp0 master bond0 ip link set bond0 master br0 bridge vlan add dev bond0 vid 100 ip link set swp0 nomaster VLAN 100 will remain installed on swp0 despite it going into standalone mode, because as far as the bridge is concerned, nothing ever happened to its bridge port. Let's extend the bridge vlan, fdb and mdb replay functions to take a 'bool adding' argument, and make DSA and ocelot call the replay functions with 'adding' as false from the switchdev unsync path, for the switch port that leaves the bridge. Note that this patch in itself does not salvage anything, because in the current pull mode of operation, DSA still needs to call the replay helpers with adding=false. This will be done in another patch. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: constify variables in the replay helpersVladimir Oltean
Some of the arguments and local variables for the newly added switchdev replay helpers can be const, so let's make them so. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: ignore switchdev events for LAG ports which didn't request replayVladimir Oltean
There is a slight inconvenience in the switchdev replay helpers added recently, and this is when: ip link add br0 type bridge ip link add bond0 type bond ip link set bond0 master br0 bridge vlan add dev bond0 vid 100 ip link set swp0 master bond0 ip link set swp1 master bond0 Since the underlying driver (currently only DSA) asks for a replay of VLANs when swp0 and swp1 join the LAG because it is bridged, what will happen is that DSA will try to react twice on the VLAN event for swp0. This is not really a huge problem right now, because most drivers accept duplicates since the bridge itself does, but it will become a problem when we add support for replaying switchdev object deletions. Let's fix this by adding a blank void *ctx in the replay helpers, which will be passed on by the bridge in the switchdev notifications. If the context is NULL, everything is the same as before. But if the context is populated with a valid pointer, the underlying switchdev driver (currently DSA) can use the pointer to 'see through' the bridge port (which in the example above is bond0) and 'know' that the event is only for a particular physical port offloading that bridge port, and not for all of them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: switchdev: add a context void pointer to struct switchdev_notifier_infoVladimir Oltean
In the case where the driver asks for a replay of a certain type of event (port object or attribute) for a bridge port that is a LAG, it may do so because this port has just joined the LAG. But there might already be other switchdev ports in that LAG, and it is preferable that those preexisting switchdev ports do not act upon the replayed event. The solution is to add a context to switchdev events, which is NULL most of the time (when the bridge layer initiates the call) but which can be set to a value controlled by the switchdev driver when a replay is requested. The driver can then check the context to figure out if all ports within the LAG should act upon the switchdev event, or just the ones that match the context. We have to modify all switchdev_handle_* helper functions as well as the prototypes in the drivers that use these helpers too, because these helpers hide the underlying struct switchdev_notifier_info from us and there is no way to retrieve the context otherwise. The context structure will be populated and used in later patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: ocelot: delete call to br_fdb_replayVladimir Oltean
Not using this driver, I did not realize it doesn't react to SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE notifications, but it implements just the bridge bypass operations (.ndo_fdb_{add,del}). So the call to br_fdb_replay just produces notifications that are ignored, delete it for now. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: bridge: include the is_local bit in br_fdb_replayVladimir Oltean
Since commit 2c4eca3ef716 ("net: bridge: switchdev: include local flag in FDB notifications"), the bridge emits SWITCHDEV_FDB_ADD_TO_DEVICE events with the is_local flag populated (but we ignore it nonetheless). We would like DSA to start treating this bit, but it is still not populated by the replay helper, so add it there too. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'bnxt_en-ptp'David S. Miller
Michael Chan says: ==================== bnxt_en: Add hardware PTP timestamping support on 575XX devices Add PTP RX and TX hardware timestamp support on 575XX devices. These devices use the two-step method to implement the IEEE-1588 timestamping support. v2: Add spinlock to serialize access to the timecounter. Use .do_aux_work() for the periodic timer reading and to get the TX timestamp from the firmware. Propagate error code from ptp_clock_register(). Make the 64-bit timer access safe on 32-bit CPUs. Read PHC using direct register access. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Enable hardware PTP supportMichael Chan
Call bnxt_ptp_init() to initialize and register with the clock driver to enable PTP support. Call bnxt_ptp_free() to unregister and clean up during shutdown. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Transmit and retrieve packet timestampsPavan Chebbi
Setup the TXBD to enable TX timestamp if requested. At TX packet DMA completion, if we requested TX timestamp on that packet, we defer to .do_aux_work() to obtain the TX timestamp from the firmware before we free the TX SKB. v2: Use .do_aux_work() to get the TX timestamp from firmware. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Get the RX packet timestampPavan Chebbi
If the RX packet is timestamped by the hardware, the RX completion record will contain the lower 32-bit of the timestamp. This needs to be combined with the upper 16-bit of the periodic timestamp that we get from the timer. The previous snapshot in ptp->old_timer is used to make sure that the snapshot is not ahead of the RX timestamp and we adjust for wrap-around if needed. v2: Make ptp->old_time read access safe on 32-bit CPUs. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Get the full 48-bit hardware timestamp periodicallyPavan Chebbi
From the bnxt_timer(), read the 48-bit hardware running clock periodically and store it in ptp->current_time. The previous snapshot of the clock will be stored in ptp->old_time. The old_time snapshot will be used in the next patches to compute the RX packet timestamps. v2: Use .do_aux_work() to read the timer periodically. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Add PTP clock APIs, ioctls, and ethtool methodsMichael Chan
Add the clock APIs to set/get/adjust the hw clock, and the related ioctls and ethtool methods. v2: Propagate error code from ptp_clock_register(). Add spinlock to serialize access to the timecounter. The timecounter is accessed in process context and the RX datapath. Read the PHC using direct registers. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Get PTP hardware capability from firmwareMichael Chan
Store PTP hardware info in a structure if hardware and firmware support PTP. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bnxt_en: Update firmware interface to 1.10.2.47Michael Chan
Adding the PTP related firmware interface is the main change. There is also a name change for admin_mtu, requiring code fixup. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'hns3-next'David S. Miller
Guangbin Huang says: ==================== net: hns3: add new debugfs commands This series adds three new debugfs commands for the HNS3 ethernet driver. change log: V1 -> V2: 1. remove patch "net: hns3: add support for link diagnosis info in debugfs" and use ethtool extended link state to implement similar function according to Jakub Kicinski's opinion. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: hns3: add support for dumping MAC umv counter in debugfsJian Shen
This patch adds support of dumping MAC umv counter in debugfs, which will be helpful for debugging. The display style is below: $ cat umv_info num_alloc_vport : 2 max_umv_size : 256 wanted_umv_size : 256 priv_umv_size : 85 share_umv_size : 86 vport(0) used_umv_num : 1 vport(1) used_umv_num : 1 Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: hns3: add support for FD counter in debugfsJian Shen
Previously, the flow director counter is not enabled. To improve the maintainability for chechking whether flow director hit or not, enable flow director counter for each function, and add debugfs query inerface to query the counters for each function. The debugfs command is below: cat fd_counter func_id hit_times pf 0 vf0 0 vf1 0 Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'tipc-next'David S. Miller
Menglong Dong says: ==================== net: tipc: fix FB_MTU eat two pages and do some code cleanup In the first patch, FB_MTU is redefined to make sure data size will not exceed PAGE_SIZE. Besides, I removed the alignment for buf_size in tipc_buf_acquire, because skb_alloc_fclone will do the alignment job. In the second patch, I removed align() in msg.c and replace it with ALIGN(). Changes since V5: - remove blank line after Fixes in commit log in the first patch Changes since V4: - remove ONE_PAGE_SKB_SZ and replace it with one_page_mtu in the first patch. - fix some code style problems for the second patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: tipc: replace align() with ALIGN in msg.cMenglong Dong
The function align() which is defined in msg.c is redundant, replace it with ALIGN() and introduce a BUF_ALIGN(). Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: tipc: fix FB_MTU eat two pagesMenglong Dong
FB_MTU is used in 'tipc_msg_build()' to alloc smaller skb when memory allocation fails, which can avoid unnecessary sending failures. The value of FB_MTU now is 3744, and the data size will be: (3744 + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + \ SKB_DATA_ALIGN(BUF_HEADROOM + BUF_TAILROOM + 3)) which is larger than one page(4096), and two pages will be allocated. To avoid it, replace '3744' with a calculation: (PAGE_SIZE - SKB_DATA_ALIGN(BUF_OVERHEAD) - \ SKB_DATA_ALIGN(sizeof(struct skb_shared_info))) What's more, alloc_skb_fclone() will call SKB_DATA_ALIGN for data size, and it's not necessary to make alignment for buf_size in tipc_buf_acquire(). So, just remove it. Fixes: 4c94cc2d3d57 ("tipc: fall back to smaller MTU if allocation of local send skb fails") Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/gitDavid S. Miller
/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2021-06-28 1) Remove an unneeded error assignment in esp4_gro_receive(). From Yang Li. 2) Add a new byseq state hashtable to find acquire states faster. From Sabrina Dubroca. 3) Remove some unnecessary variables in pfkey_create(). From zuoqilin. 4) Remove the unused description from xfrm_type struct. From Florian Westphal. 5) Fix a spelling mistake in the comment of xfrm_state_ok(). From gushengxian. 6) Replace hdr_off indirections by a small helper function. From Florian Westphal. 7) Remove xfrm4_output_finish and xfrm6_output_finish declarations, they are not used anymore.From Antony Antony. 8) Remove xfrm replay indirections. From Florian Westphal. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge tag 'mac80211-next-for-net-next-2021-06-25' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes berg says: ==================== Lots of changes: * aggregation handling improvements for some drivers * hidden AP discovery on 6 GHz and other HE 6 GHz improvements * minstrel improvements for no-ack frames * deferred rate control for TXQs to improve reaction times * virtual time-based airtime scheduler * along with various little cleanups/fixups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28mptcp: fix 'masking a bool' warningMatthieu Baerts
Dan Carpenter reported an issue introduced in commit fde56eea01f9 ("mptcp: refine mptcp_cleanup_rbuf") where a new boolean (ack_pending) is masked with 0x9. This is not the intention to ignore values by using a boolean. This variable should not have a 'bool' type: we should keep the 'u8' to allow this comparison. Fixes: fde56eea01f9 ("mptcp: refine mptcp_cleanup_rbuf") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'reset-mac'David S. Miller
Guillaume Nault says: ==================== net: reset MAC header consistently across L3 virtual devices Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode), reset the MAC header pointer after they parsed the outer headers. This accurately reflects the fact that the decapsulated packet is pure L3 packet, as that makes the MAC header 0 bytes long (the MAC and network header pointers are equal). However, many L3 devices only adjust the network header after decapsulation and leave the MAC header pointer to its original value. This can confuse other parts of the networking stack, like TC, which then considers the outer headers as one big MAC header. This patch series makes the following L3 tunnels behave like VXLAN-GPE: bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp. The case of gre is a bit special. It already resets the MAC header pointer in collect_md mode, so only the classical mode needs to be adjusted. However, gre also has a special case that expects the MAC header pointer to keep pointing to the outer header even after decapsulation. Therefore, patch 4 keeps an exception for this case. Ideally, we'd centralise the call to skb_reset_mac_header() in ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2), sit (patch 3) and gre (patch 4). That's unfortunately not feasible currently, because of the gre special case discussed above that precludes us from resetting the MAC header unconditionally. The original motivation is to redirect bareudp packets to Ethernet devices (as described in patch 1). The rest of this series aims at bringing consistency across all L3 devices (apart from gre's special case unfortunately). Note: the gtp patch results from pure code inspection and has been compiled tested only. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28gtp: reset mac_header after decapGuillaume Nault
For consistency with other L3 tunnel devices, reset the mac_header pointer after decapsulation. This makes the mac_header 0 bytes long, thus making it clear that this skb has no mac_header. Compile tested only. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28ip6_tunnel: allow redirecting ip6gre and ipxip6 packets to eth devicesGuillaume Nault
Reset the mac_header pointer even when the tunnel transports only L3 data (in the ARPHRD_ETHER case, this is already done by eth_type_trans). This prevents other parts of the stack from mistakenly accessing the outer header after the packet has been decapsulated. In practice, this allows to push an Ethernet header to ipip6, ip6ip6, mplsip6 or ip6gre packets and redirect them to an Ethernet device: $ tc filter add dev ip6tnl0 ingress matchall \ action vlan push_eth dst_mac 00:00:5e:00:53:01 \ src_mac 00:00:5e:00:53:00 \ action mirred egress redirect dev eth0 Without this patch, push_eth refuses to add an ethernet header because the skb appears to already have a MAC header. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28gre: let mac_header point to outer header only when necessaryGuillaume Nault
Commit e271c7b4420d ("gre: do not keep the GRE header around in collect medata mode") did reset the mac_header for the collect_md case. Let's extend this behaviour to classical gre devices as well. ipgre_header_parse() seems to be the only case that requires mac_header to point to the outer header. We can detect this case accurately by checking ->header_ops. For all other cases, we can reset mac_header. This allows to push an Ethernet header to ipgre packets and redirect them to an Ethernet device: $ tc filter add dev gre0 ingress matchall \ action vlan push_eth dst_mac 00:00:5e:00:53:01 \ src_mac 00:00:5e:00:53:00 \ action mirred egress redirect dev eth0 Before this patch, this worked only for collect_md gre devices. Now this works for regular gre devices as well. Only the special case of gre devices that use ipgre_header_ops isn't supported. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28sit: allow redirecting ip6ip, ipip and mplsip packets to eth devicesGuillaume Nault
Even though sit transports L3 data (IPv6, IPv4 or MPLS) packets, it needs to reset the mac_header pointer, so that other parts of the stack don't mistakenly access the outer header after the packet has been decapsulated. There are two rx handlers to modify: ipip6_rcv() for the ip6ip mode and sit_tunnel_rcv() which is used to re-implement the ipip and mplsip modes of ipip.ko. This allows to push an Ethernet header to sit packets and redirect them to an Ethernet device: $ tc filter add dev sit0 ingress matchall \ action vlan push_eth dst_mac 00:00:5e:00:53:01 \ src_mac 00:00:5e:00:53:00 \ action mirred egress redirect dev eth0 Without this patch, push_eth refuses to add an ethernet header because the skb appears to already have a MAC header. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28ipip: allow redirecting ipip and mplsip packets to eth devicesGuillaume Nault
Even though ipip transports IPv4 or MPLS packets, it needs to reset the mac_header pointer, so that other parts of the stack don't mistakenly access the outer header after the packet has been decapsulated. This allows to push an Ethernet header to ipip or mplsip packets and redirect them to an Ethernet device: $ tc filter add dev ipip0 ingress matchall \ action vlan push_eth dst_mac 00:00:5e:00:53:01 \ src_mac 00:00:5e:00:53:00 \ action mirred egress redirect dev eth0 Without this patch, push_eth refuses to add an ethernet header because the skb appears to already have a MAC header. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bareudp: allow redirecting bareudp packets to eth devicesGuillaume Nault
Even though bareudp transports L3 data (typically IP or MPLS), it needs to reset the mac_header pointer, so that other parts of the stack don't mistakenly access the outer header after the packet has been decapsulated. This allows to push an Ethernet header to bareudp packets and redirect them to an Ethernet device: $ tc filter add dev bareudp0 ingress matchall \ action vlan push_eth dst_mac 00:00:5e:00:53:01 \ src_mac 00:00:5e:00:53:00 \ action mirred egress redirect dev eth0 Without this patch, push_eth refuses to add an ethernet header because the skb appears to already have a MAC header. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-26net/mlx5e: Add IPsec support to uplink representorRaed Salem
Add the xfrm xdo and ipsec_init/cleanup to uplink representor to support IPsec in SRIOV switchdev mode. Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Huy Nguyen <huyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-26net/mlx5e: kTLS, Add stats for number of deleted kTLS TX offloaded connectionsTariq Toukan
Expose ethtool SW counter for the number of kTLS device-offloaded TX connections that are finished and deleted. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-26net/mlx5: SF, Improve performance in SF allocationEli Cohen
Avoid second traversal on the SF table by recording the first free entry and using it in case the looked up entry was not found in the table. Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-26net/mlx5: Increase hairpin buffer sizeAriel Levkovich
The max packet size a hairpin queue is able to handle is determined by the total hairpin buffer size divided by 4. Currently the buffer size is set to 32KB which makes the max packet size to be 8KB and doesn't support jumbo frames of size 9KB. This change increases the buffer size to 64KB to increase the max frame size and support 9KB frames. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-26net/mlx5: DR, Add support for flow sampler offloadYevgeny Kliteynik
Add SW steering support for sFlow / flow sampler action. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-26net/mlx5: Compare sampler flow destination ID in fs_coreYevgeny Kliteynik
When comparing sampler flow destinations, in fs_core, consider sampler ID as well. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-25Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-06-25 This series contains updates to ice driver only. Jesse adds support for tracepoints to aide in debugging. Maciej adds support for PTP auxiliary pin support. Victor removes the VSI info from the old aggregator when moving the VSI to another aggregator. Tony removes an unnecessary VSI assignment. Christophe Jaillet fixes a memory leak for failed allocation in ice_pf_dcb_cfg(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25net/smc: Ensure correct state of the socket in send pathGuvenc Gulce
When smc_sendmsg() is called before the SMC socket initialization has completed, smc_tx_sendmsg() will access un-initialized fields of the SMC socket which results in a null-pointer dereference. Fix this by checking the socket state first in smc_tx_sendmsg(). Fixes: e0e4b8fa5338 ("net/smc: Add SMC statistics support") Reported-by: syzbot+5dda108b672b54141857@syzkaller.appspotmail.com Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25Merge tag 'wireless-drivers-next-2021-06-25' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.14 Second, and most likely the last, set of patches for v5.14. mt76 and iwlwifi have most patches in this round, but rtw88 also has some new features. Nothing special really standing out. mt76 * mt7915 MSI support * disable ASPM on mt7915 * mt7915 tx status reporting * mt7921 decap offload rtw88 * beacon filter support * path diversity support * firmware crash information via devcoredump * quirks for disabling pci capabilities mt7601u * add USB ID for a XiaoDu WiFi Dongle ath11k * enable support for QCN9074 PCI devices brcmfmac * support parse country code map from DeviceTree iwlwifi * support for new hardware * support for BIOS control of 11ax enablement in Russia * support UNII4 band enablement from BIOS ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25net: mdiobus: withdraw fwnode_mdbiobus_registerMarcin Wojtas
The newly implemented fwnode_mdbiobus_register turned out to be problematic - in case the fwnode_/of_/acpi_mdio are built as modules, a dependency cycle can be observed during the depmod phase of modules_install, eg.: depmod: ERROR: Cycle detected: fwnode_mdio -> of_mdio -> fwnode_mdio depmod: ERROR: Found 2 modules in dependency cycles! OR: depmod: ERROR: Cycle detected: acpi_mdio -> fwnode_mdio -> acpi_mdio depmod: ERROR: Found 2 modules in dependency cycles! A possible solution could be to rework fwnode_mdiobus_register, so that to merge the contents of acpi_mdiobus_register and of_mdiobus_register. However feasible, such change would be very intrusive and affect huge amount of the of_mdiobus_register users. Since there are currently 2 users of ACPI and MDIO (xgmac_mdio and mvmdio), withdraw the fwnode_mdbiobus_register and roll back to a simple 'if' condition in affected drivers. Fixes: 62a6ef6a996f ("net: mdiobus: Introduce fwnode_mdbiobus_register()") Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25ice: Fix a memory leak in an error handling path in 'ice_pf_dcb_cfg()'Christophe JAILLET
If this 'kzalloc()' fails we must free some resources as in all the other error handling paths of this function. Fixes: 348048e724a0 ("ice: Implement iidc operations") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-25ice: remove unnecessary VSI assignmentTony Nguyen
ice_get_vf_vsi() is being called twice for the same VSI. Remove the unnecessary call/assignment. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
2021-06-25ice: remove the VSI info from previous aggVictor Raj
Remove the VSI info from previous aggregator after moving the VSI to a new aggregator. Signed-off-by: Victor Raj <victor.raj@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>