summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2021-03-26mld: add mc_lock for protecting per-interface mld dataTaehee Yoo
The purpose of this lock is to avoid a bottleneck in the query/report event handler logic. By previous patches, almost all mld data is protected by RTNL. So, the query and report event handler, which is data path logic acquires RTNL too. Therefore if a lot of query and report events are received, it uses RTNL for a long time. So it makes the control-plane bottleneck because of using RTNL. In order to avoid this bottleneck, mc_lock is added. mc_lock protect only per-interface mld data and per-interface mld data is used in the query/report event handler logic. So, no longer rtnl_lock is needed in the query/report event handler logic. Therefore bottleneck will be disappeared by mc_lock. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mld: add new workqueues for process mld eventsTaehee Yoo
When query/report packets are received, mld module processes them. But they are processed under BH context so it couldn't use sleepable functions. So, in order to switch context, the two workqueues are added which processes query and report event. In the struct inet6_dev, mc_{query | report}_queue are added so it is per-interface queue. And mc_{query | report}_work are workqueue structure. When the query or report event is received, skb is queued to proper queue and worker function is scheduled immediately. Workqueues and queues are protected by spinlock, which is mc_{query | report}_lock, and worker functions are protected by RTNL. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mld: convert ifmcaddr6 to RCUTaehee Yoo
The ifmcaddr6 has been protected by inet6_dev->lock(rwlock) so that the critical section is atomic context. In order to switch this context, changing locking is needed. The ifmcaddr6 actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mld: convert ip6_sf_list to RCUTaehee Yoo
The ip6_sf_list has been protected by mca_lock(spin_lock) so that the critical section is atomic context. In order to switch this context, changing locking is needed. The ip6_sf_list actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. But It doesn't remove mca_lock yet because ifmcaddr6 isn't converted to RCU yet. So, It's not fully converted to the sleepable context. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mld: convert ipv6_mc_socklist->sflist to RCUTaehee Yoo
The sflist has been protected by rwlock so that the critical section is atomic context. In order to switch this context, changing locking is needed. The sflist actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mld: get rid of inet6_dev->mc_lockTaehee Yoo
The purpose of mc_lock is to protect inet6_dev->mc_tomb. But mc_tomb is already protected by RTNL and all functions, which manipulate mc_tomb are called under RTNL. So, mc_lock is not needed. Furthermore, it is spinlock so the critical section is atomic. In order to reduce atomic context, it should be removed. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mld: convert from timer to delayed workTaehee Yoo
mcast.c has several timers for delaying works. Timer's expire handler is working under atomic context so it can't use sleepable things such as GFP_KERNEL, mutex, etc. In order to use sleepable APIs, it converts from timers to delayed work. But there are some critical sections, which is used by both process and BH context. So that it still uses spin_lock_bh() and rwlock. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26ethtool: fec: fix FEC_NONE checkJakub Kicinski
Dan points out we need to use the mask not the bit (which is 0). Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 42ce127d9864 ("ethtool: fec: sanitize ethtool_fecparam->fec") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: rename mptcp_pm_nl_add_addr_send_ackGeliang Tang
Since mptcp_pm_nl_add_addr_send_ack is now used for both ADD_ADDR and RM_ADDR cases, rename it to mptcp_pm_nl_addr_send_ack. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: send ack for rm_addrGeliang Tang
This patch changes the sending ACK conditions for the ADD_ADDR, send an ACK packet for RM_ADDR too. In mptcp_pm_remove_addr, invoke mptcp_pm_nl_add_addr_send_ack to send the ACK packet. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: drop useless addr_signal clearGeliang Tang
msk->pm.addr_signal is cleared in mptcp_pm_add_addr_signal, no need to clear it in mptcp_pm_nl_add_addr_send_ack again. Drop it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: move to next addr when subflow creation failGeliang Tang
When an invalid address was announced, the subflow couldn't be created for this address. Therefore mptcp_pm_nl_subflow_established couldn't be invoked. Then the next addresses in the local address list didn't have a chance to be announced. This patch invokes the new function mptcp_pm_add_addr_echoed when the address is echoed. In it, use mptcp_lookup_anno_list_by_saddr to check whether this address is in the anno_list. If it is, PM schedules the status MPTCP_PM_SUBFLOW_ESTABLISHED to invoke mptcp_pm_create_subflow_or_signal_addr to deal with the next address in the local address list. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: export lookup_anno_list_by_saddrGeliang Tang
This patch exported the static function lookup_anno_list_by_saddr, and renamed it to mptcp_lookup_anno_list_by_saddr. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: move to next addr when timeoutGeliang Tang
This patch called mptcp_pm_subflow_established to move to the next address when an ADD_ADDR has been retransmitted the maximum number of times. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: drop unused subflow in mptcp_pm_subflow_establishedGeliang Tang
This patch drops the unused parameter subflow in mptcp_pm_subflow_established(). Fixes: 926bdeab5535 ("mptcp: Implement path manager interface commands") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: skip connecting the connected addressGeliang Tang
This patch added a new helper named lookup_subflow_by_daddr to find whether the destination address is in the msk's conn_list. In mptcp_pm_nl_add_addr_received, use lookup_subflow_by_daddr to check whether the announced address is already connected. If it is, skip connecting this address and send out the echo. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: drop argument port from mptcp_pm_announce_addrGeliang Tang
Drop the redundant argument 'port' from mptcp_pm_announce_addr, use the port field of another argument 'addr' instead. Fixes: 0f5c9e3f079f ("mptcp: add port parameter for mptcp_pm_announce_addr") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26mptcp: clean-up the rtx pathPaolo Abeni
After the previous patch we can easily avoid invoking the workqueue to perform the retransmission, if the msk socket lock is held at rtx timer expiration. This also simplifies the relevant code. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25tcp: convert elligible sysctls to u8Eric Dumazet
Many tcp sysctls are either bools or small ints that can fit into u8. Reducing space taken by sysctls can save few cache line misses when sending/receiving data while cpu caches are empty, for example after cpu idle period. This is hard to measure with typical network performance tests, but after this patch, struct netns_ipv4 has shrunk by three cache lines. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25inet: convert tcp_early_demux and udp_early_demux to u8Eric Dumazet
For these sysctls, their dedicated helpers have to use proc_dou8vec_minmax(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25ipv4: convert ip_forward_update_priority sysctl to u8Eric Dumazet
This sysctl uses ip_fwd_update_priority() helper, so the conversion needs to change it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25ipv4: shrink netns_ipv4 with sysctl conversionsEric Dumazet
These sysctls that can fit in one byte instead of one int are converted to save space and thus reduce cache line misses. - icmp_echo_ignore_all, icmp_echo_ignore_broadcasts, - icmp_ignore_bogus_error_responses, icmp_errors_use_inbound_ifaddr - tcp_ecn, tcp_ecn_fallback - ip_default_ttl, ip_no_pmtu_disc, ip_fwd_use_pmtu - ip_nonlocal_bind, ip_autobind_reuse - ip_dynaddr, ip_early_demux, raw_l3mdev_accept - nexthop_compat_mode, fwmark_reflect Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25net: change netdev_unregister_timeout_secs min value to 1Dmitry Vyukov
netdev_unregister_timeout_secs=0 can lead to printing the "waiting for dev to become free" message every jiffy. This is too frequent and unnecessary. Set the min value to 1 second. Also fix the merge issue introduced by "net: make unregister netdev warning timeout configurable": it changed "refcnt != 1" to "refcnt". Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Fixes: 5aa3afe107d9 ("net: make unregister netdev warning timeout configurable") Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25net: ipv4: Fix some typosLu Wei
Modify "accomodate" to "accommodate" in net/ipv4/esp4.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25net: dsa: Fix a typo in tag_rtl4_a.cLu Wei
Modify "Apparantly" to "Apparently" in net/dsa/tag_rtl4_a.c.. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25net: decnet: Fix a typo in dn_nsp_in.cLu Wei
Modify "erronous" to "erroneous" in net/decnet/dn_nsp_in.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25net: core: Fix a typo in dev_addr_lists.cLu Wei
Modify "funciton" to "function" in net/core/dev_addr_lists.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25net: ceph: Fix a typo in osdmap.cLu Wei
Modify "inital" to "initial" in net/ceph/osdmap.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25net: Fix a misspell in socket.cLu Wei
s/addres/address Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25tipc: add extack messages for bearer/media failureHoang Le
Add extack error messages for -EINVAL errors when enabling bearer, getting/setting properties for a media/bearer Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25ethtool: fec: sanitize ethtool_fecparam->fecJakub Kicinski
Reject NONE on set, this mode means device does not support FEC so it's a little out of place in the set interface. This should be safe to do - user space ethtool does not allow the use of NONE on set. A few drivers treat it the same as OFF, but none use it instead of OFF. Similarly reject an empty FEC mask. The common user space tool will not send such requests and most drivers correctly reject it already. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25ethtool: fec: sanitize ethtool_fecparam->active_fecJakub Kicinski
struct ethtool_fecparam::active_fec is a GET-only field, all in-tree drivers correctly ignore it on SET. Clear the field on SET to avoid any confusion. Again, we can't reject non-zero now since ethtool user space does not zero-init the param correctly. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25ethtool: fec: sanitize ethtool_fecparam->reservedJakub Kicinski
struct ethtool_fecparam::reserved is never looked at by the core. Make sure it's actually 0. Unfortunately we can't return an error because old ethtool doesn't zero-initialize the structure for SET. On GET we can be more verbose, there are no in tree (ab)users. Fix up the kdoc on the structure. Remove the mention of FEC bypass. Seems like a niche thing to configure in the first place. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-03-24 The following pull-request contains BPF updates for your *net-next* tree. We've added 37 non-merge commits during the last 15 day(s) which contain a total of 65 files changed, 3200 insertions(+), 738 deletions(-). The main changes are: 1) Static linking of multiple BPF ELF files, from Andrii. 2) Move drop error path to devmap for XDP_REDIRECT, from Lorenzo. 3) Spelling fixes from various folks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: "Various fixes, all over: 1) Fix overflow in ptp_qoriq_adjfine(), from Yangbo Lu. 2) Always store the rx queue mapping in veth, from Maciej Fijalkowski. 3) Don't allow vmlinux btf in map_create, from Alexei Starovoitov. 4) Fix memory leak in octeontx2-af from Colin Ian King. 5) Use kvalloc in bpf x86 JIT for storing jit'd addresses, from Yonghong Song. 6) Fix tx ptp stats in mlx5, from Aya Levin. 7) Check correct ip version in tun decap, fropm Roi Dayan. 8) Fix rate calculation in mlx5 E-Switch code, from arav Pandit. 9) Work item memork leak in mlx5, from Shay Drory. 10) Fix ip6ip6 tunnel crash with bpf, from Daniel Borkmann. 11) Lack of preemptrion awareness in macvlan, from Eric Dumazet. 12) Fix data race in pxa168_eth, from Pavel Andrianov. 13) Range validate stab in red_check_params(), from Eric Dumazet. 14) Inherit vlan filtering setting properly in b53 driver, from Florian Fainelli. 15) Fix rtnl locking in igc driver, from Sasha Neftin. 16) Pause handling fixes in igc driver, from Muhammad Husaini Zulkifli. 17) Missing rtnl locking in e1000_reset_task, from Vitaly Lifshits. 18) Use after free in qlcnic, from Lv Yunlong. 19) fix crash in fritzpci mISDN, from Tong Zhang. 20) Premature rx buffer reuse in igb, from Li RongQing. 21) Missing termination of ip[a driver message handler arrays, from Alex Elder. 22) Fix race between "x25_close" and "x25_xmit"/"x25_rx" in hdlc_x25 driver, from Xie He. 23) Use after free in c_can_pci_remove(), from Tong Zhang. 24) Uninitialized variable use in nl80211, from Jarod Wilson. 25) Off by one size calc in bpf verifier, from Piotr Krysiuk. 26) Use delayed work instead of deferrable for flowtable GC, from Yinjun Zhang. 27) Fix infinite loop in NPC unmap of octeontx2 driver, from Hariprasad Kelam. 28) Fix being unable to change MTU of dwmac-sun8i devices due to lack of fifo sizes, from Corentin Labbe. 29) DMA use after free in r8169 with WoL, fom Heiner Kallweit. 30) Mismatched prototypes in isdn-capi, from Arnd Bergmann. 31) Fix psample UAPI breakage, from Ido Schimmel" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (171 commits) psample: Fix user API breakage math: Export mul_u64_u64_div_u64 ch_ktls: fix enum-conversion warning octeontx2-af: Fix memory leak of object buf ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation net: bridge: don't notify switchdev for local FDB addresses net/sched: act_ct: clear post_ct if doing ct_clear net: dsa: don't assign an error value to tag_ops isdn: capi: fix mismatched prototypes net/mlx5: SF, do not use ecpu bit for vhca state processing net/mlx5e: Fix division by 0 in mlx5e_select_queue net/mlx5e: Fix error path for ethtool set-priv-flag net/mlx5e: Offload tuple rewrite for non-CT flows net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP net/mlx5: Add back multicast stats for uplink representor net: ipconfig: ic_dev can be NULL in ic_close_devs MAINTAINERS: Combine "QLOGIC QLGE 10Gb ETHERNET DRIVER" sections into one docs: networking: Fix a typo r8169: fix DMA being used after buffer free if WoL is enabled net: ipa: fix init header command validation ...
2021-03-246lowpan: Fix some typos in nhc_udp.cWang Hai
s/Orignal/Original/ s/infered/inferred/ Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24net/packet: Fix a typo in af_packet.cWang Hai
s/sequencially/sequentially/ Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24net/tls: Fix a typo in tls_device.cWang Hai
s/beggining/beginning/ Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24inet: use bigger hash table for IP ID generationEric Dumazet
In commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count") I used a very small hash table that could be abused by patient attackers to reveal sensitive information. Switch to a dynamic sizing, depending on RAM size. Typical big hosts will now use 128x more storage (2 MB) to get a similar increase in security and reduction of hash collisions. As a bonus, use of alloc_large_system_hash() spreads allocated memory among all NUMA nodes. Fixes: 73f156a6e8c1 ("inetpeer: get rid of ip_id_count") Reported-by: Amit Klein <aksecurity@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24net: decnet: Fixed multiple Coding Style issuesSai Kalyaan Palla
Made changes to coding style as suggested by checkpatch.pl changes are of the type: space required before the open parenthesis '(' space required after that ',' Signed-off-by: Sai Kalyaan Palla <saikalyaan63@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24net: sched: Mundane typo fixesBhaskar Chowdhury
s/procdure/procedure/ s/maintanance/maintenance/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24dsa: slave: add support for TC_SETUP_FTPablo Neira Ayuso
The dsa infrastructure provides a well-defined hierarchy of devices, pass up the call to set up the flow block to the master device. From the software dataplane, the netfilter infrastructure uses the dsa slave devices to refer to the input and output device for the given skbuff. Similarly, the flowtable definition in the ruleset refers to the dsa slave port devices. This patch adds the glue code to call ndo_setup_tc with TC_SETUP_FT with the master device via the dsa slave devices. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24netfilter: flowtable: support for FLOW_ACTION_PPPOE_PUSHPablo Neira Ayuso
Add a PPPoE push action if layer 2 protocol is ETH_P_PPP_SES to add PPPoE flowtable hardware offload support. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24netfilter: flowtable: bridge vlan hardware offload and switchdevFelix Fietkau
The switch might have already added the VLAN tag through PVID hardware offload. Keep this extra VLAN in the flowtable but skip it on egress. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24netfilter: nft_flow_offload: use direct xmit if hardware offload is enabledPablo Neira Ayuso
If there is a forward path to reach an ethernet device and hardware offload is enabled, then use the direct xmit path. Moreover, store the real device in the direct xmit path info since software datapath uses dev_hard_header() to push the layer encapsulation headers while hardware offload refers to the real device. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24netfilter: flowtable: add offload support for xmit path typesPablo Neira Ayuso
When the flow tuple xmit_type is set to FLOW_OFFLOAD_XMIT_DIRECT, the dst_cache pointer is not valid, and the h_source/h_dest/ifidx out fields need to be used. This patch also adds the FLOW_ACTION_VLAN_PUSH action to pass the VLAN tag to the driver. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24netfilter: flowtable: add dsa supportPablo Neira Ayuso
Replace the master ethernet device by the dsa slave port. Packets coming in from the software ingress path use the dsa slave port as input device. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24netfilter: flowtable: add pppoe supportPablo Neira Ayuso
Add the PPPoE protocol and session id to the flow tuple using the encap fields to uniquely identify flows from the receive path. For the transmit path, dev_hard_header() on the vlan device push the headers. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24netfilter: flowtable: add bridge vlan filtering supportPablo Neira Ayuso
Add the vlan tag based when PVID is set on. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>