summaryrefslogtreecommitdiff
path: root/net/core
AgeCommit message (Collapse)Author
2020-10-04net/core: check length before updating Ethertype in skb_mpls_{push,pop}Guillaume Nault
Openvswitch allows to drop a packet's Ethernet header, therefore skb_mpls_push() and skb_mpls_pop() might be called with ethernet=true and mac_len=0. In that case the pointer passed to skb_mod_eth_type() doesn't point to an Ethernet header and the new Ethertype is written at unexpected locations. Fix this by verifying that mac_len is big enough to contain an Ethernet header. Fixes: fa4e0f8855fc ("net/sched: fix corrupted L2 header with MPLS 'push' and 'pop' actions") Signed-off-by: Guillaume Nault <gnault@redhat.com> Acked-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04net: devlink: Add support for port regionsAndrew Lunn
Allow regions to be registered to a devlink port. The same netlink API is used, but the port index is provided to indicate when a region is a port region as opposed to a device region. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04net: devlink: Add unused port flavourAndrew Lunn
Not all ports of a switch need to be used, particularly in embedded systems. Add a port flavour for ports which physically exist in the switch, but are not connected to the front panel etc, and so are unused. By having unused ports present in devlink, it gives a more accurate representation of the hardware. It also allows regions to be associated to such ports, so allowing, for example, to determine unused ports are correctly powered off, or to compare probable reset defaults of unused ports to used ports experiences issues. Actually registering unused ports and setting the flavour to unused is optional. The DSA core will register all such switch ports, but such ports are expected to be limited in number. Bigger ASICs may decide not to list unused ports. v2: Expand the description about why it is useful Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03net/sched: act_vlan: Add {POP,PUSH}_ETH actionsGuillaume Nault
Implement TCA_VLAN_ACT_POP_ETH and TCA_VLAN_ACT_PUSH_ETH, to respectively pop and push a base Ethernet header at the beginning of a frame. POP_ETH is just a matter of pulling ETH_HLEN bytes. VLAN tags, if any, must be stripped before calling POP_ETH. PUSH_ETH is restricted to skbs with no mac_header, and only the MAC addresses can be configured. The Ethertype is automatically set from skb->protocol. These restrictions ensure that all skb's fields remain consistent, so that this action can't confuse other part of the networking stack (like GSO). Since openvswitch already had these actions, consolidate the code in skbuff.c (like for vlan and mpls push/pop). Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02genetlink: move to smaller ops wherever possibleJakub Kicinski
Bulk of the genetlink users can use smaller ops, move them. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02devlink: add .trap_group_action_set() callbackIoana Ciornei
Add a new devlink callback, .trap_group_action_set(), which can be used by device drivers which do not support controlling the action (drop, trap) on each trap but rather on the entire group trap. If this new callback is populated, it will take precedence over the .trap_action_set() callback when the user requests a change of all the traps in a group. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02devlink: add parser error drop packet trapsIoana Ciornei
Add parser error drop packet traps, so that capable device driver could register them with devlink. The new packet trap group holds any drops of packets which were marked by the device as erroneous during header parsing. Add documentation for every added packet trap and packet trap group. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02bpf, sockmap: Add skb_adjust_room to pop bytes off ingress payloadJohn Fastabend
This implements a new helper skb_adjust_room() so users can push/pop extra bytes from a BPF_SK_SKB_STREAM_VERDICT program. Some protocols may include headers and other information that we may not want to include when doing a redirect from a BPF_SK_SKB_STREAM_VERDICT program. One use case is to redirect TLS packets into a receive socket that doesn't expect TLS data. In TLS case the first 13B or so contain the protocol header. With KTLS the payload is decrypted so we should be able to redirect this to a receiving socket, but the receiving socket may not be expecting to receive a TLS header and discard the data. Using the above helper we can pop the header off and put an appropriate header on the payload. This allows for creating a proxy between protocols without extra hops through the stack or userspace. So in order to fix this case add skb_adjust_room() so users can strip the header. After this the user can strip the header and an unmodified receiver thread will work correctly when data is redirected into the ingress path of a sock. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/160160099197.7052.8443193973242831692.stgit@john-Precision-5820-Tower
2020-10-01net-sysfs: Fix inconsistent of format with argument type in net-sysfs.cYe Bin
Fix follow warnings: [net/core/net-sysfs.c:1161]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'int'. [net/core/net-sysfs.c:1162]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'int'. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01pktgen: Fix inconsistent of format with argument type in pktgen.cYe Bin
Fix follow warnings: [net/core/pktgen.c:925]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'. [net/core/pktgen.c:942]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'. [net/core/pktgen.c:962]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'. [net/core/pktgen.c:984]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'. [net/core/pktgen.c:1149]: (warning) %d in format string (no. 1) requires 'int' but the argument type is 'unsigned int'. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-10-01 The following pull-request contains BPF updates for your *net-next* tree. We've added 90 non-merge commits during the last 8 day(s) which contain a total of 103 files changed, 7662 insertions(+), 1894 deletions(-). Note that once bpf(/net) tree gets merged into net-next, there will be a small merge conflict in tools/lib/bpf/btf.c between commit 1245008122d7 ("libbpf: Fix native endian assumption when parsing BTF") from the bpf tree and the commit 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") from the bpf-next tree. Correct resolution would be to stick with bpf-next, it should look like: [...] /* check BTF magic */ if (fread(&magic, 1, sizeof(magic), f) < sizeof(magic)) { err = -EIO; goto err_out; } if (magic != BTF_MAGIC && magic != bswap_16(BTF_MAGIC)) { /* definitely not a raw BTF */ err = -EPROTO; goto err_out; } /* get file size */ [...] The main changes are: 1) Add bpf_snprintf_btf() and bpf_seq_printf_btf() helpers to support displaying BTF-based kernel data structures out of BPF programs, from Alan Maguire. 2) Speed up RCU tasks trace grace periods by a factor of 50 & fix a few race conditions exposed by it. It was discussed to take these via BPF and networking tree to get better testing exposure, from Paul E. McKenney. 3) Support multi-attach for freplace programs, needed for incremental attachment of multiple XDP progs using libxdp dispatcher model, from Toke Høiland-Jørgensen. 4) libbpf support for appending new BTF types at the end of BTF object, allowing intrusive changes of prog's BTF (useful for future linking), from Andrii Nakryiko. 5) Several BPF helper improvements e.g. avoid atomic op in cookie generator and add a redirect helper into neighboring subsys, from Daniel Borkmann. 6) Allow map updates on sockmaps from bpf_iter context in order to migrate sockmaps from one to another, from Lorenz Bauer. 7) Fix 32 bit to 64 bit assignment from latest alu32 bounds tracking which caused a verifier issue due to type downgrade to scalar, from John Fastabend. 8) Follow-up on tail-call support in BPF subprogs which optimizes x64 JIT prologue and epilogue sections, from Maciej Fijalkowski. 9) Add an option to perf RB map to improve sharing of event entries by avoiding remove- on-close behavior. Also, add BPF_PROG_TEST_RUN for raw_tracepoint, from Song Liu. 10) Fix a crash in AF_XDP's socket_release when memory allocation for UMEMs fails, from Magnus Karlsson. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01bpf: Fix "unresolved symbol" build error with resolve_btfidsYonghong Song
Michal reported a build failure likes below: BTFIDS vmlinux FAILED unresolved symbol tcp_timewait_sock make[1]: *** [/.../linux-5.9-rc7/Makefile:1176: vmlinux] Error 255 This error can be triggered when config has CONFIG_NET enabled but CONFIG_INET disabled. In this case, there is no user of istructs inet_timewait_sock and tcp_timewait_sock and hence vmlinux BTF types are not generated for these two structures. To fix the problem, let us force BTF generation for these two structures with BTF_TYPE_EMIT. Fixes: fce557bcef11 ("bpf: Make btf_sock_ids global") Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201001051339.2549085-1-yhs@fb.com
2020-09-30drop_monitor: Filter control packets in drop monitorIdo Schimmel
Previously, devlink called into drop monitor in order to report hardware originated drops / exceptions. devlink intentionally filtered control packets and did not pass them to drop monitor as they were not dropped by the underlying hardware. Now drop monitor registers its probe on a generic 'devlink_trap_report' tracepoint and should therefore perform this filtering itself instead of having devlink do that. Add the trap type as metadata and have drop monitor ignore control packets. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30drop_monitor: Remove duplicate structIdo Schimmel
'struct net_dm_hw_metadata' is a duplicate of 'struct devlink_trap_metadata'. Remove the former and simplify the code. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30drop_monitor: Remove no longer used functionsIdo Schimmel
The old probe functions that were invoked by drop monitor code are no longer called and can thus be removed. They were replaced by actual probe functions that are registered on the recently introduced 'devlink_trap_report' tracepoint. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30drop_monitor: Convert to using devlink tracepointIdo Schimmel
Convert drop monitor to use the recently introduced 'devlink_trap_report' tracepoint instead of having devlink call into drop monitor. This is both consistent with software originated drops ('kfree_skb' tracepoint) and also allows drop monitor to be built as a module and still report hardware originated drops. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30drop_monitor: Prepare probe functions for devlink tracepointIdo Schimmel
Drop monitor supports two alerting modes: Summary and packet. Prepare a probe function for each, so that they could be later registered on the devlink tracepoint by calling register_trace_devlink_trap_report(), based on the configured alerting mode. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30devlink: Add a tracepoint for trap reportsIdo Schimmel
Add a tracepoint for trap reports so that drop monitor could register its probe on it. Use trace_devlink_trap_report_enabled() to avoid wasting cycles setting the trap metadata if the tracepoint is not enabled. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30bpf: Add redirect_neigh helper as redirect drop-inDaniel Borkmann
Add a redirect_neigh() helper as redirect() drop-in replacement for the xmit side. Main idea for the helper is to be very similar in semantics to the latter just that the skb gets injected into the neighboring subsystem in order to let the stack do the work it knows best anyway to populate the L2 addresses of the packet and then hand over to dev_queue_xmit() as redirect() does. This solves two bigger items: i) skbs don't need to go up to the stack on the host facing veth ingress side for traffic egressing the container to achieve the same for populating L2 which also has the huge advantage that ii) the skb->sk won't get orphaned in ip_rcv_core() when entering the IP routing layer on the host stack. Given that skb->sk neither gets orphaned when crossing the netns as per 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb.") the helper can then push the skbs directly to the phys device where FQ scheduler can do its work and TCP stack gets proper backpressure given we hold on to skb->sk as long as skb is still residing in queues. With the helper used in BPF data path to then push the skb to the phys device, I observed a stable/consistent TCP_STREAM improvement on veth devices for traffic going container -> host -> host -> container from ~10Gbps to ~15Gbps for a single stream in my test environment. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/bpf/f207de81629e1724899b73b8112e0013be782d35.1601477936.git.daniel@iogearbox.net
2020-09-30bpf, net: Rework cookie generator as per-cpu oneDaniel Borkmann
With its use in BPF, the cookie generator can be called very frequently in particular when used out of cgroup v2 hooks (e.g. connect / sendmsg) and attached to the root cgroup, for example, when used in v1/v2 mixed environments. In particular, when there's a high churn on sockets in the system there can be many parallel requests to the bpf_get_socket_cookie() and bpf_get_netns_cookie() helpers which then cause contention on the atomic counter. As similarly done in f991bd2e1421 ("fs: introduce a per-cpu last_ino allocator"), add a small helper library that both can use for the 64 bit counters. Given this can be called from different contexts, we also need to deal with potential nested calls even though in practice they are considered extremely rare. One idea as suggested by Eric Dumazet was to use a reverse counter for this situation since we don't expect 64 bit overflows anyways; that way, we can avoid bigger gaps in the 64 bit counter space compared to just batch-wise increase. Even on machines with small number of cores (e.g. 4) the cookie generation shrinks from min/max/med/avg (ns) of 22/50/40/38.9 down to 10/35/14/17.3 when run in parallel from multiple CPUs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Link: https://lore.kernel.org/bpf/8a80b8d27d3c49f9a14e1d5213c19d8be87d1dc8.1601477936.git.daniel@iogearbox.net
2020-09-30bpf: Add classid helper only based on skb->skDaniel Borkmann
Similarly to 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks"), add a helper to retrieve cgroup v1 classid solely based on the skb->sk, so it can be used as key as part of BPF map lookups out of tc from host ns, in particular given the skb->sk is retained these days when crossing net ns thanks to 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb."). This is similar to bpf_skb_cgroup_id() which implements the same for v2. Kubernetes ecosystem is still operating on v1 however, hence net_cls needs to be used there until this can be dropped in with the v2 helper of bpf_skb_cgroup_id(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/ed633cf27a1c620e901c5aa99ebdefb028dce600.1601477936.git.daniel@iogearbox.net
2020-09-29net: Add netif_rx_any_context()Sebastian Andrzej Siewior
Quite some drivers make conditional decisions based on in_interrupt() to invoke either netif_rx() or netif_rx_ni(). Conditionals based on in_interrupt() or other variants of preempt count checks in drivers should not exist for various reasons and Linus clearly requested to either split the code pathes or pass an argument to the common functions which provides the context. This is obviously the correct solution, but for some of the affected drivers this needs a major rewrite due to their convoluted structure. As in_interrupt() usage in drivers needs to be phased out, provide netif_rx_any_context() as a stop gap for these drivers. This confines the in_interrupt() conditional to core code which in turn allows to remove the access to this check for driver code and provides one central place to do further modifications once the driver maze is cleaned up. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28bpf: sockmap: Enable map_update_elem from bpf_iterLorenz Bauer
Allow passing a pointer to a BTF struct sock_common* when updating a sockmap or sockhash. Since BTF pointers can fault and therefore be NULL at runtime we need to add an additional !sk check to sock_map_update_elem. Since we may be passed a request or timewait socket we also need to check sk_fullsock. Doing this allows calling map_update_elem on sockmap from bpf_iter context, which uses BTF pointers. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200928090805.23343-2-lmb@cloudflare.com
2020-09-28net: core: add nested_level variable in net_deviceTaehee Yoo
This patch is to add a new variable 'nested_level' into the net_device structure. This variable will be used as a parameter of spin_lock_nested() of dev->addr_list_lock. netif_addr_lock() can be called recursively so spin_lock_nested() is used instead of spin_lock() and dev->lower_level is used as a parameter of spin_lock_nested(). But, dev->lower_level value can be updated while it is being used. So, lockdep would warn a possible deadlock scenario. When a stacked interface is deleted, netif_{uc | mc}_sync() is called recursively. So, spin_lock_nested() is called recursively too. At this moment, the dev->lower_level variable is used as a parameter of it. dev->lower_level value is updated when interfaces are being unlinked/linked immediately. Thus, After unlinking, dev->lower_level shouldn't be a parameter of spin_lock_nested(). A (macvlan) | B (vlan) | C (bridge) | D (macvlan) | E (vlan) | F (bridge) A->lower_level : 6 B->lower_level : 5 C->lower_level : 4 D->lower_level : 3 E->lower_level : 2 F->lower_level : 1 When an interface 'A' is removed, it releases resources. At this moment, netif_addr_lock() would be called. Then, netdev_upper_dev_unlink() is called recursively. Then dev->lower_level is updated. There is no problem. But, when the bridge module is removed, 'C' and 'F' interfaces are removed at once. If 'F' is removed first, a lower_level value is like below. A->lower_level : 5 B->lower_level : 4 C->lower_level : 3 D->lower_level : 2 E->lower_level : 1 F->lower_level : 1 Then, 'C' is removed. at this moment, netif_addr_lock() is called recursively. The ordering is like this. C(3)->D(2)->E(1)->F(1) At this moment, the lower_level value of 'E' and 'F' are the same. So, lockdep warns a possible deadlock scenario. In order to avoid this problem, a new variable 'nested_level' is added. This value is the same as dev->lower_level - 1. But this value is updated in rtnl_unlock(). So, this variable can be used as a parameter of spin_lock_nested() safely in the rtnl context. Test commands: ip link add br0 type bridge vlan_filtering 1 ip link add vlan1 link br0 type vlan id 10 ip link add macvlan2 link vlan1 type macvlan ip link add br3 type bridge vlan_filtering 1 ip link set macvlan2 master br3 ip link add vlan4 link br3 type vlan id 10 ip link add macvlan5 link vlan4 type macvlan ip link add br6 type bridge vlan_filtering 1 ip link set macvlan5 master br6 ip link add vlan7 link br6 type vlan id 10 ip link add macvlan8 link vlan7 type macvlan ip link set br0 up ip link set vlan1 up ip link set macvlan2 up ip link set br3 up ip link set vlan4 up ip link set macvlan5 up ip link set br6 up ip link set vlan7 up ip link set macvlan8 up modprobe -rv bridge Splat looks like: [ 36.057436][ T744] WARNING: possible recursive locking detected [ 36.058848][ T744] 5.9.0-rc6+ #728 Not tainted [ 36.059959][ T744] -------------------------------------------- [ 36.061391][ T744] ip/744 is trying to acquire lock: [ 36.062590][ T744] ffff8c4767509280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_set_rx_mode+0x19/0x30 [ 36.064922][ T744] [ 36.064922][ T744] but task is already holding lock: [ 36.066626][ T744] ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.068851][ T744] [ 36.068851][ T744] other info that might help us debug this: [ 36.070731][ T744] Possible unsafe locking scenario: [ 36.070731][ T744] [ 36.072497][ T744] CPU0 [ 36.073238][ T744] ---- [ 36.074007][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.075290][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.076590][ T744] [ 36.076590][ T744] *** DEADLOCK *** [ 36.076590][ T744] [ 36.078515][ T744] May be due to missing lock nesting notation [ 36.078515][ T744] [ 36.080491][ T744] 3 locks held by ip/744: [ 36.081471][ T744] #0: ffffffff98571df0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x236/0x490 [ 36.083614][ T744] #1: ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.085942][ T744] #2: ffff8c476c8da280 (&bridge_netdev_addr_lock_key/4){+...}-{2:2}, at: dev_uc_sync+0x39/0x80 [ 36.088400][ T744] [ 36.088400][ T744] stack backtrace: [ 36.089772][ T744] CPU: 6 PID: 744 Comm: ip Not tainted 5.9.0-rc6+ #728 [ 36.091364][ T744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 36.093630][ T744] Call Trace: [ 36.094416][ T744] dump_stack+0x77/0x9b [ 36.095385][ T744] __lock_acquire+0xbc3/0x1f40 [ 36.096522][ T744] lock_acquire+0xb4/0x3b0 [ 36.097540][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.098657][ T744] ? rtmsg_ifinfo+0x1f/0x30 [ 36.099711][ T744] ? __dev_notify_flags+0xa5/0xf0 [ 36.100874][ T744] ? rtnl_is_locked+0x11/0x20 [ 36.101967][ T744] ? __dev_set_promiscuity+0x7b/0x1a0 [ 36.103230][ T744] _raw_spin_lock_bh+0x38/0x70 [ 36.104348][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.105461][ T744] dev_set_rx_mode+0x19/0x30 [ 36.106532][ T744] dev_set_promiscuity+0x36/0x50 [ 36.107692][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.108929][ T744] dev_set_promiscuity+0x1e/0x50 [ 36.110093][ T744] br_port_set_promisc+0x1f/0x40 [bridge] [ 36.111415][ T744] br_manage_promisc+0x8b/0xe0 [bridge] [ 36.112728][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.113967][ T744] ? __hw_addr_sync_one+0x23/0x50 [ 36.115135][ T744] __dev_set_rx_mode+0x68/0x90 [ 36.116249][ T744] dev_uc_sync+0x70/0x80 [ 36.117244][ T744] dev_uc_add+0x50/0x60 [ 36.118223][ T744] macvlan_open+0x18e/0x1f0 [macvlan] [ 36.119470][ T744] __dev_open+0xd6/0x170 [ 36.120470][ T744] __dev_change_flags+0x181/0x1d0 [ 36.121644][ T744] dev_change_flags+0x23/0x60 [ 36.122741][ T744] do_setlink+0x30a/0x11e0 [ 36.123778][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.124929][ T744] ? __nla_validate_parse.part.6+0x45/0x8e0 [ 36.126309][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.127457][ T744] __rtnl_newlink+0x546/0x8e0 [ 36.128560][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.129623][ T744] ? deactivate_slab.isra.85+0x6a1/0x850 [ 36.130946][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.132102][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.133176][ T744] ? is_bpf_text_address+0x5/0xe0 [ 36.134364][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.135445][ T744] ? rcu_read_lock_sched_held+0x32/0x60 [ 36.136771][ T744] ? kmem_cache_alloc_trace+0x2d8/0x380 [ 36.138070][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.139164][ T744] rtnl_newlink+0x47/0x70 [ ... ] Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net: core: introduce struct netdev_nested_priv for nested interface ↵Taehee Yoo
infrastructure Functions related to nested interface infrastructure such as netdev_walk_all_{ upper | lower }_dev() pass both private functions and "data" pointer to handle their own things. At this point, the data pointer type is void *. In order to make it easier to expand common variables and functions, this new netdev_nested_priv structure is added. In the following patch, a new member variable will be added into this struct to fix the lockdep issue. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net: core: add __netdev_upper_dev_unlink()Taehee Yoo
The netdev_upper_dev_unlink() has to work differently according to flags. This idea is the same with __netdev_upper_dev_link(). In the following patches, new flags will be added. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: flow_dissector: avoid indirect call to DSA .flow_dissect for generic caseVladimir Oltean
With the recent mitigations against speculative execution exploits, indirect function calls are more expensive and it would be good to avoid them where possible. In the case of DSA, most switch taggers will shift the EtherType and next headers by a fixed amount equal to that tag's length in bytes. So we can use a generic procedure to determine that, without calling into custom tagger code. However we still leave the flow_dissect method inside struct dsa_device_ops as an override for the generic function. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: make the .flow_dissect tagger callback return voidVladimir Oltean
There is no tagger that returns anything other than zero, so just change the return type appropriately. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25devlink: introduce flash update overwrite maskJacob Keller
Sections of device flash may contain settings or device identifying information. When performing a flash update, it is generally expected that these settings and identifiers are not overwritten. However, it may sometimes be useful to allow overwriting these fields when performing a flash update. Some examples include, 1) customizing the initial device config on first programming, such as overwriting default device identifying information, or 2) reverting a device configuration to known good state provided in the new firmware image, or 3) in case it is suspected that current firmware logic for managing the preservation of fields during an update is broken. Although some devices are able to completely separate these types of settings and fields into separate components, this is not true for all hardware. To support controlling this behavior, a new DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK is defined. This is an nla_bitfield32 which will define what subset of fields in a component should be overwritten during an update. If no bits are specified, or of the overwrite mask is not provided, then an update should not overwrite anything, and should maintain the settings and identifiers as they are in the previous image. If the overwrite mask has the DEVLINK_FLASH_OVERWRITE_SETTINGS bit set, then the device should be configured to overwrite any of the settings in the requested component with settings found in the provided image. Similarly, if the DEVLINK_FLASH_OVERWRITE_IDENTIFIERS bit is set, the device should be configured to overwrite any device identifiers in the requested component with the identifiers from the image. Multiple overwrite modes may be combined to indicate that a combination of the set of fields that should be overwritten. Drivers which support the new overwrite mask must set the DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK in the supported_flash_update_params field of their devlink_ops. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25devlink: convert flash_update to use params structureJacob Keller
The devlink core recently gained support for checking whether the driver supports a flash_update parameter, via `supported_flash_update_params`. However, parameters are specified as function arguments. Adding a new parameter still requires modifying the signature of the .flash_update callback in all drivers. Convert the .flash_update function to take a new `struct devlink_flash_update_params` instead. By using this structure, and the `supported_flash_update_params` bit field, a new parameter to flash_update can be added without requiring modification to existing drivers. As before, all parameters except file_name will require driver opt-in. Because file_name is a necessary field to for the flash_update to make sense, no "SUPPORTED" bitflag is provided and it is always considered valid. All future additional parameters will require a new bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Bin Luo <luobin9@huawei.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Danielle Ratson <danieller@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25devlink: check flash_update parameter support in net coreJacob Keller
When implementing .flash_update, drivers which do not support per-component update are manually checking the component parameter to verify that it is NULL. Without this check, the driver might accept an update request with a component specified even though it will not honor such a request. Instead of having each driver check this, move the logic into net/core/devlink.c, and use a new `supported_flash_update_params` field in the devlink_ops. Drivers which will support per-component update must now specify this by setting DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT in the supported_flash_update_params in their devlink_ops. This helps ensure that drivers do not forget to check for a NULL component if they do not support per-component update. This also enables a slightly better error message by enabling the core stack to set the netlink bad attribute message to indicate precisely the unsupported attribute in the message. Going forward, any new additional parameter to flash update will require a bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Bin Luo <luobin9@huawei.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Danielle Ratson <danieller@mellanox.com> Cc: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau
This patch changes the bpf_sk_assign() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. The bpf_sk_lookup_assign() is taking ARG_PTR_TO_SOCKET_"OR_NULL". Meaning it specifically takes a literal NULL. ARG_PTR_TO_BTF_ID_SOCK_COMMON does not allow a literal NULL, so another ARG type is required for this purpose and another follow-up patch can be used if there is such need. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200925000415.3857374-1-kafai@fb.com
2020-09-25bpf: Change bpf_tcp_*_syncookie to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau
This patch changes the bpf_tcp_*_syncookie() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000409.3856725-1-kafai@fb.com
2020-09-25bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau
This patch changes the bpf_sk_storage_*() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. A micro benchmark has been done on a "cgroup_skb/egress" bpf program which does a bpf_sk_storage_get(). It was driven by netperf doing a 4096 connected UDP_STREAM test with 64bytes packet. The stats from "kernel.bpf_stats_enabled" shows no meaningful difference. The sk_storage_get_btf_proto, sk_storage_delete_btf_proto, btf_sk_storage_get_proto, and btf_sk_storage_delete_proto are no longer needed, so they are removed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000402.3856307-1-kafai@fb.com
2020-09-25bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ↵Martin KaFai Lau
ARG_PTR_TO_BTF_ID_SOCK_COMMON The previous patch allows the networking bpf prog to use the bpf_skc_to_*() helpers to get a PTR_TO_BTF_ID socket pointer, e.g. "struct tcp_sock *". It allows the bpf prog to read all the fields of the tcp_sock. This patch changes the bpf_sk_release() and bpf_sk_*cgroup_id() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. For example, the following will work: sk = bpf_skc_lookup_tcp(skb, tuple, tuplen, BPF_F_CURRENT_NETNS, 0); if (!sk) return; tp = bpf_skc_to_tcp_sock(sk); if (!tp) { bpf_sk_release(sk); return; } lsndtime = tp->lsndtime; /* Pass tp to bpf_sk_release() will also work */ bpf_sk_release(tp); Since PTR_TO_BTF_ID could be NULL, the helper taking ARG_PTR_TO_BTF_ID_SOCK_COMMON has to check for NULL at runtime. A btf_id of "struct sock" may not always mean a fullsock. Regardless the helper's running context may get a non-fullsock or not, considering fullsock check/handling is pretty cheap, it is better to keep the same verifier expectation on helper that takes ARG_PTR_TO_BTF_ID* will be able to handle the minisock situation. In the bpf_sk_*cgroup_id() case, it will try to get a fullsock by using sk_to_full_sk() as its skb variant bpf_sk"b"_*cgroup_id() has already been doing. bpf_sk_release can already handle minisock, so nothing special has to be done. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200925000356.3856047-1-kafai@fb.com
2020-09-25bpf: Enable bpf_skc_to_* sock casting helper to networking prog typeMartin KaFai Lau
There is a constant need to add more fields into the bpf_tcp_sock for the bpf programs running at tc, sock_ops...etc. A current workaround could be to use bpf_probe_read_kernel(). However, other than making another helper call for reading each field and missing CO-RE, it is also not as intuitive to use as directly reading "tp->lsndtime" for example. While already having perfmon cap to do bpf_probe_read_kernel(), it will be much easier if the bpf prog can directly read from the tcp_sock. This patch tries to do that by using the existing casting-helpers bpf_skc_to_*() whose func_proto returns a btf_id. For example, the func_proto of bpf_skc_to_tcp_sock returns the btf_id of the kernel "struct tcp_sock". These helpers are also added to is_ptr_cast_function(). It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id. That will keep the ref-tracking works properly. The bpf_skc_to_* helpers are made available to most of the bpf prog types in filter.c. The bpf_skc_to_* helpers will be limited by perfmon cap. This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON. The helper accepting this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON]) or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON). The bpf_skc_to_*() helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will accept pointer obtained from skb->sk. Instead of specifying both arg_type and arg_btf_id in the same func_proto which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the compatible_reg_types[] in verifier.c. The reason is the arg_btf_id is always the same. Discussion in this thread: https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/ The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is expecting a PTR_TO_BTF_ID which could be NULL. This is the same behavior as the existing helper taking ARG_PTR_TO_BTF_ID. The _SOCK_COMMON part means the helper is also expecting the legacy SOCK_COMMON pointer. By excluding the _OR_NULL part, the bpf prog cannot call helper with a literal NULL which doesn't make sense in most cases. e.g. bpf_skc_to_tcp_sock(NULL) will be rejected. All PTR_TO_*_OR_NULL reg has to do a NULL check first before passing into the helper or else the bpf prog will be rejected. This behavior is nothing new and consistent with the current expectation during bpf-prog-load. [ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace ARG_PTR_TO_SOCK* of other existing helpers later such that those existing helpers can take the PTR_TO_BTF_ID returned by the bpf_skc_to_*() helpers. The only special case is bpf_sk_lookup_assign() which can accept a literal NULL ptr. It has to be handled specially in another follow up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ] [ When converting the older helpers that take ARG_PTR_TO_SOCK* in the later patch, if the kernel does not support BTF, ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON because no reg->type could have PTR_TO_BTF_ID in this case. It is not a concern for the newer-btf-only helper like the bpf_skc_to_*() here though because these helpers must require BTF vmlinux to begin with. ] Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200925000350.3855720-1-kafai@fb.com
2020-09-24mptcp: add sk_stop_timer_sync helperGeliang Tang
This patch added a new helper sk_stop_timer_sync, it deactivates a timer like sk_stop_timer, but waits for the handler to finish. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: fix a new kernel-doc warning at dev.cMauro Carvalho Chehab
kernel-doc expects the function prototype to be just after the kernel-doc markup, as otherwise it will get it all wrong: ./net/core/dev.c:10036: warning: Excess function parameter 'dev' description in 'WAIT_REFS_MIN_MSECS' Fixes: 0e4be9e57e8c ("net: use exponential backoff in netdev_wait_allrefs") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-09-23 The following pull-request contains BPF updates for your *net-next* tree. We've added 95 non-merge commits during the last 22 day(s) which contain a total of 124 files changed, 4211 insertions(+), 2040 deletions(-). The main changes are: 1) Full multi function support in libbpf, from Andrii. 2) Refactoring of function argument checks, from Lorenz. 3) Make bpf_tail_call compatible with functions (subprograms), from Maciej. 4) Program metadata support, from YiFei. 5) bpf iterator optimizations, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22devlink: Enhance policy to validate port type input valueParav Pandit
Use range checking facility of nla_policy to validate port type attribute input value is valid or not. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22devlink: Enhance policy to validate eswitch mode valueParav Pandit
Use range checking facility of nla_policy to validate eswitch mode input attribute value is valid or not. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Two minor conflicts: 1) net/ipv4/route.c, adding a new local variable while moving another local variable and removing it's initial assignment. 2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes. One pretty prints the port mode differently, whilst another changes the driver to try and obtain the port mode from the port node rather than the switch node. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-21bpf: Using rcu_read_lock for bpf_sk_storage_map iteratorYonghong Song
If a bucket contains a lot of sockets, during bpf_iter traversing a bucket, concurrent userspace bpf_map_update_elem() and bpf program bpf_sk_storage_{get,delete}() may experience some undesirable delays as they will compete with bpf_iter for bucket lock. Note that the number of buckets for bpf_sk_storage_map is roughly the same as the number of cpus. So if there are lots of sockets in the system, each bucket could contain lots of sockets. Different actual use cases may experience different delays. Here, using selftest bpf_iter subtest bpf_sk_storage_map, I hacked the kernel with ktime_get_mono_fast_ns() to collect the time when a bucket was locked during bpf_iter prog traversing that bucket. This way, the maximum incurred delay was measured w.r.t. the number of elements in a bucket. # elems in each bucket delay(ns) 64 17000 256 72512 2048 875246 The potential delays will be further increased if we have even more elemnts in a bucket. Using rcu_read_lock() is a reasonable compromise here. It may lose some precision, e.g., access stale sockets, but it will not hurt performance of bpf program or user space application which also tries to get/delete or update map elements. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200916224645.720172-1-yhs@fb.com
2020-09-21bpf: Allow specifying a BTF ID per argument in function protosLorenz Bauer
Function prototypes using ARG_PTR_TO_BTF_ID currently use two ways to signal which BTF IDs are acceptable. First, bpf_func_proto.btf_id is an array of IDs, one for each argument. This array is only accessed up to the highest numbered argument that uses ARG_PTR_TO_BTF_ID and may therefore be less than five arguments long. It usually points at a BTF_ID_LIST. Second, check_btf_id is a function pointer that is called by the verifier if present. It gets the actual BTF ID of the register, and the argument number we're currently checking. It turns out that the only user check_arg_btf_id ignores the argument, and is simply used to check whether the BTF ID has a struct sock_common at it's start. Replace both of these mechanisms with an explicit BTF ID for each argument in a function proto. Thanks to btf_struct_ids_match this is very flexible: check_arg_btf_id can be replaced by requiring struct sock_common. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200921121227.255763-5-lmb@cloudflare.com
2020-09-21net-sysfs: add backlog len and CPU id to softnet dataPaolo Abeni
Currently the backlog status in not exposed to user-space. Since long backlogs (or simply not empty ones) can be a source of noticeable latency, -RT deployments need some way to inspect it. Additionally, there isn't a direct match between 'softnet_stat' lines and the related CPU - sd for offline CPUs are not dumped - so this patch also includes the CPU id into such entry. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-20net: remove unnecessary NULL checking in napi_consume_skb()Yunsheng Lin
When budget is non-zero, skb_unref() has already handled the NULL checking. When budget is zero, the dev_consume_skb_any() has handled NULL checking in __dev_kfree_skb_irq(), or dev_kfree_skb() which also ultimately call skb_unref(). So remove the unnecessary checking in napi_consume_skb(). Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18net: devlink: region: Pass the region ops to the snapshot functionAndrew Lunn
Pass the region to be snapshotted to the function performing the snapshot. This allows one function to operate on numerous regions. v4: Add missing kerneldoc for ICE Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18net: core: delete duplicated wordsRandy Dunlap
Drop repeated words in net/core/. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18devlink: collect flash notify params into a structShannon Nelson
The dev flash status notify function parameter lists are getting rather long, so add a struct to be filled and passed rather than continuously changing the function signatures. Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18devlink: add timeout information to status_notifyShannon Nelson
Add a timeout element to the DEVLINK_CMD_FLASH_UPDATE_STATUS netlink message for use by a userland utility to show that a particular firmware flash activity may take a long but bounded time to finish. Also add a handy helper for drivers to make use of the new timeout value. UI usage hints: - if non-zero, add timeout display to the end of the status line [component] status_msg ( Xm Ys : Am Bs ) using the timeout value for Am Bs and updating the Xm Ys every second - if the timeout expires while awaiting the next update, display something like [component] status_msg ( timeout reached : Am Bs ) - if new status notify messages are received, remove the timeout and start over Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>