summaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)Author
2020-02-27bpf: INET_DIAG support in bpf_sk_storageMartin KaFai Lau
This patch adds INET_DIAG support to bpf_sk_storage. 1. Although this series adds bpf_sk_storage diag capability to inet sk, bpf_sk_storage is in general applicable to all fullsock. Hence, the bpf_sk_storage logic will operate on SK_DIAG_* nlattr. The caller will pass in its specific nesting nlattr (e.g. INET_DIAG_*) as the argument. 2. The request will be like: INET_DIAG_REQ_SK_BPF_STORAGES (nla_nest) (defined in latter patch) SK_DIAG_BPF_STORAGE_REQ_MAP_FD (nla_put_u32) SK_DIAG_BPF_STORAGE_REQ_MAP_FD (nla_put_u32) ...... Considering there could have multiple bpf_sk_storages in a sk, instead of reusing INET_DIAG_INFO ("ss -i"), the user can select some specific bpf_sk_storage to dump by specifying an array of SK_DIAG_BPF_STORAGE_REQ_MAP_FD. If no SK_DIAG_BPF_STORAGE_REQ_MAP_FD is specified (i.e. an empty INET_DIAG_REQ_SK_BPF_STORAGES), it will dump all bpf_sk_storages of a sk. 3. The reply will be like: INET_DIAG_BPF_SK_STORAGES (nla_nest) (defined in latter patch) SK_DIAG_BPF_STORAGE (nla_nest) SK_DIAG_BPF_STORAGE_MAP_ID (nla_put_u32) SK_DIAG_BPF_STORAGE_MAP_VALUE (nla_reserve_64bit) SK_DIAG_BPF_STORAGE (nla_nest) SK_DIAG_BPF_STORAGE_MAP_ID (nla_put_u32) SK_DIAG_BPF_STORAGE_MAP_VALUE (nla_reserve_64bit) ...... 4. Unlike other INET_DIAG info of a sk which is pretty static, the size required to dump the bpf_sk_storage(s) of a sk is dynamic as the system adding more bpf_sk_storage_map. It is hard to set a static min_dump_alloc size. Hence, this series learns it at the runtime and adjust the cb->min_dump_alloc as it iterates all sk(s) of a system. The "unsigned int *res_diag_size" in bpf_sk_storage_diag_put() is for this purpose. The next patch will update the cb->min_dump_alloc as it iterates the sk(s). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230421.1975729-1-kafai@fb.com
2020-02-27NFC: Replace zero-length array with flexible-array memberGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: dsa: propagate resolved link config via mac_link_up()Russell King
Propagate the resolved link configuration down via DSA's phylink_mac_link_up() operation to allow split PCS/MAC to work. Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25devlink: extend devlink_trap_report() to accept cookie and passJiri Pirko
Add cookie argument to devlink_trap_report() allowing driver to pass in the user cookie. Pass on the cookie down to drop monitor code. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25drop_monitor: extend by passing cookie from driverJiri Pirko
If driver passed along the cookie, push it through Netlink. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25devlink: add trap metadata type for cookieJiri Pirko
Allow driver to indicate cookie metadata for registered traps. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25flow_offload: pass action cookie through offload structuresJiri Pirko
Extend struct flow_action_entry in order to hold TC action cookie specified by user inserting the action. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24Merge tag 'mac80211-next-for-net-next-2020-02-24' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== A new set of changes: * lots of small documentation fixes, from Jérôme Pouiller * beacon protection (BIGTK) support from Jouni Malinen * some initial code for TID configuration, from Tamizh chelvam * I reverted some new API before it's actually used, because it's wrong to mix controlled port and preauth * a few other cleanups/fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24net: Special handling for IP & MPLS.Martin Varghese
Special handling is needed in bareudp module for IP & MPLS as they support more than one ethertypes. MPLS has 2 ethertypes. 0x8847 for MPLS unicast and 0x8848 for MPLS multicast. While decapsulating MPLS packet from UDP packet the tunnel destination IP address is checked to determine the ethertype. The ethertype of the packet will be set to 0x8848 if the tunnel destination IP address is a multicast IP address. The ethertype of the packet will be set to 0x8847 if the tunnel destination IP address is a unicast IP address. IP has 2 ethertypes.0x0800 for IPV4 and 0x86dd for IPv6. The version field of the IP header tunnelled will be checked to determine the ethertype. This special handling to tunnel additional ethertypes will be disabled by default and can be enabled using a flag called multiproto. This flag can be used only with ethertypes 0x8847 and 0x0800. Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24net: UDP tunnel encapsulation module for tunnelling different protocols like ↵Martin Varghese
MPLS, IP, NSH etc. The Bareudp tunnel module provides a generic L3 encapsulation tunnelling module for tunnelling different protocols like MPLS, IP,NSH etc inside a UDP tunnel. Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24devlink: add ACL generic packet trapsJiri Pirko
Add packet traps that can report packets that were dropped during ACL processing. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24mac80211: Add api to support configuring TID specific configurationTamizh chelvam
Implement drv_set_tid_config api to allow TID specific configuration and drv_reset_tid_config api to reset peer specific TID configuration. This per-TID onfiguration will be applied for all the connected stations when MAC is NULL. Signed-off-by: Tamizh chelvam <tamizhr@codeaurora.org> Link: https://lore.kernel.org/r/1579506687-18296-7-git-send-email-tamizhr@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24nl80211: Add support to configure TID specific RTSCTS configurationTamizh chelvam
This patch adds support to configure per TID RTSCTS control configuration to enable/disable through the NL80211_TID_CONFIG_ATTR_RTSCTS_CTRL attribute. Signed-off-by: Tamizh chelvam <tamizhr@codeaurora.org> Link: https://lore.kernel.org/r/1579506687-18296-5-git-send-email-tamizhr@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24nl80211: Add support to configure TID specific AMPDU configurationTamizh chelvam
This patch adds support to configure per TID AMPDU control configuration to enable/disable aggregation through the NL80211_TID_CONFIG_ATTR_AMPDU_CTRL attribute. Signed-off-by: Tamizh chelvam <tamizhr@codeaurora.org> Link: https://lore.kernel.org/r/1579506687-18296-4-git-send-email-tamizhr@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24nl80211: Add support to configure TID specific retry configurationTamizh chelvam
This patch adds support to configure per TID retry configuration through the NL80211_TID_CONFIG_ATTR_RETRY_SHORT and NL80211_TID_CONFIG_ATTR_RETRY_LONG attributes. This TID specific retry configuration will have more precedence than phy level configuration. Signed-off-by: Tamizh chelvam <tamizhr@codeaurora.org> Link: https://lore.kernel.org/r/1579506687-18296-3-git-send-email-tamizhr@codeaurora.org [rebase completely on top of my previous API changes] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24nl80211: modify TID-config APIJohannes Berg
Make some changes to the TID-config API: * use u16 in nl80211 (only, and restrict to using 8 bits for now), to avoid issues in the future if we ever want to use higher TIDs. * reject empty TIDs mask (via netlink policy) * change feature advertising to not use extended feature flags but have own mechanism for this, which simplifies the code * fix all variable names from 'tid' to 'tids' since it's a mask * change to cfg80211_ name prefixes, not ieee80211_ * fix some minor docs/spelling things. Change-Id: Ia234d464b3f914cdeab82f540e018855be580dce Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24nl80211: Add NL command to support TID speicific configurationsTamizh chelvam
Add the new NL80211_CMD_SET_TID_CONFIG command to support data TID specific configuration. Per TID configuration is passed in the nested NL80211_ATTR_TID_CONFIG attribute. This patch adds support to configure per TID noack policy through the NL80211_TID_CONFIG_ATTR_NOACK attribute. Signed-off-by: Tamizh chelvam <tamizhr@codeaurora.org> Link: https://lore.kernel.org/r/1579506687-18296-2-git-send-email-tamizhr@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: Support key configuration for Beacon protection (BIGTK)Jouni Malinen
IEEE P802.11-REVmd/D3.0 adds support for protecting Beacon frames using a new set of keys (BIGTK; key index 6..7) similarly to the way group-addressed Robust Management frames are protected (IGTK; key index 4..5). Extend cfg80211 and nl80211 to allow the new BIGTK to be configured. Add an extended feature flag to indicate driver support for the new key index values to avoid array overflows in driver implementations and also to indicate to user space when this functionality is available. Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Link: https://lore.kernel.org/r/20200222132548.20835-2-jouni@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: fix indentation errorsJérôme Pouiller
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-10-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: merge documentations of field "dev"Jérôme Pouiller
The field "dev" was documented on two places. This patch merges the comments. Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-8-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: merge documentations of field "debugfsdir"Jérôme Pouiller
The field "privid" is documented twice. Comments were more or less the same. The patch merge them. Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-7-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: drop duplicated documentation of field "reg_notifier"Jérôme Pouiller
The field "reg_notifier" was already documented above the definition of struct wiphy. The comment inside the definition of the struct did not bring more information. Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-6-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: drop duplicated documentation of field "perm_addr"Jérôme Pouiller
The field "perm_addr" was already documented above the definition of struct wiphy. Comments were almost identical. Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-5-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: drop duplicated documentation of field "_net"Jérôme Pouiller
The field "_net" was already documented above the definition of struct wiphy. Both comments were identical. Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-4-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: drop duplicated documentation of field "registered"Jérôme Pouiller
Field "registered" was documented three times: twice in the documentation block of struct wiphy and once inside the struct definition. This patch keep only one comment. Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-3-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: drop duplicated documentation of field "privid"Jérôme Pouiller
The field "privid" was already documented above the definition of struct wiphy. Comments were not identical, but they said more or less the same thing. Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-2-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24cfg80211: drop duplicated documentation of field "probe_resp_offload"Jérôme Pouiller
The field "probe_resp_offload" was already documented above the definition of struct wiphy. Both comments were identical. Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Link: https://lore.kernel.org/r/20200221115604.594035-1-Jerome.Pouiller@silabs.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-24Revert "nl80211: add src and dst addr attributes for control port tx/rx"Johannes Berg
This reverts commit 8c3ed7aa2b9ef666195b789e9b02e28383243fa8. As Jouni points out, there's really no need for this, since the RSN pre-authentication frames are normal data frames, not port control frames (locally). We can still revert this now since it hasn't actually gone beyond -next. Fixes: 8c3ed7aa2b9e ("nl80211: add src and dst addr attributes for control port tx/rx") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200224101910.b746e263287a.I9eb15d6895515179d50964dec3550c9dc784bb93@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-02-21 The following pull-request contains BPF updates for your *net-next* tree. We've added 25 non-merge commits during the last 4 day(s) which contain a total of 33 files changed, 2433 insertions(+), 161 deletions(-). The main changes are: 1) Allow for adding TCP listen sockets into sock_map/hash so they can be used with reuseport BPF programs, from Jakub Sitnicki. 2) Add a new bpf_program__set_attach_target() helper for adding libbpf support to specify the tracepoint/function dynamically, from Eelco Chaudron. 3) Add bpf_read_branch_records() BPF helper which helps use cases like profile guided optimizations, from Daniel Xu. 4) Enable bpf_perf_event_read_value() in all tracing programs, from Song Liu. 5) Relax BTF mandatory check if only used for libbpf itself e.g. to process BTF defined maps, from Andrii Nakryiko. 6) Move BPF selftests -mcpu compilation attribute from 'probe' to 'v3' as it has been observed that former fails in envs with low memlock, from Yonghong Song. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Conflict resolution of ice_virtchnl_pf.c based upon work by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21net: Generate reuseport group ID on group creationJakub Sitnicki
Commit 736b46027eb4 ("net: Add ID (if needed) to sock_reuseport and expose reuseport_lock") has introduced lazy generation of reuseport group IDs that survive group resize. By comparing the identifier we check if BPF reuseport program is not trying to select a socket from a BPF map that belongs to a different reuseport group than the one the packet is for. Because SOCKARRAY used to be the only BPF map type that can be used with reuseport BPF, it was possible to delay the generation of reuseport group ID until a socket from the group was inserted into BPF map for the first time. Now that SOCK{MAP,HASH} can be used with reuseport BPF we have two options, either generate the reuseport ID on map update, like SOCKARRAY does, or allocate an ID from the start when reuseport group gets created. This patch takes the latter approach to keep sockmap free of calls into reuseport code. This streamlines the reuseport_id access as its lifetime now matches the longevity of reuseport object. The cost of this simplification, however, is that we allocate reuseport IDs for all SO_REUSEPORT users. Even those that don't use SOCKARRAY in their setups. With the way identifiers are currently generated, we can have at most S32_MAX reuseport groups, which hopefully is sufficient. If we ever get close to the limit, we can switch an u64 counter like sk_cookie. Another change is that we now always call into SOCKARRAY logic to unlink the socket from the map when unhashing or closing the socket. Previously we did it only when at least one socket from the group was in a BPF map. It is worth noting that this doesn't conflict with sockmap tear-down in case a socket is in a SOCK{MAP,HASH} and belongs to a reuseport group. sockmap tear-down happens first: prot->unhash `- tcp_bpf_unhash |- tcp_bpf_remove | `- while (sk_psock_link_pop(psock)) | `- sk_psock_unlink | `- sock_map_delete_from_link | `- __sock_map_delete | `- sock_map_unref | `- sk_psock_put | `- sk_psock_drop | `- rcu_assign_sk_user_data(sk, NULL) `- inet_unhash `- reuseport_detach_sock `- bpf_sk_reuseport_detach `- WRITE_ONCE(sk->sk_user_data, NULL) Suggested-by: Martin Lau <kafai@fb.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200218171023.844439-10-jakub@cloudflare.com
2020-02-21tcp_bpf: Don't let child socket inherit parent protocol ops on copyJakub Sitnicki
Prepare for cloning listening sockets that have their protocol callbacks overridden by sk_msg. Child sockets must not inherit parent callbacks that access state stored in sk_user_data owned by the parent. Restore the child socket protocol callbacks before it gets hashed and any of the callbacks can get invoked. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200218171023.844439-4-jakub@cloudflare.com
2020-02-21net, sk_msg: Clear sk_user_data pointer on clone if taggedJakub Sitnicki
sk_user_data can hold a pointer to an object that is not intended to be shared between the parent socket and the child that gets a pointer copy on clone. This is the case when sk_user_data points at reference-counted object, like struct sk_psock. One way to resolve it is to tag the pointer with a no-copy flag by repurposing its lowest bit. Based on the bit-flag value we clear the child sk_user_data pointer after cloning the parent socket. The no-copy flag is stored in the pointer itself as opposed to externally, say in socket flags, to guarantee that the pointer and the flag are copied from parent to child socket in an atomic fashion. Parent socket state is subject to change while copying, we don't hold any locks at that time. This approach relies on an assumption that sk_user_data holds a pointer to an object aligned at least 2 bytes. A manual audit of existing users of rcu_dereference_sk_user_data helper confirms our assumption. Also, an RCU-protected sk_user_data is not likely to hold a pointer to a char value or a pathological case of "struct { char c; }". To be safe, warn when the flag-bit is set when setting sk_user_data to catch any future misuses. It is worth considering why clearing sk_user_data unconditionally is not an option. There exist users, DRBD, NVMe, and Xen drivers being among them, that rely on the pointer being copied when cloning the listening socket. Potentially we could distinguish these users by checking if the listening socket has been created in kernel-space via sock_create_kern, and hence has sk_kern_sock flag set. However, this is not the case for NVMe and Xen drivers, which create sockets without marking them as belonging to the kernel. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200218171023.844439-3-jakub@cloudflare.com
2020-02-21cfg80211: remove support for adjacent channel compensationEmmanuel Grumbach
The only driver that used that was iwlwifi and it removed support for this. Remove the feature here as well. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200214232336.a530de38e511.I393bc395f6037c8cca6421ed550e3072dc248aed@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-20net: page_pool: API cleanup and commentsIlias Apalodimas
Functions starting with __ usually indicate those which are exported, but should not be called directly. Update some of those declared in the API and make it more readable. page_pool_unmap_page() and page_pool_release_page() were doing exactly the same thing calling __page_pool_clean_page(). Let's rename __page_pool_clean_page() to page_pool_release_page() and export it in order to show up on perf logs and get rid of page_pool_unmap_page(). Finally rename __page_pool_put_page() to page_pool_put_page() since we can now directly call it from drivers and rename the existing page_pool_put_page() to page_pool_put_full_page() since they do the same thing but the latter is trying to sync the full DMA area. This patch also updates netsec, mvneta and stmmac drivers which use those functions. Suggested-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19net: sched: Pass ingress block to tcf_classify_ingressPaul Blakey
On ingress and cls_act qdiscs init, save the block on ingress mini_Qdisc and and pass it on to ingress classification, so it can be used for the looking up a specified chain index. Co-developed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19net: sched: Introduce ingress classification functionPaul Blakey
TC multi chain configuration can cause offloaded tc chains to miss in hardware after jumping to some chain. In such cases the software should continue from the chain that missed in hardware, as the hardware may have manipulated the packet and updated some counters. Currently a single tcf classification function serves both ingress and egress. However, multi chain miss processing (get tc skb extension on hw miss, set tc skb extension on tc miss) should happen only on ingress. Refactor the code to use ingress classification function, and move setting the tc skb extension from general classification to it, as a prestep for supporting the hw miss scenario. Co-developed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19net: core: add helper tcp_v6_gso_csum_prepHeiner Kallweit
Several network drivers for chips that support TSO6 share the same code for preparing the TCP header, so let's factor it out to a helper. A difference is that some drivers reset the payload_len whilst others don't do this. This value is overwritten by TSO anyway, therefore the new helper resets it in general. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18devlink: Force enclosing array on binary fmsg dataAya Levin
Add a new API for start/end binary array brackets [] to force array around binary data as required from JSON. With this restriction, re-open API to set binary fmsg data. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-17net: sched: correct flower port blockingJason Baron
tc flower rules that are based on src or dst port blocking are sometimes ineffective due to uninitialized stack data. __skb_flow_dissect() extracts ports from the skb for tc flower to match against. However, the port dissection is not done when when the FLOW_DIS_IS_FRAGMENT bit is set in key_control->flags. All callers of __skb_flow_dissect(), zero-out the key_control field except for fl_classify() as used by the flower classifier. Thus, the FLOW_DIS_IS_FRAGMENT may be set on entry to __skb_flow_dissect(), since key_control is allocated on the stack and may not be initialized. Since key_basic and key_control are present for all flow keys, let's make sure they are initialized. Fixes: 62230715fd24 ("flow_dissector: do not dissect l4 ports for fragments") Co-developed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17net: sched: don't take rtnl lock during flow_action setupVlad Buslov
Refactor tc_setup_flow_action() function not to use rtnl lock and remove 'rtnl_held' argument that is no longer needed. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17net: sched: refactor ct action helpers to require tcf_lockVlad Buslov
In order to remove rtnl lock dependency from flow_action representation translator, change rtnl_dereference() to rcu_dereference_protected() in ct action helpers that provide external access to zone and action values. This is safe to do because the functions are not called from anywhere else outside flow_action infrastructure which was modified to obtain tcf_lock when accessing action data in one of previous patches in the series. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17net: sched: refactor police action helpers to require tcf_lockVlad Buslov
In order to remove rtnl lock dependency from flow_action representation translator, change rcu_dereference_bh_rtnl() to rcu_dereference_protected() in police action helpers that provide external access to rate and burst values. This is safe to do because the functions are not called from anywhere else outside flow_action infrastructure which was modified to obtain tcf_lock when accessing action data in one of previous patches in the series. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17net: sched: lock action when translating it to flow_action infraVlad Buslov
In order to remove dependency on rtnl lock, take action's tcfa_lock when constructing its representation as flow_action_entry structure. Refactor tcf_sample_get_group() to assume that caller holds tcf_lock and don't take it manually. This callback is only called from flow_action infra representation translator which now calls it with tcf_lock held, so this refactoring is necessary to prevent deadlock. Allocate memory with GFP_ATOMIC flag for ip_tunnel_info copy because tcf_tunnel_info_copy() is only called from flow_action representation infra code with tcf_lock spinlock taken. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-16net/sock.h: fix all kernel-doc warningsRandy Dunlap
Fix all kernel-doc warnings for <net/sock.h>. Fixes these warnings: ../include/net/sock.h:232: warning: Function parameter or member 'skc_addrpair' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_portpair' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_ipv6only' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_net_refcnt' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_v6_daddr' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_v6_rcv_saddr' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_cookie' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_listener' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_tw_dr' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_rcv_wnd' not described in 'sock_common' ../include/net/sock.h:232: warning: Function parameter or member 'skc_tw_rcv_nxt' not described in 'sock_common' ../include/net/sock.h:498: warning: Function parameter or member 'sk_rx_skb_cache' not described in 'sock' ../include/net/sock.h:498: warning: Function parameter or member 'sk_wq_raw' not described in 'sock' ../include/net/sock.h:498: warning: Function parameter or member 'tcp_rtx_queue' not described in 'sock' ../include/net/sock.h:498: warning: Function parameter or member 'sk_tx_skb_cache' not described in 'sock' ../include/net/sock.h:498: warning: Function parameter or member 'sk_route_forced_caps' not described in 'sock' ../include/net/sock.h:498: warning: Function parameter or member 'sk_txtime_report_errors' not described in 'sock' ../include/net/sock.h:498: warning: Function parameter or member 'sk_validate_xmit_skb' not described in 'sock' ../include/net/sock.h:498: warning: Function parameter or member 'sk_bpf_storage' not described in 'sock' ../include/net/sock.h:2024: warning: No description found for return value of 'sk_wmem_alloc_get' ../include/net/sock.h:2035: warning: No description found for return value of 'sk_rmem_alloc_get' ../include/net/sock.h:2046: warning: No description found for return value of 'sk_has_allocations' ../include/net/sock.h:2082: warning: No description found for return value of 'skwq_has_sleeper' ../include/net/sock.h:2244: warning: No description found for return value of 'sk_page_frag' ../include/net/sock.h:2444: warning: Function parameter or member 'tcp_rx_skb_cache_key' not described in 'DECLARE_STATIC_KEY_FALSE' ../include/net/sock.h:2444: warning: Excess function parameter 'sk' description in 'DECLARE_STATIC_KEY_FALSE' ../include/net/sock.h:2444: warning: Excess function parameter 'skb' description in 'DECLARE_STATIC_KEY_FALSE' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-16Merge tag 'mac80211-next-for-net-next-2020-02-14' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== A few big new things: * 802.11 frame encapsulation offload support * more HE (802.11ax) support, including some for 6 GHz band * powersave in hwsim, for better testing Of course as usual there are various cleanups and small fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-14Merge tag 'mac80211-for-net-2020-02-14' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a few fixes: * avoid running out of tracking space for frames that need to be reported to userspace by using more bits * fix beacon handling suppression by adding some relevant elements to the CRC calculation * fix quiet mode in action frames * fix crash in ethtool for virt_wifi and similar * add a missing policy entry * fix 160 & 80+80 bandwidth to take local capabilities into account ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-14mac80211: allow setting queue_len for drivers not using wake_tx_queueJohn Crispin
Currently a mac80211 driver can only set the txq_limit when using wake_tx_queue. Not all drivers use wake_tx_queue. This patch adds a new element to wiphy allowing a driver to set a custom tx_queue_len and the code that will apply it in case it is set. The current default is 1000 which is too low for ath11k when doing HE rates. Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20200211122605.13002-1-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-14mac80211: Fix setting txpower to zeroBen Greear
With multiple VIFS ath10k, and probably others, tries to find the minimum txpower for all vifs and uses that when setting txpower in the firmware. If a second vif is added and starts to scan, it's txpower is not initialized yet and it set to zero. ath10k had a patch to ignore zero values, but then it is impossible to actually set txpower to zero. So, instead initialize the txpower to INT_MIN in mac80211, and let drivers know that means the power has not been set and so should be ignored. This should fix regression in: commit 88407beb1b1462f706a1950a355fd086e1c450b6 Author: Ryan Hsu <ryanhsu@qca.qualcomm.com> Date: Tue Dec 13 14:55:19 2016 -0800 ath10k: fix incorrect txpower set by P2P_DEVICE interface Tested on ath10k 9984 with ath10k-ct firmware. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://lore.kernel.org/r/20191217183057.24586-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-02-13icmp: introduce helper for nat'd source address in network device contextJason A. Donenfeld
This introduces a helper function to be called only by network drivers that wraps calls to icmp[v6]_send in a conntrack transformation, in case NAT has been used. We don't want to pollute the non-driver path, though, so we introduce this as a helper to be called by places that actually make use of this, as suggested by Florian. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>