linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-03-30	net: phy: broadcom: Only advertise EEE for supported modes	Florian Fainelli
	We should not be advertising EEE for modes that we do not support, correct that oversight by looking at the PHY device supported linkmodes. Fixes: 99cec8a4dda2 ("net: phy: broadcom: Allow enabling or disabling of EEE") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	nfp: flower: ignore duplicate merge hints from FW	Yinjun Zhang
	A merge hint message needs some time to process before the merged flow actually reaches the firmware, during which we may get duplicate merge hints if there're more than one packet that hit the pre-merged flow. And processing duplicate merge hints will cost extra host_ctx's which are a limited resource. Avoid the duplicate merge by using hash table to store the sub_flows to be merged. Fixes: 8af56f40e53b ("nfp: flower: offload merge flows") Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	net: let skb_orphan_partial wake-up waiters.	Paolo Abeni
	Currently the mentioned helper can end-up freeing the socket wmem without waking-up any processes waiting for more write memory. If the partially orphaned skb is attached to an UDP (or raw) socket, the lack of wake-up can hang the user-space. Even for TCP sockets not calling the sk destructor could have bad effects on TSQ. Address the issue using skb_orphan to release the sk wmem before setting the new sock_efree destructor. Additionally bundle the whole ownership update in a new helper, so that later other potential users could avoid duplicate code. v1 -> v2: - use skb_orphan() instead of sort of open coding it (Eric) - provide an helper for the ownership change (Eric) Fixes: f6ba8d33cfbb ("netem: fix skb_orphan_partial()") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	sch_htb: fix null pointer dereference on a null new_q	Yunjian Wang
	sch_htb: fix null pointer dereference on a null new_q Currently if new_q is null, the null new_q pointer will be dereference when 'q->offload' is true. Fix this by adding a braces around htb_parent_to_leaf_offload() to avoid it. Addresses-Coverity: ("Dereference after null check") Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	net: qrtr: Fix memory leak on qrtr_tx_wait failure	Loic Poulain
	qrtr_tx_wait does not check for radix_tree_insert failure, causing the 'flow' object to be unreferenced after qrtr_tx_wait return. Fix that by releasing flow on radix_tree_insert failure. Fixes: 5fdeb0d372ab ("net: qrtr: Implement outgoing flow control") Reported-by: syzbot+739016799a89c530b32a@syzkaller.appspotmail.com Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	net: sched: bump refcount for new action in ACT replace mode	Kumar Kartikeya Dwivedi
	Currently, action creation using ACT API in replace mode is buggy. When invoking for non-existent action index 42, tc action replace action bpf obj foo.o sec <xyz> index 42 kernel creates the action, fills up the netlink response, and then just deletes the action after notifying userspace. tc action show action bpf doesn't list the action. This happens due to the following sequence when ovr = 1 (replace mode) is enabled: tcf_idr_check_alloc is used to atomically check and either obtain reference for existing action at index, or reserve the index slot using a dummy entry (ERR_PTR(-EBUSY)). This is necessary as pointers to these actions will be held after dropping the idrinfo lock, so bumping the reference count is necessary as we need to insert the actions, and notify userspace by dumping their attributes. Finally, we drop the reference we took using the tcf_action_put_many call in tcf_action_add. However, for the case where a new action is created due to free index, its refcount remains one. This when paired with the put_many call leads to the kernel setting up the action, notifying userspace of its creation, and then tearing it down. For existing actions, the refcount is still held so they remain unaffected. Fortunately due to rtnl_lock serialization requirement, such an action with refcount == 1 will not be concurrently deleted by anything else, at best CLS API can move its refcount up and down by binding to it after it has been published from tcf_idr_insert_many. Since refcount is atleast one until put_many call, CLS API cannot delete it. Also __tcf_action_put release path already ensures deterministic outcome (either new action will be created or existing action will be reused in case CLS API tries to bind to action concurrently) due to idr lock serialization. We fix this by making refcount of newly created actions as 2 in ACT API replace mode. A relaxed store will suffice as visibility is ensured only after the tcf_idr_insert_many call. Note that in case of creation or overwriting using CLS API only (i.e. bind = 1), overwriting existing action object is not allowed, and any such request is silently ignored (without error). The refcount bump that occurs in tcf_idr_check_alloc call there for existing action will pair with tcf_exts_destroy call made from the owner module for the same action. In case of action creation, there is no existing action, so no tcf_exts_destroy callback happens. This means no code changes for CLS API. Fixes: cae422f379f3 ("net: sched: use reference counting action init") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	net/ncsi: Avoid channel_monitor hrtimer deadlock	Milton Miller
	Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed deadlock on SMP because stop calls del_timer_sync on the timer that invoked channel_monitor as its timer function. Recognise the inherent race of marking the monitor disabled before deleting the timer by just returning if enable was cleared. After a timeout (the default case -- reset to START when response received) just mark the monitor.enabled false. If the channel has an entry on the channel_queue list, or if the state is not ACTIVE or INACTIVE, then warn and mark the timer stopped and don't restart, as the locking is broken somehow. Fixes: 0795fb2021f0 ("net/ncsi: Stop monitor if channel times out or is inactive") Signed-off-by: Milton Miller <miltonm@us.ibm.com> Signed-off-by: Eddie James <eajames@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	xfrm/compat: Cleanup WARN()s that can be user-triggered	Dmitry Safonov
	Replace WARN_ONCE() that can be triggered from userspace with pr_warn_once(). Those still give user a hint what's the issue. I've left WARN()s that are not possible to trigger with current code-base and that would mean that the code has issues: - relying on current compat_msg_min[type] <= xfrm_msg_min[type] - expected 4-byte padding size difference between compat_msg_min[type] and xfrm_msg_min[type] - compat_policy[type].len <= xfrma_policy[type].len (for every type) Reported-by: syzbot+834ffd1afc7212eb8147@syzkaller.appspotmail.com Fixes: 5f3eea6b7e8f ("xfrm/compat: Attach xfrm dumps to 64=>32 bit translator") Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-03-29	ethernet/netronome/nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx	Lv Yunlong
	In nfp_bpf_ctrl_msg_rx, if nfp_ccm_get_type(skb) == NFP_CCM_TYPE_BPF_BPF_EVENT is true, the skb will be freed. But the skb is still used by nfp_ccm_rx(&bpf->ccm, skb). My patch adds a return when the skb was freed. Fixes: bcf0cafab44fd ("nfp: split out common control message handling code") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	Merge branch '100GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-03-29 This series contains updates to ice driver only. Ani does not fail on link/PHY errors during probe as this is not a fatal error to prevent the user from remedying the problem. He also corrects checking Wake on LAN support to be port number, not PF ID. Fabio increases the AdminQ timeout as some commands can take longer than the current value. Chinh fixes iSCSI to use be able to use port 860 by using information from DCBx and other QoS configuration info. Krzysztof fixes a possible race between ice_open() and ice_stop(). Bruce corrects the ordering of arguments in a memory allocation call. Dave removes DCBNL device reset bit which is blocking changes coming from DCBNL interface. Jacek adds error handling for filter allocation failure. Robert ensures memory is freed if VSI filter list issues are encountered. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	dt-bindings: net: bcm4908-enet: fix Ethernet generic properties	Rafał Miłecki
	This binding file uses $ref: ethernet-controller.yaml# so it's required to use "unevaluatedProperties" (instead of "additionalProperties") to make Ethernet properties validate. Fixes: f08b5cf1eb1f ("dt-bindings: net: bcm4908-enet: include ethernet-controller.yaml") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	dt-bindings: net: ethernet-controller: fix typo in NVMEM	Rafał Miłecki
	The correct property name is "nvmem-cell-names". This is what: 1. Was originally documented in the ethernet.txt 2. Is used in DTS files 3. Matches standard syntax for phandles 4. Linux net subsystem checks for Fixes: 9d3de3c58347 ("dt-bindings: net: Add YAML schemas for the generic Ethernet options") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	net:tipc: Fix a double free in tipc_sk_mcast_rcv	Lv Yunlong
	In the if(skb_peek(arrvq) == skb) branch, it calls __skb_dequeue(arrvq) to get the skb by skb = skb_peek(arrvq). Then __skb_dequeue() unlinks the skb from arrvq and returns the skb which equals to skb_peek(arrvq). After __skb_dequeue(arrvq) finished, the skb is freed by kfree_skb(__skb_dequeue(arrvq)) in the first time. Unfortunately, the same skb is freed in the second time by kfree_skb(skb) after the branch completed. My patch removes kfree_skb() in the if(skb_peek(arrvq) == skb) branch, because this skb will be freed by kfree_skb(skb) finally. Fixes: cb1b728096f54 ("tipc: eliminate race condition at multicast reception") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	cxgb4: avoid collecting SGE_QBASE regs during traffic	Rahul Lakkireddy
	Accessing SGE_QBASE_MAP[0-3] and SGE_QBASE_INDEX registers can lead to SGE missing doorbells under heavy traffic. So, only collect them when adapter is idle. Also update the regdump range to skip collecting these registers. Fixes: 80a95a80d358 ("cxgb4: collect SGE PF/VF queue map") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	net: dsa: Fix type was not set for devlink port	Maxim Kochetkov
	If PHY is not available on DSA port (described at devicetree but absent or failed to detect) then kernel prints warning after 3700 secs: [ 3707.948771] ------------[ cut here ]------------ [ 3707.948784] Type was not set for devlink port. [ 3707.948894] WARNING: CPU: 1 PID: 17 at net/core/devlink.c:8097 0xc083f9d8 We should unregister the devlink port as a user port and re-register it as an unused port before executing "continue" in case of dsa_port_setup error. Fixes: 86f8b1c01a0a ("net: dsa: Do not make user port errors fatal") Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	gianfar: Handle error code at MAC address change	Claudiu Manoil
	Handle return error code of eth_mac_addr(); Fixes: 3d23a05c75c7 ("gianfar: Enable changing mac addr when if up") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	ethernet: myri10ge: Fix a use after free in myri10ge_sw_tso	Lv Yunlong
	In myri10ge_sw_tso, the skb_list_walk_safe macro will set (curr) = (segs) and (next) = (curr)->next. If status!=0 is true, the memory pointed by curr and segs will be free by dev_kfree_skb_any(curr). But later, the segs is used by segs = segs->next and causes a uaf. As (next) = (curr)->next, my patch replaces seg->next to next. Fixes: 536577f36ff7a ("net: myri10ge: use skb_list_walk_safe helper for gso segments") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	MAINTAINERS: Add entry for Qualcomm IPC Router (QRTR) driver	Manivannan Sadhasivam
	Add MAINTAINERS entry for Qualcomm IPC Router (QRTR) driver. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	Merge tag 'linux-can-fixes-for-5.12-20210329' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2021-03-29 this is a pull request of 3 patches for net/master. The two patch are by Oliver Hartkopp. He fixes length check in the proto_ops::getname callback for the CAN RAW, BCM and ISOTP protocols, which were broken by the introduction of the J1939 protocol. The last patch is by me and fixes the a BUILD_BUG_ON() check which triggers on ARCH=arm with CONFIG_AEABI unset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	Merge branch 'mlxsw-ecn-marking'	David S. Miller
	Ido Schimmel says: ==================== mlxsw: spectrum: Fix ECN marking in tunnel decapsulation Patch #1 fixes a discrepancy between the software and hardware data paths with regards to ECN marking after decapsulation. See the changelog for a detailed description. Patch #2 extends the ECN decap test to cover all possible combinations of inner and outer ECN markings. The test passes over both data paths. v2: * Only set ECT(1) if inner is ECT(0) * Introduce a new helper to determine inner ECN. Share it between NVE and IP-in-IP tunnels * Extend the selftest ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	selftests: forwarding: vxlan_bridge_1d: Add more ECN decap test cases	Ido Schimmel
	Test that all possible combinations of inner and outer ECN bits result in the correct inner ECN marking according to RFC 6040 4.2. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	mlxsw: spectrum: Fix ECN marking in tunnel decapsulation	Ido Schimmel
	Cited commit changed the behavior of the software data path with regards to the ECN marking of decapsulated packets. However, the commit did not change other callers of __INET_ECN_decapsulate(), namely mlxsw. The driver is using the function in order to ensure that the hardware and software data paths act the same with regards to the ECN marking of decapsulated packets. The discrepancy was uncovered by commit 5aa3c334a449 ("selftests: forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value") that aligned the selftest to the new behavior. Without this patch the selftest passes when used with veth pairs, but fails when used with mlxsw netdevs. Fix this by instructing the device to propagate the ECT(1) mark from the outer header to the inner header when the inner header is ECT(0), for both NVE and IP-in-IP tunnels. A helper is added in order not to duplicate the code between both tunnel types. Fixes: b723748750ec ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	ice: Cleanup fltr list in case of allocation issues	Robert Malz
	When ice_remove_vsi_lkup_fltr is called, by calling ice_add_to_vsi_fltr_list local copy of vsi filter list is created. If any issues during creation of vsi filter list occurs it up for the caller to free already allocated memory. This patch ensures proper memory deallocation in these cases. Fixes: 80d144c9ac82 ("ice: Refactor switch rule management structures and functions") Signed-off-by: Robert Malz <robertx.malz@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	ice: Use port number instead of PF ID for WoL	Anirudh Venkataramanan
	As per the spec, the WoL control word read from the NVM should be interpreted as port numbers, and not PF numbers. So when checking if WoL supported, use the port number instead of the PF ID. Also, ice_is_wol_supported doesn't really need a pointer to the pf struct, but just needs a pointer to the hw instance. Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	ice: Fix for dereference of NULL pointer	Jacek Bułatek
	Add handling of allocation fault for ice_vsi_list_map_info. Also *fi should not be NULL pointer, it is a reference to raw data field, so remove this variable and use the reference directly. Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming") Signed-off-by: Jacek Bułatek <jacekx.bulatek@intel.com> Co-developed-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	ice: remove DCBNL_DEVRESET bit from PF state	Dave Ertman
	The original purpose of the ICE_DCBNL_DEVRESET was to protect the driver during DCBNL device resets. But, the flow for DCBNL device resets now consists of only calls up the stack such as dev_close() and dev_open() that will result in NDO calls to the driver. These will be handled with state changes from the stack. Also, there is a problem of the dev_close and dev_open being blocked by checks for reset in progress also using the ICE_DCBNL_DEVRESET bit. Since the ICE_DCBNL_DEVRESET bit is not necessary for protecting the driver from DCBNL device resets and it is actually blocking changes coming from the DCBNL interface, remove the bit from the PF state and don't block driver function based on DCBNL reset in progress. Fixes: b94b013eb626 ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	ice: fix memory allocation call	Bruce Allan
	Fix the order of number of array members and member size parameters in a *calloc() call. Fixes: b3c3890489f6 ("ice: avoid unnecessary single-member variable-length structs") Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	ice: prevent ice_open and ice_stop during reset	Krzysztof Goreczny
	There is a possibility of race between ice_open or ice_stop calls performed by OS and reset handling routine both trying to modify VSI resources. Observed scenarios: - reset handler deallocates memory in ice_vsi_free_arrays and ice_open tries to access it in ice_vsi_cfg_txq leading to driver crash - reset handler deallocates memory in ice_vsi_free_arrays and ice_close tries to access it in ice_down leading to driver crash - reset handler clears port scheduler topology and sets port state to ICE_SCHED_PORT_STATE_INIT leading to ice_ena_vsi_txq fail in ice_open To prevent this additional checks in ice_open and ice_stop are introduced to make sure that OS is not allowed to alter VSI config while reset is in progress. Fixes: cdedef59deb0 ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	ice: Recognize 860 as iSCSI port in CEE mode	Chinh T Cao
	iSCSI can use both TCP ports 860 and 3260. However, in our current implementation, the ice_aqc_opc_get_cee_dcb_cfg (0x0A07) AQ command doesn't provide a way to communicate the protocol port number to the AQ's caller. Thus, we assume that 3260 is the iSCSI port number at the AQ's caller layer. Rely on the dcbx-willing mode, desired QoS and remote QoS configuration to determine which port number that iSCSI will use. Fixes: 0ebd3ff13cca ("ice: Add code for DCB initialization part 2/4") Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	ice: Increase control queue timeout	Fabio Pricoco
	250 msec timeout is insufficient for some AQ commands. Advice from FW team was to increase the timeout. Increase to 1 second. Fixes: 7ec59eeac804 ("ice: Add support for control queues") Signed-off-by: Fabio Pricoco <fabio.pricoco@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	ice: Continue probe on link/PHY errors	Anirudh Venkataramanan
	An incorrect NVM update procedure can result in the driver failing probe. In this case, the recommended resolution method is to update the NVM using the right procedure. However, if the driver fails probe, the user will not be able to update the NVM. So do not fail probe on link/PHY errors. Fixes: 1a3571b5938c ("ice: restore PHY settings on media insertion") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-29	can: uapi: can.h: mark union inside struct can_frame packed	Marc Kleine-Budde
	In commit ea7800565a12 ("can: add optional DLC element to Classical CAN frame structure") the struct can_frame::can_dlc was put into an anonymous union with another u8 variable. For various reasons some members in struct can_frame and canfd_frame including the first 8 byes of data are expected to have the same memory layout. This is enforced by a BUILD_BUG_ON check in af_can.c. Since the above mentioned commit this check fails on ARM kernels compiled with the ARM OABI (which means CONFIG_AEABI not set). In this case -mabi=apcs-gnu is passed to the compiler, which leads to a structure size boundary of 32, instead of 8 compared to CONFIG_AEABI enabled. This means the the union in struct can_frame takes 4 bytes instead of the expected 1. Rong Chen illustrates the problem with pahole in the ARM OABI case: \| struct can_frame { \| canid_t can_id; /* 0 4 / \| union { \| __u8 len; / 4 1 / \| __u8 can_dlc; / 4 1 / \| }; / 4 4 / \| __u8 __pad; / 8 1 / \| __u8 __res0; / 9 1 / \| __u8 len8_dlc; / 10 1 / \| \| / XXX 5 bytes hole, try to pack / \| \| __u8 data[8] \| __attribute__((__aligned__(8))); / 16 8 / \| \| / size: 24, cachelines: 1, members: 6 / \| / sum members: 19, holes: 1, sum holes: 5 / \| / forced alignments: 1, forced holes: 1, sum forced holes: 5 / \| / last cacheline: 24 bytes */ \| } __attribute__((__aligned__(8))); Marking the anonymous union as __attribute__((packed)) fixes the BUILD_BUG_ON problem on these compilers. Fixes: ea7800565a12 ("can: add optional DLC element to Classical CAN frame structure") Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Rong Chen <rong.a.chen@intel.com> Link: https://lore.kernel.org/linux-can/2c82ec23-3551-61b5-1bd8-178c3407ee83@hartkopp.net/ Link: https://lore.kernel.org/r/20210325125850.1620-3-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-29	can: isotp: fix msg_namelen values depending on CAN_REQUIRED_SIZE	Oliver Hartkopp
	Since commit f5223e9eee65 ("can: extend sockaddr_can to include j1939 members") the sockaddr_can has been extended in size and a new CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol specific needed size. The ABI for the msg_name and msg_namelen has not been adapted to the new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to a problem when an existing binary reads the (increased) struct sockaddr_can in msg_name. Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Reported-by: Richard Weinberger <richard@nod.at> Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.zimbra@nod.at/T/#t Link: https://lore.kernel.org/r/20210325125850.1620-2-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-29	can: bcm/raw: fix msg_namelen values depending on CAN_REQUIRED_SIZE	Oliver Hartkopp
	Since commit f5223e9eee65 ("can: extend sockaddr_can to include j1939 members") the sockaddr_can has been extended in size and a new CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol specific needed size. The ABI for the msg_name and msg_namelen has not been adapted to the new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to a problem when an existing binary reads the (increased) struct sockaddr_can in msg_name. Fixes: f5223e9eee65 ("can: extend sockaddr_can to include j1939 members") Reported-by: Richard Weinberger <richard@nod.at> Tested-by: Richard Weinberger <richard@nod.at> Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.zimbra@nod.at/T/#t Link: https://lore.kernel.org/r/20210325125850.1620-1-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-29	xfrm: Provide private skb extensions for segmented and hw offloaded ESP packets	Steffen Klassert
	Commit 94579ac3f6d0 ("xfrm: Fix double ESP trailer insertion in IPsec crypto offload.") added a XFRM_XMIT flag to avoid duplicate ESP trailer insertion on HW offload. This flag is set on the secpath that is shared amongst segments. This lead to a situation where some segments are not transformed correctly when segmentation happens at layer 3. Fix this by using private skb extensions for segmented and hw offloaded ESP packets. Fixes: 94579ac3f6d0 ("xfrm: Fix double ESP trailer insertion in IPsec crypto offload.") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-03-28	drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit	Lv Yunlong
	In pvc_xmit, if __skb_pad(skb, pad, false) failed, it will free the skb in the first time and goto drop. But the same skb is freed by kfree_skb(skb) in the second time in drop. Maintaining the original function unchanged, my patch adds a new label out to avoid the double free if __skb_pad() failed. Fixes: f5083d0cee08a ("drivers/net/wan/hdlc_fr: Improvements to the code of pvc_xmit") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26	bpf: Take module reference for trampoline in module	Jiri Olsa
	Currently module can be unloaded even if there's a trampoline register in it. It's easily reproduced by running in parallel: # while :; do ./test_progs -t module_attach; done # while :; do rmmod bpf_testmod; sleep 0.5; done Taking the module reference in case the trampoline's ip is within the module code. Releasing it when the trampoline's ip is unregistered. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210326105900.151466-1-jolsa@kernel.org
2021-03-26	bpf: Fix a spelling typo in bpf_atomic_alu_string disasm	Xu Kuohai
	The name string for BPF_XOR is "xor", not "or". Fix it. Fixes: 981f94c3e921 ("bpf: Add bitwise atomic instructions") Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Brendan Jackman <jackmanb@google.com> Link: https://lore.kernel.org/bpf/20210325134141.8533-1-xukuohai@huawei.com
2021-03-26	bpf/selftests: Test that kernel rejects a TCP CC with an invalid license	Toke Høiland-Jørgensen
	This adds a selftest to check that the verifier rejects a TCP CC struct_ops with a non-GPL license. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210326100314.121853-2-toke@redhat.com
2021-03-26	bpf: Enforce that struct_ops programs be GPL-only	Toke Høiland-Jørgensen
	With the introduction of the struct_ops program type, it became possible to implement kernel functionality in BPF, making it viable to use BPF in place of a regular kernel module for these particular operations. Thus far, the only user of this mechanism is for implementing TCP congestion control algorithms. These are clearly marked as GPL-only when implemented as modules (as seen by the use of EXPORT_SYMBOL_GPL for tcp_register_congestion_control()), so it seems like an oversight that this was not carried over to BPF implementations. Since this is the only user of the struct_ops mechanism, just enforcing GPL-only for the struct_ops program type seems like the simplest way to fix this. Fixes: 0baf26b0fcd7 ("bpf: tcp: Support tcp_congestion_ops in bpf") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210326100314.121853-1-toke@redhat.com
2021-03-25	libbpf: Fix bail out from 'ringbuf_process_ring()' on error	Pedro Tammela
	The current code bails out with negative and positive returns. If the callback returns a positive return code, 'ring_buffer__consume()' and 'ring_buffer__poll()' will return a spurious number of records consumed, but mostly important will continue the processing loop. This patch makes positive returns from the callback a no-op. Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support") Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325150115.138750-1-pctammela@mojatatu.com
2021-03-25	Merge branch '40GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-03-25 This series contains updates to virtchnl header file and i40e driver. Norbert removes added padding from virtchnl RSS structures as this causes issues when iterating over the arrays. Mateusz adds Asym_Pause as supported to allow these settings to be set as the hardware supports it. Eryk fixes an issue where encountering a VF reset alongside releasing VFs could cause a call trace. Arkadiusz moves TC setup before resource setup as previously it was possible to enter with a null q_vector causing a kernel oops. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25	sch_red: fix off-by-one checks in red_check_params()	Eric Dumazet
	This fixes following syzbot report: UBSAN: shift-out-of-bounds in ./include/net/red.h:237:23 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 1 PID: 8418 Comm: syz-executor170 Not tainted 5.12.0-rc4-next-20210324-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327 red_set_parms include/net/red.h:237 [inline] choke_change.cold+0x3c/0xc8 net/sched/sch_choke.c:414 qdisc_create+0x475/0x12f0 net/sched/sch_api.c:1247 tc_modify_qdisc+0x4c8/0x1a50 net/sched/sch_api.c:1663 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x43f039 Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffdfa725168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043f039 RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000004 RBP: 0000000000403020 R08: 0000000000400488 R09: 0000000000400488 R10: 0000000000400488 R11: 0000000000000246 R12: 00000000004030b0 R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488 Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25	Merge branch 'tunnel-shinfo'	David S. Miller
	Antoine Tenart says: ==================== net: do not modify the shared tunnel info when PMTU triggers an ICMP reply The series fixes an issue were a shared ip_tunnel_info is modified when PMTU triggers an ICMP reply in vxlan and geneve, making following packets in that flow to have a wrong destination address if the flow isn't updated. A detailled information is given in each of the two commits. This was tested manually with OVS and I ran the PTMU selftests with kmemleak enabled (all OK, none was skipped). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25	geneve: do not modify the shared tunnel info when PMTU triggers an ICMP reply	Antoine Tenart
	When the interface is part of a bridge or an Open vSwitch port and a packet exceed a PMTU estimate, an ICMP reply is sent to the sender. When using the external mode (collect metadata) the source and destination addresses are reversed, so that Open vSwitch can match the packet against an existing (reverse) flow. But inverting the source and destination addresses in the shared ip_tunnel_info will make following packets of the flow to use a wrong destination address (packets will be tunnelled to itself), if the flow isn't updated. Which happens with Open vSwitch, until the flow times out. Fixes this by uncloning the skb's ip_tunnel_info before inverting its source and destination addresses, so that the modification will only be made for the PTMU packet, not the following ones. Fixes: c1a800e88dbf ("geneve: Support for PMTU discovery on directly bridged links") Tested-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25	vxlan: do not modify the shared tunnel info when PMTU triggers an ICMP reply	Antoine Tenart
	When the interface is part of a bridge or an Open vSwitch port and a packet exceed a PMTU estimate, an ICMP reply is sent to the sender. When using the external mode (collect metadata) the source and destination addresses are reversed, so that Open vSwitch can match the packet against an existing (reverse) flow. But inverting the source and destination addresses in the shared ip_tunnel_info will make following packets of the flow to use a wrong destination address (packets will be tunnelled to itself), if the flow isn't updated. Which happens with Open vSwitch, until the flow times out. Fixes this by uncloning the skb's ip_tunnel_info before inverting its source and destination addresses, so that the modification will only be made for the PTMU packet, not the following ones. Fixes: fc68c99577cc ("vxlan: Support for PMTU discovery on directly bridged links") Tested-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25	Merge branch 'nfc-fixes'	David S. Miller
	Xiaoming Ni says: ==================== nfc: fix Resource leakage and endless loop fix Resource leakage and endless loop in net/nfc/llcp_sock.c, reported by "kiyin(尹亮)". Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 ==================== math: Export mul_u64_u64_div_u64 Fixes: f51d7bf1dbe5 ("ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation") Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25	nfc: Avoid endless loops caused by repeated llcp_sock_connect()	Xiaoming Ni
	When sock_wait_state() returns -EINPROGRESS, "sk->sk_state" is LLCP_CONNECTING. In this case, llcp_sock_connect() is repeatedly invoked, nfc_llcp_sock_link() will add sk to local->connecting_sockets twice. sk->sk_node->next will point to itself, that will make an endless loop and hang-up the system. To fix it, check whether sk->sk_state is LLCP_CONNECTING in llcp_sock_connect() to avoid repeated invoking. Fixes: b4011239a08e ("NFC: llcp: Fix non blocking sockets connections") Reported-by: "kiyin(尹亮)" <kiyin@tencent.com> Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: <stable@vger.kernel.org> #v3.11 Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25	nfc: fix memory leak in llcp_sock_connect()	Xiaoming Ni
	In llcp_sock_connect(), use kmemdup to allocate memory for "llcp_sock->service_name". The memory is not released in the sock_unlink label of the subsequent failure branch. As a result, memory leakage occurs. fix CVE-2020-25672 Fixes: d646960f7986 ("NFC: Initial LLCP support") Reported-by: "kiyin(尹亮)" <kiyin@tencent.com> Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: <stable@vger.kernel.org> #v3.3 Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25	nfc: fix refcount leak in llcp_sock_connect()	Xiaoming Ni
	nfc_llcp_local_get() is invoked in llcp_sock_connect(), but nfc_llcp_local_put() is not invoked in subsequent failure branches. As a result, refcount leakage occurs. To fix it, add calling nfc_llcp_local_put(). fix CVE-2020-25671 Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reported-by: "kiyin(尹亮)" <kiyin@tencent.com> Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: <stable@vger.kernel.org> #v3.6 Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>