linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-04-11	self-tests: add veth tests	Paolo Abeni
	Add some basic veth tests, that verify the expected flags and aggregation with different setups (default, xdp, etc...) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-09	Merge tag 'net-5.12-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.12-rc7, including fixes from can, ipsec, mac80211, wireless, and bpf trees. No scary regressions here or in the works, but small fixes for 5.12 changes keep coming. Current release - regressions: - virtio: do not pull payload in skb->head - virtio: ensure mac header is set in virtio_net_hdr_to_skb() - Revert "net: correct sk_acceptq_is_full()" - mptcp: revert "mptcp: provide subflow aware release function" - ethernet: lan743x: fix ethernet frame cutoff issue - dsa: fix type was not set for devlink port - ethtool: remove link_mode param and derive link params from driver - sched: htb: fix null pointer dereference on a null new_q - wireless: iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_enqueue_hcmd() - wireless: iwlwifi: fw: fix notification wait locking - wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding the rtnl dependency Current release - new code bugs: - napi: fix hangup on napi_disable for threaded napi - bpf: take module reference for trampoline in module - wireless: mt76: mt7921: fix airtime reporting and related tx hangs - wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending config command Previous releases - regressions: - rfkill: revert back to old userspace API by default - nfc: fix infinite loop, refcount & memory leaks in LLCP sockets - let skb_orphan_partial wake-up waiters - xfrm/compat: Cleanup WARN()s that can be user-triggered - vxlan, geneve: do not modify the shared tunnel info when PMTU triggers an ICMP reply - can: fix msg_namelen values depending on CAN_REQUIRED_SIZE - can: uapi: mark union inside struct can_frame packed - sched: cls: fix action overwrite reference counting - sched: cls: fix err handler in tcf_action_init() - ethernet: mlxsw: fix ECN marking in tunnel decapsulation - ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx - ethernet: i40e: fix receiving of single packets in xsk zero-copy mode - ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic Previous releases - always broken: - bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET - bpf: Refcount task stack in bpf_get_task_stack - bpf, x86: Validate computation of branch displacements - ieee802154: fix many similar syzbot-found bugs - fix NULL dereferences in netlink attribute handling - reject unsupported operations on monitor interfaces - fix error handling in llsec_key_alloc() - xfrm: make ipv4 pmtu check honor ip header df - xfrm: make hash generation lock per network namespace - xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp offload - ethtool: fix incorrect datatype in set_eee ops - xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory model - openvswitch: fix send of uninitialized stack memory in ct limit reply Misc: - udp: add get handling for UDP_GRO sockopt" * tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (182 commits) net: fix hangup on napi_disable for threaded napi net: hns3: Trivial spell fix in hns3 driver lan743x: fix ethernet frame cutoff issue net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits net: dsa: lantiq_gswip: Don't use PHY auto polling net: sched: sch_teql: fix null-pointer dereference ipv6: report errors for iftoken via netlink extack net: sched: fix err handler in tcf_action_init() net: sched: fix action overwrite reference counting Revert "net: sched: bump refcount for new action in ACT replace mode" ice: fix memory leak of aRFS after resuming from suspend i40e: Fix sparse warning: missing error code 'err' i40e: Fix sparse error: 'vsi->netdev' could be null i40e: Fix sparse error: uninitialized symbol 'ring' i40e: Fix sparse errors in i40e_txrx.c i40e: Fix parameters in aq_get_phy_register() nl80211: fix beacon head validation bpf, x86: Validate computation of branch displacements for x86-32 bpf, x86: Validate computation of branch displacements for x86-64 ...
2021-04-08	tc-testing: add simple action test to verify batch change cleanup	Vlad Buslov
	Verify cleanup of failed actions batch change where second action in batch fails after successful init of first action. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08	tc-testing: add simple action test to verify batch add cleanup	Vlad Buslov
	Verify cleanup of failed actions batch add where second action in batch fails after successful init of first action. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07	selftests: mptcp: add the net device name testcase	Geliang Tang
	This patch added a new testcase for setting the net device name. In it, pass the net device name to pm_nl_ctl to set the ifindex field of struct mptcp_pm_addr_entry. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-02	selftests: mptcp: dump more info on mpjoin errors	Matthieu Baerts
	Very occasionally, MPTCP selftests fail. Yeah, I saw that at least once! Here we provide more details in case of errors with mptcp_join.sh script like it was done with mptcp_connect.sh, see commit 767389c8dd55 ("selftests: mptcp: dump more info on errors") Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-02	selftests: mptcp: init nstat history	Matthieu Baerts
	Not to be impacted by packets sent between sub-tests. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-02	selftests: mptcp: launch mptcp_connect with timeout	Matthieu Baerts
	'mptcp_connect' already has a timeout for poll() but in some cases, it is not enough. With "timeout" tool, we will force the command to fail if it doesn't finish on time. Thanks to that, the script will continue and display details about the current state before marking the test as failed. Displaying this state is very important to be able to understand the issue. Best to have our CI reporting the issue than just "the test hanged". Note that in mptcp_connect.sh, we were using a long timeout to validate the fact we cannot create a socket if a sysctl is set. We don't need this timeout. In diag.sh, we want to send signals to mptcp_connect instances that have been started in the netns. But we cannot send this signal to 'timeout' otherwise that will stop the timeout and messages telling us SIGUSR1 has been received will be printed. Instead of trying to find the right PID and storing them in an array, we can simply use the output of 'ip netns pids' which is all the PIDs we want to send signal to. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/160 Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-04-01 The following pull-request contains BPF updates for your net-next tree. We've added 68 non-merge commits during the last 7 day(s) which contain a total of 70 files changed, 2944 insertions(+), 1139 deletions(-). The main changes are: 1) UDP support for sockmap, from Cong. 2) Verifier merge conflict resolution fix, from Daniel. 3) xsk selftests enhancements, from Maciej. 4) Unstable helpers aka kernel func calling, from Martin. 5) Batches ops for LPM map, from Pedro. 6) Fix race in bpf_get_local_storage, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller
	Alexei Starovoitov says: ==================== pull-request: bpf 2021-04-01 The following pull-request contains BPF updates for your net tree. We've added 11 non-merge commits during the last 8 day(s) which contain a total of 10 files changed, 151 insertions(+), 26 deletions(-). The main changes are: 1) xsk creation fixes, from Ciara. 2) bpf_get_task_stack fix, from Dave. 3) trampoline in modules fix, from Jiri. 4) bpf_obj_get fix for links and progs, from Lorenz. 5) struct_ops progs must be gpl compatible fix, from Toke. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01	libbpf: Only create rx and tx XDP rings when necessary	Ciara Loftus
	Prior to this commit xsk_socket__create(_shared) always attempted to create the rx and tx rings for the socket. However this causes an issue when the socket being setup is that which shares the fd with the UMEM. If a previous call to this function failed with this socket after the rings were set up, a subsequent call would always fail because the rings are not torn down after the first call and when we try to set them up again we encounter an error because they already exist. Solve this by remembering whether the rings were set up by introducing new bools to struct xsk_umem which represent the ring setup status and using them to determine whether or not to set up the rings. Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-4-ciara.loftus@intel.com
2021-04-01	libbpf: Restore umem state after socket create failure	Ciara Loftus
	If the call to xsk_socket__create fails, the user may want to retry the socket creation using the same umem. Ensure that the umem is in the same state on exit if the call fails by: 1. ensuring the umem _save pointers are unmodified. 2. not unmapping the set of umem rings that were set up with the umem during xsk_umem__create, since those maps existed before the call to xsk_socket__create and should remain in tact even in the event of failure. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-3-ciara.loftus@intel.com
2021-04-01	libbpf: Ensure umem pointer is non-NULL before dereferencing	Ciara Loftus
	Calls to xsk_socket__create dereference the umem to access the fill_save and comp_save pointers. Make sure the umem is non-NULL before doing this. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20210331061218.1647-2-ciara.loftus@intel.com
2021-04-01	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm fixes from Paolo Bonzini: "It's a bit larger than I (and probably you) would like by the time we get to -rc6, but perhaps not entirely unexpected since the changes in the last merge window were larger than usual. x86: - Fixes for missing TLB flushes with TDP MMU - Fixes for race conditions in nested SVM - Fixes for lockdep splat with Xen emulation - Fix for kvmclock underflow - Fix srcdir != builddir builds - Other small cleanups ARM: - Fix GICv3 MMIO compatibility probing - Prevent guests from using the ARMv8.4 self-hosted tracing extension" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests: kvm: Check that TSC page value is small after KVM_SET_CLOCK(0) KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update() KVM: x86: disable interrupts while pvclock_gtod_sync_lock is taken KVM: x86: reduce pvclock_gtod_sync_lock critical sections KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit KVM: SVM: load control fields from VMCB12 before checking them KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap KVM: make: Fix out-of-source module builds selftests: kvm: make hardware_disable_test less verbose KVM: x86/vPMU: Forbid writing to MSR_F15H_PERF MSRs when guest doesn't have X86_FEATURE_PERFCTR_CORE KVM: x86: remove unused declaration of kvm_write_tsc() KVM: clean up the unused argument tools/kvm_stat: Add restart delay KVM: arm64: Fix CPU interface MMIO compatibility detection KVM: arm64: Disable guest access to trace filter controls KVM: arm64: Hide system instruction access to Trace registers
2021-04-01	selftests/bpf: Add a test case for loading BPF_SK_SKB_VERDICT	Cong Wang
	This adds a test case to ensure BPF_SK_SKB_VERDICT and BPF_SK_STREAM_VERDICT will never be attached at the same time. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331023237.41094-17-xiyou.wangcong@gmail.com
2021-04-01	selftests/bpf: Add a test case for udp sockmap	Cong Wang
	Add a test case to ensure redirection between two UDP sockets work. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331023237.41094-16-xiyou.wangcong@gmail.com
2021-04-01	sock_map: Introduce BPF_SK_SKB_VERDICT	Cong Wang
	Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is confusing and more importantly we still want to distinguish them from user-space. So we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to attach stream_verdict and skb_verdict programs to the same map. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-10-xiyou.wangcong@gmail.com
2021-04-01	idr test suite: Improve reporting from idr_find_test_1	Matthew Wilcox (Oracle)
	Instead of just reporting an assertion failure, report enough information that we can start diagnosing exactly went wrong. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-04-01	idr test suite: Create anchor before launching throbber	Matthew Wilcox (Oracle)
	The throbber could race with creation of the anchor entry and cause the IDR to have zero entries in it, which would cause the test to fail. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-04-01	idr test suite: Take RCU read lock in idr_find_test_1	Matthew Wilcox (Oracle)
	When run on a single CPU, this test would frequently access already-freed memory. Due to timing, this bug never showed up on multi-CPU tests. Reported-by: Chris von Recklinghausen <crecklin@redhat.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-04-01	radix tree test suite: Register the main thread with the RCU library	Matthew Wilcox (Oracle)
	Several test runners register individual worker threads with the RCU library, but neglect to register the main thread, which can lead to objects being freed while the main thread is in what appears to be an RCU critical section. Reported-by: Chris von Recklinghausen <crecklin@redhat.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-04-01	selftests: kvm: Check that TSC page value is small after KVM_SET_CLOCK(0)	Vitaly Kuznetsov
	Add a test for the issue when KVM_SET_CLOCK(0) call could cause TSC page value to go very big because of a signedness issue around hv_clock->system_time. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210326155551.17446-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-31	selftests/net: so_txtime multi-host support	Carlos Llamas
	SO_TXTIME hardware offload requires testing across devices, either between machines or separate network namespaces. Split up SO_TXTIME test into tx and rx modes, so traffic can be sent from one process to another. Create a veth-pair on different namespaces and bind each process to an end point via [-S]ource and [-D]estination parameters. Optional start [-t]ime parameter can be passed to synchronize the test across the hosts (with synchorinzed clocks). Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31	selftests: ethtool: add a netdevsim FEC test	Jakub Kicinski
	Test FEC settings, iterate over configs. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31	tools/resolve_btfids: Fix warnings	Stanislav Fomichev
	* make eprintf static, used only in main.c * initialize ret in eprintf * remove unused tmp v3: remove another err (Song Liu) v2: * remove unused 'int err = -1' Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210329223143.3659983-1-sdf@google.com
2021-03-30	selftests/bpf: Add an option for a debug shell in vmtest.sh	KP Singh
	The newly introduced -s command line option starts an interactive shell. If a command is specified, the shell is started after the command finishes executing. It's useful to have a shell especially when debugging failing tests or developing new tests. Since the user may terminate the VM forcefully, an extra "sync" is added after the execution of the command to persist any logs from the command into the log file. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210323014752.3198283-1-kpsingh@kernel.org
2021-03-30	selftests: mptcp: remove id 0 address testcases	Geliang Tang
	This patch added the testcases for removing the id 0 subflow and the id 0 address. In do_transfer, use the removing addresses number '9' for deleting the id 0 address. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	selftests: mptcp: add addr argument for del_addr	Geliang Tang
	For the id 0 address, different MPTCP connections could be using different IP addresses for id 0. This patch added an extra argument IP address for del_addr when using id 0. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	selftests: mptcp: avoid calling pm_nl_ctl with bad IDs	Matthieu Baerts
	IDs are supposed to be between 0 and 255. In pm_nl_ctl, for both the 'add' and 'get' instruction, the ID is casted in a u_int8_t. So if we give 256, we will delete ID 0. Obviously, the goal is not to delete this ID by giving 256. We could modify pm_nl_ctl and stop if the ID is negative or higher than 255 but probably better not to increase the number of lines for such things in this tool which is only used in selftests. Instead, we use it within the limits. This modification also means that we will no longer add a new ID for the 2nd entry. That's why we removed an expected entry from the dump and introduced with commit dc8eb10e95a8 ("selftests: mptcp: add testcases for setting the address ID"). So now we delete ID 9 like before and we add entries for IDs 10 to 255 that are deleted just after. Note that this could be seen as a fix but it was not really an issue so far: we were simply playing with ID 0/1 once again. With the following commit ("selftests: mptcp: add addr argument for del_addr"), it will be different because ID 0 is going to required an address. We don't want errors when trying to delete ID 0 without the address argument. Acked-and-tested-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	tc-testing: add simple action change test	Vlad Buslov
	Use act_simple to verify that action created with 'tc actions change' command exists after command returns. The goal is to verify internal action API reference counting to ensure that the case when netlink message has NLM_F_REPLACE flag set but action with specified index doesn't exist is handled correctly. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	selftests: net: add UDP GRO forwarding self-tests	Paolo Abeni
	Create a bunch of virtual topologies and verify that NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD-enabled devices aggregate the ingress packets as expected. Additionally check that the aggregate packets are segmented correctly when landing on a socket Also test SKB_GSO_FRAGLIST and SKB_GSO_UDP_L4 aggregation on top of UDP tunnel (vxlan) v1 -> v2: - hopefully clarify the commit message - moved the overlay network ipv6 range into the 'documentation' reserved range (Willem) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	radix tree test suite: Fix compilation	Matthew Wilcox (Oracle)
	Commit 4bba4c4bb09a added tools/include/linux/compiler_types.h which includes linux/compiler-gcc.h. Unfortunately, we had our own (empty) compiler_types.h which overrode the one added by that commit, and so we lost the definition of __must_be_array(). Removing our empty compiler_types.h fixes the problem and reduces our divergence from the rest of the tools. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-03-30	selftests: kvm: make hardware_disable_test less verbose	Vitaly Kuznetsov
	hardware_disable_test produces 512 snippets like ... main: [511] waiting semaphore run_test: [511] start vcpus run_test: [511] all threads launched main: [511] waiting 368us main: [511] killing child and this doesn't have much value, let's print this info with pr_debug(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210323104331.1354800-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30	tools/kvm_stat: Add restart delay	Stefan Raspl
	If this service is enabled and the system rebooted, Systemd's initial attempt to start this unit file may fail in case the kvm module is not loaded. Since we did not specify a delay for the retries, Systemd restarts with a minimum delay a number of times before giving up and disabling the service. Which means a subsequent kvm module load will have kvm running without monitoring. Adding a delay to fix this. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Message-Id: <20210325122949.1433271-1-raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30	selftests: xsk: Remove unused defines	Björn Töpel
	Remove two unused defines. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-18-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove mutex and condition variable	Björn Töpel
	The usage of the condition variable is broken, and overkill. Replace it with a pthread barrier. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-17-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove thread attribute	Björn Töpel
	There is really no reason to have a non-default thread stack size. Remove that. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-16-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Implement bpf_link test	Maciej Fijalkowski
	Introduce a test that is supposed to verify the persistence of BPF resources based on underlying bpf_link usage. Test will: 1) create and bind two sockets on queue ids 0 and 1 2) run a traffic on queue ids 0 3) remove xsk sockets from queue 0 on both veth interfaces 4) run a traffic on queues ids 1 Running traffic successfully on qids 1 means that BPF resources were not removed on step 3). In order to make it work, change the command that creates veth pair to have the 4 queue pairs by default. Introduce the arrays of xsks and umems to ifobject struct but keep a pointers to single entities, so rest of the logic around Rx/Tx can be kept as-is. For umem handling, double the size of mmapped space and split that between the two sockets. Rename also bidi_pass to a variable 'second_step' of a boolean type as it's now used also for the test that is introduced here and it doesn't have anything in common with bi-directional testing. Drop opt_queue command line argument as it wasn't working before anyway. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-15-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove sync_mutex_tx and atomic var	Maciej Fijalkowski
	Although thread_common_ops() are called in both Tx and Rx threads, testapp_validate() will not spawn Tx thread until Rx thread signals that it has finished its initialization via condition variable. Therefore, locking in thread_common_ops is not needed and furthermore Tx thread does not have to spin on atomic variable. Note that this simplification wouldn't be possible if there would still be a common worker thread. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-13-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Refactor teardown/bidi test cases and testapp_validate	Maciej Fijalkowski
	Currently, there is a testapp_sockets() that acts like a wrapper around testapp_validate() and it is called for bidi and teardown test types. Other test types call testapp_validate() directly. Split testapp_sockets() onto two separate functions so a bunch of bidi specific logic can be moved there and out of testapp_validate() itself. Introduce function pointer to ifobject struct which will be used for assigning the Rx/Tx function that is assigned to worker thread. Let's also have a global ifobject Rx/Tx pointers so it's easier to swap the vectors on a second run of a bi-directional test. Thread creation now is easey to follow. switching_notify variable is useless, info about vector switch can be printed based on bidi_pass state. Last but not least, init/destroy synchronization variables only once, not per each test. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-12-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove Tx synchronization resources	Maciej Fijalkowski
	Tx thread needs to be started after the Rx side is fully initialized so that packets are not xmitted until xsk Rx socket is ready to be used. It can be observed that atomic variable spinning_tx is not checked from Rx side in any way, so thread_common_ops can be modified to only address the spinning_rx. This means that spinning_tx can be removed altogheter. signal_tx_condition is never utilized, so simply remove it. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-11-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Split worker thread	Maciej Fijalkowski
	Let's a have a separate Tx/Rx worker threads instead of a one common thread packed with Tx/Rx specific checks. Move mmap for umem buffer space and a switch_namespace() call to thread_common_ops. This also allows for a bunch of simplifactions that are the subject of the next commits. The final result will be a code base that is much easier to follow. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-10-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove thread for netns switch	Maciej Fijalkowski
	Currently, there is a dedicated thread for following remote ns operations: - grabbing the ifindex of the interface moved to remote netns - removing xdp prog from that interface With bpf_link usage in place, this can be simply omitted, so remove mentioned thread, as BPF resources will be managed by bpf_link itself, so there's no further need for creating the thread that will switch to remote netns and do the cleanup. Keep most of the logic for switching the ns, though, but make switch_namespace() return the fd so that it will be possible to close it at the process termination time. Get rid of logic around making sure that it's possible to switch ns in validate_interfaces(). Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-9-maciej.fijalkowski@intel.com
2021-03-30	libbpf: xsk: Use bpf_link	Maciej Fijalkowski
	Currently, if there are multiple xdpsock instances running on a single interface and in case one of the instances is terminated, the rest of them are left in an inoperable state due to the fact of unloaded XDP prog from interface. Consider the scenario below: // load xdp prog and xskmap and add entry to xskmap at idx 10 $ sudo ./xdpsock -i ens801f0 -t -q 10 // add entry to xskmap at idx 11 $ sudo ./xdpsock -i ens801f0 -t -q 11 terminate one of the processes and another one is unable to work due to the fact that the XDP prog was unloaded from interface. To address that, step away from setting bpf prog in favour of bpf_link. This means that refcounting of BPF resources will be done automatically by bpf_link itself. Provide backward compatibility by checking if underlying system is bpf_link capable. Do this by looking up/creating bpf_link on loopback device. If it failed in any way, stick with netlink-based XDP prog. therwise, use bpf_link-based logic. When setting up BPF resources during xsk socket creation, check whether bpf_link for a given ifindex already exists via set of calls to bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd and comparing the ifindexes from bpf_link and xsk socket. For case where resources exist but they are not AF_XDP related, bail out and ask user to remove existing prog and then retry. Lastly, do a bit of refactoring within __xsk_setup_xdp_prog and pull out existing code branches based on prog_id value onto separate functions that are responsible for resource initialization if prog_id was 0 and for lookup existing resources for non-zero prog_id as that implies that XDP program is present on the underlying net device. This in turn makes it easier to follow, especially the teardown part of both branches. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-7-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Simplify frame traversal in dumping thread	Maciej Fijalkowski
	Store offsets to each layer in a separate variables rather than compute them every single time. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-6-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove inline keyword from source file	Maciej Fijalkowski
	Follow the kernel coding style guidelines and let compiler do the decision about inlining. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-5-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove unused function	Maciej Fijalkowski
	Probably it was ported from xdpsock but is not used anywhere. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-4-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove struct ifaceconfigobj	Maciej Fijalkowski
	ifaceconfigobj is not really useful, it is possible to keep the functionality and simplify the code. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-3-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Don't call worker_pkt_dump() for stats test	Maciej Fijalkowski
	For TEST_TYPE_STATS, worker_pkt_validate() that places frames onto pkt_buf is not called. Therefore, when dump mode is set, don't call worker_pkt_dump() for mentioned test type, so that it won't crash on pkt_buf() access. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-2-maciej.fijalkowski@intel.com