summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2019-04-16btf: add support for VAR and DATASEC in btf_dedup()Andrii Nakryiko
This patch adds support for VAR and DATASEC in btf_dedup(). VAR/DATASEC are never deduplicated, but they need to be processed anyway as types they refer to might need to be remapped due to deduplication and compaction. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Yonghong Song <yhs@fb.com> Cc: Alexei Starovoitov <ast@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-12selftests/bpf: C based test for sysctl and strtoXAndrey Ignatov
Add C based test for a few bpf_sysctl_* helpers and bpf_strtoul. Make sure that sysctl can be identified by name and that multiple integers can be parsed from sysctl value with bpf_strtoul. net/ipv4/tcp_mem is chosen as a testing sysctl, it contains 3 unsigned longs, they all are parsed and compared (val[0] < val[1] < val[2]). Example of output: # ./test_sysctl ... Test case: C prog: deny all writes .. [PASS] Test case: C prog: deny access by name .. [PASS] Test case: C prog: read tcp_mem .. [PASS] Summary: 39 PASSED, 0 FAILED Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Test bpf_strtol and bpf_strtoul helpersAndrey Ignatov
Test that bpf_strtol and bpf_strtoul helpers can be used to convert provided buffer to long or unsigned long correspondingly and return both correct result and number of consumed bytes, or proper errno. Example of output: # ./test_sysctl .. Test case: bpf_strtoul one number string .. [PASS] Test case: bpf_strtoul multi number string .. [PASS] Test case: bpf_strtoul buf_len = 0, reject .. [PASS] Test case: bpf_strtoul supported base, ok .. [PASS] Test case: bpf_strtoul unsupported base, EINVAL .. [PASS] Test case: bpf_strtoul buf with spaces only, EINVAL .. [PASS] Test case: bpf_strtoul negative number, EINVAL .. [PASS] Test case: bpf_strtol negative number, ok .. [PASS] Test case: bpf_strtol hex number, ok .. [PASS] Test case: bpf_strtol max long .. [PASS] Test case: bpf_strtol overflow, ERANGE .. [PASS] Summary: 36 PASSED, 0 FAILED Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Test ARG_PTR_TO_LONG arg typeAndrey Ignatov
Test that verifier handles new argument types properly, including uninitialized or partially initialized value, misaligned stack access, etc. Example of output: #456/p ARG_PTR_TO_LONG uninitialized OK #457/p ARG_PTR_TO_LONG half-uninitialized OK #458/p ARG_PTR_TO_LONG misaligned OK #459/p ARG_PTR_TO_LONG size < sizeof(long) OK #460/p ARG_PTR_TO_LONG initialized OK Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Add sysctl and strtoX helpers to bpf_helpers.hAndrey Ignatov
Add bpf_sysctl_* and bpf_strtoX helpers to bpf_helpers.h. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12bpf: Sync bpf.h to tools/Andrey Ignatov
Sync bpf_strtoX related bpf UAPI changes to tools/. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Test file_pos field in bpf_sysctl ctxAndrey Ignatov
Test access to file_pos field of bpf_sysctl context, both read (incl. narrow read) and write. # ./test_sysctl ... Test case: ctx:file_pos sysctl:read read ok .. [PASS] Test case: ctx:file_pos sysctl:read read ok narrow .. [PASS] Test case: ctx:file_pos sysctl:read write ok .. [PASS] ... Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Test bpf_sysctl_{get,set}_new_value helpersAndrey Ignatov
Test that new value provided by user space on sysctl write can be read by bpf_sysctl_get_new_value and overridden by bpf_sysctl_set_new_value. # ./test_sysctl ... Test case: sysctl_get_new_value sysctl:read EINVAL .. [PASS] Test case: sysctl_get_new_value sysctl:write ok .. [PASS] Test case: sysctl_get_new_value sysctl:write ok long .. [PASS] Test case: sysctl_get_new_value sysctl:write E2BIG .. [PASS] Test case: sysctl_set_new_value sysctl:read EINVAL .. [PASS] Test case: sysctl_set_new_value sysctl:write ok .. [PASS] Summary: 22 PASSED, 0 FAILED Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Test sysctl_get_current_value helperAndrey Ignatov
Test sysctl_get_current_value on sysctl read and write, buffers with enough space and too small buffers to get E2BIG and truncated result, etc. # ./test_sysctl ... Test case: sysctl_get_current_value sysctl:read ok, gt .. [PASS] Test case: sysctl_get_current_value sysctl:read ok, eq .. [PASS] Test case: sysctl_get_current_value sysctl:read E2BIG truncated .. [PASS] Test case: sysctl_get_current_value sysctl:read EINVAL .. [PASS] Test case: sysctl_get_current_value sysctl:write ok .. [PASS] Summary: 16 PASSED, 0 FAILED Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Test bpf_sysctl_get_name helperAndrey Ignatov
Test w/ and w/o BPF_F_SYSCTL_BASE_NAME, buffers with enough space and too small buffers to get E2BIG and truncated result, etc. # ./test_sysctl ... Test case: sysctl_get_name sysctl_value:base ok .. [PASS] Test case: sysctl_get_name sysctl_value:base E2BIG truncated .. [PASS] Test case: sysctl_get_name sysctl:full ok .. [PASS] Test case: sysctl_get_name sysctl:full E2BIG truncated .. [PASS] Test case: sysctl_get_name sysctl:full E2BIG truncated small .. [PASS] Summary: 11 PASSED, 0 FAILED Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Test BPF_CGROUP_SYSCTLAndrey Ignatov
Add unit test for BPF_PROG_TYPE_CGROUP_SYSCTL program type. Test that program can allow/deny access. Test both valid and invalid accesses to ctx->write. Example of output: # ./test_sysctl Test case: sysctl wrong attach_type .. [PASS] Test case: sysctl:read allow all .. [PASS] Test case: sysctl:read deny all .. [PASS] Test case: ctx:write sysctl:read read ok .. [PASS] Test case: ctx:write sysctl:write read ok .. [PASS] Test case: ctx:write sysctl:read write reject .. [PASS] Summary: 6 PASSED, 0 FAILED Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12selftests/bpf: Test sysctl section nameAndrey Ignatov
Add unit test to verify that program and attach types are properly identified for "cgroup/sysctl" section name. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12libbpf: Support sysctl hookAndrey Ignatov
Support BPF_PROG_TYPE_CGROUP_SYSCTL program in libbpf: identifying program and attach types by section name, probe. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12bpf: Sync bpf.h to tools/Andrey Ignatov
Sync BPF_PROG_TYPE_CGROUP_SYSCTL related bpf UAPI changes to tools/. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-11selftests: Add debugging options to pmtu.shDavid Ahern
pmtu.sh script runs a number of tests and dumps a summary of pass/fail. If a test fails, it is near impossible to debug why. For example: TEST: ipv6: PMTU exceptions [FAIL] There are a lot of commands run behind the scenes for this test. Which one is failing? Add a VERBOSE option to show commands that are run and any output from those commands. Add a PAUSE_ON_FAIL option to halt the script if a test fails allowing users to poke around with the setup in the failed state. In the process, rename tracing to TRACING and move declaration to top with the new variables. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-04-12 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Improve BPF verifier scalability for large programs through two optimizations: i) remove verifier states that are not useful in pruning, ii) stop walking parentage chain once first LIVE_READ is seen. Combined gives approx 20x speedup. Increase limits for accepting large programs under root, and add various stress tests, from Alexei. 2) Implement global data support in BPF. This enables static global variables for .data, .rodata and .bss sections to be properly handled which allows for more natural program development. This also opens up the possibility to optimize program workflow by compiling ELFs only once and later only rewriting section data before reload, from Daniel and with test cases and libbpf refactoring from Joe. 3) Add config option to generate BTF type info for vmlinux as part of the kernel build process. DWARF debug info is converted via pahole to BTF. Latter relies on libbpf and makes use of BTF deduplication algorithm which results in 100x savings compared to DWARF data. Resulting .BTF section is typically about 2MB in size, from Andrii. 4) Add BPF verifier support for stack access with variable offset from helpers and add various test cases along with it, from Andrey. 5) Extend bpf_skb_adjust_room() growth BPF helper to mark inner MAC header so that L2 encapsulation can be used for tc tunnels, from Alan. 6) Add support for input __sk_buff context in BPF_PROG_TEST_RUN so that users can define a subset of allowed __sk_buff fields that get fed into the test program, from Stanislav. 7) Add bpf fs multi-dimensional array tests for BTF test suite and fix up various UBSAN warnings in bpftool, from Yonghong. 8) Generate a pkg-config file for libbpf, from Luca. 9) Dump program's BTF id in bpftool, from Prashant. 10) libbpf fix to use smaller BPF log buffer size for AF_XDP's XDP program, from Magnus. 11) kallsyms related fixes for the case when symbols are not present in BPF selftests and samples, from Daniel ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-11tools: add smp_* barrier variants to include infrastructureDaniel Borkmann
Add the definition for smp_rmb(), smp_wmb(), and smp_mb() to the tools include infrastructure: this patch adds the implementation for x86-64 and arm64, and have it fall back as currently is for other archs which do not have it implemented at this point. The x86-64 one uses lock + add combination for smp_mb() with address below red zone. This is on top of 09d62154f613 ("tools, perf: add and use optimized ring_buffer_{read_head, write_tail} helpers"), which didn't touch smp_* barrier implementations. Magnus recently rightfully reported however that the latter on x86-64 still wrongly falls back to sfence, lfence and mfence respectively, thus fix that for applications under tools making use of these to avoid such ugly surprises. The main header under tools (include/asm/barrier.h) will in that case not select the fallback implementation. Reported-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-11selftests_bpf: add L2 encap to test_tc_tunnelAlan Maguire
Update test_tc_tunnel to verify adding inner L2 header encapsulation (an MPLS label or ethernet header) works. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-11bpf: sync bpf.h to tools/ for BPF_F_ADJ_ROOM_ENCAP_L2Alan Maguire
Sync include/uapi/linux/bpf.h with tools/ equivalent to add BPF_F_ADJ_ROOM_ENCAP_L2(len) macro. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-11selftests_bpf: extend test_tc_tunnel for UDP encapAlan Maguire
commit 868d523535c2 ("bpf: add bpf_skb_adjust_room encap flags") introduced support to bpf_skb_adjust_room for GSO-friendly GRE and UDP encapsulation and later introduced associated test_tc_tunnel tests. Here those tests are extended to cover UDP encapsulation also. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-11net: sched: flower: use correct ht function to prevent duplicatesVlad Buslov
Implementation of function rhashtable_insert_fast() check if its internal helper function __rhashtable_insert_fast() returns non-NULL pointer and seemingly return -EEXIST in such case. However, since __rhashtable_insert_fast() is called with NULL key pointer, it never actually checks for duplicates, which means that -EEXIST is never returned to the user. Use rhashtable_lookup_insert_fast() hash table API instead. In order to verify that it works as expected and prevent the problem from happening in future, extend tc-tests with new test that verifies that no new filters with existing key can be inserted to flower classifier. Fixes: 1f17f7742eeb ("net: sched: flower: insert filter to ht before offloading it to hw") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-11selftests: bpf: add selftest for __sk_buff context in BPF_PROG_TEST_RUNStanislav Fomichev
Simple test that sets cb to {1,2,3,4,5} and priority to 6, runs bpf program that fails if cb is not what we expect and increments cb[i] and priority. When the test finishes, we check that cb is now {2,3,4,5,6} and priority is 7. We also test the sanity checks: * ctx_in is provided, but ctx_size_in is zero (same for ctx_out/ctx_size_out) * unexpected non-zero fields in __sk_buff return EINVAL Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-11libbpf: add support for ctx_{size, }_{in, out} in BPF_PROG_TEST_RUNStanislav Fomichev
Support recently introduced input/output context for test runs. We extend only bpf_prog_test_run_xattr. bpf_prog_test_run is unextendable and left as is. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-11tools/bpftool: show btf id in program informationPrashant Bhole
Let's add a way to know whether a program has btf context. Patch adds 'btf_id' in the output of program listing. When btf_id is present, it means program has btf context. Sample output: user@test# bpftool prog list 25: xdp name xdp_prog1 tag 539ec6ce11b52f98 gpl loaded_at 2019-04-10T11:44:20+0900 uid 0 xlated 488B not jited memlock 4096B map_ids 23 btf_id 1 Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-11libbpf: Fix build with gcc-8Andrey Ignatov
Reported in [1]. With gcc 8.3.0 the following error is issued: cc -Ibpf@sta -I. -I.. -I.././include -I.././include/uapi -fdiagnostics-color=always -fsanitize=address,undefined -fno-omit-frame-pointer -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -g -fPIC -g -O2 -Werror -Wall -Wno-pointer-arith -Wno-sign-compare -MD -MQ 'bpf@sta/src_libbpf.c.o' -MF 'bpf@sta/src_libbpf.c.o.d' -o 'bpf@sta/src_libbpf.c.o' -c ../src/libbpf.c ../src/libbpf.c: In function 'bpf_object__elf_collect': ../src/libbpf.c:947:18: error: 'map_def_sz' may be used uninitialized in this function [-Werror=maybe-uninitialized] if (map_def_sz <= sizeof(struct bpf_map_def)) { ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/libbpf.c:827:18: note: 'map_def_sz' was declared here int i, map_idx, map_def_sz, nr_syms, nr_maps = 0, nr_maps_glob = 0; ^~~~~~~~~~ According to [2] -Wmaybe-uninitialized is enabled by -Wall. Same error is generated by clang's -Wconditional-uninitialized. [1] https://github.com/libbpf/libbpf/pull/29#issuecomment-481902601 [2] https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html Fixes: d859900c4c56 ("bpf, libbpf: support global data/bss/rodata sections") Reported-by: Evgeny Vereshchagin <evvers@ya.ru> Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-10libbpf: fix crash in XDP socket part with new larger BPF_LOG_BUF_SIZEMagnus Karlsson
In commit da11b417583e ("libbpf: teach libbpf about log_level bit 2"), the BPF_LOG_BUF_SIZE was increased to 16M. The XDP socket part of libbpf allocated the log_buf on the stack, but for the new 16M buffer size this is not going to work. Change the code so it uses a 16K buffer instead. Fixes: da11b417583e ("libbpf: teach libbpf about log_level bit 2") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-10bpf, bpftool: fix a few ubsan warningsYonghong Song
The issue is reported at https://github.com/libbpf/libbpf/issues/28. Basically, per C standard, for void *memcpy(void *dest, const void *src, size_t n) if "dest" or "src" is NULL, regardless of whether "n" is 0 or not, the result of memcpy is undefined. clang ubsan reported three such instances in bpf.c with the following pattern: memcpy(dest, 0, 0). Although in practice, no known compiler will cause issues when copy size is 0. Let us still fix the issue to silence ubsan warnings. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-09bpf, selftest: add test cases for BTF Var and DataSecDaniel Borkmann
Extend test_btf with various positive and negative tests around BTF verification of kind Var and DataSec. All passing as well: # ./test_btf [...] BTF raw test[4] (global data test #1): OK BTF raw test[5] (global data test #2): OK BTF raw test[6] (global data test #3): OK BTF raw test[7] (global data test #4, unsupported linkage): OK BTF raw test[8] (global data test #5, invalid var type): OK BTF raw test[9] (global data test #6, invalid var type (fwd type)): OK BTF raw test[10] (global data test #7, invalid var type (fwd type)): OK BTF raw test[11] (global data test #8, invalid var size): OK BTF raw test[12] (global data test #9, invalid var size): OK BTF raw test[13] (global data test #10, invalid var size): OK BTF raw test[14] (global data test #11, multiple section members): OK BTF raw test[15] (global data test #12, invalid offset): OK BTF raw test[16] (global data test #13, invalid offset): OK BTF raw test[17] (global data test #14, invalid offset): OK BTF raw test[18] (global data test #15, not var kind): OK BTF raw test[19] (global data test #16, invalid var referencing sec): OK BTF raw test[20] (global data test #17, invalid var referencing var): OK BTF raw test[21] (global data test #18, invalid var loop): OK BTF raw test[22] (global data test #19, invalid var referencing var): OK BTF raw test[23] (global data test #20, invalid ptr referencing var): OK BTF raw test[24] (global data test #21, var included in struct): OK BTF raw test[25] (global data test #22, array of var): OK [...] PASS:167 SKIP:0 FAIL:0 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09bpf, selftest: test global data/bss/rodata sectionsJoe Stringer
Add tests for libbpf relocation of static variable references into the .data, .rodata and .bss sections of the ELF, also add read-only test for .rodata. All passing: # ./test_progs [...] test_global_data:PASS:load program 0 nsec test_global_data:PASS:pass global data run 925 nsec test_global_data_number:PASS:relocate .bss reference 925 nsec test_global_data_number:PASS:relocate .data reference 925 nsec test_global_data_number:PASS:relocate .rodata reference 925 nsec test_global_data_number:PASS:relocate .bss reference 925 nsec test_global_data_number:PASS:relocate .data reference 925 nsec test_global_data_number:PASS:relocate .rodata reference 925 nsec test_global_data_number:PASS:relocate .bss reference 925 nsec test_global_data_number:PASS:relocate .bss reference 925 nsec test_global_data_number:PASS:relocate .rodata reference 925 nsec test_global_data_number:PASS:relocate .rodata reference 925 nsec test_global_data_number:PASS:relocate .rodata reference 925 nsec test_global_data_string:PASS:relocate .rodata reference 925 nsec test_global_data_string:PASS:relocate .data reference 925 nsec test_global_data_string:PASS:relocate .bss reference 925 nsec test_global_data_string:PASS:relocate .data reference 925 nsec test_global_data_string:PASS:relocate .bss reference 925 nsec test_global_data_struct:PASS:relocate .rodata reference 925 nsec test_global_data_struct:PASS:relocate .bss reference 925 nsec test_global_data_struct:PASS:relocate .rodata reference 925 nsec test_global_data_struct:PASS:relocate .data reference 925 nsec test_global_data_rdonly:PASS:test .rodata read-only map 925 nsec [...] Summary: 229 PASSED, 0 FAILED Note map helper signatures have been changed to avoid warnings when passing in const data. Joint work with Daniel Borkmann. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09bpf, selftest: test {rd, wr}only flags and direct value accessDaniel Borkmann
Extend test_verifier with various test cases around the two kernel extensions, that is, {rd,wr}only map support as well as direct map value access. All passing, one skipped due to xskmap not present on test machine: # ./test_verifier [...] #948/p XDP pkt read, pkt_meta' <= pkt_data, bad access 1 OK #949/p XDP pkt read, pkt_meta' <= pkt_data, bad access 2 OK #950/p XDP pkt read, pkt_data <= pkt_meta', good access OK #951/p XDP pkt read, pkt_data <= pkt_meta', bad access 1 OK #952/p XDP pkt read, pkt_data <= pkt_meta', bad access 2 OK Summary: 1410 PASSED, 1 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09bpf: bpftool support for dumping data/bss/rodata sectionsDaniel Borkmann
Add the ability to bpftool to handle BTF Var and DataSec kinds in order to dump them out of btf_dumper_type(). The value has a single object with the section name, which itself holds an array of variables it dumps. A single variable is an object by itself printed along with its name. From there further type information is dumped along with corresponding value information. Example output from .rodata: # ./bpftool m d i 150 [{ "value": { ".rodata": [{ "load_static_data.bar": 18446744073709551615 },{ "num2": 24 },{ "num5": 43947 },{ "num6": 171 },{ "str0": [97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,0,0,0,0,0,0 ] },{ "struct0": { "a": 42, "b": 4278120431, "c": 1229782938247303441 } },{ "struct2": { "a": 0, "b": 0, "c": 0 } } ] } } ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09bpf, libbpf: add support for BTF Var and DataSecDaniel Borkmann
This adds libbpf support for BTF Var and DataSec kinds. Main point here is that libbpf needs to do some preparatory work before the whole BTF object can be loaded into the kernel, that is, fixing up of DataSec size taken from the ELF section size and non-static variable offset which needs to be taken from the ELF's string section. Upstream LLVM doesn't fix these up since at time of BTF emission it is too early in the compilation process thus this information isn't available yet, hence loader needs to take care of it. Note, deduplication handling has not been in the scope of this work and needs to be addressed in a future commit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://reviews.llvm.org/D59441 Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09bpf, libbpf: support global data/bss/rodata sectionsDaniel Borkmann
This work adds BPF loader support for global data sections to libbpf. This allows to write BPF programs in more natural C-like way by being able to define global variables and const data. Back at LPC 2018 [0] we presented a first prototype which implemented support for global data sections by extending BPF syscall where union bpf_attr would get additional memory/size pair for each section passed during prog load in order to later add this base address into the ldimm64 instruction along with the user provided offset when accessing a variable. Consensus from LPC was that for proper upstream support, it would be more desirable to use maps instead of bpf_attr extension as this would allow for introspection of these sections as well as potential live updates of their content. This work follows this path by taking the following steps from loader side: 1) In bpf_object__elf_collect() step we pick up ".data", ".rodata", and ".bss" section information. 2) If present, in bpf_object__init_internal_map() we add maps to the obj's map array that corresponds to each of the present sections. Given section size and access properties can differ, a single entry array map is created with value size that is corresponding to the ELF section size of .data, .bss or .rodata. These internal maps are integrated into the normal map handling of libbpf such that when user traverses all obj maps, they can be differentiated from user-created ones via bpf_map__is_internal(). In later steps when we actually create these maps in the kernel via bpf_object__create_maps(), then for .data and .rodata sections their content is copied into the map through bpf_map_update_elem(). For .bss this is not necessary since array map is already zero-initialized by default. Additionally, for .rodata the map is frozen as read-only after setup, such that neither from program nor syscall side writes would be possible. 3) In bpf_program__collect_reloc() step, we record the corresponding map, insn index, and relocation type for the global data. 4) And last but not least in the actual relocation step in bpf_program__relocate(), we mark the ldimm64 instruction with src_reg = BPF_PSEUDO_MAP_VALUE where in the first imm field the map's file descriptor is stored as similarly done as in BPF_PSEUDO_MAP_FD, and in the second imm field (as ldimm64 is 2-insn wide) we store the access offset into the section. Given these maps have only single element ldimm64's off remains zero in both parts. 5) On kernel side, this special marked BPF_PSEUDO_MAP_VALUE load will then store the actual target address in order to have a 'map-lookup'-free access. That is, the actual map value base address + offset. The destination register in the verifier will then be marked as PTR_TO_MAP_VALUE, containing the fixed offset as reg->off and backing BPF map as reg->map_ptr. Meaning, it's treated as any other normal map value from verification side, only with efficient, direct value access instead of actual call to map lookup helper as in the typical case. Currently, only support for static global variables has been added, and libbpf rejects non-static global variables from loading. This can be lifted until we have proper semantics for how BPF will treat multi-object BPF loads. From BTF side, libbpf will set the value type id of the types corresponding to the ".bss", ".data" and ".rodata" names which LLVM will emit without the object name prefix. The key type will be left as zero, thus making use of the key-less BTF option in array maps. Simple example dump of program using globals vars in each section: # bpftool prog [...] 6784: sched_cls name load_static_dat tag a7e1291567277844 gpl loaded_at 2019-03-11T15:39:34+0000 uid 0 xlated 1776B jited 993B memlock 4096B map_ids 2238,2237,2235,2236,2239,2240 # bpftool map show id 2237 2237: array name test_glo.bss flags 0x0 key 4B value 64B max_entries 1 memlock 4096B # bpftool map show id 2235 2235: array name test_glo.data flags 0x0 key 4B value 64B max_entries 1 memlock 4096B # bpftool map show id 2236 2236: array name test_glo.rodata flags 0x80 key 4B value 96B max_entries 1 memlock 4096B # bpftool prog dump xlated id 6784 int load_static_data(struct __sk_buff * skb): ; int load_static_data(struct __sk_buff *skb) 0: (b7) r6 = 0 ; test_reloc(number, 0, &num0); 1: (63) *(u32 *)(r10 -4) = r6 2: (bf) r2 = r10 ; int load_static_data(struct __sk_buff *skb) 3: (07) r2 += -4 ; test_reloc(number, 0, &num0); 4: (18) r1 = map[id:2238] 6: (18) r3 = map[id:2237][0]+0 <-- direct addr in .bss area 8: (b7) r4 = 0 9: (85) call array_map_update_elem#100464 10: (b7) r1 = 1 ; test_reloc(number, 1, &num1); [...] ; test_reloc(string, 2, str2); 120: (18) r8 = map[id:2237][0]+16 <-- same here at offset +16 122: (18) r1 = map[id:2239] 124: (18) r3 = map[id:2237][0]+16 126: (b7) r4 = 0 127: (85) call array_map_update_elem#100464 128: (b7) r1 = 120 ; str1[5] = 'x'; 129: (73) *(u8 *)(r9 +5) = r1 ; test_reloc(string, 3, str1); 130: (b7) r1 = 3 131: (63) *(u32 *)(r10 -4) = r1 132: (b7) r9 = 3 133: (bf) r2 = r10 ; int load_static_data(struct __sk_buff *skb) 134: (07) r2 += -4 ; test_reloc(string, 3, str1); 135: (18) r1 = map[id:2239] 137: (18) r3 = map[id:2235][0]+16 <-- direct addr in .data area 139: (b7) r4 = 0 140: (85) call array_map_update_elem#100464 141: (b7) r1 = 111 ; __builtin_memcpy(&str2[2], "hello", sizeof("hello")); 142: (73) *(u8 *)(r8 +6) = r1 <-- further access based on .bss data 143: (b7) r1 = 108 144: (73) *(u8 *)(r8 +5) = r1 [...] For Cilium use-case in particular, this enables migrating configuration constants from Cilium daemon's generated header defines into global data sections such that expensive runtime recompilations with LLVM can be avoided altogether. Instead, the ELF file becomes effectively a "template", meaning, it is compiled only once (!) and the Cilium daemon will then rewrite relevant configuration data from the ELF's .data or .rodata sections directly instead of recompiling the program. The updated ELF is then loaded into the kernel and atomically replaces the existing program in the networking datapath. More info in [0]. Based upon recent fix in LLVM, commit c0db6b6bd444 ("[BPF] Don't fail for static variables"). [0] LPC 2018, BPF track, "ELF relocation for static data in BPF", http://vger.kernel.org/lpc-bpf2018.html#session-3 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09bpf, libbpf: refactor relocation handlingJoe Stringer
Adjust the code for relocations slightly with no functional changes, so that upcoming patches that will introduce support for relocations into the .data, .rodata and .bss sections can be added independent of these changes. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09bpf: sync {btf, bpf}.h uapi header from tools infrastructureDaniel Borkmann
Pull in latest changes from both headers, so we can make use of them in libbpf. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09bpf: implement lookup-free direct value access for mapsDaniel Borkmann
This generic extension to BPF maps allows for directly loading an address residing inside a BPF map value as a single BPF ldimm64 instruction! The idea is similar to what BPF_PSEUDO_MAP_FD does today, which is a special src_reg flag for ldimm64 instruction that indicates that inside the first part of the double insns's imm field is a file descriptor which the verifier then replaces as a full 64bit address of the map into both imm parts. For the newly added BPF_PSEUDO_MAP_VALUE src_reg flag, the idea is the following: the first part of the double insns's imm field is again a file descriptor corresponding to the map, and the second part of the imm field is an offset into the value. The verifier will then replace both imm parts with an address that points into the BPF map value at the given value offset for maps that support this operation. Currently supported is array map with single entry. It is possible to support more than just single map element by reusing both 16bit off fields of the insns as a map index, so full array map lookup could be expressed that way. It hasn't been implemented here due to lack of concrete use case, but could easily be done so in future in a compatible way, since both off fields right now have to be 0 and would correctly denote a map index 0. The BPF_PSEUDO_MAP_VALUE is a distinct flag as otherwise with BPF_PSEUDO_MAP_FD we could not differ offset 0 between load of map pointer versus load of map's value at offset 0, and changing BPF_PSEUDO_MAP_FD's encoding into off by one to differ between regular map pointer and map value pointer would add unnecessary complexity and increases barrier for debugability thus less suitable. Using the second part of the imm field as an offset into the value does /not/ come with limitations since maximum possible value size is in u32 universe anyway. This optimization allows for efficiently retrieving an address to a map value memory area without having to issue a helper call which needs to prepare registers according to calling convention, etc, without needing the extra NULL test, and without having to add the offset in an additional instruction to the value base pointer. The verifier then treats the destination register as PTR_TO_MAP_VALUE with constant reg->off from the user passed offset from the second imm field, and guarantees that this is within bounds of the map value. Any subsequent operations are normally treated as typical map value handling without anything extra needed from verification side. The two map operations for direct value access have been added to array map for now. In future other types could be supported as well depending on the use case. The main use case for this commit is to allow for BPF loader support for global variables that reside in .data/.rodata/.bss sections such that we can directly load the address of them with minimal additional infrastructure required. Loader support has been added in subsequent commits for libbpf library. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2019-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Off by one and bounds checking fixes in NFC, from Dan Carpenter. 2) There have been many weird regressions in r8169 since we turned ASPM support on, some are still not understood nor completely resolved. Let's turn this back off for now. From Heiner Kallweit. 3) Signess fixes for ethtool speed value handling, from Michael Zhivich. 4) Handle timestamps properly in macb driver, from Paul Thomas. 5) Two erspan fixes, it's the usual "skb ->data potentially reallocated and we're holding a stale protocol header pointer". From Lorenzo Bianconi. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: bnxt_en: Reset device on RX buffer errors. bnxt_en: Improve RX consumer index validity check. net: macb driver, check for SKBTX_HW_TSTAMP qlogic: qlcnic: fix use of SPEED_UNKNOWN ethtool constant broadcom: tg3: fix use of SPEED_UNKNOWN ethtool constant ethtool: avoid signed-unsigned comparison in ethtool_validate_speed() net: ip6_gre: fix possible use-after-free in ip6erspan_rcv net: ip_gre: fix possible use-after-free in erspan_rcv r8169: disable ASPM again MAINTAINERS: ieee802154: update documentation file pattern net: vrf: Fix ping failed when vrf mtu is set to 0 selftests: add a tc matchall test case nfc: nci: Potential off by one in ->pipes[] array NFC: nci: Add some bounds checking in nci_hci_cmd_received()
2019-04-08selftests/tpm2: Open tpm dev in unbuffered modeTadeusz Struk
In order to have control over how many bytes are read or written the device needs to be opened in unbuffered mode. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-04-08selftests/tpm2: Extend tests to cover partial readsTadeusz Struk
Three new tests added: 1. Send get random cmd, read header in 1st read, read the rest in second read - expect success 2. Send get random cmd, read only part of the response, send another get random command, read the response - expect success 3. Send get random cmd followed by another get random cmd, without reading the first response - expect the second cmd to fail with -EBUSY Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-04-08selftests: fib_tests: Add tests for ipv6 gateway with ipv4 routeDavid Ahern
Add tests for ipv6 gateway with ipv4 route. Tests include basic single path with ping to verify connectivity and multipath. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-07selftests: add a tc matchall test caseNicolas Dichtel
This is a follow up of the commit 0db6f8befc32 ("net/sched: fix ->get helper of the matchall cls"). To test it: $ cd tools/testing/selftests/tc-testing $ ln -s ../plugin-lib/nsPlugin.py plugins/20-nsPlugin.py $ ./tdc.py -n -e 2638 Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-06libbpf: Ignore -Wformat-nonliteral warningAndrey Ignatov
vsprintf() in __base_pr() uses nonliteral format string and it breaks compilation for those who provide corresponding extra CFLAGS, e.g.: https://github.com/libbpf/libbpf/issues/27 If libbpf is built with the flags from PR: libbpf.c:68:26: error: format string is not a string literal [-Werror,-Wformat-nonliteral] return vfprintf(stderr, format, args); ^~~~~~ 1 error generated. Ignore this warning since the use case in libbpf.c is legit. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-06selftests: forwarding: test for bridge mcast traffic after report and leaveNikolay Aleksandrov
This test is split in two, the first part checks if a report creates a corresponding mdb entry and if traffic is properly forwarded to it, and the second part checks if the mdb entry is deleted after a leave and if traffic is *not* forwarded to it. Since the mcast querier is enabled we should see standard mcast snooping bridge behaviour. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor comment merge conflict in mlx5. Staging driver has a fixup due to the skb->xmit_more changes in 'net-next', but was removed in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-05selftests/bpf: Test unbounded var_off stack accessAndrey Ignatov
Test the case when reg->smax_value is too small/big and can overflow, and separately min and max values outside of stack bounds. Example of output: # ./test_verifier #856/p indirect variable-offset stack access, unbounded OK #857/p indirect variable-offset stack access, max out of bound OK #858/p indirect variable-offset stack access, min out of bound OK Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-05selftests/bpf: Test indirect var_off stack access in unpriv modeAndrey Ignatov
Test that verifier rejects indirect stack access with variable offset in unprivileged mode and accepts same code in privileged mode. Since pointer arithmetics is prohibited in unprivileged mode verifier should reject the program even before it gets to helper call that uses variable offset, at the time when that variable offset is trying to be constructed. Example of output: # ./test_verifier ... #859/u indirect variable-offset stack access, priv vs unpriv OK #859/p indirect variable-offset stack access, priv vs unpriv OK Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-05selftests/bpf: Test indirect var_off stack access in raw modeAndrey Ignatov
Test that verifier rejects indirect access to uninitialized stack with variable offset. Example of output: # ./test_verifier ... #859/p indirect variable-offset stack access, uninitialized OK Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Several hash table refcount fixes in batman-adv, from Sven Eckelmann. 2) Use after free in bpf_evict_inode(), from Daniel Borkmann. 3) Fix mdio bus registration in ixgbe, from Ivan Vecera. 4) Unbounded loop in __skb_try_recv_datagram(), from Paolo Abeni. 5) ila rhashtable corruption fix from Herbert Xu. 6) Don't allow upper-devices to be added to vrf devices, from Sabrina Dubroca. 7) Add qmi_wwan device ID for Olicard 600, from Bjørn Mork. 8) Don't leave skb->next poisoned in __netif_receive_skb_list_ptype, from Alexander Lobakin. 9) Missing IDR checks in mlx5 driver, from Aditya Pakki. 10) Fix false connection termination in ktls, from Jakub Kicinski. 11) Work around some ASPM issues with r8169 by disabling rx interrupt coalescing on certain chips. From Heiner Kallweit. 12) Properly use per-cpu qstat values on NOLOCK qdiscs, from Paolo Abeni. 13) Fully initialize sockaddr_in structures in SCTP, from Xin Long. 14) Various BPF flow dissector fixes from Stanislav Fomichev. 15) Divide by zero in act_sample, from Davide Caratti. 16) Fix bridging multicast regression introduced by rhashtable conversion, from Nikolay Aleksandrov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits) ibmvnic: Fix completion structure initialization ipv6: sit: reset ip header pointer in ipip6_rcv net: bridge: always clear mcast matching struct on reports and leaves libcxgb: fix incorrect ppmax calculation vlan: conditional inclusion of FCoE hooks to match netdevice.h and bnx2x sch_cake: Make sure we can write the IP header before changing DSCP bits sch_cake: Use tc_skb_protocol() helper for getting packet protocol tcp: Ensure DCTCP reacts to losses net/sched: act_sample: fix divide by zero in the traffic path net: thunderx: fix NULL pointer dereference in nicvf_open/nicvf_stop net: hns: Fix sparse: some warnings in HNS drivers net: hns: Fix WARNING when remove HNS driver with SMMU enabled net: hns: fix ICMP6 neighbor solicitation messages discard problem net: hns: Fix probabilistic memory overwrite when HNS driver initialized net: hns: Use NAPI_POLL_WEIGHT for hns driver net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw() flow_dissector: rst'ify documentation ipv6: Fix dangling pointer when ipv6 fragment net-gro: Fix GRO flush when receiving a GSO packet. flow_dissector: document BPF flow dissector environment ...
2019-04-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2019-04-04 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Batch of fixes to the existing BPF flow dissector API to support calling BPF programs from the eth_get_headlen context (support for latter is planned to be added in bpf-next), from Stanislav. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>