summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)Author
2019-03-19perf map: Remove map from 'names' tree in __maps__remove()Changbin Du
There are two trees for each map inserted by maps__insert(), so remove it from the 'names' tree in __maps__remove(). Detected with gcc's ASan. Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 1e6285699b30 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190316080556.3075-11-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf hist: Add missing map__put() in error caseChangbin Du
We need to map__put() before returning from failure of sample__resolve_callchain(). Detected with gcc's ASan. Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 9c68ae98c6f7 ("perf callchain: Reference count maps") Link: http://lkml.kernel.org/r/20190316080556.3075-10-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf build-id: Fix memory leak in print_sdt_events()Changbin Du
Detected with gcc's ASan: Direct leak of 4356 byte(s) in 120 object(s) allocated from: #0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) #1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215 #2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339 #3 0x55719af66272 in print_events util/parse-events.c:2542 #4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 #5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520 #9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 40218daea1db ("perf list: Show SDT and pre-cached events") Link: http://lkml.kernel.org/r/20190316080556.3075-7-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf config: Fix a memory leak in collect_config()Changbin Du
Detected with gcc's ASan: Direct leak of 66 byte(s) in 5 object(s) allocated from: #0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) #1 0x560c8761034d in collect_config util/config.c:597 #2 0x560c8760d9cb in get_value util/config.c:169 #3 0x560c8760dfd7 in perf_parse_file util/config.c:285 #4 0x560c8760e0d2 in perf_config_from_file util/config.c:476 #5 0x560c876108fd in perf_config_set__init util/config.c:661 #6 0x560c87610c72 in perf_config_set__new util/config.c:709 #7 0x560c87610d2f in perf_config__init util/config.c:718 #8 0x560c87610e5d in perf_config util/config.c:730 #9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442 #10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Taeung Song <treeze.taeung@gmail.com> Fixes: 20105ca1240c ("perf config: Introduce perf_config_set class") Link: http://lkml.kernel.org/r/20190316080556.3075-6-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf list: Don't forget to drop the reference to the allocated thread_mapChangbin Du
Detected via gcc's ASan: Direct leak of 2048 byte(s) in 64 object(s) allocated from: 6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370) 7 #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43 8 #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85 9 #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250 10 #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382 11 #5 0x556b0f0e3231 in print_events util/parse-events.c:2514 12 #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 13 #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 14 #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 15 #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 16 #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520 17 #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 89896051f8da ("perf tools: Do not put a variable sized type not at the end of a struct") Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf stat: Improve scalingAndi Kleen
The multiplexing scaling in perf stat mysteriously adds 0.5 to the value. This dates back to the original perf tool. Other scaling code doesn't use that strange convention. Remove the extra 0.5. Before: $ perf stat -e 'cycles,cycles,cycles,cycles,cycles,cycles' grep -rq foo Performance counter stats for 'grep -rq foo': 6,403,580 cycles (81.62%) 6,404,341 cycles (81.64%) 6,402,983 cycles (81.62%) 6,399,941 cycles (81.63%) 6,399,451 cycles (81.62%) 6,436,105 cycles (91.87%) 0.005843799 seconds time elapsed 0.002905000 seconds user 0.002902000 seconds sys After: $ perf stat -e 'cycles,cycles,cycles,cycles,cycles,cycles' grep -rq foo Performance counter stats for 'grep -rq foo': 6,422,704 cycles (81.68%) 6,401,842 cycles (81.68%) 6,398,432 cycles (81.68%) 6,397,098 cycles (81.68%) 6,396,074 cycles (81.67%) 6,434,980 cycles (91.62%) 0.005884437 seconds time elapsed 0.003580000 seconds user 0.002356000 seconds sys Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-10-andi@firstfloor.org Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf stat: Fix --no-scaleAndi Kleen
The -c option to enable multiplex scaling has been useless for quite some time because scaling is default. It's only useful as --no-scale to disable scaling. But the non scaling code path has bitrotted and doesn't print anything because perf output code relies on value run/ena information. Also even when we don't want to scale a value it's still useful to show its multiplex percentage. This patch: - Fixes help and documentation to show --no-scale instead of -c - Removes -c, only keeps the long option because -c doesn't support negatives. - Enables running/enabled even with --no-scale - And fixes some other problems in the no-scale output. Before: $ perf stat --no-scale -e cycles true Performance counter stats for 'true': <not counted> cycles 0.000984154 seconds time elapsed After: $ ./perf stat --no-scale -e cycles true Performance counter stats for 'true': 706,070 cycles 0.001219821 seconds time elapsed Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> LPU-Reference: 20190314225002.30108-9-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-xggjvwcdaj2aqy8ib3i4b1g6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf report: Indicate JITed code better in reportAndi Kleen
Print [TID] tid %d instead of the crypted /tmp/perf-%d.map default. % cat >loop.java public class loop { public static void main(String[] args) { for (;;); } } ^D % javac loop.java % perf record java loop ^C Before: % perf report --stdio ... 56.09% java perf-34724.map [.] 0x00007fd5bd021896 19.12% java perf-34724.map [.] 0x00007fd5bd021887 9.79% java perf-34724.map [.] 0x00007fd5bd021783 8.97% java perf-34724.map [.] 0x00007fd5bd02175b After: % perf report --stdio ... 56.09% java [JIT] tid 34724 [.] 0x00007fd5bd021896 19.12% java [JIT] tid 34724 [.] 0x00007fd5bd021887 9.79% java [JIT] tid 34724 [.] 0x00007fd5bd021783 8.97% java [JIT] tid 34724 [.] 0x00007fd5bd02175b Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> LPU-Reference: 20190314225002.30108-7-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-r17l6py9g0sezb7mi1f286gt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf report: Show all sort keys in help outputAndi Kleen
Show all the supported sort keys in the command line help output, so that it's not needed to refer to the manpage. Before: % perf report -h ... -s, --sort <key[,key2...]> sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline, ... Please refer the man page for the complete list. After: % perf report -h ... -s, --sort <key[,key2...]> sort by key(s): overhead overhead_sys overhead_us overhead_guest_sys overhead_guest_us overhead_children sample period pid comm dso symbol parent cpu ... Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-5-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-9r3uz2ch4izoi1uln3f889co@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf record: Allow to limit number of reported perf.data filesAndi Kleen
When doing long term recording and waiting for some event to snapshot on, we often only care about the last minute or so. The --switch-output command line option supports rotating the perf.data file when the size exceeds a threshold. But the disk would still be filled with unnecessary old files. Add a new option to only keep a number of rotated files, so that the disk space usage can be limited. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-3-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-y5u2lik0ragt4vlktz6qc9ks@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Implement browsing of individual samplesAndi Kleen
Now 'perf report' can show whole time periods with 'perf script', but the user still has to find individual samples of interest manually. It would be expensive and complicated to search for the right samples in the whole perf file. Typically users only need to look at a small number of samples for useful analysis. Also the full scripts tend to show samples of all CPUs and all threads mixed up, which can be very confusing on larger systems. Add a new --samples option to save a small random number of samples per hist entry. Use a reservoir sample technique to select a representatve number of samples. Then allow browsing the samples using 'perf script' as part of the hist entry context menu. This automatically adds the right filters, so only the thread or cpu of the sample is displayed. Then we use less' search functionality to directly jump the to the time stamp of the selected sample. It uses different menus for assembler and source display. Assembler needs xed installed and source needs debuginfo. Currently it only supports as many samples as fit on the screen due to some limitations in the slang ui code. Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311174605.GA29294@tassilo.jf.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Support builtin perf script in scripts menuAndi Kleen
The scripts menu traditionally only showed custom perf scripts. Allow to run standard perf script with useful default options too. - Normal perf script - perf script with assembler (needs xed installed) - perf script with source code output (needs debuginfo) - perf script with custom arguments Then we automatically select the right options to display the information in the perf.data file. For example with -b display branch contexts. It's not easily possible to check for xed's existence in advance. perf script usually gives sensible error messages when it's not available. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Support time sort keyAndi Kleen
Add a time sort key to perf report to display samples for different time quantums separately. This allows easier analysis of workloads that change over time, and also will allow looking at the context of samples. % perf record ... % perf report --sort time,overhead,symbol --time-quantum 1ms --stdio ... 0.67% 277061.87300 [.] _dl_start 0.50% 277061.87300 [.] f1 0.50% 277061.87300 [.] f2 0.33% 277061.87300 [.] main 0.29% 277061.87300 [.] _dl_lookup_symbol_x 0.29% 277061.87300 [.] dl_main 0.29% 277061.87300 [.] do_lookup_x 0.17% 277061.87300 [.] _dl_debug_initialize 0.17% 277061.87300 [.] _dl_init_paths 0.08% 277061.87300 [.] check_match 0.04% 277061.87300 [.] _dl_count_modids 1.33% 277061.87400 [.] f1 1.33% 277061.87400 [.] f2 1.33% 277061.87400 [.] main 1.17% 277061.87500 [.] main 1.08% 277061.87500 [.] f1 1.08% 277061.87500 [.] f2 1.00% 277061.87600 [.] main 0.83% 277061.87600 [.] f1 0.83% 277061.87600 [.] f2 1.00% 277061.87700 [.] main Committer notes: Rename 'time' argument to hist_time() to htime to overcome this in older distros: cc1: warnings being treated as errors util/hist.c: In function 'hist_time': util/hist.c:251: error: declaration of 'time' shadows a global declaration /usr/include/time.h:186: error: shadowed declaration is here Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf session: Add process callback to reader objectJiri Olsa
Adding callback function to reader object so callers can process data in different ways. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf header: Add DIR_FORMAT feature to describe directory dataJiri Olsa
The data files layout is described by HEADER_DIR_FORMAT feature. Currently it holds only version number (1): uint64_t version; The current version holds only version value (1) means that data files: - Follow the 'data.*' name format. - Contain raw events data in standard perf format as read from kernel (and need to be sorted) Future versions are expected to describe different data files layout according to special needs. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf data: Make perf_data__size() work over directoryJiri Olsa
Make perf_data__size() return proper size for directory data, summing up all the individual file sizes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf data: Add perf_data__update_dir() functionJiri Olsa
Add perf_data__update_dir() to update the size for every file within the perf.data directory. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf data: Support having perf.data stored as a directoryJiri Olsa
The caller needs to set 'struct perf_data::is_dir flag and the path will be treated as a directory. The 'struct perf_data::file' is initialized and open as 'path/header' file. Add a check to the direcory interface functions to check the is_dir flag. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-2-jolsa@kernel.org [ Be consistent on how to signal failure, i.e. use -1 and let users check errno ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf probe: Fix getting the kernel mapAdrian Hunter
Since commit 4d99e4136580 ("perf machine: Workaround missing maps for x86 PTI entry trampolines"), perf tools has been creating more than one kernel map, however 'perf probe' assumed there could be only one. Fix by using machine__kernel_map() to get the main kernel map. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Xu Yu <xuyu@linux.alibaba.com> Fixes: 4d99e4136580 ("perf machine: Workaround missing maps for x86 PTI entry trampolines") Fixes: d83212d5dd67 ("kallsyms, x86: Export addresses of PTI entry trampolines") Link: http://lkml.kernel.org/r/2ed432de-e904-85d2-5c36-5897ddc5b23b@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Parse time quantumAndi Kleen
Many workloads change over time. 'perf report' currently aggregates the whole time range reported in perf.data. This patch adds an option for a time quantum to quantisize the perf.data over time. This just adds the option, will be used in follow on patches for a time sort key. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-6-andi@firstfloor.org [ Use NSEC_PER_[MU]SEC ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf time-utils: Add utility function to print time stamps in nanosecondsAndi Kleen
Add a utility function to print nanosecond timestamps. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-11-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Support output in nanosecondsAndi Kleen
Upcoming changes add timestamp output in perf report. Add a --ns argument similar to perf script to support nanoseconds resolution when needed. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf script: Support insn output for normal samplesAndi Kleen
perf script -F +insn was only working for PT traces because the PT instruction decoder was filling in the insn/insn_len sample attributes. Support it for non PT samples too on x86 using the existing x86 instruction decoder. This adds some extra checking to ensure that we don't try to decode instructions when using perf.data from a different architecture. % perf record -a sleep 1 % perf script -F ip,sym,insn --xed ffffffff811704c9 remote_function movl %eax, 0x18(%rbx) ffffffff8100bb50 intel_bts_enable_local retq ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff810f1f79 generic_exec_single xor %eax, %eax ffffffff811704c9 remote_function movl %eax, 0x18(%rbx) ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx ffffffff81048610 native_apic_mem_write mov %edi, %edi ... Committer testing: Before: # perf script -F ip,sym,insn --xed | head -5 ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068806 native_write_msr addb %al, (%rax) ffffffffa4068806 native_write_msr addb %al, (%rax) # perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" # After: # perf script -F ip,sym,insn --xed | head -5 ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) # perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" | head -5 ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) # More examples: # perf script -F ip,sym,insn --xed | grep -v native_write_msr | head ffffffffa416b90e tick_check_broadcast_expired btq %rax, 0x1a5f42a(%rip) ffffffffa4956bd0 nmi_cpu_backtrace pushq %r13 ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx ffffffffa4956bf3 nmi_cpu_backtrace popq %r12 ffffffffa4171d5c smp_call_function_single pause ffffffffa4956bdd nmi_cpu_backtrace mov %ebp, %r12d ffffffffa4797e4d menu_select cmp $0x190, %rax ffffffffa4171d5c smp_call_function_single pause ffffffffa405a7d8 nmi_cpu_backtrace_handler callq 0xffffffffa4956bd0 ffffffffa4797f7a menu_select shr $0x3, %rax # Which matches the annotate output modulo resolving callqs: # perf annotate --stdio2 nmi_cpu_backtrace_handler Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period] nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux Percent Disassembly of section .text: ffffffff8105a7d0 <nmi_cpu_backtrace_handler>: nmi_cpu_backtrace_handler(): nmi_trigger_cpumask_backtrace(mask, exclude_self, nmi_raise_cpu_backtrace); } static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs) { 24.45 → callq __fentry__ if (nmi_cpu_backtrace(regs)) mov %rsi,%rdi 75.55 → callq nmi_cpu_backtrace return NMI_HANDLED; movzbl %al,%eax return NMI_DONE; } ← retq # # perf annotate --stdio2 __hrtimer_next_event_base Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period] __hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux Percent Disassembly of section .text: ffffffff8115b910 <__hrtimer_next_event_base>: __hrtimer_next_event_base(): static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base, const struct hrtimer *exclude, unsigned int active, ktime_t expires_next) { → callq __fentry__ <SNIP> 4a: add $0x1,%r14 77.31 mov 0x18(%rax),%rdx shl $0x6,%r14 sub 0x38(%rbx,%r14,1),%rdx if (expires < expires_next) { cmp %r12,%rdx ↓ jge 68 <SNIP> Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-3-andi@firstfloor.org [ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ] [ archinsn.c needs the instruction decoder that is only build when CONFIG_AUXTRACE=y, fix that ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-09Merge tag 'perf-core-for-mingo-5.1-20190307' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core changes from Arnaldo Carvalho de Melo: perf bpf: Arnaldo Carvalho de Melo: - Automatically add BTF ELF markers to 'perf trace' BPF programs, so that tools such as 'bpftool map dump' can pretty print map keys and values. perf c2c: Jiri Olsa: - Fix report for empty NUMA node. perf diff: Jin Yao: - Support --time, --cpu, --pid and --tid filter options. perf probe: Arnaldo Carvalho de Melo: - Clarify error message about not finding kernel modules debuginfo. perf record: Jiri Olsa: - Fixup probing for max attr.precise_ip. perf trace: Arnaldo Carvalho de Melo: - Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic. perf annotate: Arnaldo Carvalho de Melo: - Calculate the max instruction name, align column to that, removing the hardcoded max 6 chars and cope with instructions with names longer than that, such as vpmovmskb, vpcmpeqb, etc. kernel: Song Liu: - Consider events with attr.bpf_event set as side-band. Gustavo A. R. Silva: - Mark expected switch fall-through in perf_event_parse_addr_filter(). Libraries: Jiri Olsa: - Fix leaks and double frees on error paths. libtraceevent: Tony Jones: - Fix buffer overflow in arg_eval(). python scripting: Tony Jones: - More python3 fixes. Trivial: Yang Wei: - Remove needless extra semicolon in clang C++ glue code. Intel PT/BTS: Adrian Hunter: - Improve auxtrace address filter error message when there is no DSO. - Fix divide by zero when TSC is not available. - Further improvements to the export to sqlite/posgresql python scripts and to the GUI sqlviewer, exporting 'parent_id' so that we have enable the creation of call trees. Andi Kleen: - Generalize function to copy from thread addr space from intel-bts code. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-03-06perf data: Force perf_data__open|close zero data->file.pathJiri Olsa
Making sure the data->file.path is zeroed on perf_data__open error path and in perf_data__close, so we don't double free it in case someone call it twice. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf session: Fix double free in perf_data__closeJiri Olsa
We can't call perf_data__close and subsequently perf_session__delete, because it will call perf_data__close again and cause double free for data->file.path. $ perf report -i . incompatible file format (rerun with -v to learn more) free(): double free detected in tcache 2 Aborted (core dumped) In fact we don't need to call perf_data__close at all, because at the time the got out_close is reached, session->data is already initialized, so the perf_data__close call will be triggered from perf_session__delete. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: 2d4f27999b88 ("perf data: Add global path holder") Link: http://lkml.kernel.org/r/20190305152536.21035-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf evsel: Probe for precise_ip with simple attrJiri Olsa
Currently we probe for precise_ip with user specified perf_event_attr, which might fail because of unsupported kernel features, which would get disabled during the open time anyway. Switching the probe to take place on simple hw cycles, so the following record sets proper precise_ip: # perf record -e cycles:P ls # perf evlist -v cycles:P: size: 112, ... precise_ip: 3, ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf tools: Read and store caps/max_precise in perf_pmuJiri Olsa
Read the caps/max_precise value and store it in struct perf_pmu to be used when setting the maximum precise_ip field in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf hist: Fix memory leak of srclineJiri Olsa
We can't allocate he->srcline unconditionaly, only when new hist_entry is created. Moving he->srcline allocation into hist_entry__init function. Original-patch-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf hist: Add error path into hist_entry__initJiri Olsa
Adding error path into hist_entry__init to unify error handling, so every new member does not need to free everything else. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: nageswara r sastry <nasastry@in.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf thread: Generalize function to copy from thread addr space from ↵Andi Kleen
intel-bts code Add a utility function to fetch executable code. Convert one user over to it. There are more places doing that, but they do significantly different actions, so they are not easy to fit into a single library function. Committer changes: . No need to cast around, make 'buf' be a void pointer. . Rename it to thread__memcpy() to reflect the fact it is about copying a chunk of memory from a thread, i.e. from its address space. . No need to have it in a separate object file, move it to thread.[ch] . Check the return of map__load(), the original code didn't do it, but since we're moving this around, check that as well. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/r/20190305144758.12397-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf annotate: Calculate the max instruction name, align column to thatArnaldo Carvalho de Melo
We were hardcoding '6' as the max instruction name, and we have lots that are longer than that, see the diff from two 'P' printed TUI annotations for a libc function that uses instructions with long names, such as 'vpmovmskb' with its 9 chars: --- __strcmp_avx2.annotation.before 2019-03-06 16:31:39.368020425 -0300 +++ __strcmp_avx2.annotation 2019-03-06 16:32:12.079450508 -0300 @@ -2,284 +2,284 @@ Event: cycles:ppp Percent endbr64 - 0.10 mov %edi,%eax + 0.10 mov %edi,%eax - xor %edx,%edx + xor %edx,%edx - 3.54 vpxor %ymm7,%ymm7,%ymm7 + 3.54 vpxor %ymm7,%ymm7,%ymm7 - or %esi,%eax + or %esi,%eax - and $0xfff,%eax + and $0xfff,%eax - cmp $0xf80,%eax + cmp $0xf80,%eax - ↓ jg 370 + ↓ jg 370 - 27.07 vmovdqu (%rdi),%ymm1 + 27.07 vmovdqu (%rdi),%ymm1 - 7.97 vpcmpeqb (%rsi),%ymm1,%ymm0 + 7.97 vpcmpeqb (%rsi),%ymm1,%ymm0 - 2.15 vpminub %ymm1,%ymm0,%ymm0 + 2.15 vpminub %ymm1,%ymm0,%ymm0 - 4.09 vpcmpeqb %ymm7,%ymm0,%ymm0 + 4.09 vpcmpeqb %ymm7,%ymm0,%ymm0 - 0.43 vpmovmskb %ymm0,%ecx + 0.43 vpmovmskb %ymm0,%ecx - 1.53 test %ecx,%ecx + 1.53 test %ecx,%ecx - ↓ je b0 + ↓ je b0 - 5.26 tzcnt %ecx,%edx + 5.26 tzcnt %ecx,%edx - 18.40 movzbl (%rdi,%rdx,1),%eax + 18.40 movzbl (%rdi,%rdx,1),%eax - 7.09 movzbl (%rsi,%rdx,1),%edx + 7.09 movzbl (%rsi,%rdx,1),%edx - 3.34 sub %edx,%eax + 3.34 sub %edx,%eax 2.37 vzeroupper ← retq nop - 50: tzcnt %ecx,%edx + 50: tzcnt %ecx,%edx - movzbl 0x20(%rdi,%rdx,1),%eax + movzbl 0x20(%rdi,%rdx,1),%eax - movzbl 0x20(%rsi,%rdx,1),%edx + movzbl 0x20(%rsi,%rdx,1),%edx - sub %edx,%eax + sub %edx,%eax vzeroupper ← retq - data16 nopw %cs:0x0(%rax,%rax,1) + data16 nopw %cs:0x0(%rax,%rax,1) Reported-by: Travis Downs <travis.downs@gmail.com> LPU-Reference: CAOBGo4z1KfmWeOm6Et0cnX5Z6DWsG2PQbAvRn1MhVPJmXHrc5g@mail.gmail.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-89wsdd9h9g6bvq52sgp6d0u4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Lots of tooling updates - too many to list, here's a few highlights: - Various subcommand updates to 'perf trace', 'perf report', 'perf record', 'perf annotate', 'perf script', 'perf test', etc. - CPU and NUMA topology and affinity handling improvements, - HW tracing and HW support updates: - Intel PT updates - ARM CoreSight updates - vendor HW event updates - BPF updates - Tons of infrastructure updates, both on the build system and the library support side - Documentation updates. - ... and lots of other changes, see the changelog for details. Kernel side updates: - Tighten up kprobes blacklist handling, reduce the number of places where developers can install a kprobe and hang/crash the system. - Fix/enhance vma address filter handling. - Various PMU driver updates, small fixes and additions. - refcount_t conversions - BPF updates - error code propagation enhancements - misc other changes" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits) perf script python: Add Python3 support to syscall-counts-by-pid.py perf script python: Add Python3 support to syscall-counts.py perf script python: Add Python3 support to stat-cpi.py perf script python: Add Python3 support to stackcollapse.py perf script python: Add Python3 support to sctop.py perf script python: Add Python3 support to powerpc-hcalls.py perf script python: Add Python3 support to net_dropmonitor.py perf script python: Add Python3 support to mem-phys-addr.py perf script python: Add Python3 support to failed-syscalls-by-pid.py perf script python: Add Python3 support to netdev-times.py perf tools: Add perf_exe() helper to find perf binary perf script: Handle missing fields with -F +.. perf data: Add perf_data__open_dir_data function perf data: Add perf_data__(create_dir|close_dir) functions perf data: Fail check_backup in case of error perf data: Make check_backup work over directories perf tools: Add rm_rf_perf_data function perf tools: Add pattern name checking to rm_rf perf tools: Add depth checking to rm_rf perf data: Add global path holder ...
2019-03-06perf clang: Remove needless extra semicolonYang Wei
Delete a superfluous semicolon in getBPFObjectFromModule(). Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Wei <albin_yang@163.com> Link: http://lkml.kernel.org/r/1551710174-3349-1-git-send-email-albin_yang@163.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-01perf db-export: Add calls parent_id to enable creation of call treesAdrian Hunter
The call_path can be used to find the parent symbol for a call but not the exact parent call. To do that add parent_id to the call_return export. This enables the creation of a call tree from the exported data. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lkml.kernel.org/n/tip-6j7tzdxo67cox6kan7k22oo6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-01perf intel-pt: Fix divide by zero when TSC is not availableAdrian Hunter
When TSC is not available, "timeless" decoding is used but a divide by zero occurs if perf_time_to_tsc() is called. Ensure the divisor is not zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.9+ Link: https://lkml.kernel.org/n/tip-1i4j0wqoc8vlbkcizqqxpsf4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-01perf auxtrace: Improve address filter error message when there is no DSOAdrian Hunter
The message does not indicate the possibility that the symbol is not found because the file does not exist. Before: $ perf record -e intel_pt//u --filter 'filter strcmp / strcpy @ foo ' ls Symbol 'strcmp' not found. Note that symbols must be functions. Failed to parse address filter: 'filter strcmp / strcpy @ foo ' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. After: $ perf record -e intel_pt//u --filter 'filter strcmp / strcpy @ foo ' ls File 'foo' not found or has no symbols. Symbol 'strcmp' not found. Note that symbols must be functions. Failed to parse address filter: 'filter strcmp / strcpy @ foo ' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lkml.kernel.org/n/tip-dvngzxd0jkplzw1ary69dilb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-01perf time-utils: Refactor time range parsing codeJin Yao
Jiri points out that we don't need any time checking and time string parsing if the --time option is not set. That makes sense. This patch refactors the time range parsing code, move the duplicated code from perf report and perf script to time_utils and check if --time option is set before parsing the time string. This patch is no logic change expected. So the usage of --time is same as before. For example: Select the first and second 10% time slices: perf report --time 10%/1,10%/2 perf script --time 10%/1,10%/2 Select the slices from 0% to 10% and from 30% to 40%: perf report --time 0%-10%,30%-40% perf script --time 0%-10%,30%-40% Select the time slices from timestamp 3971 to 3973 perf report --time 3971,3973 perf script --time 3971,3973 Committer testing: Using the above examples, check before and after to see if it remains the same: $ perf record -F 10000 -- find . -name "*.[ch]" -exec cat {} + > /dev/null [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 1.626 MB perf.data (42392 samples) ] $ $ perf report --time 10%/1,10%/2 > /tmp/report.before.1 $ perf script --time 10%/1,10%/2 > /tmp/script.before.1 $ perf report --time 0%-10%,30%-40% > /tmp/report.before.2 $ perf script --time 0%-10%,30%-40% > /tmp/script.before.2 $ perf report --time 180457.375844,180457.377717 > /tmp/report.before.3 $ perf script --time 180457.375844,180457.377717 > /tmp/script.before.3 For example, the 3rd test produces this slice: $ cat /tmp/script.before.3 cat 3147 180457.375844: 2143 cycles:uppp: 7f79362590d9 cfree@GLIBC_2.2.5+0x9 (/usr/lib64/libc-2.28.so) cat 3147 180457.375986: 2245 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat) cat 3147 180457.376012: 2164 cycles:uppp: 7f7936257430 _int_malloc+0x8c0 (/usr/lib64/libc-2.28.so) cat 3147 180457.376140: 2921 cycles:uppp: 558b70f3a554 [unknown] (/usr/bin/cat) cat 3147 180457.376296: 2844 cycles:uppp: 7f7936258abe malloc+0x4e (/usr/lib64/libc-2.28.so) cat 3147 180457.376431: 2717 cycles:uppp: 558b70f3b0ca [unknown] (/usr/bin/cat) cat 3147 180457.376667: 2630 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat) cat 3147 180457.376795: 2442 cycles:uppp: 7f79362bff55 read+0x15 (/usr/lib64/libc-2.28.so) cat 3147 180457.376927: 2376 cycles:uppp: ffffffff9aa00163 [unknown] ([unknown]) cat 3147 180457.376954: 2307 cycles:uppp: 7f7936257438 _int_malloc+0x8c8 (/usr/lib64/libc-2.28.so) cat 3147 180457.377116: 3091 cycles:uppp: 7f7936258a70 malloc+0x0 (/usr/lib64/libc-2.28.so) cat 3147 180457.377362: 2945 cycles:uppp: 558b70f3a3b0 [unknown] (/usr/bin/cat) cat 3147 180457.377517: 2727 cycles:uppp: 558b70f3a9aa [unknown] (/usr/bin/cat) $ Install 'coreutils-debuginfo' to see cat's guts (symbols), but then, the above chunk translates into this 'perf report' output: $ cat /tmp/report.before.3 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 13 of event 'cycles:uppp' (time slices: 180457.375844,180457.377717) # Event count (approx.): 33552 # # Overhead Command Shared Object Symbol # ........ ....... ................ ...................... # 17.69% cat libc-2.28.so [.] malloc 14.53% cat cat [.] 0x000000000000586e 13.33% cat libc-2.28.so [.] _int_malloc 8.78% cat cat [.] 0x00000000000023b0 8.71% cat cat [.] 0x0000000000002554 8.13% cat cat [.] 0x00000000000029aa 8.10% cat cat [.] 0x00000000000030ca 7.28% cat libc-2.28.so [.] read 7.08% cat [unknown] [k] 0xffffffff9aa00163 6.39% cat libc-2.28.so [.] cfree@GLIBC_2.2.5 # # (Tip: Order by the overhead of source file name and line number: perf report -s srcline) # $ Now lets see after applying this patch, nothing should change: $ perf report --time 10%/1,10%/2 > /tmp/report.after.1 $ perf script --time 10%/1,10%/2 > /tmp/script.after.1 $ perf report --time 0%-10%,30%-40% > /tmp/report.after.2 $ perf script --time 0%-10%,30%-40% > /tmp/script.after.2 $ perf report --time 180457.375844,180457.377717 > /tmp/report.after.3 $ perf script --time 180457.375844,180457.377717 > /tmp/script.after.3 $ diff -u /tmp/report.before.1 /tmp/report.after.1 $ diff -u /tmp/script.before.1 /tmp/script.after.1 $ diff -u /tmp/report.before.2 /tmp/report.after.2 --- /tmp/report.before.2 2019-03-01 11:01:53.526094883 -0300 +++ /tmp/report.after.2 2019-03-01 11:09:18.231770467 -0300 @@ -352,5 +352,5 @@ # -# (Tip: Generate a script for your data: perf script -g <lang>) +# (Tip: Treat branches as callchains: perf report --branch-history) # $ diff -u /tmp/script.before.2 /tmp/script.after.2 $ diff -u /tmp/report.before.3 /tmp/report.after.3 --- /tmp/report.before.3 2019-03-01 11:03:08.890045588 -0300 +++ /tmp/report.after.3 2019-03-01 11:09:40.660224002 -0300 @@ -22,5 +22,5 @@ # -# (Tip: Order by the overhead of source file name and line number: perf report -s srcline) +# (Tip: List events using substring match: perf list <keyword>) # $ diff -u /tmp/script.before.3 /tmp/script.after.3 $ Cool, just the 'perf report' tips changed, QED. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1551435186-6008-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-01tools: libbpf: add a correctly named define for map iterationJakub Kicinski
For historical reasons the helper to loop over maps in an object is called bpf_map__for_each while it really should be called bpf_object__for_each_map. Rename and add a correctly named define for backward compatibility. Switch all in-tree users to the correct name (Quentin). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-28perf probe: Clarify error message about not finding kernel modules debuginfoArnaldo Carvalho de Melo
'perf probe' supports using just the kernel module name, but that will work only when the module is loaded, or using the full pathname to the file with the DWARF debug info, but the warning was cryptic: Before: # perf probe -m cls_flower -L fl_change Failed to find the path for cls_flower: No such file or directory Error: Failed to show lines. # After: # perf probe -m cls_flower -L fl_change Module cls_flower is not loaded, please specify its full path name. Error: Failed to show lines. # perf probe -m /lib/modules/5.0.0-rc7+/kernel/net/sched/cls_flower.ko -L fl_change | head -7 <fl_change@/home/acme/git/linux/net/sched/cls_flower.c:0> 0 static int fl_change(struct net *net, struct sk_buff *in_skb, struct tcf_proto *tp, unsigned long base, u32 handle, struct nlattr **tca, void **arg, bool ovr, struct netlink_ext_ack *extack) 4 { 5 struct cls_fl_head *head = rtnl_dereference(tp->root); # The behaviour doesn't change when the module is loaded: # modprobe cls_flower # perf probe -m cls_flower -L fl_change | head -7 <fl_change@/home/acme/git/linux/net/sched/cls_flower.c:0> 0 static int fl_change(struct net *net, struct sk_buff *in_skb, struct tcf_proto *tp, unsigned long base, u32 handle, struct nlattr **tca, void **arg, bool ovr, struct netlink_ext_ack *extack) 4 { 5 struct cls_fl_head *head = rtnl_dereference(tp->root); # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Marcelo Ricardo Leitner <mleitner@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-q4njvk9mshra00jacqjbzfn5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25perf tools: Add perf_exe() helper to find perf binaryAndi Kleen
Also convert one existing user. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190224153722.27020-9-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25perf data: Add perf_data__open_dir_data functionJiri Olsa
Add perf_data__open_dir_data to open files inside 'struct perf_data' path directory: static int perf_data__open_dir(struct perf_data *data); Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190224190656.30163-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25perf data: Add perf_data__(create_dir|close_dir) functionsJiri Olsa
Add perf_data__create_dir() to create nr files inside 'struct perf_data' path directory: int perf_data__create_dir(struct perf_data *data, int nr); and function to close that data: void perf_data__close_dir(struct perf_data *data); Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190224190656.30163-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25perf data: Fail check_backup in case of errorJiri Olsa
And display the error message from removing the old data file: $ perf record ls Can't remove old data: Permission denied (perf.data.old) Perf session creation failed. $ perf record ls Can't remove old data: Unknown file found (perf.data.old) Perf session creation failed. Not sure how to make fail the rename (after we successfully remove the destination file/dir) to show the message, anyway let's have it there. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190224190656.30163-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25perf data: Make check_backup work over directoriesJiri Olsa
Change check_backup() to call rm_rf_perf_data() instead of unlink() to work over directory paths. Also move the call earlier in the code, before we fork for file/dir, so it can backup also directory data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190224190656.30163-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25perf tools: Add rm_rf_perf_data functionJiri Olsa
To remove perf.data including the directory, with checking on expected files and no other directories inside. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Suggested-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190224190656.30163-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25perf tools: Add pattern name checking to rm_rfJiri Olsa
Add pattern argument to rm_rf_depth() (and rename it to rm_rf_depth_pat()) to specify the name pattern files need to match inside the directory. The function fails if we find different file to remove. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190224190656.30163-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25perf tools: Add depth checking to rm_rfJiri Olsa
Adding depth argument to rm_rf (and renaming it to rm_rf_depth) to specify the depth we will go searching for files to remove. It will be used to specify single depth for perf.data directory removal in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190224190656.30163-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-22perf data: Add global path holderJiri Olsa
Add a 'path' member to 'struct perf_data'. It will keep the configured path for the data (const char *). The path in struct perf_data_file is now dynamically allocated (duped) from it. This scheme is useful/used in following patches where struct perf_data::path holds the 'configure' directory path and struct perf_data_file::path holds the allocated path for specific files. Also it actually makes the code little simpler. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org [ Fixup data-convert-bt.c missing conversion ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-22perf data: Move size to struct perf_data_fileJiri Olsa
We are about to add support for multiple files, so we need each file to keep its size. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190221094145.9151-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>