Age | Commit message (Collapse) | Author |
|
Fix the perf-probe --functions option do not show the PLT
stub symbols (*@plt) by default.
-----
$ ./perf probe -x /usr/lib64/libc-2.33.so -F | head
a64l
abort
abs
accept
accept4
access
acct
addmntent
addseverity
adjtime
-----
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhriamat@kernel.org>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162532653450.393143.12621329879630677469.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In Fedora34, libc-2.33.so has both .dynsym and .symtab sections and
most of (not all) symbols moved to .dynsym. In this case, perf only
decode the symbols in .symtab, and perf probe can not list up the
functions in the library.
To fix this issue, decode both .symtab and .dynsym sections.
Without this fix,
-----
$ ./perf probe -x /usr/lib64/libc-2.33.so -F
@plt
@plt
calloc@plt
free@plt
malloc@plt
memalign@plt
realloc@plt
-----
With this fix.
-----
$ ./perf probe -x /usr/lib64/libc-2.33.so -F
@plt
@plt
a64l
abort
abs
accept
accept4
access
acct
addmntent
-----
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162532652681.393143.10163733179955267999.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix debuginfo__new() to set the build-id to dso before
dso__read_binary_type_filename() so that it can find
DSO_BINARY_TYPE__BUILDID_DEBUGINFO debuginfo correctly.
However, this may not change the result, because elfutils (libdwfl) has
its own debuginfo finder. With/without this patch, the perf probe
correctly find the debuginfo file.
This is just a failsafe and keep code's sanity (if you use
dso__read_binary_type_filename(), you must set the build-id to the dso.)
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhriamat@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162532651863.393143.11692691321219235810.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick the changes in these csets:
64c2c2c62f92339b ("quota: Change quotactl_path() systcall to an fd-based one")
65ffb3d69ed3da28 ("quota: Wire up quotactl_fd syscall")
That silences these perf build warnings and add support for those new
syscalls in tools such as 'perf trace'.
For instance, this is now possible:
# perf trace -v -e quota*
event qualifier tracepoint filter: (common_pid != 158365 && common_pid != 2512) && (id == 179 || id == 443)
^C#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep quota tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
179 common quotactl sys_quotactl
443 common quotactl_fd sys_quotactl_fd
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Recently bperf was added to use BPF to count perf events for various
purposes. This is an extension for the approach and targetting to
cgroup usages.
Unlike the other bperf, it doesn't share the events with other
processes but it'd reduce unnecessary events (and the overhead of
multiplexing) for each monitored cgroup within the perf session.
When --for-each-cgroup is used with --bpf-counters, it will open
cgroup-switches event per cpu internally and attach the new BPF
program to read given perf_events and to aggregate the results for
cgroups. It's only called when task is switched to a task in a
different cgroup.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210701211227.1403788-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Current 'perf report' fails to process a pipe input when --task or
--stat options are used. This is because they reset all the tool
callbacks and fails to find a matching event for a sample.
When pipe input is used, the event info is passed via ATTR records so it
needs to handle that operation. Otherwise the following error occurs.
Note, -14 (= -EFAULT) comes from evlist__parse_sample():
# perf record -a -o- sleep 1 | perf report -i- --stat
Can't parse sample, err = -14
0x271044 [0x38]: failed to process type: 9
Error:
failed to process sample
#
Committer testing:
Before:
$ perf record -o- sleep 1 | perf report -i- --stat
Can't parse sample, err = -14
[ perf record: Woken up 1 times to write data ]
0x1350 [0x30]: failed to process type: 9
Error:
failed to process sample
[ perf record: Captured and wrote 0.000 MB - ]
$
After:
$ perf record -o- sleep 1 | perf report -i- --stat
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
Aggregated stats:
TOTAL events: 41
COMM events: 2 ( 4.9%)
EXIT events: 1 ( 2.4%)
SAMPLE events: 9 (22.0%)
MMAP2 events: 4 ( 9.8%)
ATTR events: 1 ( 2.4%)
FINISHED_ROUND events: 1 ( 2.4%)
THREAD_MAP events: 1 ( 2.4%)
CPU_MAP events: 1 ( 2.4%)
EVENT_UPDATE events: 1 ( 2.4%)
TIME_CONV events: 1 ( 2.4%)
FEATURE events: 19 (46.3%)
cycles:uhH stats:
SAMPLE events: 9
$
Fixes: a4a4d0a7a2b20f78 ("perf report: Add --stats option to display quick data statistics")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210630043058.1131295-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
ASan reports a memory leak caused by evlist not being deleted on exit in
perf-report, perf-script and perf-data.
The problem is caused by evlist->session not being deleted, which is
allocated in perf_session__read_header, called in perf_session__new if
perf_data is in read mode.
In case of write mode, the session->evlist is filled by the caller.
This patch solves the problem by calling evlist__delete in
perf_session__delete if perf_data is in read mode.
Changes in v2:
- call evlist__delete from within perf_session__delete
v1: https://lore.kernel.org/lkml/20210621234317.235545-1-rickyman7@gmail.com/
ASan report follows:
$ ./perf script report flamegraph
=================================================================
==227640==ERROR: LeakSanitizer: detected memory leaks
<SNIP unrelated>
Indirect leak of 2704 byte(s) in 1 object(s) allocated from:
#0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
#2 0x7f999e in evlist__new /home/user/linux/tools/perf/util/evlist.c:77:26
#3 0x8ad938 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3797:20
#4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
#5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
#6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
#7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
#8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
#9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
#10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
#11 0x7f5260654b74 (/lib64/libc.so.6+0x27b74)
Indirect leak of 568 byte(s) in 1 object(s) allocated from:
#0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
#2 0x80ce88 in evsel__new_idx /home/user/linux/tools/perf/util/evsel.c:268:24
#3 0x8aed93 in evsel__new /home/user/linux/tools/perf/util/evsel.h:210:9
#4 0x8ae07e in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3853:11
#5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
#6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
#7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
#8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
#9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
#10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
#11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
#12 0x7f5260654b74 (/lib64/libc.so.6+0x27b74)
Indirect leak of 264 byte(s) in 1 object(s) allocated from:
#0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
#2 0xbe3e70 in xyarray__new /home/user/linux/tools/lib/perf/xyarray.c:10:23
#3 0xbd7754 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:361:21
#4 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7
#5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
#6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
#7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
#8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
#9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
#10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
#11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
#12 0x7f5260654b74 (/lib64/libc.so.6+0x27b74)
Indirect leak of 32 byte(s) in 1 object(s) allocated from:
#0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
#2 0xbd77e0 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:365:14
#3 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7
#4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
#5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
#6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
#7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
#8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
#9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
#10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
#11 0x7f5260654b74 (/lib64/libc.so.6+0x27b74)
Indirect leak of 7 byte(s) in 1 object(s) allocated from:
#0 0x4b8207 in strdup (/home/user/linux/tools/perf/perf+0x4b8207)
#1 0x8b4459 in evlist__set_event_name /home/user/linux/tools/perf/util/header.c:2292:16
#2 0x89d862 in process_event_desc /home/user/linux/tools/perf/util/header.c:2313:3
#3 0x8af319 in perf_file_section__process /home/user/linux/tools/perf/util/header.c:3651:9
#4 0x8aa6e9 in perf_header__process_sections /home/user/linux/tools/perf/util/header.c:3427:9
#5 0x8ae3e7 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3886:2
#6 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
#7 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
#8 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
#9 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
#10 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
#11 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
#12 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
#13 0x7f5260654b74 (/lib64/libc.so.6+0x27b74)
SUMMARY: AddressSanitizer: 3728 byte(s) leaked in 7 allocation(s).
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210624231926.212208-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In perf annotate, when 's' is pressed on a line containing source code,
it shows the message "Only available for assembly lines".
This patch gets rid of the error, moving the cursr to the next available
asm line (or the closest previous one if no asm line is found moving
forwards), before hiding source code lines.
Changes in v2:
- handle case of no asm line found in
annotate_browser__find_next_asm_line by returning NULL and
handling error in caller.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210624223423.189550-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a function, for use by dlfilters, to read object code.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a function, for use by dlfilters, to return the perf_event_attr
structure.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a function, for use by dlfilters, to return source code file name and
line number.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a function, for use by dlfilters, to return instruction bytes.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a function, for use by dlfilters, to resolve addresses from branch
stacks or callchains.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Users of the --dlfilter option need to include perf_dlfilter.h
in their filters. Install it to the include path.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add option --dlarg to pass arguments to dlfilters. The --dlarg option can
be repeated to pass more than 1 argument.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add option --list-dlfilters to list dlfilters in the current directory or
the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must
come before option --list-dlfilters) to show long descriptions.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
filter_event_early() can be more than 30% faster than filter_event()
because it is called before internal filtering. In other respects it
is the same as filter_event(), except that it will be passed events
that have yet to be filtered out.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In some cases, users want to filter very large amounts of data (e.g.
from AUX area tracing like Intel PT) looking for something specific.
While scripting such as Python can be used, Python is 10 to 20 times
slower than C. So define a C API so that custom filters can be written
and loaded.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Zhihao sent a patch but it made llvm__compile_bpf() return what
asprintf() returns on error, which is just -1, but since this function
returns -errno, fix it by returning -ENOMEM for this case instead.
Fixes: cb76371441d098 ("perf llvm: Allow passing options to llc ...")
Fixes: 5eab5a7ee032ac ("perf llvm: Display eBPF compiling command ...")
Reported-by: Hulk Robot <hulkci@huawei.com>
Reported-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210609115945.2193194-1-chengzhihao1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently, timeless mode starts the decode on PERF_RECORD_EXIT, and
non-timeless mode starts decoding on the fist PERF_RECORD_AUX record.
This can cause the "data has no samples!" error if the first
PERF_RECORD_AUX record comes before the first (or any relevant)
PERF_RECORD_MMAP2 record because the mmaps are required by the decoder
to access the binary data.
This change pushes the start of non-timeless decoding to the very end of
parsing the file. The PERF_RECORD_EXIT event can't be used because it
might not exist in system-wide or snapshot modes.
I have not been able to find the exact cause for the events to be
intermittently in the wrong order in the basic scenario:
perf record -e cs_etm/@tmc_etr0/u top
But it can be made to happen every time with the --delay option. This is
because "enable_on_exec" is disabled, which causes tracing to start
before the process to be launched is exec'd. For example:
perf record -e cs_etm/@tmc_etr0/u --delay=1 top
perf report -D | grep 'AUX\|MAP'
0 16714475632740 0x520 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x30 flags: 0 []
0 16714476494960 0x5d0 [0x40]: PERF_RECORD_AUX offset: 0x30 size: 0x30 flags: 0 []
0 16714478208900 0x660 [0x40]: PERF_RECORD_AUX offset: 0x60 size: 0x30 flags: 0 []
4294967295 16714478293340 0x700 [0x70]: PERF_RECORD_MMAP2 8712/8712: [0x557a460000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
4294967295 16714478353020 0x770 [0x88]: PERF_RECORD_MMAP2 8712/8712: [0x7f86f72000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
Another scenario in which decoding from the first aux record fails is a
workload that forks. Although the aux record comes after 'bash', it
comes before 'top', which is what we are interested in. For example:
perf record -e cs_etm/@tmc_etr0/u -- bash -c top
perf report -D | grep 'AUX\|MAP'
4294967295 16853946421300 0x510 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x558f280000(0x142000) @ 0 00:17 5213953 0]: r-xp /usr/bin/bash
4294967295 16853946543560 0x580 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba6e000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
4294967295 16853946628420 0x608 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba9e000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
0 16853947067300 0x690 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x3a60 flags: 0 []
...
0 16853966602580 0x1758 [0x40]: PERF_RECORD_AUX offset: 0xc2470 size: 0x30 flags: 0 []
4294967295 16853967119860 0x1818 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x5559e70000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
4294967295 16853967181620 0x1888 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed06000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
4294967295 16853967237180 0x1910 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed36000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
A third scenario is when the majority of time is spent in a shared
library that is not loaded at startup. For example a dynamically loaded
plugin.
Testing
=======
Testing was done by checking if any samples that are present in the
old output are missing from the new output. Timestamps must be
stripped out with awk because now they are set to the last AUX sample,
rather than the first:
./perf script $4 | awk '!($4="")' > new.script
./perf-default script $4 | awk '!($4="")' > default.script
comm -13 <(sort -u new.script) <(sort -u default.script)
Testing showed that the new output is a superset of the old. When lines
appear in the comm output, it is not because they are missing but
because [unknown] is now resolved to sensible locations. For example
last putp branch here now resolves to libtinfo, so it's not missing
from the output, but is actually improved:
Old:
top 305 [001] 1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
top 305 [001] 1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
top 305 [001] 1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 0 [unknown] ([unknown])
New:
top 305 [001] 1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
top 305 [001] 1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
top 305 [001] 1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 7f8ab39208 putp+0x0 (/lib/libtinfo.so.5.9)
In the following two modes, decoding now works and the "data has no
samples!" error is not displayed any more:
perf record -e cs_etm/@tmc_etr0/u -- bash -c top
perf record -e cs_etm/@tmc_etr0/u --delay=1 top
In snapshot mode, there is also an improvement to decoding. Previously
samples for the 'kill' process that was used to send SIGUSR2 were
completely missing, because the process hadn't started yet. But now
there are additional samples present:
perf record -e cs_etm/@tmc_etr0/u --snapshot -a
perf script
stress 19380 [003] 161627.938153: 1000000 instructions:uH: aaaabb612fb4 [unknown] (/usr/bin/stress)
kill 19644 [000] 161627.938153: 1000000 instructions:uH: ffffae0ef210 [unknown] (/lib/aarch64-linux-gnu/ld-2.27.so)
stress 19380 [003] 161627.938153: 1000000 instructions:uH: ffff9e754d40 random_r+0x20 (/lib/aarch64-linux-gnu/libc-2.27.so)
Also tested was the round trip of 'perf inject' followed by 'perf
report' which has the same differences and improvements.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210609130421.13934-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
perf event) for processing trace data, which is needless and even might
cause logic error, e.g. it might fail to correlate perf events with Arm
SPE events correctly.
So this patch removes the condition checking for PERF_RECORD_EXIT event.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210519071939.1598923-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It's possible that record in Arm SPE trace is later than perf event and
vice versa. This asks to correlate the perf events and Arm SPE
synthesized events to be processed in the manner of correct timing.
To achieve the time ordering, this patch reverses the flow, it firstly
calls arm_spe_sample() and then calls arm_spe_decode(). By comparing
the timestamp value and detect the perf event is coming earlier than Arm
SPE trace data, it bails out from the decoding loop, the last record is
pushed into auxtrace stack and is deferred to generate sample. To track
the timestamp, everytime it updates timestamp for the latest record.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.
To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.
This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
During the recording phase, "perf record" tool synthesizes event
PERF_RECORD_TIME_CONV for the hardware clock parameters and saves the
event into the data file.
Afterwards, when processing the data file, the event TIME_CONV will be
processed at the very early time and is stored into session context.
This patch extracts these parameters from the session context and saves
into the structure "spe->tc" with the type perf_tsc_conversion, so that
the parameters are ready for conversion between clock counter and time
stamp.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The callback cs_etm_find_snapshot() is invoked for snapshot mode, its
main purpose is to find the correct AUX trace data and returns "head"
and "old" (we can call "old" as "old head") to the caller, the caller
__auxtrace_mmap__read() uses these two pointers to decide the AUX trace
data size.
This patch removes cs_etm_find_snapshot() with below reasons:
- The first thing in cs_etm_find_snapshot() is to check if the head has
wrapped around, if it is not, directly bails out. The checking is
pointless, this is because the "head" and "old" pointers both are
monotonical increasing so they never wrap around.
- cs_etm_find_snapshot() adjusts the "head" and "old" pointers and
assumes the AUX ring buffer is fully filled with the hardware trace
data, so it always subtracts the difference "mm->len" from "head" to
get "old". Let's imagine the snapshot is taken in very short
interval, the tracers only fill a small chunk of the trace data into
the AUX ring buffer, in this case, it's wrongly to copy the whole the
AUX ring buffer to perf file.
- As the "head" and "old" pointers are monotonically increased, the
function __auxtrace_mmap__read() handles these two pointers properly.
It calculates the reminders for these two pointers, and the size is
clamped to be never more than "snapshot_size". We can simply reply on
the function __auxtrace_mmap__read() to calculate the correct result
for data copying, it's not necessary to add Arm CoreSight specific
callback.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Cc: Denis Nikitin <denik@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: http://lore.kernel.org/lkml/20210701093537.90759-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some helper functions will be used for cgroup counting too. Move them
to a header file for sharing.
Committer notes:
Fix the build on older systems with:
- struct bpf_map_info map_info = {0};
+ struct bpf_map_info map_info = { .id = 0, };
This wasn't breaking the build in such systems as bpf_counter.c isn't
built due to:
tools/perf/util/Build:
perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o
The bpf_counter.h file on the other hand is included from places that
are built everywhere.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The cgroup_is_v2() is to check if the given subsystem is mounted on
cgroup v2 or not. It'll be used by BPF cgroup code later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The read_cgroup_id() is to read a cgroup id from a file handle using
name_to_handle_at(2) for the given cgroup. It'll be used by bperf
cgroup stat later.
Committer notes:
-int read_cgroup_id(struct cgroup *cgrp)
+static inline int read_cgroup_id(struct cgroup *cgrp __maybe_unused)
To fix the build when HAVE_FILE_HANDLE is not defined.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Added callback option (-G) to support cgroups for 'perf top'.
Added condition to make sure -cgroup and --all-cgroups aren't both enabled.
Example:
$perf top -e cycles -G system.slice/docker-6b95a5eb649c0d671eba3835f0d93973d05a088f3ae8602246bde37affb1ba3e.scope -a --stdio
PerfTop: 3330 irqs/sec kernel:68.2% exact: 0.0% lost: 0/0 drop: 0/11075 [4000Hz cpu-clock], (all, 4 CPUs)
-------------------------------------------------------------------------------------------------------------------------------------------------------
27.32% [unknown] [.] 0x00007f8ab7b69352
11.44% [kernel] [k] 0xffffffff968cd657
3.12% [kernel] [k] 0xffffffff96160e96
2.63% [kernel] [k] 0xffffffff96160eb0
1.96% [kernel] [k] 0xffffffff9615fcf6
1.42% [kernel] [k] 0xffffffff964ddfc7
1.09% [kernel] [k] 0xffffffff96160e90
0.81% [kernel] [k] 0xffffffff96160eb3
0.67% [kernel] [k] 0xffffffff9615fec1
0.57% [kernel] [k] 0xffffffff961ee1d0
0.53% [unknown] [.] 0x00007f8ab7b6666c
0.53% [kernel] [k] 0xffffffff96160e64
0.52% [kernel] [k] 0xffffffff9616c303
0.51% [kernel] [k] 0xffffffffc08e7d50
...
Signed-off-by: Joshua Martinez <joshuamart@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: joshua martinez <joshuamart@google.com>
Link: http://lore.kernel.org/lkml/20210616231829.3735671-1-joshuamart@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Share the addr_location of 'addr' so that it need not be resolved more than
once.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To make it possible to use filtering with scripts, move filtering before
scripting.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Generally, it should be more efficient if filter_cpu() comes before
machine__resolve() because filter_cpu() is much less code than
machine__resolve().
Example:
$ perf record --sample-cpu -- make -C tools/perf >/dev/null
Before:
$ perf stat -- perf script -C 0 >/dev/null
Performance counter stats for 'perf script -C 0':
116.94 msec task-clock # 0.992 CPUs utilized
2 context-switches # 17.103 /sec
0 cpu-migrations # 0.000 /sec
8,187 page-faults # 70.011 K/sec
478,351,812 cycles # 4.091 GHz
564,785,464 instructions # 1.18 insn per cycle
114,341,105 branches # 977.789 M/sec
2,615,495 branch-misses # 2.29% of all branches
0.117840576 seconds time elapsed
0.085040000 seconds user
0.032396000 seconds sys
After:
$ perf stat -- perf script -C 0 >/dev/null
Performance counter stats for 'perf script -C 0':
107.45 msec task-clock # 0.992 CPUs utilized
3 context-switches # 27.919 /sec
0 cpu-migrations # 0.000 /sec
7,964 page-faults # 74.117 K/sec
438,417,260 cycles # 4.080 GHz
522,571,855 instructions # 1.19 insn per cycle
105,187,488 branches # 978.921 M/sec
2,356,261 branch-misses # 2.24% of all branches
0.108282546 seconds time elapsed
0.095935000 seconds user
0.011991000 seconds sys
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Having a verbose option will allow shell tests to provide extra failure
details when the fail or skip.
Committer notes:
Keep the 'script' variable at PATH_MAX, as its just something we'll pass
to system(), not really a "path", so being arbitrary, reduce the patch
size by not adding the three extra bytes to the 'script' variable.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210621215648.2991319-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up fixes, since perf/urgent is already upstream.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick the changes in:
ea6932d70e223e02 ("net: make get_net_ns return error if NET_NS is disabled")
That don't result in any changes in the tables generated from that
header.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h
Cc: Changbin Du <changbin.du@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
$(( .. )) is a bash feature but the test's interpreter is !/bin/sh,
switch the code to use expr.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210617184216.2075588-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
ASan reported a memory leak of BPF-related ksymbols map and dso. The
leak is caused by refount never reaching 0, due to missing __put calls
in the function machine__process_ksymbol_register.
Once the dso is inserted in the map, dso__put() should be called
(map__new2() increases the refcount to 2).
The same thing applies for the map when it's inserted into maps
(maps__insert() increases the refcount to 2).
$ sudo ./perf record -- sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]
=================================================================
==297735==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 6992 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x8e4e53 in map__new2 /home/user/linux/tools/perf/util/map.c:216:20
#2 0x8cf68c in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:778:10
[...]
Indirect leak of 8702 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x8728d7 in dso__new_id /home/user/linux/tools/perf/util/dso.c:1256:20
#2 0x872015 in dso__new /home/user/linux/tools/perf/util/dso.c:1295:9
#3 0x8cf623 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:774:21
[...]
Indirect leak of 1520 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
#2 0x888954 in map__process_kallsym_symbol /home/user/linux/tools/perf/util/symbol.c:710:8
[...]
Indirect leak of 1406 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
#2 0x8cfbd8 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:803:8
[...]
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Link: http://lore.kernel.org/lkml/20210612173751.188582-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
metricgroup__add_metric_sys_event_iter()
The error code is not set at all in the sys event iter function.
This may lead to an uninitialized value of "ret" in
metricgroup__add_metric() when no CPU metric is added.
Fix by properly setting the error code.
It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
if we have no CPU or sys event metric matching, then "has_match" should
be 0 and "ret" is set to -EINVAL.
However gcc cannot detect that it may not have been set after the
map_for_each_metric() loop for CPU metrics, which is strange.
Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623335580-187317-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The following command segfaults on my x86 broadwell:
$ ./perf stat -M frontend_bound,retiring,backend_bound,bad_speculation sleep 1
WARNING: grouped events cpus do not match, disabling group:
anon group { raw 0x10e }
anon group { raw 0x10e }
perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)' failed.
Aborted (core dumped)
The issue shows itself as a use-after-free in evlist__check_cpu_maps(),
whereby the leader of an event selector (evsel) has been deleted (yet we
still attempt to verify for an evsel).
Fundamentally the problem comes from metricgroup__setup_events() ->
find_evsel_group(), and has developed from the previous fix attempt in
commit 9c880c24cb0d ("perf metricgroup: Fix for metrics containing
duration_time").
The problem now is that the logic in checking if an evsel is in the same
group is subtly broken for the "cycles" event. For the "cycles" event,
the pmu_name is NULL; however the logic in find_evsel_group() may set an
event matched against "cycles" as used, when it should not be.
This leads to a condition where an evsel is set, yet its leader is not.
Fix the check for evsel pmu_name by not matching evsels when either has a
NULL pmu_name.
There is still a pre-existing metric issue whereby the ordering of the
metrics may break the 'stat' function, as discussed at:
https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/
Fixes: 9c880c24cb0d ("perf metricgroup: Fix for metrics containing duration_time")
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # On a Thinkpad T450S
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623335580-187317-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now the boot-time tracing supports kprobes events and that must be
written in bootconfig file in the following format.
ftrace.event.kprobes.<EVENT_NAME>.probes = <PROBE-DEF>
'perf probe' already supports --definition (-D) action to show probe
definitions, but the format is for tracefs:
[p|r][:EVENT_NAME] <PROBE-DEF>
This patch adds the --bootconfig option for -D action so that it outputs
the probe definitions in bootconfig format. E.g.
$ perf probe --bootconfig -D "path_lookupat:7 err:s32 s:string"
ftrace.event.kprobes.path_lookupat_L7.probe = 'path_lookupat.isra.0+309 err_s32=%ax:s32 s_string=+0(%r13):string'
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282412351.452340.14871995440005640114.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cleanup synthesize_probe_trace_command() to simplify the code path.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282411361.452340.16886399333622147122.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf probe' internally checks the probe target is in the text area in
post-process (after analyzing debuginfo). But it fails if the probe
target is in the "inittext".
This is a good limitation for the online kernel because such functions
have gone after booting. However, for using it for boot-time tracing,
user may want to put a probe on init functions.
This skips the post checking process if the target is offline kenrel so
that user can get the probe definition on the init functions.
Without this patch:
$ perf probe -k ./build-x86_64/vmlinux -D do_mount_root:10
Probe point 'do_mount_root:10' not found.
Error: Failed to add events.
With this patch:
$ perf probe -k ./build-x86_64/vmlinux -D do_mount_root:10
p:probe/do_mount_root_L10 mount_block_root+300
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282410293.452340.13347006295826431632.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If the test is run on a hypervisor then the cycles event may not be
counted, skip the test in this situation. Fail the test if cycles are
not counted in the subsequent bpf counter run.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210617184216.2075588-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Provide additional context for when the stat bpf counters test skips.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210617184216.2075588-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The "auxtrace_info" and "auxtrace" functions are not set in "tool" member of
"annotate". As a result, perf annotate does not support parsing itrace data.
Before:
# perf record -e arm_spe_0/branch_filter=1/ -a sleep 1
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 20.874 MB perf.data ]
# perf annotate --stdio
Error:
The perf.data data has no samples!
Solution:
1. Add itrace options in help,
2. Set hook functions of "id_index", "auxtrace_info" and "auxtrace" in perf_tool.
After:
# perf record --all-user -e arm_spe_0/branch_filter=1/ ls
Couldn't synthesize bpf events.
perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.010 MB perf.data ]
# perf annotate --stdio
Percent | Source code & Disassembly of libc-2.28.so for branch-miss (1 samples, percent: local period)
------------------------------------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 0000000000066180 <__getdelim@@GLIBC_2.17>:
0.00 : 66180: stp x29, x30, [sp, #-96]!
0.00 : 66184: cmp x0, #0x0
0.00 : 66188: ccmp x1, #0x0, #0x4, ne // ne = any
0.00 : 6618c: mov x29, sp
0.00 : 66190: stp x24, x25, [sp, #56]
0.00 : 66194: stp x26, x27, [sp, #72]
0.00 : 66198: str x28, [sp, #88]
0.00 : 6619c: b.eq 66450 <__getdelim@@GLIBC_2.17+0x2d0> // b.none
0.00 : 661a0: stp x22, x23, [x29, #40]
0.00 : 661a4: mov x22, x1
0.00 : 661a8: ldr w1, [x3]
0.00 : 661ac: mov w23, w2
0.00 : 661b0: stp x20, x21, [x29, #24]
0.00 : 661b4: mov x20, x3
0.00 : 661b8: mov x21, x0
0.00 : 661bc: tbnz w1, #15, 66360 <__getdelim@@GLIBC_2.17+0x1e0>
0.00 : 661c0: ldr x0, [x3, #136]
0.00 : 661c4: ldr x2, [x0, #8]
0.00 : 661c8: str x19, [x29, #16]
0.00 : 661cc: mrs x19, tpidr_el0
0.00 : 661d0: sub x19, x19, #0x700
0.00 : 661d4: cmp x2, x19
0.00 : 661d8: b.eq 663f0 <__getdelim@@GLIBC_2.17+0x270> // b.none
0.00 : 661dc: mov w1, #0x1 // #1
0.00 : 661e0: ldaxr w2, [x0]
0.00 : 661e4: cmp w2, #0x0
0.00 : 661e8: b.ne 661f4 <__getdelim@@GLIBC_2.17+0x74> // b.any
0.00 : 661ec: stxr w3, w1, [x0]
0.00 : 661f0: cbnz w3, 661e0 <__getdelim@@GLIBC_2.17+0x60>
0.00 : 661f4: b.ne 66448 <__getdelim@@GLIBC_2.17+0x2c8> // b.any
0.00 : 661f8: ldr x0, [x20, #136]
0.00 : 661fc: ldr w1, [x20]
0.00 : 66200: ldr w2, [x0, #4]
0.00 : 66204: str x19, [x0, #8]
0.00 : 66208: add w2, w2, #0x1
0.00 : 6620c: str w2, [x0, #4]
0.00 : 66210: tbnz w1, #5, 66388 <__getdelim@@GLIBC_2.17+0x208>
0.00 : 66214: ldr x19, [x29, #16]
0.00 : 66218: ldr x0, [x21]
0.00 : 6621c: cbz x0, 66228 <__getdelim@@GLIBC_2.17+0xa8>
0.00 : 66220: ldr x0, [x22]
0.00 : 66224: cbnz x0, 6623c <__getdelim@@GLIBC_2.17+0xbc>
0.00 : 66228: mov x0, #0x78 // #120
0.00 : 6622c: str x0, [x22]
0.00 : 66230: bl 20710 <malloc@plt>
0.00 : 66234: str x0, [x21]
0.00 : 66238: cbz x0, 66428 <__getdelim@@GLIBC_2.17+0x2a8>
0.00 : 6623c: ldr x27, [x20, #8]
0.00 : 66240: str x19, [x29, #16]
0.00 : 66244: ldr x19, [x20, #16]
0.00 : 66248: sub x19, x19, x27
0.00 : 6624c: cmp x19, #0x0
0.00 : 66250: b.le 66398 <__getdelim@@GLIBC_2.17+0x218>
0.00 : 66254: mov x25, #0x0 // #0
0.00 : 66258: b 662d8 <__getdelim@@GLIBC_2.17+0x158>
0.00 : 6625c: nop
0.00 : 66260: add x24, x19, x25
0.00 : 66264: ldr x3, [x22]
0.00 : 66268: add x26, x24, #0x1
0.00 : 6626c: ldr x0, [x21]
0.00 : 66270: cmp x3, x26
0.00 : 66274: b.cs 6629c <__getdelim@@GLIBC_2.17+0x11c> // b.hs, b.nlast
0.00 : 66278: lsl x3, x3, #1
0.00 : 6627c: cmp x3, x26
0.00 : 66280: csel x26, x3, x26, cs // cs = hs, nlast
0.00 : 66284: mov x1, x26
0.00 : 66288: bl 206f0 <realloc@plt>
0.00 : 6628c: cbz x0, 66438 <__getdelim@@GLIBC_2.17+0x2b8>
0.00 : 66290: str x0, [x21]
0.00 : 66294: ldr x27, [x20, #8]
0.00 : 66298: str x26, [x22]
0.00 : 6629c: mov x2, x19
0.00 : 662a0: mov x1, x27
0.00 : 662a4: add x0, x0, x25
0.00 : 662a8: bl 87390 <explicit_bzero@@GLIBC_2.25+0x50>
0.00 : 662ac: ldr x0, [x20, #8]
0.00 : 662b0: add x19, x0, x19
0.00 : 662b4: str x19, [x20, #8]
0.00 : 662b8: cbnz x28, 66410 <__getdelim@@GLIBC_2.17+0x290>
0.00 : 662bc: mov x0, x20
0.00 : 662c0: bl 73b80 <__underflow@@GLIBC_2.17>
0.00 : 662c4: cmn w0, #0x1
0.00 : 662c8: b.eq 66410 <__getdelim@@GLIBC_2.17+0x290> // b.none
0.00 : 662cc: ldp x27, x19, [x20, #8]
0.00 : 662d0: mov x25, x24
0.00 : 662d4: sub x19, x19, x27
0.00 : 662d8: mov x2, x19
0.00 : 662dc: mov w1, w23
0.00 : 662e0: mov x0, x27
0.00 : 662e4: bl 807b0 <memchr@@GLIBC_2.17>
0.00 : 662e8: cmp x0, #0x0
0.00 : 662ec: mov x28, x0
0.00 : 662f0: sub x0, x0, x27
0.00 : 662f4: csinc x19, x19, x0, eq // eq = none
0.00 : 662f8: mov x0, #0x7fffffffffffffff // #9223372036854775807
0.00 : 662fc: sub x0, x0, x25
0.00 : 66300: cmp x19, x0
0.00 : 66304: b.lt 66260 <__getdelim@@GLIBC_2.17+0xe0> // b.tstop
0.00 : 66308: adrp x0, 17f000 <sys_sigabbrev@@GLIBC_2.17+0x320>
0.00 : 6630c: ldr x0, [x0, #3624]
0.00 : 66310: mrs x2, tpidr_el0
0.00 : 66314: ldr x19, [x29, #16]
0.00 : 66318: mov w3, #0x4b // #75
0.00 : 6631c: ldr w1, [x20]
0.00 : 66320: mov x24, #0xffffffffffffffff // #-1
0.00 : 66324: str w3, [x2, x0]
0.00 : 66328: tbnz w1, #15, 66340 <__getdelim@@GLIBC_2.17+0x1c0>
0.00 : 6632c: ldr x0, [x20, #136]
0.00 : 66330: ldr w1, [x0, #4]
0.00 : 66334: sub w1, w1, #0x1
0.00 : 66338: str w1, [x0, #4]
0.00 : 6633c: cbz w1, 663b8 <__getdelim@@GLIBC_2.17+0x238>
0.00 : 66340: mov x0, x24
0.00 : 66344: ldr x28, [sp, #88]
0.00 : 66348: ldp x20, x21, [x29, #24]
0.00 : 6634c: ldp x22, x23, [x29, #40]
0.00 : 66350: ldp x24, x25, [sp, #56]
0.00 : 66354: ldp x26, x27, [sp, #72]
0.00 : 66358: ldp x29, x30, [sp], #96
0.00 : 6635c: ret
100.00 : 66360: tbz w1, #5, 66218 <__getdelim@@GLIBC_2.17+0x98>
0.00 : 66364: ldp x20, x21, [x29, #24]
0.00 : 66368: mov x24, #0xffffffffffffffff // #-1
0.00 : 6636c: ldp x22, x23, [x29, #40]
0.00 : 66370: mov x0, x24
0.00 : 66374: ldp x24, x25, [sp, #56]
0.00 : 66378: ldp x26, x27, [sp, #72]
0.00 : 6637c: ldr x28, [sp, #88]
0.00 : 66380: ldp x29, x30, [sp], #96
0.00 : 66384: ret
0.00 : 66388: mov x24, #0xffffffffffffffff // #-1
0.00 : 6638c: ldr x19, [x29, #16]
0.00 : 66390: b 66328 <__getdelim@@GLIBC_2.17+0x1a8>
0.00 : 66394: nop
0.00 : 66398: mov x0, x20
0.00 : 6639c: bl 73b80 <__underflow@@GLIBC_2.17>
0.00 : 663a0: cmn w0, #0x1
0.00 : 663a4: b.eq 66438 <__getdelim@@GLIBC_2.17+0x2b8> // b.none
0.00 : 663a8: ldp x27, x19, [x20, #8]
0.00 : 663ac: sub x19, x19, x27
0.00 : 663b0: b 66254 <__getdelim@@GLIBC_2.17+0xd4>
0.00 : 663b4: nop
0.00 : 663b8: str xzr, [x0, #8]
0.00 : 663bc: ldxr w2, [x0]
0.00 : 663c0: stlxr w3, w1, [x0]
0.00 : 663c4: cbnz w3, 663bc <__getdelim@@GLIBC_2.17+0x23c>
0.00 : 663c8: cmp w2, #0x1
0.00 : 663cc: b.le 66340 <__getdelim@@GLIBC_2.17+0x1c0>
0.00 : 663d0: mov x1, #0x81 // #129
0.00 : 663d4: mov x2, #0x1 // #1
0.00 : 663d8: mov x3, #0x0 // #0
0.00 : 663dc: mov x8, #0x62 // #98
0.00 : 663e0: svc #0x0
0.00 : 663e4: ldp x20, x21, [x29, #24]
0.00 : 663e8: ldp x22, x23, [x29, #40]
0.00 : 663ec: b 66370 <__getdelim@@GLIBC_2.17+0x1f0>
0.00 : 663f0: ldr w2, [x0, #4]
0.00 : 663f4: add w2, w2, #0x1
0.00 : 663f8: str w2, [x0, #4]
0.00 : 663fc: tbz w1, #5, 66214 <__getdelim@@GLIBC_2.17+0x94>
0.00 : 66400: mov x24, #0xffffffffffffffff // #-1
0.00 : 66404: ldr x19, [x29, #16]
0.00 : 66408: b 66330 <__getdelim@@GLIBC_2.17+0x1b0>
0.00 : 6640c: nop
0.00 : 66410: ldr x0, [x21]
0.00 : 66414: strb wzr, [x0, x24]
0.00 : 66418: ldr w1, [x20]
0.00 : 6641c: ldr x19, [x29, #16]
0.00 : 66420: b 66328 <__getdelim@@GLIBC_2.17+0x1a8>
0.00 : 66424: nop
0.00 : 66428: mov x24, #0xffffffffffffffff // #-1
0.00 : 6642c: ldr w1, [x20]
0.00 : 66430: b 66328 <__getdelim@@GLIBC_2.17+0x1a8>
0.00 : 66434: nop
0.00 : 66438: mov x24, #0xffffffffffffffff // #-1
0.00 : 6643c: ldr w1, [x20]
0.00 : 66440: ldr x19, [x29, #16]
0.00 : 66444: b 66328 <__getdelim@@GLIBC_2.17+0x1a8>
0.00 : 66448: bl e3ba0 <pthread_setcanceltype@@GLIBC_2.17+0x30>
0.00 : 6644c: b 661f8 <__getdelim@@GLIBC_2.17+0x78>
0.00 : 66450: adrp x0, 17f000 <sys_sigabbrev@@GLIBC_2.17+0x320>
0.00 : 66454: ldr x0, [x0, #3624]
0.00 : 66458: mrs x1, tpidr_el0
0.00 : 6645c: mov w2, #0x16 // #22
0.00 : 66460: mov x24, #0xffffffffffffffff // #-1
0.00 : 66464: str w2, [x1, x0]
0.00 : 66468: b 66370 <__getdelim@@GLIBC_2.17+0x1f0>
0.00 : 6646c: ldr w1, [x20]
0.00 : 66470: mov x4, x0
0.00 : 66474: tbnz w1, #15, 6648c <__getdelim@@GLIBC_2.17+0x30c>
0.00 : 66478: ldr x0, [x20, #136]
0.00 : 6647c: ldr w1, [x0, #4]
0.00 : 66480: sub w1, w1, #0x1
0.00 : 66484: str w1, [x0, #4]
0.00 : 66488: cbz w1, 66494 <__getdelim@@GLIBC_2.17+0x314>
0.00 : 6648c: mov x0, x4
0.00 : 66490: bl 20e40 <gnu_get_libc_version@@GLIBC_2.17+0x130>
0.00 : 66494: str xzr, [x0, #8]
0.00 : 66498: ldxr w2, [x0]
0.00 : 6649c: stlxr w3, w1, [x0]
0.00 : 664a0: cbnz w3, 66498 <__getdelim@@GLIBC_2.17+0x318>
0.00 : 664a4: cmp w2, #0x1
0.00 : 664a8: b.le 6648c <__getdelim@@GLIBC_2.17+0x30c>
0.00 : 664ac: mov x1, #0x81 // #129
0.00 : 664b0: mov x2, #0x1 // #1
0.00 : 664b4: mov x3, #0x0 // #0
0.00 : 664b8: mov x8, #0x62 // #98
0.00 : 664bc: svc #0x0
0.00 : 664c0: b 6648c <__getdelim@@GLIBC_2.17+0x30c>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210615091704.259202-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove duplicate '#undef E'.
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http://lore.kernel.org/lkml/20210616120339.219807-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When peeking an event, it has a short path and a long path. The short
path uses the session pointer "one_mmap_addr" to directly fetch the
event; and the long path needs to read out the event header and the
following event data from file and fill into the buffer pointer passed
through the argument "buf".
The issue is in the long path that it copies the event header and event
data into the same destination address which pointer "buf", this means
the event header is overwritten. We are just lucky to run into the
short path in most cases, so we don't hit the issue in the long path.
This patch adds the offset "hdr_sz" to the pointer "buf" when copying
the event data, so that it can reserve the event header which can be
used properly by its caller.
Fixes: 5a52f33adf02 ("perf session: Add perf_session__peek_event()")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A group mixed with hybrid event and global event is allowed. For
example, group leader is 'intel_pt//' and the group member is
'cpu_atom/cycles/'.
e.g.:
# perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u'
The challenge is that their available cpus are not fully matched. For
example, 'intel_pt//' is available on CPU0-CPU23, but 'cpu_atom/cycles/'
is available on CPU16-CPU23.
When getting the group id for group member, we must be very careful.
Because the cpu for 'intel_pt//' is not equal to the cpu for
'cpu_atom/cycles/'. Actually the cpu here is the index of evsel->core.cpus,
not the real CPU ID.
e.g. cpu0 for 'intel_pt//' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.
Before:
# perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u' -vv uname
...
------------------------------------------------------------
perf_event_attr:
type 10
size 128
config 0xe601
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|CPU|IDENTIFIER
read_format ID
disabled 1
inherit 1
exclude_kernel 1
exclude_hv 1
enable_on_exec 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 4084 cpu 0 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 4084 cpu 1 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid 4084 cpu 2 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid 4084 cpu 3 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid 4084 cpu 4 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid 4084 cpu 5 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid 4084 cpu 6 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid 4084 cpu 7 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid 4084 cpu 8 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid 4084 cpu 9 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid 4084 cpu 10 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid 4084 cpu 11 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid 4084 cpu 12 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid 4084 cpu 13 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid 4084 cpu 14 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid 4084 cpu 15 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid 4084 cpu 16 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid 4084 cpu 17 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid 4084 cpu 18 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid 4084 cpu 19 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid 4084 cpu 20 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid 4084 cpu 21 group_fd -1 flags 0x8 = 27
sys_perf_event_open: pid 4084 cpu 22 group_fd -1 flags 0x8 = 28
sys_perf_event_open: pid 4084 cpu 23 group_fd -1 flags 0x8 = 29
------------------------------------------------------------
perf_event_attr:
size 128
config 0x800000000
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD|IDENTIFIER|AUX
read_format ID
inherit 1
exclude_kernel 1
exclude_hv 1
freq 1
sample_id_all 1
exclude_guest 1
aux_sample_size 4096
------------------------------------------------------------
sys_perf_event_open: pid 4084 cpu 16 group_fd 5 flags 0x8
sys_perf_event_open failed, error -22
The group_fd 5 is not correct. It should be 22 (the fd of
'intel_pt' on CPU16).
After:
# perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u' -vv uname
...
------------------------------------------------------------
perf_event_attr:
type 10
size 128
config 0xe601
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|CPU|IDENTIFIER
read_format ID
disabled 1
inherit 1
exclude_kernel 1
exclude_hv 1
enable_on_exec 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 5162 cpu 0 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 5162 cpu 1 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid 5162 cpu 2 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid 5162 cpu 3 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid 5162 cpu 4 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid 5162 cpu 5 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid 5162 cpu 6 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid 5162 cpu 7 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid 5162 cpu 8 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid 5162 cpu 9 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid 5162 cpu 10 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid 5162 cpu 11 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid 5162 cpu 12 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid 5162 cpu 13 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid 5162 cpu 14 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid 5162 cpu 15 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid 5162 cpu 16 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid 5162 cpu 17 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid 5162 cpu 18 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid 5162 cpu 19 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid 5162 cpu 20 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid 5162 cpu 21 group_fd -1 flags 0x8 = 27
sys_perf_event_open: pid 5162 cpu 22 group_fd -1 flags 0x8 = 28
sys_perf_event_open: pid 5162 cpu 23 group_fd -1 flags 0x8 = 29
------------------------------------------------------------
perf_event_attr:
size 128
config 0x800000000
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD|IDENTIFIER|AUX
read_format ID
inherit 1
exclude_kernel 1
exclude_hv 1
freq 1
sample_id_all 1
exclude_guest 1
aux_sample_size 4096
------------------------------------------------------------
sys_perf_event_open: pid 5162 cpu 16 group_fd 22 flags 0x8 = 30
sys_perf_event_open: pid 5162 cpu 17 group_fd 23 flags 0x8 = 31
sys_perf_event_open: pid 5162 cpu 18 group_fd 24 flags 0x8 = 32
sys_perf_event_open: pid 5162 cpu 19 group_fd 25 flags 0x8 = 33
sys_perf_event_open: pid 5162 cpu 20 group_fd 26 flags 0x8 = 34
sys_perf_event_open: pid 5162 cpu 21 group_fd 27 flags 0x8 = 35
sys_perf_event_open: pid 5162 cpu 22 group_fd 28 flags 0x8 = 36
sys_perf_event_open: pid 5162 cpu 23 group_fd 29 flags 0x8 = 37
------------------------------------------------------------
...
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210609044555.27180-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Report permission error for the tracefs open and rewrite whole the error
message code around it.
You'll see a hint according to what you want to do with perf probe as
below.
$ perf probe -l
No permission to read tracefs.
Please try 'sudo mount -o remount,mode=755 /sys/kernel/tracing/'
Error: Failed to show event list.
$ perf probe -d \*
No permission to write tracefs.
Please run this command again with sudo.
Error: Failed to delete events.
This also fixes -ENOTSUP checking for mounting tracefs/debugfs.
Actually open returns -ENOENT in that case and we have to check it with
current mount point list. If we unmount debugfs and tracefs perf probe
shows correct message as below.
$ perf probe -l
Debugfs or tracefs is not mounted
Please try 'sudo mount -t tracefs nodev /sys/kernel/tracing/'
Error: Failed to show event list.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162299456839.503471.13863002017089255222.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|