Age | Commit message (Collapse) | Author |
|
There's an issue with using threads::last_match in multithread mode
which is enabled during the perf top synthesize. It might crash with
following assertion:
perf: ...include/linux/refcount.h:109: refcount_inc:
Assertion `!(!refcount_inc_not_zero(r))' failed.
The gdb backtrace looks like this:
0x00007ffff50839fb in raise () from /lib64/libc.so.6
(gdb)
#0 0x00007ffff50839fb in raise () from /lib64/libc.so.6
#1 0x00007ffff5085800 in abort () from /lib64/libc.so.6
#2 0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
#4 0x0000000000535ff9 in refcount_inc (r=0x7fffe8009a70)
at ...include/linux/refcount.h:109
#5 0x0000000000536771 in thread__get (thread=0x7fffe8009a40)
at util/thread.c:115
#6 0x0000000000523cd0 in ____machine__findnew_thread (machine=0xbfde38,
threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:432
#7 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
pid=2, tid=2) at util/machine.c:489
#8 0x0000000000523f24 in machine__findnew_thread (machine=0xbfde38,
pid=2, tid=2) at util/machine.c:499
#9 0x0000000000526fbe in machine__process_fork_event (machine=0xbfde38,
...
The failing assertion is this one:
REFCOUNT_WARN(!refcount_inc_not_zero(r), ...
the problem is that we don't serialize access to threads::last_match.
We serialize the access to the threads tree, but we don't care how's
threads::last_match being accessed. Both locked/unlocked paths use
that data and can set it. In multithreaded mode we can end up with
invalid object in thread__get call, like in following paths race:
thread 1
...
machine__findnew_thread
down_write(&threads->lock);
__machine__findnew_thread
____machine__findnew_thread
th = threads->last_match;
if (th->tid == tid) {
thread__get
thread 2
...
machine__find_thread
down_read(&threads->lock);
__machine__findnew_thread
____machine__findnew_thread
th = threads->last_match;
if (th->tid == tid) {
thread__get
thread 3
...
machine__process_fork_event
machine__remove_thread
__machine__remove_thread
threads->last_match = NULL
thread__put
thread__put
Thread 1 and 2 might got stale last_match, before thread 3 clears
it. Thread 1 and 2 then race with thread 3's thread__put and they
might trigger the refcnt == 0 assertion above.
The patch is disabling the last_match cache for multiple thread
mode. It was originally meant for single thread scenarios, where
it's common to have multiple sequential searches of the same
thread.
In multithread mode this does not make sense, because top's threads
processes different /proc entries and so the 'struct threads' object
is queried for various threads. Moreover we'd need to add more locks
to make it work.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180719143345.12963-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Separating threads::last_match cache set into separate
threads__set_last_match function. This will be useful in following
patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180719143345.12963-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Separating threads::last_match cache read/check into separate
threads__get_last_match function. This will be useful in following
patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180719143345.12963-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Stephan reported, that pipe mode does not carry the group information
and thus the piped report won't display the grouped output for following
command:
# perf record -e '{cycles,instructions,branches}' -a sleep 4 | perf report
It has no idea about the group setup, so it will display events
separately:
# Overhead Command Shared Object ...
# ........ ............... .......................
#
6.71% swapper [kernel.kallsyms]
2.28% offlineimap libpython2.7.so.1.0
0.78% perf [kernel.kallsyms]
...
Fix GROUP_DESC feature record to be synthesized in pipe mode, so the
report output is grouped if there are groups defined in record:
# Overhead Command Shared ...
# ........................ ............... .......
#
7.57% 0.16% 0.30% swapper [kernel
1.87% 3.15% 2.46% offlineimap libpyth
1.33% 0.00% 0.00% perf [kernel
...
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180712135202.14774-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When perf/data is recorded with the dwarf call-graph option, the
callchain shown by 'perf script' still shows the binary offsets of the
userspace symbols instead of their virtual addresses. Since the symbol
offset calculation is based on using virtual address as the ip, we see
incorrect offsets as well.
The use of virtual addresses affects the ability to find out the
line number in the corresponding source file to which an address
maps to as described in commit 67540759151a ("perf unwind: Use
addr_location::addr instead of ip for entries").
This has also been addressed by temporarily converting the virtual
address to the correponding binary offset so that it can be mapped
to the source line number correctly.
This is a follow-up for commit 19610184693c ("perf script: Show
virtual addresses instead of offsets").
This can be verified on a powerpc64le system running Fedora 27 as
shown below:
# perf probe -x /usr/lib64/libc-2.26.so -a inet_pton
# perf record -e probe_libc:inet_pton --call-graph=dwarf ping -6 -c 1 ::1
Before:
# perf report --stdio --no-children -s sym,srcline -g address
# Samples: 1 of event 'probe_libc:inet_pton'
# Event count (approx.): 1
#
# Overhead Symbol Source:Line
# ........ .................... ...........
#
100.00% [.] __GI___inet_pton inet_pton.c
|
---gaih_inet getaddrinfo.c:537 (inlined)
__GI_getaddrinfo getaddrinfo.c:2304 (inlined)
main ping.c:519
generic_start_main libc-start.c:308 (inlined)
__libc_start_main libc-start.c:102
...
# perf script -F comm,ip,sym,symoff,srcline,dso
ping
15af28 __GI___inet_pton+0xffff000099160008 (/usr/lib64/libc-2.26.so)
libc-2.26.so[ffff80004ca0af28]
10fa53 gaih_inet+0xffff000099160f43
libc-2.26.so[ffff80004c9bfa53] (inlined)
1105b3 __GI_getaddrinfo+0xffff000099160163
libc-2.26.so[ffff80004c9c05b3] (inlined)
2d6f main+0xfffffffd9f1003df (/usr/bin/ping)
ping[fffffffecf882d6f]
2369f generic_start_main+0xffff00009916013f
libc-2.26.so[ffff80004c8d369f] (inlined)
23897 __libc_start_main+0xffff0000991600b7 (/usr/lib64/libc-2.26.so)
libc-2.26.so[ffff80004c8d3897]
After:
# perf report --stdio --no-children -s sym,srcline -g address
# Samples: 1 of event 'probe_libc:inet_pton'
# Event count (approx.): 1
#
# Overhead Symbol Source:Line
# ........ .................... ...........
#
100.00% [.] __GI___inet_pton inet_pton.c
|
---gaih_inet.constprop.7 getaddrinfo.c:537
getaddrinfo getaddrinfo.c:2304
main ping.c:519
generic_start_main.isra.0 libc-start.c:308
__libc_start_main libc-start.c:102
...
# perf script -F comm,ip,sym,symoff,srcline,dso
ping
7fffb38aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
inet_pton.c:68
7fffb385fa53 gaih_inet.constprop.7+0xf43 (/usr/lib64/libc-2.26.so)
getaddrinfo.c:537
7fffb38605b3 getaddrinfo+0x163 (/usr/lib64/libc-2.26.so)
getaddrinfo.c:2304
130782d6f main+0x3df (/usr/bin/ping)
ping.c:519
7fffb377369f generic_start_main.isra.0+0x13f (/usr/lib64/libc-2.26.so)
libc-start.c:308
7fffb3773897 __libc_start_main+0xb7 (/usr/lib64/libc-2.26.so)
libc-start.c:102
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes: 67540759151a ("perf unwind: Use addr_location::addr instead of ip for entries")
Link: http://lkml.kernel.org/r/20180703120555.32971-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This should speed up accessing new system calls introduced with the
kernel rather than waiting for libaudit updates to include them.
It also enables users to specify wildcards, for example, perf trace -e
'open*', just like was already possible on x86, s390, and powerpc, which
means arm64 can now pass the "Check open filename arg using perf trace +
vfs_getname" test.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180706163454.f714b9ab49ecc8566a0b3565@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This should speed up accessing new system calls introduced with the
kernel rather than waiting for libaudit updates to include them.
Using the existing other arch scripts resulted in this error:
tools/perf/arch/arm64/entry/syscalls//mksyscalltbl: 25: printf: __NR3264_ftruncate: expected numeric value
because, unlike other arches, asm-generic's unistd.h does things like:
#define __NR_ftruncate __NR3264_ftruncate
Turning the scripts printf's %d into a %s resulted in this in the
generated syscalls.c file:
static const char *syscalltbl_arm64[] = {
[__NR3264_ftruncate] = "ftruncate",
So we use the host C compiler to fold the macros, and print them out
from within a temporary C program, in order to get the correct output:
static const char *syscalltbl_arm64[] = {
[46] = "ftruncate",
Committer notes:
Testing this with a container with an old toolchain breaks because it
ends up using the system's /usr/include/asm-generic/unistd.h, included
from tools/arch/arm64/include/uapi/asm/unistd.h when what is desired is
for it to include tools/include/uapi/asm-generic/unistd.h.
Since all that tools/arch/arm64/include/uapi/asm/unistd.h is to set a
define and then include asm-generic/unistd.h, do that directly and use
tools/include/uapi/asm-generic/unistd.h as the file to get the syscall
definitions to expand.
Testing it:
tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc tools/include/uapi/asm-generic/unistd.h
Now works and generates in the syscall string table.
Before it ended up as:
$ tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc tools/arch/arm64/include/uapi/asm/unistd.h
static const char *syscalltbl_arm64[] = {
<stdin>: In function 'main':
<stdin>:257:38: error: '__NR_getrandom' undeclared (first use in this function)
<stdin>:257:38: note: each undeclared identifier is reported only once for each function it appears in
<stdin>:258:41: error: '__NR_memfd_create' undeclared (first use in this function)
<stdin>:259:32: error: '__NR_bpf' undeclared (first use in this function)
<stdin>:260:37: error: '__NR_execveat' undeclared (first use in this function)
tools/perf/arch/arm64/entry/syscalls/mksyscalltbl: 47: tools/perf/arch/arm64/entry/syscalls/mksyscalltbl: /tmp/create-table-60liya: Permission denied
};
$
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180706163443.22626f5e9e10e5bab5e5c662@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Will be used for generating the syscall id/string translation table.
The arm64 unistd.h file simply #includes the asm-generic/unistd.h, so,
since we will want to know whether either change, we grab both:
arch/arm64/include/uapi/asm/unistd.h
and
include/uapi/asm-generic/unistd.h
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180706163434.1b64ffbcc0284fb79982f53b@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If the event 'probe_libc:inet_pton' already exists, this test fails and
deletes the existing event before exiting. This will then pass for any
subsequent executions.
Instead of skipping to deleting the existing event because of failing to
add a new event, a duplicate event is now created and the script
continues with the usual checks. Only the new duplicate event that is
created at the beginning of the test is deleted as a part of the
cleanups in the end. All existing events remain as it is.
This can be observed on a powerpc64 system running Fedora 27 as shown
below.
# perf probe -x /usr/lib64/power8/libc-2.26.so -a inet_pton
Added new event:
probe_libc:inet_pton (on inet_pton in /usr/lib64/power8/libc-2.26.so)
Before:
# perf test -v "probe libc's inet_pton & backtrace it with ping"
62: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 21302
test child finished with -1
---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
# perf probe --list
After:
# perf test -v "probe libc's inet_pton & backtrace it with ping"
62: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 21490
ping 21513 [035] 39357.565561: probe_libc:inet_pton_1: (7fffa4c623b0)
7fffa4c623b0 __GI___inet_pton+0x0 (/usr/lib64/power8/libc-2.26.so)
7fffa4c190dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so)
7fffa4c19c4c getaddrinfo+0x15c (/usr/lib64/power8/libc-2.26.so)
111d93c20 main+0x3e0 (/usr/bin/ping)
test child finished with 0
---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
# perf probe --list
probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c in /usr/lib64/power8/libc-2.26.so)
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/e11fecff96e6cf4c65cdbd9012463513d7b8356c.1530724939.git.sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If there is a mismatch in the perf script output, this test fails and
exits before the event and temporary files created during its execution
are cleaned up.
This can be observed on a powerpc64 system running Fedora 27 as shown
below.
# perf test -v "probe libc's inet_pton & backtrace it with ping"
62: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 18655
ping 18674 [013] 24511.496995: probe_libc:inet_pton: (7fffa6b423b0)
7fffa6b423b0 __GI___inet_pton+0x0 (/usr/lib64/power8/libc-2.26.so)
7fffa6af90dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so)
FAIL: expected backtrace entry "getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/power8/libc-2.26.so\)$" got "7fffa6af90dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so)"
test child finished with -1
---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
# ls /tmp/expected.* /tmp/perf.data.* /tmp/perf.script.*
/tmp/expected.u31 /tmp/perf.data.Pki /tmp/perf.script.Bhs
# perf probe --list
probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c in /usr/lib64/power8/libc-2.26.so)
Cleanup of the event and the temporary files are now ensured by allowing
the cleanup code to be executed even if the lines from the backtrace do
not match their expected patterns instead of simply exiting from the
point of failure.
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/ce9fb091dd3028fba8749a1a267cfbcb264bbfb1.1530724939.git.sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For powerpc64, this test currently fails due to a mismatch in the
expected output.
This can be observed on a powerpc64le system running Fedora 27 as shown
below.
# perf test -v "probe libc's inet_pton & backtrace it with ping"
Before:
62: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 23948
ping 23965 [003] 71136.075084: probe_libc:inet_pton: (7fff996aaf28)
7fff996aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
7fff9965fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so)
FAIL: expected backtrace entry 2 "getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/libc-2.26.so\)$" got "7fff9965fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so)"
test child finished with -1
---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
After:
62: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 24638
ping 24655 [001] 71208.525396: probe_libc:inet_pton: (7fffa245af28)
7fffa245af28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
7fffa240fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so)
7fffa24105b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so)
138d52d70 main+0x3e0 (/usr/bin/ping)
test child finished with 0
---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Maynard Johnson <maynard@us.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Fixes: e07d585e2454 ("perf tests: Switch trace+probe_libc_inet_pton to use record")
Link: http://lkml.kernel.org/r/49621ec5f37109f0655e5a8c32287ad68d85a1e5.1530724939.git.sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For powerpc64, perf will filter out the second entry in the callchain,
i.e. the LR value, if the return address of the function corresponding
to the probed location has already been saved on its caller's stack.
The state of the return address is determined using debug information.
At any point within a function, if the return address is already saved
somewhere, a DWARF expression can tell us about its location. If the
return address in still in LR only, no DWARF expression would exist.
Typically, the instructions in a function's prologue first copy the LR
value to R0 and then pushes R0 on to the stack. If LR has already been
copied to R0 but R0 is yet to be pushed to the stack, we can still get a
DWARF expression that says that the return address is in R0. This is
indicating that getting a DWARF expression for the return address does
not guarantee the fact that it has already been saved on the stack.
This can be observed on a powerpc64le system running Fedora 27 as shown
below.
# objdump -d /usr/lib64/libc-2.26.so | less
...
000000000015af20 <inet_pton>:
15af20: 0b 00 4c 3c addis r2,r12,11
15af24: e0 c1 42 38 addi r2,r2,-15904
15af28: a6 02 08 7c mflr r0
15af2c: f0 ff c1 fb std r30,-16(r1)
15af30: f8 ff e1 fb std r31,-8(r1)
15af34: 78 1b 7f 7c mr r31,r3
15af38: 78 23 83 7c mr r3,r4
15af3c: 78 2b be 7c mr r30,r5
15af40: 10 00 01 f8 std r0,16(r1)
15af44: c1 ff 21 f8 stdu r1,-64(r1)
15af48: 28 00 81 f8 std r4,40(r1)
...
# readelf --debug-dump=frames-interp /usr/lib64/libc-2.26.so | less
...
00027024 0000000000000024 00027028 FDE cie=00000000 pc=000000000015af20..000000000015af88
LOC CFA r30 r31 ra
000000000015af20 r1+0 u u u
000000000015af34 r1+0 c-16 c-8 r0
000000000015af48 r1+64 c-16 c-8 c+16
000000000015af5c r1+0 c-16 c-8 c+16
000000000015af78 r1+0 u u
...
# perf probe -x /usr/lib64/libc-2.26.so -a inet_pton+0x18
# perf record -e probe_libc:inet_pton -g ping -6 -c 1 ::1
# perf script
Before:
ping 2829 [005] 512917.460174: probe_libc:inet_pton: (7fff7e2baf38)
7fff7e2baf38 __GI___inet_pton+0x18 (/usr/lib64/libc-2.26.so)
7fff7e2705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so)
12f152d70 _init+0xbfc (/usr/bin/ping)
7fff7e1836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fff7e183898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
After:
ping 2829 [005] 512917.460174: probe_libc:inet_pton: (7fff7e2baf38)
7fff7e2baf38 __GI___inet_pton+0x18 (/usr/lib64/libc-2.26.so)
7fff7e26fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so)
7fff7e2705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so)
12f152d70 _init+0xbfc (/usr/bin/ping)
7fff7e1836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fff7e183898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Maynard Johnson <maynard@us.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/66e848a7bdf2d43b39210a705ff6d828a0865661.1530724939.git.sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For powerpc64, redundant entries in the callchain are filtered out by
determining the state of the return address and the stack frame using
DWARF debug information.
For making these filtering decisions we must analyze the debug
information for the location corresponding to the program counter value,
i.e. the first entry in the callchain, and not the LR value; otherwise,
perf may filter out either the second or the third entry in the
callchain incorrectly.
This can be observed on a powerpc64le system running Fedora 27 as shown
below.
Case 1 - Attaching a probe at inet_pton+0x8 (binary offset 0x15af28).
Return address is still in LR and a new stack frame is not yet
allocated. The LR value, i.e. the second entry, should not be
filtered out.
# objdump -d /usr/lib64/libc-2.26.so | less
...
000000000010eb10 <gaih_inet.constprop.7>:
...
10fa48: 78 bb e4 7e mr r4,r23
10fa4c: 0a 00 60 38 li r3,10
10fa50: d9 b4 04 48 bl 15af28 <inet_pton+0x8>
10fa54: 00 00 00 60 nop
10fa58: ac f4 ff 4b b 10ef04 <gaih_inet.constprop.7+0x3f4>
...
0000000000110450 <getaddrinfo>:
...
1105a8: 54 00 ff 38 addi r7,r31,84
1105ac: 58 00 df 38 addi r6,r31,88
1105b0: 69 e5 ff 4b bl 10eb18 <gaih_inet.constprop.7+0x8>
1105b4: 78 1b 71 7c mr r17,r3
1105b8: 50 01 7f e8 ld r3,336(r31)
...
000000000015af20 <inet_pton>:
15af20: 0b 00 4c 3c addis r2,r12,11
15af24: e0 c1 42 38 addi r2,r2,-15904
15af28: a6 02 08 7c mflr r0
15af2c: f0 ff c1 fb std r30,-16(r1)
15af30: f8 ff e1 fb std r31,-8(r1)
...
# perf probe -x /usr/lib64/libc-2.26.so -a inet_pton+0x8
# perf record -e probe_libc:inet_pton -g ping -6 -c 1 ::1
# perf script
Before:
ping 4507 [002] 514985.546540: probe_libc:inet_pton: (7fffa7dbaf28)
7fffa7dbaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
7fffa7d705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so)
13fb52d70 _init+0xbfc (/usr/bin/ping)
7fffa7c836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fffa7c83898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
After:
ping 4507 [002] 514985.546540: probe_libc:inet_pton: (7fffa7dbaf28)
7fffa7dbaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
7fffa7d6fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so)
7fffa7d705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so)
13fb52d70 _init+0xbfc (/usr/bin/ping)
7fffa7c836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fffa7c83898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
Case 2 - Attaching a probe at _int_malloc+0x180 (binary offset 0x9cf10).
Return address in still in LR and a new stack frame has already
been allocated but not used. The caller's caller, i.e. the third
entry, is invalid and should be filtered out and not the second
one.
# objdump -d /usr/lib64/libc-2.26.so | less
...
000000000009cd90 <_int_malloc>:
9cd90: 17 00 4c 3c addis r2,r12,23
9cd94: 70 a3 42 38 addi r2,r2,-23696
9cd98: 26 00 80 7d mfcr r12
9cd9c: f8 ff e1 fb std r31,-8(r1)
9cda0: 17 00 e4 3b addi r31,r4,23
9cda4: d8 ff 61 fb std r27,-40(r1)
9cda8: 78 23 9b 7c mr r27,r4
9cdac: 1f 00 bf 2b cmpldi cr7,r31,31
9cdb0: f0 ff c1 fb std r30,-16(r1)
9cdb4: b0 ff c1 fa std r22,-80(r1)
9cdb8: 78 1b 7e 7c mr r30,r3
9cdbc: 08 00 81 91 stw r12,8(r1)
9cdc0: 11 ff 21 f8 stdu r1,-240(r1)
9cdc4: 4c 01 9d 41 bgt cr7,9cf10 <_int_malloc+0x180>
9cdc8: 20 00 a4 2b cmpldi cr7,r4,32
...
9cf08: 00 00 00 60 nop
9cf0c: 00 00 42 60 ori r2,r2,0
9cf10: e4 06 ff 7b rldicr r31,r31,0,59
9cf14: 40 f8 a4 7f cmpld cr7,r4,r31
9cf18: 68 05 9d 41 bgt cr7,9d480 <_int_malloc+0x6f0>
...
000000000009e3c0 <tcache_init.part.4>:
...
9e420: 40 02 80 38 li r4,576
9e424: 78 fb e3 7f mr r3,r31
9e428: 71 e9 ff 4b bl 9cd98 <_int_malloc+0x8>
9e42c: 00 00 a3 2f cmpdi cr7,r3,0
9e430: 78 1b 7e 7c mr r30,r3
...
000000000009f7a0 <__libc_malloc>:
...
9f8f8: 00 00 89 2f cmpwi cr7,r9,0
9f8fc: 1c ff 9e 40 bne cr7,9f818 <__libc_malloc+0x78>
9f900: c9 ea ff 4b bl 9e3c8 <tcache_init.part.4+0x8>
9f904: 00 00 00 60 nop
9f908: e8 90 22 e9 ld r9,-28440(r2)
...
# perf probe -x /usr/lib64/libc-2.26.so -a _int_malloc+0x180
# perf record -e probe_libc:_int_malloc -g ./test-malloc
# perf script
Before:
test-malloc 6554 [009] 515975.797403: probe_libc:_int_malloc: (7fffa6e6cf10)
7fffa6e6cf10 _int_malloc+0x180 (/usr/lib64/libc-2.26.so)
7fffa6dd0000 [unknown] (/usr/lib64/libc-2.26.so)
7fffa6e6f904 malloc+0x164 (/usr/lib64/libc-2.26.so)
7fffa6e6f9fc malloc+0x25c (/usr/lib64/libc-2.26.so)
100006b4 main+0x38 (/home/testuser/test-malloc)
7fffa6df36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fffa6df3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
After:
test-malloc 6554 [009] 515975.797403: probe_libc:_int_malloc: (7fffa6e6cf10)
7fffa6e6cf10 _int_malloc+0x180 (/usr/lib64/libc-2.26.so)
7fffa6e6e42c tcache_init.part.4+0x6c (/usr/lib64/libc-2.26.so)
7fffa6e6f904 malloc+0x164 (/usr/lib64/libc-2.26.so)
7fffa6e6f9fc malloc+0x25c (/usr/lib64/libc-2.26.so)
100006b4 main+0x38 (/home/sandipan/test-malloc)
7fffa6df36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fffa6df3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Maynard Johnson <maynard@us.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Fixes: a60335ba3298 ("perf tools powerpc: Adjust callchain based on DWARF debug info")
Link: http://lkml.kernel.org/r/24bb726d91ed173aebc972ec3f41a2ef2249434e.1530724939.git.sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add missing documentation for --desc and --debug options to the 'perf
list' man page.
Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180717110738.10779-1-qpakzk@gmail.com
[ Clarify that --desc is by default active ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
With commit eca0fa28cd0d ("perf record: Provide detailed information on
s390 CPU") s390 platform provides detailed type/model/capacity
information in the CPU identifier string instead of just "IBM/S390".
This breaks 'perf kvm' support which uses hard coded string IBM/S390 to
compare with the CPU identifier string. Fix this by changing the
comparison.
Reported-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Tested-by: Stefan Raspl <raspl@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: stable@vger.kernel.org
Fixes: eca0fa28cd0d ("perf record: Provide detailed information on s390 CPU")
Link: http://lkml.kernel.org/r/20180712070936.67547-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'perf stat' command line flag -T to display transaction counters is
currently supported for x86 only.
Add support for s390. It is based on the metrics flag -M transaction
using the architecture dependent JSON files. This requires a metric
named "transaction" in the JSON files for the platform.
Introduce a new function metricgroup__has_metric() to check for the
existence of a metric_name transaction.
As suggested by Andi Kleen, this is the new approach to support
transactions counters. Other architectures will follow.
Output before:
[root@p23lp27 perf]# ./perf stat -T -- sleep 1
Cannot set up transaction events
[root@p23lp27 perf]#
Output after:
[root@s35lp76 perf]# ./perf stat -T -- ~/mytesttx 1 >/tmp/111
Performance counter stats for '/root/mytesttx 1':
1 tx_c_tend # 13.0 transaction
1 tx_nc_tend
11 tx_nc_tabort
0 tx_c_tabort_special
0 tx_c_tabort_no_special
0.001070109 seconds time elapsed
[root@s35lp76 perf]#
Suggested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180626071701.58190-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf stat' displays transactional counters using flag -T on x86. On
s390 use a JSON file defined metric named transaction to achieve the
same result.
Output before:
none
Output after:
[root@s35lp76 perf]# ./perf stat -M transaction -- \
~/mytesttx 1 >/tmp/111
Performance counter stats for '/root/mytesttx 1':
1 tx_c_tend # 13.0 transaction
1 tx_nc_tend
11 tx_nc_tabort
0 tx_c_tabort_special
0 tx_c_tabort_no_special
0.001061232 seconds time elapsed
[root@s35lp76 perf]#
Suggested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180621080452.61012-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Correct the support of detailed/verbose PMU event description by using
the "Unit": keyword in the json files to address event names refering to
the /sys/devices/cpum_[cs]f devices.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180621080452.61012-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This reverts commit 038586c34301578e538f6c5aa79ca82bce1b9152.
Fix the support of detailed/verbose PMU event description by using the
"Unit": keyword in the json files to address event names refering to the
/sys/devices/cpum_[cs]f devices.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180621080452.61012-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If the instruction sample failure has happened, it isn't necessary to
execute to the end of the function cs_etm__flush(). This commit is to
bail out immediately and return the error code.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1529298599-3876-3-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch introduces invalid address macro and uses it to replace dummy
value '0xdeadbeefdeadbeefUL'.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1529298599-3876-2-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We want to allow having mixed events with/without callchains, not
using a global flag to show callchains, but allowing supressing
callchains when they are present.
So invert the logic of the last parameter to hists__fprint() to
that effect.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ohqyisr6qge79qa95ojslptx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Extend regression testing to cover case of complex event names enabled
by the cset f92da71280fb ("perf record: Enable arbitrary event names
thru name= modifier").
Testing it:
# perf test
1: vmlinux symtab matches kallsyms : Skip
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok <===!
7: Simple expression parser : Ok
...
Committer testing:
# perf test "event definition"
6: Parse event definition strings : Ok
# perf test -v 6 2> /tmp/before
# perf test -v 6 2> /tmp/after
# diff -u /tmp/before /tmp/after
--- /tmp/before 2018-06-19 10:50:21.485572638 -0300
+++ /tmp/after 2018-06-19 10:50:40.886572896 -0300
@@ -1,6 +1,6 @@
6: Parse event definition strings :
--- start ---
-test child forked, pid 24259
+test child forked, pid 24904
running test 0 'syscalls:sys_enter_openat'Using CPUID GenuineIntel-6-3D
registering plugin: /root/.traceevent/plugins/plugin_kvm.so
registering plugin: /root/.traceevent/plugins/plugin_hrtimer.so
@@ -136,9 +136,11 @@
running test 50 '4:0x6530160/name=numpmu/'
running test 51 'L1-dcache-misses/name=cachepmu/'
running test 52 'intel_pt//u'
+running test 53 'cycles/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks'/Duk'
running test 0 'cpu/config=10,config1,config2=3,period=1000/u'
running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u'
running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/'
+running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2,umask=0x3/ukp'
el-capacity -> cpu/event=0x54,umask=0x2/
el-conflict -> cpu/event=0x54,umask=0x1/
el-start -> cpu/event=0xc8,umask=0x1/
#
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ad30b774-219b-7b80-c610-4e9e298cf8a7@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up fixes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some of the comments in the perf events code use articles incorrectly,
using 'a' for words beginning with a vowel sound, where 'an' should be
used.
Signed-off-by: Tobias Tefke <tobias.tefke@tutanota.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: alexander.shishkin@linux.intel.com
Cc: jolsa@redhat.com
Cc: namhyung@kernel.org
Link: http://lkml.kernel.org/r/20180709105715.22938-1-tobias.tefke@tutanota.com
[ Fix a few more perf related 'a event' typo fixes from all around the kernel and tooling tree. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tool fixes from Ingo Molnar:
"Misc tooling fixes: python3 related fixes, gcc8 fix, bashism fixes and
some other smaller fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Use python-config --includes rather than --cflags
perf script python: Fix dict reference counting
perf stat: Fix --interval_clear option
perf tools: Fix compilation errors on gcc8
perf test shell: Prevent temporary editor files from being considered test scripts
perf llvm-utils: Remove bashism from kernel include fetch script
perf test shell: Make perf's inet_pton test more portable
perf test shell: Replace '|&' with '2>&1 |' to work with more shells
perf scripts python: Add Python 3 support to EventClass.py
perf scripts python: Add Python 3 support to sched-migration.py
perf scripts python: Add Python 3 support to Util.py
perf scripts python: Add Python 3 support to SchedGui.py
perf scripts python: Add Python 3 support to Core.py
perf tools: Generate a Python script compatible with Python 2 and 3
|
|
Commit 0c3b7e42616f ("tools build: Add support for host programs format")
introduced host_c_flags which referenced CHOSTFLAGS. The actual name of the
variable is HOSTCFLAGS. Fix this up.
Fixes: 0c3b7e42616f ("tools build: Add support for host programs format")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Builds started failing in Fedora on Python 3.7 with:
`.gnu.debuglto_.debug_macro' referenced in section
`.gnu.debuglto_.debug_macro' of
util/scripting-engines/trace-event-python.o: defined in discarded
section
In Fedora, Python 3.7 added -flto to the list of --cflags and since it
was only applied to util/scripting-engines/trace-event-python.c and
scripts/python/Perf-Trace-Util/Context.c, linking failed.
It's not the first time the addition of flags has broken builds: commit
c6707fdef7e2 ("perf tools: Fix up build in hardnened environments")
appears to have fixed a similar problem. "python-config --includes"
provides the proper -I flags and doesn't introduce additional CFLAGS.
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180710154612.6285-1-jcline@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The dictionaries are attached to the parameter tuple that steals the
references and takes care of releasing them when appropriate. The code
should not decrement the reference counts explicitly. E.g. if libpython
has been built with reference debugging enabled, the superfluous DECREFs
will trigger this error when running perf script:
Fatal Python error: Objects/tupleobject.c:238 object at
0x7f10f2041b40 has negative ref count -1
Aborted (core dumped)
If the reference debugging is not enabled, the superfluous DECREFs might
cause the dict objects to be silently released while they are still in
use. This may trigger various other assertions or just cause perf
crashes and/or weird and unexpected data changes in the stored Python
objects.
Signed-off-by: Janne Huttunen <janne.huttunen@nokia.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jaroslav Skarvada <jskarvad@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1531133990-17485-1-git-send-email-janne.huttunen@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently we display extra header line, like:
# perf stat -I 1000 -a --interval-clear
# time counts unit events
insn per cycle branch-misses of all branches
2.964917103 3855.349912 cpu-clock (msec) # 3.855 CPUs utilized
2.964917103 23,993 context-switches # 0.006 M/sec
2.964917103 1,301 cpu-migrations # 0.329 K/sec
...
Fixing the condition and getting proper:
# perf stat -I 1000 -a --interval-clear
# time counts unit events
2.359048938 1432.492228 cpu-clock (msec) # 1.432 CPUs utilized
2.359048938 7,613 context-switches # 0.002 M/sec
2.359048938 419 cpu-migrations # 0.133 K/sec
...
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 9660e08ee8cb ("perf stat: Add --interval-clear option")
Link: http://lkml.kernel.org/r/20180702134202.17745-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We are getting following warnings on gcc8 that break compilation:
$ make
CC jvmti/jvmti_agent.o
jvmti/jvmti_agent.c: In function ‘jvmti_open’:
jvmti/jvmti_agent.c:252:35: error: ‘/jit-’ directive output may be truncated \
writing 5 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
There's no point in checking the result of snprintf call in
jvmti_open, the following open call will fail in case the
name is mangled or too long.
Using tools/lib/ function scnprintf that touches the return value from
the snprintf() calls and thus get rid of those warnings.
$ make DEBUG=1
CC arch/x86/util/perf_regs.o
arch/x86/util/perf_regs.c: In function ‘arch_sdt_arg_parse_op’:
arch/x86/util/perf_regs.c:229:4: error: ‘strncpy’ output truncated before terminating nul
copying 2 bytes from a string of the same length [-Werror=stringop-truncation]
strncpy(prefix, "+0", 2);
^~~~~~~~~~~~~~~~~~~~~~~~
Using scnprintf instead of the strncpy (which we know is safe in here)
to get rid of that warning.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180702134202.17745-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
scripts
Allows a perf shell test developer to concurrently edit and run their
test scripts, avoiding perf test attempts to execute their editor
temporary files, such as seen here:
$ sudo taskset -c 0 ./perf test -vvvvvvvv -F 63
63: 0VIM 8.0 :
--- start ---
sh: 1: ./tests/shell/.record+probe_libc_inet_pton.sh.swp: Permission denied
---- end ----
0VIM 8.0: FAILED!
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180629124658.15a506b41fc4539c08eb9426@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Like system(), popen() calls /bin/sh, which may/may not be bash.
Script when run on dash and encounters the line, yields:
exit: Illegal number: -1
checkbashisms report on script content:
possible bashism (exit|return with negative status code):
exit -1
Remove the bashism and use the more portable non-zero failure
status code 1.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180629124652.8d0af7e2281fd3fd8262cacc@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Debian based systems such as Ubuntu have dash as their default shell.
Even if the normal or root user's shell is bash, certain scripts still
call /bin/sh, which points to dash, so we fix this perf test by
rewriting it in a more portable way.
BEFORE:
$ sudo perf test -v 64
64: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 31942
./tests/shell/record+probe_libc_inet_pton.sh: 18: ./tests/shell/record+probe_libc_inet_pton.sh: expected[0]=ping[][0-9 \.:]+probe_libc:inet_pton: \([[:xdigit:]]+\): not found
./tests/shell/record+probe_libc_inet_pton.sh: 19: ./tests/shell/record+probe_libc_inet_pton.sh: expected[1]=.*inet_pton\+0x[[:xdigit:]]+[[:space:]]\(/lib/x86_64-linux-gnu/libc-2.27.so|inlined\)$: not found
./tests/shell/record+probe_libc_inet_pton.sh: 29: ./tests/shell/record+probe_libc_inet_pton.sh: expected[2]=getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\(/lib/x86_64-linux-gnu/libc-2.27.so\)$: not found
./tests/shell/record+probe_libc_inet_pton.sh: 30: ./tests/shell/record+probe_libc_inet_pton.sh: expected[3]=.*\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$: not found
ping 31963 [004] 83577.670613: probe_libc:inet_pton: (7fe15f87f4b0)
./tests/shell/record+probe_libc_inet_pton.sh: 39: ./tests/shell/record+probe_libc_inet_pton.sh: Bad substitution
./tests/shell/record+probe_libc_inet_pton.sh: 41: ./tests/shell/record+probe_libc_inet_pton.sh: Bad substitution
test child finished with -2
---- end ----
probe libc's inet_pton & backtrace it with ping: Skip
AFTER:
$ sudo perf test -v 64
64: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 32277
ping 32295 [001] 83679.690020: probe_libc:inet_pton: (7ff244f504b0)
7ff244f504b0 __GI___inet_pton+0x0 (/lib/x86_64-linux-gnu/libc-2.27.so)
7ff244f14ce4 getaddrinfo+0x124 (/lib/x86_64-linux-gnu/libc-2.27.so)
556ac036b57d _init+0xb75 (/bin/ping)
test child finished with 0
---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180629124643.2089b3ce59960eba34e87b27@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since we do not specify bash (and/or zsh) as a requirement, use the
standard error redirection that is more widely supported.
BEFORE:
$ sudo perf test -v 62
62: Check open filename arg using perf trace + vfs_getname:
--- start ---
test child forked, pid 27305
./tests/shell/trace+probe_vfs_getname.sh: 20: ./tests/shell/trace+probe_vfs_getname.sh: Syntax error: "&" unexpected
test child finished with -2
---- end ----
Check open filename arg using perf trace + vfs_getname: Skip
AFTER:
$ sudo perf test -v 62
64: Check open filename arg using perf trace + vfs_getname :
--- start ---
test child forked, pid 23008
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=result->name:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname -aR sleep 1
0.361 ( 0.008 ms): touch/23032 openat(dfd: CWD, filename: /tmp/temporary_file.VEh0n, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 4
test child finished with 0
---- end ----
Check open filename arg using perf trace + vfs_getname: Ok
Similar to commit 35435cd06081, with the same title.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180629124633.0a9f4bea54b8d2c28f265de2@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Support both Python 2 and Python 3 in EventClass.py. ``print`` is now a
function rather than a statement. This should have no functional change.
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Herton Krzesinski <herton@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/0100016341a73aac-e0734bdc-dcab-4c61-8333-d8be97524aa0-000000@email.amazonses.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Support both Python 2 and Python 3 in the sched-migration.py script.
This should have no functional change.
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Herton Krzesinski <herton@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/0100016341a737a5-44ec436f-3440-4cac-a03f-ddfa589bf308-000000@email.amazonses.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Support both Python 2 and Python 3 in Util.py. The dict class no longer
has a ``has_key`` method and print is now a function rather than a
statement. This should have no functional change.
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Herton Krzesinski <herton@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/0100016341a730c6-8db8b9b1-da2d-4ee3-96bf-47e0ae9796bd-000000@email.amazonses.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix a single syntax error in SchedGui.py to support both Python 2 and
Python 3. This should have no functional change.
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Herton Krzesinski <herton@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/0100016341a72d26-75729663-fe55-4309-8c9b-302e065ed2f1-000000@email.amazonses.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Support both Python 2 and Python 3 in Core.py. This should have no
functional change.
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Herton Krzesinski <herton@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/0100016341a72ebe-e572899e-f445-4765-98f0-c314935727f9-000000@email.amazonses.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When generating a Python script with "perf script -g python", produce
one that is compatible with Python 2 and 3. The difference between the
two generated scripts is:
--- python2-perf-script.py 2018-05-08 15:35:00.865889705 -0400
+++ python3-perf-script.py 2018-05-08 15:34:49.019789564 -0400
@@ -7,6 +7,8 @@
# be retrieved using Python functions of the form common_*(context).
# See the perf-script-python Documentation for the list of available functions.
+from __future__ import print_function
+
import os
import sys
@@ -18,10 +20,10 @@
def trace_begin():
- print "in trace_begin"
+ print("in trace_begin")
def trace_end():
- print "in trace_end"
+ print("in trace_end")
def raw_syscalls__sys_enter(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
@@ -29,26 +31,26 @@
print_header(event_name, common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
- print "id=%d, args=%s" % \
- (id, args)
+ print("id=%d, args=%s" % \
+ (id, args))
- print 'Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}'
+ print('Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}')
for node in common_callchain:
if 'sym' in node:
- print "\t[%x] %s" % (node['ip'], node['sym']['name'])
+ print("\t[%x] %s" % (node['ip'], node['sym']['name']))
else:
- print " [%x]" % (node['ip'])
+ print(" [%x]" % (node['ip']))
- print "\n"
+ print()
def trace_unhandled(event_name, context, event_fields_dict, perf_sample_dict):
- print get_dict_as_string(event_fields_dict)
- print 'Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}'
+ print(get_dict_as_string(event_fields_dict))
+ print('Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}')
def print_header(event_name, cpu, secs, nsecs, pid, comm):
- print "%-20s %5u %05u.%09u %8u %-20s " % \
- (event_name, cpu, secs, nsecs, pid, comm),
+ print("%-20s %5u %05u.%09u %8u %-20s " % \
+ (event_name, cpu, secs, nsecs, pid, comm), end="")
def get_dict_as_string(a_dict, delimiter=' '):
return delimiter.join(['%s=%s'%(k,str(v))for k,v in sorted(a_dict.items())])
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Herton Krzesinski <herton@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/0100016341a7278a-d178c724-2b0f-49ca-be93-80a7d51aaa0d-000000@email.amazonses.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
perf bench: (Jiri Olsa):
- Fix NUMA report output code handling of less than 1s runtimes.
perf script: (Ravi Bangoria)
- Add missing output fields in a 'perf script -h' hint.
- Fix crash because of missing evsel->priv.
- Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE], which
is just a end of features header marker.
perf stat: (Thomas Richter)
- Remove duplicate event counting
perf test:
- Wire parsing error handling in 'parse events' test (Jiri Olsa)
- Fix 'session topology' test on s/390 (Thomas Richter)
eBPF: (Yonghong Song)
- Fix a clang 7.0 compilation error when building perf linking
with libclang
intel-pt: (Adrian Hunter)
- Fix packet decoding of CYC packets.
Copies of kernel files: (Arnaldo Carvalho de Melo)
- Synchronize drm/drm.h UAPI
- Update x86's syscall_64.tbl, adding support for 'io_pgetevents' and 'rseq'
in 'perf trace'.
- Update powerpc uapi/asm/unistd.h, adding support for the 'rseq' syscall.
- Update if_link.h and bpf.h, no effect on tool features.
PowerPC: (Sandipan Das)
- Fix crash if callchain is empty.
s/390: (Thomas Richter)
- Support random socked_id assignment in the perf header.
- Support s390 random socket_id assignment in perf.data file.
- Make PMU alias definitions taken from sysfs and JSON files comparable
by normalizing them wrt spaces and newlines.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
perf_event__process_feature() accesses feat_ops[HEADER_LAST_FEATURE]
which is not defined and thus perf is crashing. HEADER_LAST_FEATURE is
used as an end marker for the perf report but it's unused for perf
script/annotate. Ignore HEADER_LAST_FEATURE for perf script/annotate,
just like it is done in 'perf report'.
Before:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
#
After:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
ls 7031 4392.099856: 250000 cpu-clock:uhH: 7f5e0ce7cd60
ls 7031 4392.100355: 250000 cpu-clock:uhH: 7f5e0c706ef7
#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 57b5de463925 ("perf report: Support forced leader feature in pipe mode")
Link: http://lkml.kernel.org/r/20180625124220.6434-4-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf script' in piped mode is crashing because evsel->priv is not set
properly. Fix it.
Before:
# perf record -o - -- ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
#
After:
# perf record -o - -- ls | perf script
<SNIP 'ls' output>
ls 2282 1031.731974: 250000 cpu-clock:uhH: 7effe4b3d29e
ls 2282 1031.732222: 250000 cpu-clock:uhH: 7effe4b3a650
#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: a14390fde64e ("perf script: Allow creating per-event dump files")
Link: http://lkml.kernel.org/r/20180625124220.6434-3-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A few fields are missing in a perf script -F hint. Add them.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180625124220.6434-2-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently we can hit following assert when running numa bench:
$ perf bench numa mem -p 3 -t 1 -P 512 -s 100 -zZ0cm --thp 1
perf: bench/numa.c:1577: __bench_numa: Assertion `!(!(((wait_stat) & 0x7f) == 0))' failed.
The assertion is correct, because we hit the SIGFPE in following line:
Thread 2.2 "thread 0/0" received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7fffd28c6700 (LWP 11750)]
0x000.. in worker_thread (__tdata=0x7.. ) at bench/numa.c:1257
1257 td->speed_gbs = bytes_done / (td->runtime_ns / NSEC_PER_SEC) / 1e9;
We don't check if the runtime is actually bigger than 1 second,
and thus this might end up with zero division within FPU.
Adding the check to prevent this.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180620094036.17278-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf stat' shows a mismatch in perf stat regarding counter names on
s390:
Run command:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v --
~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 573146 573146
tx_nc_tend: 1 573146 573146
Performance counter stats for '/root/mytesttx 1':
3 tx_nc_tend
0.001037252 seconds time elapsed
[root@s35lp76 perf]#
shows transaction counter tx_nc_tend with value 3 but it was triggered
only once as seen by the output of mytesttx.
When looking up the event name tx_nc_tend the following function
sequence is called:
parse_events_multi_pmu_add()
+--> perf_pmu__scan() being called with NULL argument
+--> pmu_read_sysfs() scans directory ../devices/ for
all PMUs
+--> perf_pmu__find() tries to find a PMU in the
global pmu list.
+--> pmu_lookup() called to read all file
entries when not in global
list.
pmu_lookup() causes the issue. It calls
+---> pmu_aliases() to read all the entries in the PMU directory.
On s390 this is named
/sys/devices/cpum_cf/events.
+--> pmu_aliases_parse() reads all files and creates an
alias for each file name.
So we end up with first entry created by
reading the sysfs file
[root@s35lp76 perf]# cat /sys/devices/cpum_cf
/events/TX_NC_TEND
event=0x008d
[root@s35lp76 perf]#
Debug output shows this entry
tx_nc_tend -> 'cpum_cf'/'event=0x008d
'/
After all files in this directory have been
read and aliases created this function is called:
+--> pmu_add_cpu_aliases()
This function looks up the CPU tables
created by the json files.
With json files for s390 now available all
the aliases are added to
the PMU alias list a second time.
The second entry is added by
reading the json file converted by jevent
resulting in file pmu-events/pmu-events.c:
{
.name = "tx_nc_tend",
.event = "event=0x8d",
.desc = "Unit: cpum_cf Completed TEND \
instructions \
in non-constrained TX mode",
.topic = "extended",
.long_desc = "A TEND instruction has \
completed in a \
non-constrained \
transactional-execution mode",
.pmu = "cpum_cf",
},
Debug output shows this entry
tx_nc_tend -> 'cpum_cf'/'event=0x8d'/
Function pmu_aliases_parse() and pmu_add_cpu_aliases() both use
__perf_pmu__new_alias() to add an alias to the PMU alias list. There is
no check if an alias already exist
So we end up with 2 entries for tx_nc_tend in the PMU alias list.
Having set up the PMU alias list for this PMU now
parse_events_multi_add_pmu() reads the complete alias list and adds each
alias with parse_events_add_pmu() to the global perfev_list. This
causes the alias to be added multiple times to the event list.
Fix this by making __perf_pmu__new_alias() to merge alias definitions if
an alias is already on the alias list. Also print a debug message when
the alias has mismatches in some fields.
Output before:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v \
-- ~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 551446 551446
Performance counter stats for '/root/mytesttx 1':
3 tx_nc_tend
0.000961134 seconds time elapsed
[root@s35lp76 perf]#
Output after:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v \
-- ~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 551446 551446
Performance counter stats for '/root/mytesttx 1':
1 tx_nc_tend
0.000961134 seconds time elapsed
[root@s35lp76 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180615101105.47047-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
PMU alias definitions in sysfs files may have spaces, newlines and
numbers with leading zeroes. Some alias definitions may also appear in
JSON files without spaces, etc.
Scan alias definitions and remove leading zeroes, spaces, newlines, etc
and rebuild string to make alias->str member comparable.
s390 for example has terms specified as event=0x0091 (read from files
../<PMU>/events/<FILE> and terms specified as event=0x91 (read from JSON
files).
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180615101105.47047-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove a trailing newline when reading sysfs file contents such as
/sys/devices/cpum_cf/events/TX_NC_TEND. This shows when verbose option
-v is used.
Output before:
tx_nc_tend -> 'cpum_cf'/'event=0x008d
'/
Output after:
tx_nc_tend -> 'cpum_cf'/'event=0x8d'/
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180615101105.47047-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Arnaldo reported the perf build failure with latest llvm/clang compiler
(7.0).
$ make LIBCLANGLLVM=1 -C tools/perf/
<SNIP>
CC /tmp/tmp.t53Qo38zci/tests/kmod-path.o
util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::SmallVectorImpl<char> >
perf::getBPFObjectFromModule(llvm::Module*)’:
util/c++/clang.cpp:150:43: error: no matching function for call to
‘llvm::TargetMachine::addPassesToEmitFile(llvm::legacy::PassManager&,
llvm::raw_svector_ostream&, llvm::TargetMachine::CodeGenFileType)’
TargetMachine::CGFT_ObjectFile)) {
^
In file included from util/c++/clang.cpp:25:0:
/usr/local/include/llvm/Target/TargetMachine.h:254:16: note: candidate:
virtual bool llvm::TargetMachine::addPassesToEmitFile(
llvm::legacy::PassManagerBase&, llvm::raw_pwrite_stream&,
llvm::raw_pwrite_stream*, llvm::TargetMachine::CodeGenFileType, bool,
llvm::MachineModuleInfo*)
virtual bool addPassesToEmitFile(PassManagerBase &, raw_pwrite_stream &,
^~~~~~~~~~~~~~~~~~~
/usr/local/include/llvm/Target/TargetMachine.h:254:16: note:
candidate expects 6 arguments, 3 provided
mv: cannot stat '/tmp/tmp.t53Qo38zci/util/c++/.clang.o.tmp': No such file or directory
make[7]: *** [/home/acme/git/perf/tools/build/Makefile.build:101:
/tmp/tmp.t53Qo38zci/util/c++/clang.o] Error 1
make[6]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: c++] Error 2
make[5]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
make[5]: *** Waiting for unfinished jobs....
CC /tmp/tmp.t53Qo38zci/tests/thread-map.o
The function addPassesToEmitFile signature changed in llvm 7.0 and such
a change caused the failure. This patch fixed the issue with using
proper function signatures under different compiler versions.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180616174739.1076733-1-yhs@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|