Age | Commit message (Collapse) | Author |
|
Option doesn't take a value, make sure the man pages agree. For example:
$ perf evlist --verbose=1
Error: option `verbose' takes no value
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210226183145.1878782-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To be used with -S or -s, using just this new option implies -s,
examples:
# perf trace --errno-summary sleep 1
Summary of events:
sleep (10793), 80 events, 93.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1000.427 1000.427 1000.427 1000.427 0.00%
mmap 8 0 0.026 0.002 0.003 0.005 9.18%
close 5 0 0.018 0.001 0.004 0.009 48.97%
mprotect 4 0 0.017 0.003 0.004 0.006 16.49%
openat 3 0 0.012 0.003 0.004 0.005 9.41%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
brk 4 0 0.005 0.001 0.001 0.002 22.77%
read 4 0 0.005 0.001 0.001 0.002 22.33%
access 1 1 0.004 0.004 0.004 0.004 0.00%
ENOENT: 1
fstat 3 0 0.004 0.001 0.001 0.002 17.18%
lseek 3 0 0.003 0.001 0.001 0.001 11.62%
arch_prctl 2 1 0.002 0.001 0.001 0.001 3.32%
EINVAL: 1
execve 1 0 0.000 0.000 0.000 0.000 0.00%
#
Works as well together with --failure and -S, i.e. collect the stats and
show just the syscalls that failed:
# perf trace --failure -S --errno-summary sleep 1
0.032 arch_prctl(option: 0x3001, arg2: 0x7fffdb11b580) = -1 EINVAL (Invalid argument)
0.045 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory)
Summary of events:
sleep (10806), 80 events, 93.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1000.094 1000.094 1000.094 1000.094 0.00%
mmap 8 0 0.026 0.002 0.003 0.005 9.06%
close 5 0 0.018 0.001 0.004 0.010 49.58%
mprotect 4 0 0.017 0.003 0.004 0.006 17.56%
openat 3 0 0.014 0.004 0.005 0.006 12.29%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
brk 4 0 0.005 0.001 0.001 0.002 22.75%
read 4 0 0.005 0.001 0.001 0.002 17.19%
access 1 1 0.005 0.005 0.005 0.005 0.00%
ENOENT: 1
fstat 3 0 0.004 0.001 0.001 0.002 21.66%
lseek 3 0 0.003 0.001 0.001 0.001 11.71%
arch_prctl 2 1 0.002 0.001 0.001 0.001 2.66%
EINVAL: 1
execve 1 0 0.000 0.000 0.000 0.000 0.00%
#
Suggested-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-l0mjwczkpouov7lss5zn8d9h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Similar to what is in 'perf record', works just like there:
# perf trace -e msr:*
328.297 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
328.302 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
328.306 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
328.317 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
328.322 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
328.327 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
328.331 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
328.336 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
328.340 :0/0 ^Cmsr:write_msr(msr: FS_BASE, val: 140240388381888)
#
So, for a system wide trace session looking at the write_msr tracepoint
we see a flood of MSR_FS_BASE, we need to get the number for that:
# grep FS_BASE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
[0xc0000100 - x86_64_specific_MSRs_offset] = "FS_BASE",
#
And then use it in a filter:
# perf trace -e msr:* --filter="msr!=0xc0000100"
<SNIP>
942.177 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068232)
942.199 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3057135655252)
942.203 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068222)
942.231 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056998373022)
942.241 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068236)
<SNIP>
#
Ok, lets filter that too, too noisy:
# grep TSC_DEADLINE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
[0x000006E0] = "IA32_TSC_DEADLINE",
#
# perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" -a sleep 0.1
0.000 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
0.066 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.070 CPU 0/KVM/4895 msr:write_msr(msr: 0x830, val: 34359740667)
0.099 CPU 0/KVM/4895 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199021993472)
0.100 CPU 0/KVM/4895 msr:read_msr(msr: IA32_APICBASE, val: 4276096000)
0.101 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR)
0.109 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
1.000 :0/0 msr:write_msr(msr: 0x830, val: 17179871485)
18.893 :0/0 msr:write_msr(msr: 0x83f, val: 246)
28.810 :0/0 msr:write_msr(msr: 0x830, val: 68719479037)
40.117 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
40.127 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR)
40.139 CPU 0/KVM/4895 msr:write_msr(msr: LSTAR, val: -2130661312)
40.141 CPU 0/KVM/4895 msr:write_msr(msr: SYSCALL_MASK, val: 14080)
40.142 CPU 0/KVM/4895 msr:write_msr(msr: TSC_AUX)
40.144 CPU 0/KVM/4895 msr:write_msr(msr: KERNEL_GS_BASE)
40.147 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL)
40.148 CPU 0/KVM/4895 msr:write_msr(msr: IA32_FLUSH_CMD, val: 1)
40.151 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
^C
#
One can combine that with filtering pids as well:
# perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" --filter-pids 4895 -a sleep 0.09
0.000 :0/0 msr:write_msr(msr: 0x830, val: 4294969597)
0.291 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
0.294 gnome-terminal/2790 msr:write_msr(msr: LSTAR, val: -1935671280)
0.295 gnome-terminal/2790 msr:write_msr(msr: TSC_AUX, val: 6)
10.940 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
15.943 gnome-shell/2096 msr:write_msr(msr: 0x830, val: 4294969597)
16.975 :0/0 msr:write_msr(msr: 0x830, val: 4294969597)
19.560 :0/0 msr:write_msr(msr: 0x83f, val: 246)
25.162 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
25.807 JS Watchdog/3635 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
25.820 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
25.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
26.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
29.942 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
45.313 :0/0 msr:write_msr(msr: 0x83f, val: 246)
56.945 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
60.946 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
74.096 JS Watchdog/8971 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
74.130 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
79.673 :0/0 msr:write_msr(msr: 0x83f, val: 246)
79.947 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 17179871485)
#
Or for just a pid, with callchains:
# grep SYSCALL_MAS /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
[0xc0000084 - x86_64_specific_MSRs_offset] = "SYSCALL_MASK",
# perf trace -e msr:* --filter="msr==0xc0000084" --pid 2790 --call-graph=dwarf
0.000 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
kvm_on_user_return ([kvm])
fire_user_return_notifiers ([kernel.kallsyms])
exit_to_usermode_loop ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
__GI___poll (inlined)
9299.073 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
kvm_on_user_return ([kvm])
fire_user_return_notifiers ([kernel.kallsyms])
exit_to_usermode_loop ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
__GI___poll (inlined)
9348.374 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
kvm_on_user_return ([kvm])
fire_user_return_notifiers ([kernel.kallsyms])
exit_to_usermode_loop ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
__GI___poll (inlined)
<SNIP>
#
Ok, just another form of KVM to emit MSRs :-)
Next step: elliminate those greps by getting the filter expression,
looking for arg names, then for the arrays associated with it to do a
reverse lookup.
Also allow those filters to be associated with strace-like syscall
names.
After that: augment the 'val' arg for 'msr:write_msr' based on the first
arg, 'msr'.
Then, do that with eBPF too, not just with tracepoint filters.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-95bfe5d4tzy5f66bx49d05rj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So far we used the libtraceevent printing routines when showing
tracepoint arguments, but since 'perf trace' has a lot of beautifiers
for syscall arguments, and since some of those can be used to augment
tracepoint arguments, add a routine to make use of those beautifiers
and allow the user to choose which one to use.
The default now is to use the same beautifiers used for the strace-like
sys_enter+sys_exit lines, but the user can choose the libtraceevent ones
by either using the:
perf trace --libtraceevent_print
command line option, or by setting:
# cat ~/.perfconfig
[trace]
tracepoint_beautifiers = libtraceevent
For instance, here are some examples:
# perf trace -e sched:*switch,*sleep,sched:*wakeup,exit*,sched:*exit sleep 1
0.000 sched:sched_wakeup(comm: "perf", pid: 5273 (perf), prio: 120, success: 1, target_cpu: 6)
0.621 nanosleep(rqtp: 0x7ffdd06d1140, rmtp: NULL) ...
0.628 sched:sched_switch(prev_comm: "sleep", prev_pid: 5273 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/6", next_pid: 0, next_prio: 120)
1000.879 sched:sched_wakeup(comm: "sleep", pid: 5273 (sleep), prio: 120, success: 1, target_cpu: 6)
0.621 ... [continued]: nanosleep()) = 0
1001.026 exit_group(error_code: 0) = ?
1001.216 sched:sched_process_exit(comm: "sleep", pid: 5273 (sleep), prio: 120)
#
And then using libtraceevent, as before:
# perf trace --libtraceevent_print -e sched:*switch,*sleep,sched:*wakeup,exit*,sched:*exit sleep 1
0.000 sched:sched_wakeup(comm=perf pid=5288 prio=120 target_cpu=001)
0.739 nanosleep(rqtp: 0x7ffeba6c2f40, rmtp: NULL) ...
0.747 sched:sched_switch(prev_comm=sleep prev_pid=5288 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120)
1000.902 sched:sched_wakeup(comm=sleep pid=5288 prio=120 target_cpu=001)
0.739 ... [continued]: nanosleep()) = 0
1001.012 exit_group(error_code: 0) = ?
#
The new default allocates an array of 'struct syscall_arg_fmt' for the
tracepoint arguments and, just like with syscall arguments, tries to
find suitable syscall_arg__scnprintf_NAME() routines to augment those
tracepoint arguments based on their type (as in the tracefs "format"
file), or even in their name + type, for instance arguntents with names
ending in "fd" with type "int" get the fd scnprintf beautifier attached,
etc.
Soon this will take advantage of the kernel BTF information to augment
enumerations based on the tracefs "format" type info.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-o8qdluotkcb3b1x2gjqrejcl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Just like with 'perf script':
# perf trace -e sched:*,syscalls:*sleep* sleep 1
0.000 :28345/28345 sched:sched_waking:comm=perf pid=28346 prio=120 target_cpu=005
0.005 :28345/28345 sched:sched_wakeup:perf:28346 [120] success=1 CPU:005
0.383 sleep/28346 sched:sched_process_exec:filename=/usr/bin/sleep pid=28346 old_pid=28346
0.613 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=607375 [ns] vruntime=23289041218 [ns]
0.689 sleep/28346 syscalls:sys_enter_nanosleep:rqtp: 0x7ffc491789b0
0.693 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=72021 [ns] vruntime=23289113239 [ns]
0.694 sleep/28346 sched:sched_switch:sleep:28346 [120] S ==> swapper/5:0 [120]
1000.787 :0/0 sched:sched_waking:comm=sleep pid=28346 prio=120 target_cpu=005
1000.824 :0/0 sched:sched_wakeup:sleep:28346 [120] success=1 CPU:005
1000.908 sleep/28346 syscalls:sys_exit_nanosleep:0x0
1001.218 sleep/28346 sched:sched_process_exit:comm=sleep pid=28346 prio=120
# perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep sleep 1
0.000 sleep/28349 sched:sched_stat_runtime:comm=sleep pid=28349 runtime=603036 [ns] vruntime=23873537697 [ns]
0.001 sleep/28349 sched:sched_switch:sleep:28349 [120] S ==> swapper/4:0 [120]
1000.392 :0/0 sched:sched_waking:comm=sleep pid=28349 prio=120 target_cpu=004
1000.443 :0/0 sched:sched_wakeup:sleep:28349 [120] success=1 CPU:004
1000.540 sleep/28349 syscalls:sys_exit_nanosleep:0x0
1000.852 sleep/28349 sched:sched_process_exit:comm=sleep pid=28349 prio=120
# perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep sleep 1
0.000 sleep/28352 sched:sched_stat_runtime:comm=sleep pid=28352 runtime=610543 [ns] vruntime=24811686681 [ns]
0.001 sleep/28352 sched:sched_switch:sleep:28352 [120] S ==> swapper/0:0 [120]
1000.397 :0/0 sched:sched_waking:comm=sleep pid=28352 prio=120 target_cpu=000
1000.440 :0/0 sched:sched_wakeup:sleep:28352 [120] success=1 CPU:000
#
# perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep --show-on-off sleep 1
0.000 sleep/28367 syscalls:sys_enter_nanosleep:rqtp: 0x7fffd1a25fc0
0.004 sleep/28367 sched:sched_stat_runtime:comm=sleep pid=28367 runtime=628760 [ns] vruntime=22170052672 [ns]
0.005 sleep/28367 sched:sched_switch:sleep:28367 [120] S ==> swapper/2:0 [120]
1000.367 :0/0 sched:sched_waking:comm=sleep pid=28367 prio=120 target_cpu=002
1000.412 :0/0 sched:sched_wakeup:sleep:28367 [120] success=1 CPU:002
1000.512 sleep/28367 syscalls:sys_exit_nanosleep:0x0
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-t3ngpt1brcc1fm9gep9gxm4q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Initial use case:
Dumping the maps setup by tools/perf/examples/bpf/augmented_raw_syscalls.c,
which so far are just booleans, showing just non-zeroed entries:
# cat ~/.perfconfig
[llvm]
dump-obj = true
clang-opt = -g
[trace]
#add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
add_events = /wb/augmented_raw_syscalls.o
$ date
Tue Feb 19 16:29:33 -03 2019
$ ls -la /wb/augmented_raw_syscalls.o
-rwxr-xr-x. 1 root root 14048 Jan 24 12:09 /wb/augmented_raw_syscalls.o
$ file /wb/augmented_raw_syscalls.o
/wb/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
$
# trace -e recvmmsg,sendmmsg --map-dump foobar
ERROR: BPF map "foobar" not found
# trace -e recvmmsg,sendmmsg --map-dump filtered_pids
ERROR: BPF map "filtered_pids" not found
# trace -e recvmmsg,sendmmsg --map-dump pids_filtered
[2583] = 1,
[2267] = 1,
^Z
[1]+ Stopped trace -e recvmmsg,sendmmsg --map-dump pids_filtered
# pidof trace
2267
# ps ax|grep gnome-terminal|grep -v grep
2583 ? Ssl 58:33 /usr/libexec/gnome-terminal-server
^C
# trace -e recvmmsg,sendmmsg --map-dump syscalls
[299] = 1,
[307] = 1,
^C
# grep x64_recvmmsg arch/x86/entry/syscalls/syscall_64.tbl
299 64 recvmmsg __x64_sys_recvmmsg
# grep x64_sendmmsg arch/x86/entry/syscalls/syscall_64.tbl
307 64 sendmmsg __x64_sys_sendmmsg
#
Next step probably will be something like 'perf stat's --interval-print and
--interval-clear.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-ztxj25rtx37ixo9cfajt8ocy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I was trigger happy on this one, as using ordered_events as implemented
by Jiri for use with the --block code under discussion on lkml incurs
in delaying processing to form batches that then get ordered and then
printed.
With 'perf trace' we want to process the events as they go, without that
delay, and doing it that way works well for the common case which is to
trace a thread or a workload started by 'perf trace'.
So revert back to not using ordered_events but add an option to select
that mode so that users can experiment with their particular use case to
see if works better, i.e. if the added delay is not a problem and the
ordering helps.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8ki7sld6rusnjhhtaly26i5o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Call it 'nr', as in this context it should be expressive enough, i.e.:
# perf trace -e sched:*waking/nr=8,call-graph=fp/
0.000 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=001
try_to_wake_up ([kernel.kallsyms])
sched_clock ([kernel.kallsyms])
3.933 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=001
try_to_wake_up ([kernel.kallsyms])
sched_clock ([kernel.kallsyms])
3.970 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
20.069 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
37.170 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
53.267 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
70.365 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
75.781 Web Content/3649 sched:sched_waking:comm=JS Helper pid=3670 prio=120 target_cpu=000
try_to_wake_up ([kernel.kallsyms])
try_to_wake_up ([kernel.kallsyms])
wake_up_q ([kernel.kallsyms])
futex_wake ([kernel.kallsyms])
do_futex ([kernel.kallsyms])
__x64_sys_futex ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
pthread_cond_signal@@GLIBC_2.3.2 (/usr/lib64/libpthread-2.26.so)
#
# perf trace -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/
0.000 :0/0 sched:sched_switch:swapper/0:0 [120] S ==> trace:3367 [120]
0.046 :0/0 sched:sched_switch:swapper/1:0 [120] S ==> kworker/u16:58:2722 [120]
570.670 irq/50-iwlwifi/680 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051ef00 len=66
__dev_queue_xmit ([kernel.kallsyms])
1106.141 jbd2/dm-0-8/476 block:block_plug:[jbd2/dm-0-8]
1106.175 jbd2/dm-0-8/476 block:block_unplug:[jbd2/dm-0-8] 1
1618.088 kworker/u16:30/2694 block:block_plug:[kworker/u16:30]
1810.000 :0/0 net:net_dev_queue:dev=vnet0 skbaddr=0xffff93498051ef00 len=52
__dev_queue_xmit ([kernel.kallsyms])
3857.974 :0/0 net:net_dev_queue:dev=vnet0 skbaddr=0xffff93498051f900 len=52
__dev_queue_xmit ([kernel.kallsyms])
4790.277 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8]
4790.448 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8]
#
The global --max-events has precendence:
# trace --max-events 3 -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/
0.000 :0/0 sched:sched_switch:swapper/0:0 [120] S ==> qemu-system-x86:2252 [120]
0.029 qemu-system-x8/2252 sched:sched_switch:qemu-system-x86:2252 [120] D ==> swapper/0:0 [120]
58.047 DNS Res~er #14/31661 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff9346966af100 len=84
__dev_queue_xmit ([kernel.kallsyms])
__libc_send (/usr/lib64/libpthread-2.26.so)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-s4jswltvh660ughvg9nwngah@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Allow stopping tracing after a number of events take place, considering
strace-like syscalls formatting as one event per enter/exit pair or when
in a multi-process tracing session a syscall is interrupted and printed
ending with '...'.
Examples included in the documentation:
Trace the first 4 open, openat or open_by_handle_at syscalls (in the future more syscalls may match here):
$ perf trace -e open* --max-events 4
[root@jouet perf]# trace -e open* --max-events 4
2272.992 ( 0.037 ms): gnome-shell/1370 openat(dfd: CWD, filename: /proc/self/stat) = 31
2277.481 ( 0.139 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
3026.398 ( 0.076 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
4294.665 ( 0.015 ms): sed/15879 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
$
Trace the first minor page fault when running a workload:
# perf trace -F min --max-stack=7 --max-events 1 sleep 1
0.000 ( 0.000 ms): sleep/18006 minfault [__clear_user+0x1a] => 0x5626efa56080 (?k)
__clear_user ([kernel.kallsyms])
load_elf_binary ([kernel.kallsyms])
search_binary_handler ([kernel.kallsyms])
__do_execve_file.isra.33 ([kernel.kallsyms])
__x64_sys_execve ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
#
Trace the next min page page fault to take place on the first CPU:
# perf trace -F min --call-graph=dwarf --max-events 1 --cpu 0
0.000 ( 0.000 ms): Web Content/17136 minfault [js::gc::Chunk::fetchNextDecommittedArena+0x4b] => 0x7fbe6181b000 (?.)
js::gc::FreeSpan::initAsEmpty (inlined)
js::gc::Arena::setAsNotAllocated (inlined)
js::gc::Chunk::fetchNextDecommittedArena (/usr/lib64/firefox/libxul.so)
js::gc::Chunk::allocateArena (/usr/lib64/firefox/libxul.so)
js::gc::GCRuntime::allocateArena (/usr/lib64/firefox/libxul.so)
js::gc::ArenaLists::allocateFromArena (/usr/lib64/firefox/libxul.so)
js::gc::GCRuntime::tryNewTenuredThing<JSString, (js::AllowGC)1> (inlined)
js::AllocateString<JSString, (js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
js::Allocate<JSThinInlineString, (js::AllowGC)1> (inlined)
JSThinInlineString::new_<(js::AllowGC)1> (inlined)
AllocateInlineString<(js::AllowGC)1, unsigned char> (inlined)
js::ConcatStrings<(js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
[0x18b26e6bc2bd] (/tmp/perf-17136.map)
Tracing the next four ext4 operations on a specific CPU:
# perf trace -e ext4:*/call-graph=fp/ --max-events 4 --cpu 3
0.000 mutt/3849 ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 57277 lblk 0
ext4_es_lookup_extent ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.097 mutt/3849 ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 57277 found 0 [0/0) 0
ext4_es_lookup_extent ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.141 mutt/3849 ext4:ext4_ext_map_blocks_enter:dev 253,2 ino 57277 lblk 0 len 1 flags
ext4_ext_map_blocks ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.184 mutt/3849 ext4:ext4_ext_load_extent:dev 253,2 ino 57277 lblk 1516511 pblk 18446744071750013657
__read_extent_tree_block ([kernel.kallsyms])
__read_extent_tree_block ([kernel.kallsyms])
ext4_find_extent ([kernel.kallsyms])
ext4_ext_map_blocks ([kernel.kallsyms])
ext4_map_blocks ([kernel.kallsyms])
ext4_mpage_readpages ([kernel.kallsyms])
read_pages ([kernel.kallsyms])
__do_page_cache_readahead ([kernel.kallsyms])
ondemand_readahead ([kernel.kallsyms])
generic_file_read_iter ([kernel.kallsyms])
__vfs_read ([kernel.kallsyms])
vfs_read ([kernel.kallsyms])
ksys_read ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rudá Moura <ruda.moura@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-sweh107bs7ol5bzls0m4tqdz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For instance:
# perf probe "vfs_getname=getname_flags:72 pathname=result->name:string"
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=result->name:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname -aR sleep 1
# perf trace --failure sleep 1
0.043 ( 0.010 ms): sleep/10978 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
For reference, here are all the syscalls in this case:
# perf trace sleep 1
? ( ): sleep/10976 ... [continued]: execve()) = 0
0.027 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
0.044 ( 0.010 ms): sleep/10976 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
0.057 ( 0.006 ms): sleep/10976 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.064 ( 0.002 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b370) = 0
0.067 ( 0.003 ms): sleep/10976 mmap(len: 111457, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec8615000
0.071 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
0.080 ( 0.007 ms): sleep/10976 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.088 ( 0.002 ms): sleep/10976 read(fd: 3, buf: 0x7fffac22b538, count: 832) = 832
0.092 ( 0.001 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b3d0) = 0
0.094 ( 0.002 ms): sleep/10976 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7feec8613000
0.099 ( 0.004 ms): sleep/10976 mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7feec8057000
0.104 ( 0.007 ms): sleep/10976 mprotect(start: 0x7feec8203000, len: 2097152) = 0
0.112 ( 0.005 ms): sleep/10976 mmap(addr: 0x7feec8403000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1753088) = 0x7feec8403000
0.120 ( 0.003 ms): sleep/10976 mmap(addr: 0x7feec8409000, len: 14976, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED) = 0x7feec8409000
0.128 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
0.139 ( 0.001 ms): sleep/10976 arch_prctl(option: 4098, arg2: 140663540761856) = 0
0.186 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8403000, len: 16384, prot: READ) = 0
0.204 ( 0.003 ms): sleep/10976 mprotect(start: 0x55bdc0ec3000, len: 4096, prot: READ) = 0
0.209 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8631000, len: 4096, prot: READ) = 0
0.214 ( 0.010 ms): sleep/10976 munmap(addr: 0x7feec8615000, len: 111457) = 0
0.269 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
0.271 ( 0.002 ms): sleep/10976 brk(brk: 0x55bdc2d25000) = 0x55bdc2d25000
0.274 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d25000
0.278 ( 0.007 ms): sleep/10976 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
0.288 ( 0.001 ms): sleep/10976 fstat(fd: 3</usr/lib/locale/locale-archive>, statbuf: 0x7feec8408aa0) = 0
0.290 ( 0.003 ms): sleep/10976 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec1488000
0.297 ( 0.001 ms): sleep/10976 close(fd: 3</usr/lib/locale/locale-archive>) = 0
0.325 (1000.193 ms): sleep/10976 nanosleep(rqtp: 0x7fffac22c0b0) = 0
1000.560 ( 0.006 ms): sleep/10976 close(fd: 1) = 0
1000.573 ( 0.005 ms): sleep/10976 close(fd: 2) = 0
1000.596 ( ): sleep/10976 exit_group()
#
And can be done systemwide, etc, with backtraces:
# perf trace --max-stack=16 --failure sleep 1
0.048 ( 0.015 ms): sleep/11092 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
__access (inlined)
dl_main (/usr/lib64/ld-2.26.so)
#
Or for some specific syscalls:
# perf trace --max-stack=16 -e openat --failure cat /tmp/rien
cat: /tmp/rien: No such file or directory
0.251 ( 0.012 ms): cat/11106 openat(dfd: CWD, filename: /tmp/rien) = -1 ENOENT No such file or directory
__libc_open64 (inlined)
main (/usr/bin/cat)
__libc_start_main (/usr/lib64/libc-2.26.so)
_start (/usr/bin/cat)
#
Look for inotify* syscalls that fail, system wide, for 2 seconds, with backtraces:
# perf trace -a --max-stack=16 --failure -e inotify* sleep 2
819.165 ( 0.058 ms): gmain/1724 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454) = -1 ENOENT No such file or directory
__GI_inotify_add_watch (inlined)
_ik_watch (/usr/lib64/libgio-2.0.so.0.5400.3)
_ip_start_watching (/usr/lib64/libgio-2.0.so.0.5400.3)
im_scan_missing (/usr/lib64/libgio-2.0.so.0.5400.3)
g_timeout_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_iterate.isra.23 (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_iteration (/usr/lib64/libglib-2.0.so.0.5400.3)
glib_worker_main (/usr/lib64/libglib-2.0.so.0.5400.3)
g_thread_proxy (/usr/lib64/libglib-2.0.so.0.5400.3)
start_thread (/usr/lib64/libpthread-2.26.so)
__GI___clone (inlined)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8f7d3mngaxvi7tlzloz3n7cs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
One can set a cgroup as a default cgroup to be used by all events or
set cgroups with the 'perf stat' and 'perf record' behaviour, i.e.
'-G A' will be the cgroup for events defined so far in the command line.
Here in my main machine, with a kvm instance running a rhel6 guinea pig
I have:
# ls -la /sys/fs/cgroup/perf_event/ | grep drw
drwxr-xr-x. 14 root root 360 Mar 6 12:04 ..
drwxr-xr-x. 3 root root 0 Mar 6 15:05 machine.slice
#
So I can go ahead and use that cgroup hierarchy, say lets see what
syscalls are being emitted by threads in that 'machine.slice' hierarchy
that are taking more than 100ms:
# perf trace --duration 100 -G machine.slice
0.188 (249.850 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
250.274 (249.743 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
500.224 (249.755 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
750.097 (249.934 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
1000.244 (249.780 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
1250.197 (249.796 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
1500.124 (249.859 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
1750.076 (172.900 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
902.570 (1021.116 ms): qemu-system-x8/23667 ppoll(ufds: 0x558151e03180, nfds: 74, tsp: 0x7ffc00cd0900, sigsetsize: 8) = 1
1923.825 (305.133 ms): qemu-system-x8/23667 ppoll(ufds: 0x558151e03180, nfds: 74, tsp: 0x7ffc00cd0900, sigsetsize: 8) = 1
2000.172 (229.002 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
^C #
If we look inside that cgroup hierarchy we get:
# ls -la /sys/fs/cgroup/perf_event/machine.slice/ | grep drw
drwxr-xr-x. 3 root root 0 Mar 6 15:05 .
drwxr-xr-x. 2 root root 0 Mar 6 16:16 machine-qemu\x2d2\x2drhel6.sandy.scope
#
There is just one, but lets say there were more and we would want to see
5 seconds worth of syscall summary for the threads in that cgroup:
# perf trace --summary -G machine.slice/machine-qemu\\x2d2\\x2drhel6.sandy.scope/ -a sleep 5
Summary of events:
qemu-system-x86 (23667), 143858 events, 24.2%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
ppoll 28492 4348.631 0.000 0.153 11.616 1.05%
futex 19661 140.801 0.001 0.007 2.993 3.20%
read 18440 68.084 0.001 0.004 1.653 4.33%
ioctl 5387 24.768 0.002 0.005 0.134 1.62%
CPU 0/KVM (23744), 449455 events, 75.8%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
ioctl 148364 3401.812 0.000 0.023 11.801 1.15%
futex 36131 404.127 0.001 0.011 7.377 2.63%
writev 29452 339.688 0.003 0.012 1.740 1.36%
write 11315 45.992 0.001 0.004 0.105 1.10%
#
See the documentation about how to set more than one cgroup for
different events in the same command line.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-t126jh4occqvu0xdqlcjygex@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To help with debugging, like the interrupted out of order issue that
will be dealt with in the next patch in this series, changing the code
to deal with:
raw_syscalls:sys_enter 411967179.269 Timer 9609/9626 [2]
raw_syscalls:sys_enter 411967179.213 file:// Content 9609/9609 [3]
328.038 (18446744073709.496 ms): Timer/9626 futex(uaddr: 0x7fc0d4027044, op: WAIT|PRIV, utime: 0x7fc0b0ffdb50 ) ...
raw_syscalls:sys_exit 411967179.225 file:// Content 9609/9609 [3]
327.982 ( 0.012 ms): file:// Conten/9609 futex(uaddr: 0x7fc0d4027040, op: WAKE|PRIV, val: 1 ) = 1
That long duration is the bug.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-fljqiibjn7wet24jd1ed7abc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add missing --force option to the man page.
Signed-off-by: Sihyeon Jang <uneedsihyeon@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1510842367-11011-2-git-send-email-uneedsihyeon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So now we can use:
# perf trace -e pkey_*
532.784 ( 0.006 ms): pkey/16018 pkey_alloc(init_val: DISABLE_WRITE) = -1 EINVAL Invalid argument
532.795 ( 0.004 ms): pkey/16018 pkey_mprotect(start: 0x7f380d0a6000, len: 4096, prot: READ|WRITE, pkey: -1) = 0
532.801 ( 0.002 ms): pkey/16018 pkey_free(pkey: -1 ) = -1 EINVAL Invalid argument
^C[root@jouet ~]#
Or '-e epoll*', '-e *msg*', etc.
Combining syscall names with perf events, tracepoints, etc, continues to
be valid, i.e. this is possible:
# perf probe -L sys_nanosleep
<SyS_nanosleep@/home/acme/git/linux/kernel/time/hrtimer.c:0>
0 SYSCALL_DEFINE2(nanosleep, struct timespec __user *, rqtp,
struct timespec __user *, rmtp)
{
struct timespec64 tu;
5 if (get_timespec64(&tu, rqtp))
6 return -EFAULT;
if (!timespec64_valid(&tu))
9 return -EINVAL;
11 current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE;
12 current->restart_block.nanosleep.rmtp = rmtp;
13 return hrtimer_nanosleep(&tu, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
}
# perf probe my_probe="sys_nanosleep:12 rmtp"
Added new event:
probe:my_probe (on sys_nanosleep:12 with rmtp)
You can now use it in all perf tools, such as:
perf record -e probe:my_probe -aR sleep 1
#
# perf trace -e probe:my_probe/max-stack=5/,*sleep sleep 1
0.427 ( 0.003 ms): sleep/16690 nanosleep(rqtp: 0x7ffefc245090) ...
0.430 ( ): probe:my_probe:(ffffffffbd112923) rmtp=0)
sys_nanosleep ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
__nanosleep_nocancel (/usr/lib64/libc-2.25.so)
0.427 (1000.208 ms): sleep/16690 ... [continued]: nanosleep()) = 0
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-elycoi8wy6y0w9dkj7ox1mzz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf trace supports --no-syscalls option but it's not listed in the man
page. (Though, I see an example using --no-syscalls in EXAMPLES
section.)
Committer note:
The --no-syscalls option tells 'perf trace' not to automagically ask for
raw_syscalls:sys_{enter,exit} to then format it in a strace like way.
This become more used as 'perf trace' got support for arbitrary events,
such as tracepoints, so more and more we use:
# perf trace --no-syscalls -e nmi:*
0.000 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 36649 handled: 1)
0.019 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 2907 handled: 0)
0.676 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 9401 handled: 1)
0.680 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 288 handled: 0)
0.701 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 4977 handled: 1)
0.703 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 67 handled: 0)
0.736 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 8549 handled: 1)
^C#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1492063332-5745-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Makes it easier to specify both events and syscalls (to be formatter
strace-like), i.e. previously one would have to do:
# perf trace -e nanosleep --event sched:sched_switch usleep 1
Now it is possible to do:
# perf trace -e nanosleep,sched:sched_switch usleep 1
0.000 ( 0.021 ms): usleep/17962 nanosleep(rqtp: 0x7ffdedd61ec0) ...
0.021 ( ): sched:sched_switch:usleep:17962 [120] S ==> swapper/1:0 [120])
0.000 ( 0.066 ms): usleep/17962 ... [continued]: nanosleep()) = 0
#
The old style --expr and using both -e and --event continues to work.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ieg6bakub4657l9e6afn85r4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In the perf wiki todo-list[1], there is an entry regarding initial-delay
and 'perf trace'; the following small patch tries to fulfill this point.
It has been generated against the branch tip/perf/core.
It has only been implemented in the "trace__run" case.
Ex.:
$ sudo strace -- ./perf trace --delay 5 sleep 1 2>&1
...
fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
ioctl(7, PERF_EVENT_IOC_ID, 0x7ffc8fd35718) = 0
ioctl(11, PERF_EVENT_IOC_SET_OUTPUT, 0x7) = 0
fcntl(11, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
ioctl(11, PERF_EVENT_IOC_ID, 0x7ffc8fd35718) = 0
write(6, "\0", 1) = 1
close(6) = 0
nanosleep({0, 5000000}, NULL) = 0 # DELAY OF 5 MS BEFORE ENABLING THE EVENTS
ioctl(3, PERF_EVENT_IOC_ENABLE, 0) = 0
ioctl(4, PERF_EVENT_IOC_ENABLE, 0) = 0
ioctl(5, PERF_EVENT_IOC_ENABLE, 0) = 0
ioctl(7, PERF_EVENT_IOC_ENABLE, 0) = 0
...
[1]: https://perf.wiki.kernel.org/index.php/Todo
Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161010054328.4028-2-alexis.berlemont@gmail.com
[ Add entry to the manpage, cut'n'pasted from stat's and record's ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We cannot limit processing stacks from the current value of the sysctl,
as we may be processing perf.data files, possibly from other machines.
Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can
be overriden using --max-stack or equivalent.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Fixes: 4cb93446c587 ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack")
Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
/proc/sys/kernel/perf_event_max_stack
There is an upper limit to what tooling considers a valid callchain,
and it was tied to the hardcoded value in the kernel,
PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl,
make it read it and use that as the upper limit, falling back to
PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To reduce the chances we'll overflow the mmap buffer, manual fine tuning
trumps this.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wxygbxmp1v9mng1ea28wet02@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If one uses:
# perf trace --min-stack 16
Then it implicitly means that callgraphs should be enabled, and the best
option in terms of widespread availability is "dwarf".
Further work needed to choose a better alternative, LBR, in capable
systems.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xtjmnpkyk42npekxz3kynzmx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Counterpart to --max-stack, to help focusing on deeply nested calls. Can
be combined with --duration, etc.
E.g.:
System wide syscall tracing looking for call stacks longer than 66:
# trace --mmap-pages 32768 --filter-pid 2711 --call-graph dwarf,16384 --min-stack 66
Or more compactly:
# trace -m 32768 --filt 2711 --call dwarf,16384 --min-st 66
363.027 ( 0.002 ms): gnome-shell/2287 poll(ufds: 0x7ffc5ea24230, nfds: 1, timeout_msecs: 4294967295 ) = 1
[0xf6fdd] (/usr/lib64/libc-2.22.so)
_xcb_conn_wait+0x92 (/usr/lib64/libxcb.so.1.1.0)
_xcb_out_send+0x4d (/usr/lib64/libxcb.so.1.1.0)
xcb_writev+0x45 (/usr/lib64/libxcb.so.1.1.0)
_XSend+0x19e (/usr/lib64/libX11.so.6.3.0)
_XReply+0x82 (/usr/lib64/libX11.so.6.3.0)
XSync+0x4d (/usr/lib64/libX11.so.6.3.0)
dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0)
_cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1)
_cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1)
_cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1)
cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1)
_cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1)
cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1)
paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0)
meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
[0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2)
g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2)
meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0)
main+0x3f7 (/usr/bin/gnome-shell)
__libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
[0x2909] (/usr/bin/gnome-shell)
363.038 ( 0.006 ms): gnome-shell/2287 writev(fd: 5<socket:[32540]>, vec: 0x7ffc5ea243a0, vlen: 3 ) = 4
__GI___writev+0x2d (/usr/lib64/libc-2.22.so)
_xcb_conn_wait+0x359 (/usr/lib64/libxcb.so.1.1.0)
_xcb_out_send+0x4d (/usr/lib64/libxcb.so.1.1.0)
xcb_writev+0x45 (/usr/lib64/libxcb.so.1.1.0)
_XSend+0x19e (/usr/lib64/libX11.so.6.3.0)
_XReply+0x82 (/usr/lib64/libX11.so.6.3.0)
XSync+0x4d (/usr/lib64/libX11.so.6.3.0)
dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0)
_cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1)
_cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1)
_cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1)
cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1)
_cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1)
cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1)
paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0)
meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
[0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2)
g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2)
meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0)
main+0x3f7 (/usr/bin/gnome-shell)
__libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
[0x2909] (/usr/bin/gnome-shell)
363.086 ( 0.042 ms): gnome-shell/2287 poll(ufds: 0x7ffc5ea24250, nfds: 1, timeout_msecs: 4294967295 ) = 1
[0xf6fdd] (/usr/lib64/libc-2.22.so)
_xcb_conn_wait+0x92 (/usr/lib64/libxcb.so.1.1.0)
wait_for_reply+0xb7 (/usr/lib64/libxcb.so.1.1.0)
xcb_wait_for_reply+0x61 (/usr/lib64/libxcb.so.1.1.0)
_XReply+0x127 (/usr/lib64/libX11.so.6.3.0)
XSync+0x4d (/usr/lib64/libX11.so.6.3.0)
dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0)
_cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1)
_cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1)
_cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1)
cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1)
_cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1)
cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1)
paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0)
meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
[0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2)
g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2)
meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0)
main+0x3f7 (/usr/bin/gnome-shell)
__libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
[0x2909] (/usr/bin/gnome-shell)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jncuxju9fibq2rl6olhqwjw6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Similar to the one in the other tools (report, script, top).
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lh7kk5a5t3erwxw31ah0cgar@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The kernel parts are not that useful:
# trace -m 512 -e nanosleep --call dwarf usleep 1
0.065 ( 0.065 ms): usleep/18732 nanosleep(rqtp: 0x7ffc4ee4e200) = 0
syscall_slow_exit_work ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
__nanosleep (/usr/lib64/libc-2.22.so)
usleep (/usr/lib64/libc-2.22.so)
main (/usr/bin/usleep)
__libc_start_main (/usr/lib64/libc-2.22.so)
_start (/usr/bin/usleep)
#
So lets just use perf_event_attr.exclude_callchain_kernel to avoid
collecting it in the ring buffer:
# trace -m 512 -e nanosleep --call dwarf usleep 1
0.063 ( 0.063 ms): usleep/19212 nanosleep(rqtp: 0x7ffc3df10fb0) = 0
__nanosleep (/usr/lib64/libc-2.22.so)
usleep (/usr/lib64/libc-2.22.so)
main (/usr/bin/usleep)
__libc_start_main (/usr/lib64/libc-2.22.so)
_start (/usr/bin/usleep)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qctu3gqhpim0dfbcp9d86c91@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now, one can print the call chain for every encountered sys_exit event,
e.g.:
$ perf trace -e nanosleep --call-graph dwarf path/to/ex_sleep
1005.757 (1000.090 ms): ex_sleep/13167 nanosleep(...) = 0
syscall_slow_exit_work ([kernel.kallsyms])
syscall_return_slowpath ([kernel.kallsyms])
int_ret_from_sys_call ([kernel.kallsyms])
__nanosleep (/usr/lib/libc-2.23.so)
[unknown] (/usr/lib/libQt5Core.so.5.6.0)
QThread::sleep (/usr/lib/libQt5Core.so.5.6.0)
main (path/to/ex_sleep)
__libc_start_main (/usr/lib/libc-2.23.so)
_start (path/to/ex_sleep)
Note that it is advised to increase the number of mmap pages to prevent
event losses when using this new feature. Often, adding `-m 10M` to the
`perf trace` invocation is enough.
This feature is also available in strace when built with libunwind via
`strace -k`. Performance wise, this solution is much better:
$ time find path/to/linux &> /dev/null
real 0m0.051s
user 0m0.013s
sys 0m0.037s
$ time perf trace -m 800M --call-graph dwarf find path/to/linux &> /dev/null
real 0m2.624s
user 0m1.203s
sys 0m1.333s
$ time strace -k find path/to/linux &> /dev/null
real 0m35.398s
user 0m10.403s
sys 0m23.173s
Note that it is currently not possible to configure the print output.
Adding such a feature, similar to what is available in `perf script` via
its `--fields` knob can be added later on.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
LPU-Reference: 1460115255-17648-1-git-send-email-milian.wolff@kdab.com
[ Split from a larger patch, do not print the IP, left align,
remove dup call symbol__init(), added man page entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The -i flag was incorrectly listed as a short flag for --no-inherit. It
should have only been listed as a short flag for --input.
This documentation error has existed since the --input flag was
introduced in 6810fc915f7a89d8134edb3996dbbf8eac386c26 (perf trace: Add
option to analyze events in a file versus live).
Signed-off-by: Peter Feiner <pfeiner@google.com>
Cc: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1446657706-14518-1-git-send-email-pfeiner@google.com
Fixes: 6810fc915f7a ("perf trace: Add option to analyze events in a file versus live")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The time out to limit the individual proc map processing was hard code
to 500ms. This patch introduce a new option --proc-map-timeout to make
the time limit configurable.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ying Huang <ying.huang@intel.com>
Link: http://lkml.kernel.org/r/1434549071-25611-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This comes from the desire of having -e/--expr to have the same meaning
as for 'strace', while other perf tools use it for --event, which
'trace' honours, i.e. all perf tools have --event in common, but trace
uses -e for strace's --expr.
Clarify it in the --help output.
Reported-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-5j94bcsdmcbeu2xthnzsj60d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Forgot to do it when adding the feature.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mx152b6x9cgknhw91vsyjlnd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When tracing in X we get event loops due to the tracing activity, i.e.
updates to a gnome-terminal that generate syscalls for X.org, etc.
To get a more useful view of what is happening, syscall wise, system
wide, we need to filter those, like in:
# ps ax|egrep '981|2296|1519' | grep -v egrep
981 tty1 Ss+ 5:40 /usr/bin/Xorg :0 -background none ...
1519 ? Sl 2:22 /usr/bin/gnome-shell
2296 ? Sl 4:16 /usr/libexec/gnome-terminal-server
#
# trace -e write --filter-pids 981,2296,1519
0.385 ( 0.021 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
0.922 ( 0.014 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
5006.525 ( 0.029 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
5007.235 ( 0.023 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
5177.646 ( 0.018 ms): rtkit-daemon/782 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7f7eea70be88, count: 8) = 8
8314.497 ( 0.004 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af7b0, count: 8) = 8
8314.518 ( 0.002 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af0e0, count: 8) = 8
^C#
When this option is used the tracer pid is also filtered.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-f5qmiyy7c0uxdm21ncatpeek@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch fix spelling typos found in tool/perf/Documentation.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/1410275930-17207-1-git-send-email-standby24x7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently, we may either trace syscalls or syscalls+pagefaults.
We'd like to be able to trace *only* pagefaults and this commit
implements this feature.
Example:
[root@zoo /]# echo 1 > /proc/sys/vm/drop_caches ; trace --no-syscalls -F -p `pidof xchat`
0.000 ( 0.000 ms): xchat/4574 majfault [g_unichar_get_script+0x11] => /usr/lib64/libglib-2.0.so.0.3800.2@0xc403b (x.)
0.202 ( 0.000 ms): xchat/4574 majfault [_cairo_hash_table_lookup+0x53] => 0x2280ff0 (?.)
20.854 ( 0.000 ms): xchat/4574 majfault [gdk_cairo_set_source_pixbuf+0x110] => /usr/bin/xchat@0x6da1f (x.)
1022.000 ( 0.000 ms): xchat/4574 majfault [__memcpy_sse2_unaligned+0x29] => 0x7ff5a8ca0400 (?.)
^C[root@zoo /]#
Below we can see malloc calls, 'trace' reading symbol tables in libraries to
resolve symbols, etc.
[root@zoo /]# echo 1 > /proc/sys/vm/drop_caches ; trace --no-syscalls -F all --cpu 1 sleep 10
0.000 ( 0.000 ms): chrome/26589 minfault [0x1b53129] => /tmp/perf-26589.map@0x33cbcbf7f000 (x.)
96.477 ( 0.000 ms): libvirtd/947 minfault [copy_user_enhanced_fast_string+0x5] => 0x7f7685bba000 (?k)
113.164 ( 0.000 ms): Xorg/1063 minfault [0x786da] => 0x7fce52882a3c (?.)
7162.801 ( 0.000 ms): chrome/3747 minfault [0x8e1a89] => 0xfcaefed0008 (?.)
<SNIP>
7773.138 ( 0.000 ms): chrome/3886 minfault [0x8e1a89] => 0xfcb0ce28008 (?.)
7992.022 ( 0.000 ms): chrome/26574 minfault [0x1b5a708] => 0x3de7b5fc5000 (?.)
8108.949 ( 0.000 ms): qemu-system-x8/4537 majfault [_int_malloc+0xee] => 0x7faffc466d60 (?.)
8108.975 ( 0.000 ms): qemu-system-x8/4537 minfault [_int_malloc+0x102] => 0x7faffc466d60 (?.)
<SNIP>
8148.174 ( 0.000 ms): qemu-system-x8/4537 minfault [_int_malloc+0x102] => 0x7faffc4eb500 (?.)
8270.855 ( 0.000 ms): chrome/26245 minfault [do_bo_emit_reloc+0xdb] => 0x45d092bc004 (?.)
8270.869 ( 0.000 ms): chrome/26245 minfault [do_bo_emit_reloc+0x108] => 0x45d09150000 (?.)
no symbols found in /usr/lib64/libspice-server.so.1.9.0, maybe install a debug package?
8273.831 ( 0.000 ms): trace/20198 majfault [__memcmp_sse4_1+0xbc6] => /usr/lib64/libspice-server.so.1.9.0@0xdf000 (d.)
<SNIP>
8275.121 ( 0.000 ms): trace/20198 minfault [dso__load+0x38] => 0x14fe756 (?.)
no symbols found in /usr/lib64/libelf-0.158.so, maybe install a debug package?
8275.142 ( 0.000 ms): trace/20198 minfault [__memcmp_sse4_1+0xbc6] => /usr/lib64/libelf-0.158.so@0x0 (d.)
<SNIP>
[root@zoo /]#
Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1403799268-1367-6-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch adds optional pagefault tracing support to 'perf trace'.
Using -F/--pf option user can specify whether he wants minor, major or
all pagefault events to be traced. This patch adds only live mode,
record and replace will come in a separate patch.
Example output:
1756272.905 ( 0.000 ms): curl/5937 majfault [0x7fa7261978b6] => /usr/lib/x86_64-linux-gnu/libkrb5.so.26.0.0@0x85288 (d.)
1862866.036 ( 0.000 ms): wget/8460 majfault [__clear_user+0x3f] => 0x659cb4 (?k)
Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1403799268-1367-3-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Per request from Pekka make --summary a summary only option meaning do
not show the individual system calls. Add another option to see all
syscalls along with the summary. In addition use 's' and 'S' as
shortcuts for the options.
Requested-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Pekka Enberg <penberg@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1384273875-3751-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Initially it tries to find a probe:vfs_getname that should be setup
with:
perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'
or with slight changes to cope with code flux in the getname_flags code.
In the future, if a "vfs:getname" tracepoint becomes available, then it
will be preferred.
This is not strictly required and more expensive method of reading the
/proc/pid/fd/ symlink will be used when the fd->path array entry is not
populated by a previous vfs_getname + open syscall ret sequence.
As with any other 'perf probe' probe the setup must be done just once
and the probe will be left inactive, waiting for users, be it 'perf
trace' of any other tool.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ujg8se8glq5izmu8cdkq15po@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When enabled dumps a summary of all syscalls by task with the usual
statistics -- min, max, average and relative stddev. For example,
make - 26341 : 3344 [ 17.4% ] 0.000 ms
read : 52 0.000 4.802 0.644 30.08
write : 20 0.004 0.036 0.010 21.72
open : 24 0.003 0.046 0.014 23.68
close : 64 0.002 0.055 0.008 22.53
stat : 2714 0.002 0.222 0.004 4.47
fstat : 18 0.001 0.041 0.006 46.26
mmap : 30 0.003 0.009 0.006 5.71
mprotect : 8 0.006 0.039 0.016 32.16
munmap : 12 0.007 0.077 0.020 38.25
brk : 48 0.002 0.014 0.004 10.18
rt_sigaction : 18 0.002 0.002 0.002 2.11
rt_sigprocmask : 60 0.002 0.128 0.010 32.88
access : 2 0.006 0.006 0.006 0.00
pipe : 12 0.004 0.048 0.013 35.98
vfork : 34 0.448 0.980 0.692 3.04
execve : 20 0.000 0.387 0.046 56.66
wait4 : 34 0.017 9923.287 593.221 68.45
fcntl : 8 0.001 0.041 0.013 48.79
getdents : 48 0.002 0.079 0.013 19.62
getcwd : 2 0.005 0.005 0.005 0.00
chdir : 2 0.070 0.070 0.070 0.00
getrlimit : 2 0.045 0.045 0.045 0.00
arch_prctl : 2 0.002 0.002 0.002 0.00
setrlimit : 2 0.002 0.002 0.002 0.00
openat : 94 0.003 0.005 0.003 2.11
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1381289214-24885-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The record option is a convience alias to include the -e raw_syscalls:*
argument to perf-record. All other options are passed to perf-record's
handler. Resulting data file can be analyzed by perf-trace -i.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1380395584-9025-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adding possibility to specify mmap size via -m/--mmap-pages
by appending unit size character (B/K/M/G) to the
number, like:
$ perf record -m 8K ls
$ perf record -m 2M ls
The size is rounded up appropriately to follow perf
mmap restrictions.
If no unit is specified the number provides pages as
of now, like:
$ perf record -m 8 ls
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1378031796-17892-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Enabled by default, disable with --no-comm, e.g.:
181.821 (0.001 ms): deja-dup-monit/10784 recvmsg(fd: 8, msg: 0x7fff4342baf0, flags: PEEK|TRUNC|CMSG_CLOEXEC ) = 20
181.824 (0.001 ms): deja-dup-monit/10784 geteuid( ) = 1000
181.825 (0.001 ms): deja-dup-monit/10784 getegid( ) = 1000
181.834 (0.002 ms): deja-dup-monit/10784 recvmsg(fd: 8, msg: 0x7fff4342baf0, flags: CMSG_CLOEXEC ) = 20
181.836 (0.001 ms): deja-dup-monit/10784 geteuid( ) = 1000
181.838 (0.001 ms): deja-dup-monit/10784 getegid( ) = 1000
181.705 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: PEEK|TRUNC|CMSG_CLOEXEC) = 1256
181.710 (0.002 ms): evolution-addr/10924 geteuid( ) = 1000
181.712 (0.001 ms): evolution-addr/10924 getegid( ) = 1000
181.727 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: CMSG_CLOEXEC ) = 1256
181.731 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000
181.734 (0.001 ms): evolution-addr/10924 getegid( ) = 1000
181.908 (0.002 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: PEEK|TRUNC|CMSG_CLOEXEC) = 20
181.913 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000
181.915 (0.001 ms): evolution-addr/10924 getegid( ) = 1000
181.930 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: CMSG_CLOEXEC ) = 20
181.934 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000
181.937 (0.001 ms): evolution-addr/10924 getegid( ) = 1000
220.718 (0.010 ms): at-spi2-regist/10715 sendmsg(fd: 3, msg: 0x7fffdb8756c0, flags: NOSIGNAL ) = 200
220.741 (0.000 ms): dbus-daemon/10711 ... [continued]: epoll_wait()) = 1
220.759 (0.004 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = 200
220.780 (0.002 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = 200
220.788 (0.001 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = -1 EAGAIN Resource temporarily unavailable
220.760 (0.004 ms): at-spi2-regist/10715 sendmsg(fd: 3, msg: 0x7fffdb8756c0, flags: NOSIGNAL ) = 200
220.771 (0.023 ms): perf/26347 open(filename: 0xf2e780, mode: 15918976 ) = 19
220.850 (0.002 ms): perf/26347 close(fd: 19 ) = 0
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6be5jvnkdzjptdrebfn5263n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Current timestamp shown for output is time relative to firt sample. This
patch adds an option to show the absolute perf_clock timestamp which is
useful when comparing output across commands (e.g., perf-trace to
perf-script).
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1378319865-55695-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Allows capture of raw_syscall:* events and analyzed at a later time.
v2: change -i option from inherit to input name for consistency with
other perf commands
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377750593-48046-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ain6q4u8g3bpnh18yhw24v2x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that we can ask for all but a set of syscalls to be traced.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9j6hvap23qanyl96wx4mrj9k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Common arguments like thread id, CPU list, mmap pages, etc should be
consistent across perf commands.
v3: Updated man page
v2: rebased to latest core branch
Signed-off-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1377018945-21940-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To output all 'trace' output to a filename, just like 'strace -ofile'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6q1homkwoayhmoq64y5vhel6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Similar to -e in strace, i.e. a comma separated list of syscall names
to trace.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-5zku7q5wug3103k1dzn3yy63@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
[root@sandy ~]# perf trace --sched --duration 0.100 --pid `pidof firefox`
<SNIP>
17079.847 ( 0.009 ms): 17643 poll(ufds: 140037623086496, nfds: 11, timeout_msecs: 0) = 0 Timeout
17079.892 ( 0.010 ms): 17643 read(fd: 4, buf: 140038178943092, count: 4096 ) = -1 EAGAIN Resource temporarily unavailable
17079.921 ( 0.013 ms): 17643 poll(ufds: 140037623086496, nfds: 11, timeout_msecs: 0) = 0 Timeout
17079.949 ( 0.009 ms): 17643 read(fd: 4, buf: 140038178943092, count: 4096 ) = -1 EAGAIN Resource temporarily unavailable
^C
_____________________________________________________________________
__) Summary of events (__
[ task - pid ] [ events ] [ ratio ] [ runtime ]
_____________________________________________________________________
firefox - 17643 : 18013 [ 72.2% ] 359.110 ms
firefox - 17663 : 41 [ 0.2% ] 21.439 ms
firefox - 17664 : 6840 [ 27.4% ] 133.642 ms
firefox - 17667 : 46 [ 0.2% ] 0.682 ms
[root@sandy ~]#
This is equivalent to the 'perf trace summary' subcomand in the tmp.perf/trace2
branch.
Another example, setting a huge duration filter to get just a system
wide summary:
[root@sandy ~]# perf trace --duration 10000.0 --sched
^C
_____________________________________________________________________
__) Summary of events (__
[ task - pid ] [ events ] [ ratio ] [ runtime ]
_____________________________________________________________________
scsi_eh_1 - 258 : 15 [ 0.0% ] 0.133 ms
kworker/0:1H - 322 : 13 [ 0.0% ] 0.032 ms
jbd2/dm-0-8 - 384 : 4 [ 0.0% ] 0.115 ms
flush-253:0 - 470 : 1 [ 0.0% ] 0.027 ms
firefox - 950 : 4783 [ 0.1% ] 24.863 ms
firefox - 992 : 1883 [ 0.1% ] 6.808 ms
firefox - 995 : 35 [ 0.0% ] 0.111 ms
ksoftirqd/6 - 4362 : 2 [ 0.0% ] 0.005 ms
ksoftirqd/7 - 4365 : 1 [ 0.0% ] 0.007 ms
Xorg - 4671 : 148 [ 0.0% ] 0.912 ms
gnome-settings- - 4846 : 14 [ 0.0% ] 0.086 ms
seahorse-daemon - 4847 : 14 [ 0.0% ] 0.092 ms
gnome-panel - 4875 : 46 [ 0.0% ] 0.159 ms
gnome-power-man - 4918 : 16 [ 0.0% ] 0.065 ms
gvfs-afc-volume - 4992 : 77 [ 0.0% ] 0.136 ms
gnome-screensav - 5114 : 24 [ 0.0% ] 0.128 ms
xchat - 8082 : 466 [ 0.0% ] 2.019 ms
synergyc - 8369 : 941 [ 0.0% ] 3.291 ms
synergyc - 8371 : 85 [ 0.0% ] 1.817 ms
jbd2/dm-4-8 - 9352 : 4 [ 0.0% ] 0.109 ms
rpcbind - 9786 : 3 [ 0.0% ] 0.017 ms
rtkit-daemon - 12802 : 10 [ 0.0% ] 0.038 ms
rtkit-daemon - 12803 : 8 [ 0.0% ] 0.000 ms
udisks-daemon - 13020 : 27 [ 0.0% ] 0.240 ms
kworker/7:0 - 14651 : 669 [ 0.0% ] 2.616 ms
kworker/5:1 - 16220 : 2 [ 0.0% ] 0.069 ms
kworker/4:0 - 19776 : 13 [ 0.0% ] 0.176 ms
openvpn - 20131 : 133 [ 0.0% ] 0.762 ms
plugin-containe - 20508 : 60658 [ 1.7% ] 131.153 ms
npviewer.bin - 20520 : 72208 [ 2.0% ] 138.945 ms
npviewer.bin - 20542 : 35 [ 0.0% ] 0.074 ms
npviewer.bin - 20543 : 30 [ 0.0% ] 0.074 ms
npviewer.bin - 20547 : 35 [ 0.0% ] 0.092 ms
npviewer.bin - 20552 : 35 [ 0.0% ] 0.093 ms
sshd - 20645 : 32 [ 0.0% ] 0.071 ms
npviewer.bin - 21053 : 35 [ 0.0% ] 0.074 ms
npviewer.bin - 21054 : 35 [ 0.0% ] 0.097 ms
kworker/0:2 - 21169 : 149 [ 0.0% ] 1.143 ms
kworker/3:0 - 22171 : 113 [ 0.0% ] 96.892 ms
flush-253:4 - 22410 : 1 [ 0.0% ] 0.028 ms
kworker/6:0 - 24581 : 25 [ 0.0% ] 0.275 ms
kworker/1:0 - 25572 : 4 [ 0.0% ] 0.103 ms
kworker/2:1 - 26299 : 138 [ 0.0% ] 1.440 ms
kworker/0:0 - 26325 : 1 [ 0.0% ] 0.003 ms
perf - 26330 : 3506967 [ 96.1% ] 6648.310 ms
[root@sandy ~]#
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/n/tip-mzuli0srnxyi1o029py6537x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Example:
[acme@sandy linux]$ perf trace --duration 0.025 usleep 1
2.221 ( 0.958 ms): 6724 execve(arg0: 140733557168278, arg1: 140733557178768, arg2: 16134304, arg3: 140733557167840, arg4: 7955998171588342573, arg5: 6723) = -2
3.690 ( 1.443 ms): 6724 execve(arg0: 140733557168295, arg1: 140733557178768, arg2: 16134304, arg3: 140733557167840, arg4: 7955998171588342573, arg5: 6723) = 0
3.979 ( 0.048 ms): 6724 open(filename: 208733843841, flags: 0, mode: 1 ) = 3
4.071 ( 0.075 ms): 6724 open(filename: 139744419925673, flags: 0, mode: 0 ) = 3
4.318 ( 0.056 ms): 6724 nanosleep(rqtp: 140734030404608, rmtp: 0 ) = 0
[acme@sandy linux]$ perf trace --duration 0.100 usleep 1
1.143 ( 1.021 ms): 6726 execve(arg0: 140736323962279, arg1: 140736323972752, arg2: 34926752, arg3: 140736323961824, arg4: 7955998171588342573, arg5: 6725) = 0
[acme@sandy linux]$
Cherry picked from tmp.perf/trace2 branch.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/n/tip-oslw2j2958we9qf0ctra4whd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Initially should look loosely like the venerable 'strace' tool, but
using the infrastructure in the perf tools to allow tracing extra
targets:
[acme@sandy linux]$ perf trace --hell
Error: unknown option `hell'
usage: perf trace <PID>
-p, --pid <pid> trace events on existing process id
--tid <tid> trace events on existing thread id
--all-cpus system-wide collection from all CPUs
--cpu <cpu> list of cpus to monitor
--no-inherit child tasks do not inherit counters
--mmap-pages <n> number of mmap data pages
--uid <user> user to profile
[acme@sandy linux]$
Those should have the same semantics as when using with 'perf record'.
It gets stuck sometimes, but hey, it works sometimes too!
In time it should support perf.data based workloads, i.e. it should have
a:
-o filename
Command line option that will produce a perf.data file that can then be
used with 'perf trace' or any of the other perf tools (script, report,
etc).
It will also eventually have the set of functionalities described in the
previous 'trace' prototype by Thomas Gleixner:
"Announcing a new utility: 'trace'"
http://lwn.net/Articles/415728/
Also planned is to have some of the features suggested in the comments
of that LWN article.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/n/tip-v9x3q9rv4caxtox7wtjpchq5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Free the perf trace name space and rename the trace to 'script' which is a
better match for the scripting engine.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|