Age | Commit message (Collapse) | Author |
|
There is a redundant ')' at the tail of each event. So remove it.
$ sudo perf trace --no-syscalls -e 'kmem:*' -a
899.342 kmem:kfree:(vfs_writev+0xb9) call_site=ffffffff9c453979 ptr=(nil))
899.344 kmem:kfree:(___sys_recvmsg+0x188) call_site=ffffffff9c9b8b88 ptr=(nil))
Signed-off-by: Changbin Du <changbin.du@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1520937601-24952-1-git-send-email-changbin.du@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To match the recently added event header information to --tui, e.g.:
# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
Samples: 128 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 48617682
_raw_spin_lock_irqsave() /proc/kcore
0.78 nop
7.03 push %rbx
3.12 pushfq
6.25 pop %rax
nop
mov %rax,%rbx
3.12 cli
nop
xor %eax,%eax
mov $0x1,%edx
79.69 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq *ffffffffb30eaed0
mov %rbx,%rax
pop %rbx
← retq
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ujy46x7cldyhyxelyf2b9quy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So at the top we'll have two lines, like this, from 'perf report':
# perf report --group --ignore-vmlinux
=====================================================================================================
Samples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895
_raw_spin_lock_irqsave /proc/kcore
Percent │ nop
│ push %rbx
0.00 14.29 0.00 │ pushfq
9.09 0.00 0.00 │ pop %rax
9.09 0.00 20.00 │ nop
│ mov %rax,%rbx
│ cli
4.55 7.14 0.00 │ nop
│ xor %eax,%eax
│ mov $0x1,%edx
│ lock cmpxchg %edx,(%rdi)
77.27 78.57 70.00 │ test %eax,%eax
│ ↓ jne 2b
│ mov %rbx,%rax
0.00 0.00 10.00 │ pop %rbx
│ ← retq
│2b: mov %eax,%esi
│ → callq queued_spin_lock_slowpath
│ mov %rbx,%rax
│ pop %rbx
Press 'h' for help on│key bindings
=====================================================================================================
9.09 + 9.09 + 4.55 + 77.27 = 100
14.29 + 7.14 + 78.57 = 100
20 + 70 + 10 = 100
We can do the math by using 't' to toggle from 'percent' to nr
=====================================================================================================
Samples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895
_raw_spin_lock_irqsave /proc/kcore
Period │ nop
│ push %rbx
0 79273 0 │ pushfq
190455 0 0 │ pop %rax
198038 0 3045 │ nop
│ mov %rax,%rbx
│ cli
217233 32562 0 │ nop
│ xor %eax,%eax
│ mov $0x1,%edx
│ lock cmpxchg %edx,(%rdi)
3421649 979174 28273 │ test %eax,%eax
│ ↓ jne 2b
│ mov %rbx,%rax
0 0 5193 │ pop %rbx
│ ← retq
│2b: mov %eax,%esi
│ → callq queued_spin_lock_slowpath
│ mov %rbx,%rax
│ pop %rbx
Press 'h' for help on│key bindings
=====================================================================================================
79273 + 190455 + 198038 + 3045 + 217233 + 32562 + 3421649 + 979174 + 28273 + 5193 = 5154895
Or number of samples:
=====================================================================================================
ooSamples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895
_raw_spin_lock_irqsave /proc/kcore
Samples │ nop
│ push %rbx
0 2 0 │ pushfq
2 0 0 │ pop %rax
2 0 2 │ nop
│ mov %rax,%rbx
│ cli
1 1 0 │ nop
│ xor %eax,%eax
│ mov $0x1,%edx
│ lock cmpxchg %edx,(%rdi)
17 11 7 │ test %eax,%eax
│ ↓ jne 2b
│ mov %rbx,%rax
0 0 1 │ pop %rbx
│ ← retq
│2b: mov %eax,%esi
│ → callq queued_spin_lock_slowpath
│ mov %rbx,%rax
│ pop %rbx
Press 'h' for help on key bindings
=====================================================================================================
2 + 2 + 2 + 2 + 1 + 1 + 17 + 11 + 7 + 1 = 46
Suggested-by: Martin Liška <mliska@suse.cz>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
Link: https://lkml.kernel.org/n/tip-ezccyxld50wtwyt66np6aomo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To print a string using the total period (nr_events) and the number of
samples for a given annotation, i.e. for a given symbol, the counterpart
to hists__scnprintf_samples_period(), that is for all the samples in a
session (be it a live session, think 'perf top' or a perf.data file,
think 'perf report').
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
Link: https://lkml.kernel.org/n/tip-goj2wu4fxutc8vd46mw3yg14@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This will be useful for the annotate browser as well, that wants to have
extra title lines, i.e. the current ui_browser unconditionally reserves
the first line for a browser title and the last one for status messages.
But some browsers, like the buckets one (hists browser) needs extra
lines to show headers, allowing it to be shown or not, press 'H' in
'perf top' or 'perf report' to see this feature.
So move that logic to the core ui_browser used by the hists_browser
('perf top' and 'perf report' main interface) so that it can be used by
the annotate browser too.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
Link: https://lkml.kernel.org/n/tip-r38xm3ut37ulbg1o5tn5iise@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The previous patch made this function useful to non-TUI parts of the
tools, but left it where the function from what it was carved, so that
the patch showed more clearly the process.
Now just move it outside the TUI parts so that we can finally use it,
even when the TUI code doesn't get built/linked.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
Link: https://lkml.kernel.org/n/tip-hqj7hvcr3mu5lvcqp3cssio6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
That is not use any struct hists_browser internals, so that it can be
shared with the other UIs and tools.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
Link: https://lkml.kernel.org/n/tip-w8mczjnqnbcj9yzfkv9ja6ro@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rename it to hists_browser__scnprintf_title() to better reflect that it
provides a scnprintf-like function operating on a hists_browser
instance.
This paves the way to have a non-hists_browser specific function to
scnprintf format a title with per evsel information to use in other
tools or UIs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
Link: https://lkml.kernel.org/n/tip-sntpyzxsnme9jvuz2qntwoh2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since a new option '--build-options' is created for 'perf version', so
we need to document it.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1522402036-22915-7-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We keep having bug reports that when users build perf on their own, but
they don't install some needed libraries such as libelf,
libbfd/libibery.
The perf can build, but it is missing important functionality.
This patch provides a new option '-vv' for perf which will print the
compiled-in status of libraries.
The 'perf -vv' is mapped to 'perf version --build-options'.
For example:
$ ./perf -vv
perf version 4.13.rc5.g6727c5
dwarf: [ on ] # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT
glibc: [ on ] # HAVE_GLIBC_SUPPORT
gtk2: [ on ] # HAVE_GTK2_SUPPORT
libaudit: [ OFF ] # HAVE_LIBAUDIT_SUPPORT
libbfd: [ on ] # HAVE_LIBBFD_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
libperl: [ on ] # HAVE_LIBPERL_SUPPORT
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT
libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
lzma: [ on ] # HAVE_LZMA_SUPPORT
get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
v3:
One bug is found in v2. It didn't process the option like '-vabc'
correctly. Fix this bug.
v2:
Use a global variable version_verbose to record the number of 'v'.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1522402036-22915-6-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch checks the values passed by CFLAGS (-DHAVE_XXX) and then
print the status of libraries.
For example, if HAVE_DWARF_SUPPORT is defined, that means the library
"dwarf" is compiled-in. The patch will print the status "on" for this
library otherwise it print the status "OFF".
A new option '--build-options' created for 'perf version' supports the
printing of library status.
For example:
$ ./perf version --build-options
or
./perf --version --build-options
or
./perf -v --build-options
perf version 4.13.rc5.g6727c5
dwarf: [ on ] # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT
glibc: [ on ] # HAVE_GLIBC_SUPPORT
gtk2: [ on ] # HAVE_GTK2_SUPPORT
libaudit: [ OFF ] # HAVE_LIBAUDIT_SUPPORT
libbfd: [ on ] # HAVE_LIBBFD_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
libperl: [ on ] # HAVE_LIBPERL_SUPPORT
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT
libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
lzma: [ on ] # HAVE_LZMA_SUPPORT
get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
v4:
1. Also print the macro name. That would make it easier
to grep around in the source looking for where code
related a particular features is located.
2. Update since HAVE_DWARF_GETLOCATIONS is renamed to
HAVE_DWARF_GETLOCATIONS_SUPPORT
v3:
Remove following unnecessary help message.
1. [ on ]: library is compiled-in
[ OFF ]: library is disabled in make configuration
OR library is not installed in build environment
2. Create '--build-options' option.
3. Use standard option parsing API 'parse_options'.
v2:
1. Use IS_BUILTIN macro to replace #ifdef/#endif block.
2. Print color for on/OFF.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1522402036-22915-5-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In Makefile.config, to make all libraries flags have _SUPPORT suffix,
rename HAVE_DWARF_GETLOCATIONS to HAVE_DWARF_GETLOCATIONS_SUPPORT
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1522402036-22915-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For most of libraries, in perf.config, they are recorded with -DHAVE_XXX in
CFLAGS according to if the libraries are compiled-in. Then C code then will
know if the library is compiled-in or not.
While for glibc, no -DHAVE_GLIBC_SUPPORT exists.
For python and perl libraries, only -DNO_PYTHON and -DNO_LIBPERL exist.
To make the code more consistent, the patch creates -DHAVE_LIBPYTHON_SUPPORT
and -DHAVE_LIBPERL_SUPPORT if the python and perl libraries are compiled-in.
Since the existing flags -DNO_PYTHON and -DNO_LIBPERL are being used in many
places in C code, this patch doesn't remove them. In a follow-up patch, we will
recontruct the C code and then use HAVE_XXX instead.
v3:
Move 'CFLAGS += -DHAVE_LIBPYTHON_SUPPORT' and 'CFLAGS +=
-DHAVE_LIBPERL_SUPPORT' to other places to avoid duplicated feature checking.
v2:
Create -DHAVE_GLIBC_SUPPORT, -DHAVE_LIBPYTHON_SUPPORT and
-DHAVE_LIBPERL_SUPPORT.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1522402036-22915-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adding IS_BUILTIN macro and its dependencies into tools world.
It's taken from kernel's include/linux/kconfig.h, which can't be taken
completely due to its kconfig dependencies.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1522402036-22915-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For instance:
# perf probe "vfs_getname=getname_flags:72 pathname=result->name:string"
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=result->name:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname -aR sleep 1
# perf trace --failure sleep 1
0.043 ( 0.010 ms): sleep/10978 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
For reference, here are all the syscalls in this case:
# perf trace sleep 1
? ( ): sleep/10976 ... [continued]: execve()) = 0
0.027 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
0.044 ( 0.010 ms): sleep/10976 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
0.057 ( 0.006 ms): sleep/10976 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.064 ( 0.002 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b370) = 0
0.067 ( 0.003 ms): sleep/10976 mmap(len: 111457, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec8615000
0.071 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
0.080 ( 0.007 ms): sleep/10976 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.088 ( 0.002 ms): sleep/10976 read(fd: 3, buf: 0x7fffac22b538, count: 832) = 832
0.092 ( 0.001 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b3d0) = 0
0.094 ( 0.002 ms): sleep/10976 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7feec8613000
0.099 ( 0.004 ms): sleep/10976 mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7feec8057000
0.104 ( 0.007 ms): sleep/10976 mprotect(start: 0x7feec8203000, len: 2097152) = 0
0.112 ( 0.005 ms): sleep/10976 mmap(addr: 0x7feec8403000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1753088) = 0x7feec8403000
0.120 ( 0.003 ms): sleep/10976 mmap(addr: 0x7feec8409000, len: 14976, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED) = 0x7feec8409000
0.128 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
0.139 ( 0.001 ms): sleep/10976 arch_prctl(option: 4098, arg2: 140663540761856) = 0
0.186 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8403000, len: 16384, prot: READ) = 0
0.204 ( 0.003 ms): sleep/10976 mprotect(start: 0x55bdc0ec3000, len: 4096, prot: READ) = 0
0.209 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8631000, len: 4096, prot: READ) = 0
0.214 ( 0.010 ms): sleep/10976 munmap(addr: 0x7feec8615000, len: 111457) = 0
0.269 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
0.271 ( 0.002 ms): sleep/10976 brk(brk: 0x55bdc2d25000) = 0x55bdc2d25000
0.274 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d25000
0.278 ( 0.007 ms): sleep/10976 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
0.288 ( 0.001 ms): sleep/10976 fstat(fd: 3</usr/lib/locale/locale-archive>, statbuf: 0x7feec8408aa0) = 0
0.290 ( 0.003 ms): sleep/10976 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec1488000
0.297 ( 0.001 ms): sleep/10976 close(fd: 3</usr/lib/locale/locale-archive>) = 0
0.325 (1000.193 ms): sleep/10976 nanosleep(rqtp: 0x7fffac22c0b0) = 0
1000.560 ( 0.006 ms): sleep/10976 close(fd: 1) = 0
1000.573 ( 0.005 ms): sleep/10976 close(fd: 2) = 0
1000.596 ( ): sleep/10976 exit_group()
#
And can be done systemwide, etc, with backtraces:
# perf trace --max-stack=16 --failure sleep 1
0.048 ( 0.015 ms): sleep/11092 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
__access (inlined)
dl_main (/usr/lib64/ld-2.26.so)
#
Or for some specific syscalls:
# perf trace --max-stack=16 -e openat --failure cat /tmp/rien
cat: /tmp/rien: No such file or directory
0.251 ( 0.012 ms): cat/11106 openat(dfd: CWD, filename: /tmp/rien) = -1 ENOENT No such file or directory
__libc_open64 (inlined)
main (/usr/bin/cat)
__libc_start_main (/usr/lib64/libc-2.26.so)
_start (/usr/bin/cat)
#
Look for inotify* syscalls that fail, system wide, for 2 seconds, with backtraces:
# perf trace -a --max-stack=16 --failure -e inotify* sleep 2
819.165 ( 0.058 ms): gmain/1724 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454) = -1 ENOENT No such file or directory
__GI_inotify_add_watch (inlined)
_ik_watch (/usr/lib64/libgio-2.0.so.0.5400.3)
_ip_start_watching (/usr/lib64/libgio-2.0.so.0.5400.3)
im_scan_missing (/usr/lib64/libgio-2.0.so.0.5400.3)
g_timeout_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_iterate.isra.23 (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_iteration (/usr/lib64/libglib-2.0.so.0.5400.3)
glib_worker_main (/usr/lib64/libglib-2.0.so.0.5400.3)
g_thread_proxy (/usr/lib64/libglib-2.0.so.0.5400.3)
start_thread (/usr/lib64/libpthread-2.26.so)
__GI___clone (inlined)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8f7d3mngaxvi7tlzloz3n7cs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Due to these commits:
1da961d72ab0 ("x86/cpufeatures: Add Intel Total Memory Encryption cpufeature")
7958b2246fad ("x86/cpufeatures: Add Intel PCONFIG cpufeature")
To silence this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
Nothing in those csets requires changes in tools/perf/, so just
sync it to silence the build.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-m2yl8wj0uxs8pncq2ncfcx46@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add DSO size to perf report/top sort output list.
This includes adding a map__size fn to map.h, which is
approximately equal to the DSO data file_size:
DSO file size map (end-start) file / (end-start)
libwebkit2gtk-4.0.so.37.24.9 43260072 41295872 95%
libglib-2.0.so.0.5400.1 1125680 1118208 99%
libc-2.26.so 1960656 1925120 101%
libdbus-1.so.3.14.13 309456 303104 102%
Sample output:
$ ./perf report -s dso_size,dso
Samples: 2K of event 'cycles:uppp', Event count (approx.): 128373340
Overhead DSO size Shared Object
90.62% unknown [unknown]
2.87% 1118208 libglib-2.0.so.0.5400.1
1.92% 303104 libdbus-1.so.3.14.13
1.42% 1925120 libc-2.26.so
0.77% 41295872 libwebkit2gtk-4.0.so.37.24.9
0.61% 335872 libgobject-2.0.so.0.5400.1
0.41% 1052672 libgdk-3.so.0.2200.25
0.36% 106496 libpthread-2.26.so
0.29% 221184 dbus-daemon
0.17% 159744 ld-2.26.so
0.13% 49152 libwayland-client.so.0.3.0
0.12% 1642496 libgio-2.0.so.0.5400.1
0.09% 7327744 libgtk-3.so.0.2200.25
0.09% 12324864 libmozjs-52.so.0.0.0
0.05% 4796416 perf
0.04% 843776 libgjs.so.0.0.0
0.03% 1409024 libmutter-clutter-1.so
Committer testing:
To sort by DSO size, use:
# perf report -F dso_size,dso,overhead -s dso_size
<SNIP>
3465216 libdns-export.so.174.0.1 0.00%
3522560 libgc.so.1.0.3 0.00%
3538944 libbfd-2.29-13.fc27.so 0.59%
3670016 libunistring.so.2.1.0 0.00%
3723264 libguile-2.0.so.22.8.1 0.00%
3776512 libgio-2.0.so.0.5400.3 0.00%
3891200 libc-2.26.so 0.96%
3944448 libmozjs-17.0.so 0.00%
4218880 libperl.so.5.26.1 0.18%
4452352 libpython2.7.so.1.0 0.02%
4472832 perf 0.02%
4603904 git 0.01%
4751360 libcrypto.so.1.1.0g 0.00%
5005312 libslang.so.2.3.1 0.00%
7315456 libgtk-3.so.0.2200.26 0.09%
8818688 i965_dri.so 2.46%
8818688 i965_dri.so (deleted) 1.26%
12414976 libmozjs-52.so.0.0.0 0.03%
23642112 cc1 2.02%
27889664 [kernel.kallsyms] 25.41%
80834560 libxul.so (deleted) 15.68%
98078720 chrome 32.03%
1056964608 [kernel.kallsyms] 1.59%
#
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180327060956.1c01ebe67a2a941bb4468c6f@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cannon Lake supports C1/C3/C6/C7, PC2/PC3/PC6/PC7/PC8/PC9/PC10
state residency counters, this patch enables those counters.
( The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592. )
Tested-by: Puthikorn Voravootivat <puthik@chromium.org>
Signed-off-by: Harry Pan <harry.pan@intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan.liang@intel.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: gs0622@gmail.com
Link: http://lkml.kernel.org/r/20180309121549.630-3-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This patch enables RAPL counters (energy consumption counters)
support for Cannon Lake processors.
( ESU and power domains refer to Intel Software Developers' Manual,
Vol. 4, Order No. 335592. )
Usage example:
$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10
Tested-by: Puthikorn Voravootivat <puthik@chromium.org>
Signed-off-by: Harry Pan <harry.pan@intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: colin.king@canonical.com
Cc: gs0622@gmail.com
Cc: kan.liang@linux.intel.com
Link: http://lkml.kernel.org/r/20180309121549.630-2-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This is a cosmetic patch that deals with the address filter structure's
ambiguous fields 'filter' and 'range'. The former stands to mean that the
filter's *action* should be to filter the traces to its address range if
it's set or stop tracing if it's unset. This is confusing and hard on the
eyes, so this patch replaces it with 'action' enum. The 'range' field is
completely redundant (meaning that the filter is an address range as
opposed to a single address trigger), as we can use zero size to mean the
same thing.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180329120648.11902-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Conflicts:
kernel/events/hw_breakpoint.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- Be consistent when checking if a perf_mmap instance had
its ring buffer unmmaped, fixing segfaults noticed in
'perf trace' (Kan Liang, Arnaldo Carvalho de Melo)
- Avoid adding the same option multiple times to the 'diff'
command in check-headers.sh (Jiri Olsa)
- Add vendor event files (JSON format) to various IBM
s390 models (z10EC, z10BC, z196, zEC12, zBC12, z13
and z14) (Thomas Richter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the
modification of a breakpoint - simplify it and remove the pointless
local variables.
Also update the stale Docbook while at it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add CPU measurement counter facility event description files (json
files) for IBM z14.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-5-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add CPU measurement counter facility event description files (json
files) for IBM z13.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-4-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add CPU measurement counter facility event description files (json
files) for IBM zEC12 and zBC12.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-3-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add CPU measurement counter facility event description files (json
files) for IBM z196.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-2-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add CPU measurement counter facility event description files (JSON
files) for IBM z10EC and z10BC.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The previous patch is insufficient to cure the reported 'perf trace'
segfault, as it only cures the perf_mmap__read_done() case, moving the
segfault to perf_mmap__read_init() functio, fix it by doing the same
refcount check.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8872481bd048 ("perf mmap: Introduce perf_mmap__read_init()")
Link: https://lkml.kernel.org/r/20180326144127.GF18897@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There is a segmentation fault when running 'perf trace'. For example:
[root@jouet e]# perf trace -e *chdir -o /tmp/bla perf report --ignore-vmlinux -i ../perf.data
The perf_mmap__consume() could unmap the mmap. It needs to check the
refcnt in perf_mmap__read_done().
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: ee023de05f35 ("perf mmap: Introduce perf_mmap__read_done()")
Link: http://lkml.kernel.org/r/1522071729-16776-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently the "opts" variable is not zero-ed and we keep on adding to
it, ending up with:
$ check-headers.sh 2>&1
+ opts=' "-B"'
+ opts=' "-B" "-B"'
+ opts=' "-B" "-B" "-B"'
+ opts=' "-B" "-B" "-B" "-B"'
+ opts=' "-B" "-B" "-B" "-B" "-B"'
+ opts=' "-B" "-B" "-B" "-B" "-B" "-B"'
Fix this by initializing it in the check() function, right before
starting the loop.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180321140515.2252-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
this patch fix a bug in how the pebs->real_ip is handled in the PEBS
handler. real_ip only exists in Haswell and later processor. It is
actually the eventing IP, i.e., where the event occurred. As opposed
to the pebs->ip which is the PEBS interrupt IP which is always off
by one.
The problem is that the real_ip just like the IP needs to be fixed up
because PEBS does not record all the machine state registers, and
in particular the code segement (cs). This is why we have the set_linear_ip()
function. The problem was that set_linear_ip() was only used on the pebs->ip
and not the pebs->real_ip.
We have profiles which ran into invalid callstacks because of this.
Here is an example:
..... 0: ffffffffffffff80 recent entry, marker kernel v
..... 1: 000000000040044d <= user address in kernel space!
..... 2: fffffffffffffe00 marker enter user v
..... 3: 000000000040044d
..... 4: 00000000004004b6 oldest entry
Debugging output in get_perf_callchain():
[ 857.769909] CALLCHAIN: CPU8 ip=40044d regs->cs=10 user_mode(regs)=0
The problem is that the kernel entry in 1: points to a user level
address. How can that be?
The reason is that with PEBS sampling the instruction that caused the event
to occur and the instruction where the CPU was when the interrupt was posted
may be far apart. And sometime during that time window, the privilege level may
change. This happens, for instance, when the PEBS sample is taken close to a
kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level
instruction. But by the time the PMU interrupt fired, the processor had already
entered kernel space. This is why the debug output shows a user address with
user_mode() false.
The problem comes from PEBS not recording the code segment (cs) register.
The register is used in x86_64 to determine if executing in kernel vs user
space. This is okay because the kernel has a software workaround called
set_linear_ip(). But the issue in setup_pebs_sample_data() is that
set_linear_ip() is never called on the real_ip value when it is available
(Haswell and later) and precise_ip > 1.
This patch fixes this problem and eliminates the callchain discrepancy.
The patch restructures the code around set_linear_ip() to minimize the number
of times the IP has to be set.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
No changes in refcount semantics -- use DEFINE_STATIC_KEY_FALSE()
for initialization and replace:
static_key_slow_inc|dec() => static_branch_inc|dec()
static_key_false() => static_branch_unlikely()
Added a '_key' suffix to rdpmc_always_available, for better self-documentation.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/20180326210929.5244-5-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/slave-dma
Pull dmaengine fix from Vinod Koul:
"One small fix for stm32-dmamux fixing buffer overflow"
* tag 'dmaengine-fix-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/slave-dma:
dmaengine: stm32-dmamux: fix a potential buffer overflow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 and PTI fixes from Ingo Molnar:
"Misc fixes:
- fix EFI pagetables freeing
- fix vsyscall pagetable setting on Xen PV guests
- remove ancient CONFIG_X86_PPRO_FENCE=y - x86 is TSO again
- fix two binutils (ld) development version related incompatibilities
- clean up breakpoint handling
- fix an x86 self-test"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry/64: Don't use IST entry for #BP stack
x86/efi: Free efi_pgd with free_pages()
x86/vsyscall/64: Use proper accessor to update P4D entry
x86/cpu: Remove the CONFIG_X86_PPRO_FENCE=y quirk
x86/boot/64: Verify alignment of the LOAD segment
x86/build/64: Force the linker to use 2MB page size
selftests/x86/ptrace_syscall: Fix for yet more glibc interference
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Make posix clock ID usage Spectre-safe"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-timers: Protect posix clock array access against speculation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Two sched debug output related fixes: a console output fix and
formatting fixes"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Adjust newlines for better alignment
sched/debug: Fix per-task line continuation for console output
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Misc kernel side fixes.
Generic:
- cgroup events counting fix
x86:
- Intel PMU truncated-parameter fix
- RDPMC fix
- API naming fix/rename
- uncore driver big-hardware PCI enumeration fix
- uncore driver filter constraint fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/cgroup: Fix child event counting bug
perf/x86/intel/uncore: Fix multi-domain PCI CHA enumeration bug on Skylake servers
perf/x86/intel: Rename confusing 'freerunning PEBS' API and implementation to 'large PEBS'
perf/x86/intel/uncore: Add missing filter constraint for SKX CHA event
perf/x86/intel: Don't accidentally clear high bits in bdw_limit_period()
perf/x86/intel: Disable userspace RDPMC usage for large PEBS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Two fixes: tighten up a jump-labels warning to not trigger on certain
modules and fix confusing (and non-existent) mutex API documentation"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
jump_label: Disable jump labels in __exit code
locking/mutex: Improve documentation
|
|
Tabs on a console with long lines do not wrap properly, so correctly
account for the line length when computing the tab placement location.
Reported-by: James Holderness <j4_james@hotmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- Move non-TUI specific annotation routines out of the TUI browser so
that it can be used in other UIs, and to demonstrate that introduce
a 'perf annotate --stdio2' option that will apply those formatting
routines to provide a non-interactive annotation mode (Arnaldo Carvalho de Melo)
- Add 'P' hotkey to the annotation TUI, so dump the current annotated
symbol to a file, easing report thru e-mail, by getting rid of the
spaces + right hand side scrollbar chars (Arnaldo Carvalho de Melo)
- Support --ignore-vmlinux to 'perf report' and 'perf annotate', that
was already present in 'perf top', to use /proc/{kcore,kallsyms},
allowing to see what is in fact running (patched stuff, alternatives,
ftrace, etc), not the initial state of the kernel (vmlinux) (Arnaldo Carvalho de Melo)
- Support 'jump' instructions to a different function, treating them
as 'call' instructions (Arnaldo Carvalho de Melo)
- Fix some jump artifacts when using vmlinux + ASM functions, where
the ELF symtab for instance, for entry_SYSCALL_64 includes that and
what comes after the 'syscall_return_via_sysret' label, but the
objdump -dS prints the jump targets + offsets using the
syscall_return_via_sysret address, which was confusing 'perf annotate'.
See the cset comments for further info (Arnaldo Carvalho de Melo)
- Report error from dwfl_attach_state() in the unwind code (Martin Vuille)
- Reference Py_None before returning it in the python extension (Petr Machata)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull mqueuefs revert from Eric Biederman:
"This fixes a regression that came in the merge window for v4.16.
The problem is that the permissions for mounting and using the
mqueuefs filesystem are broken. The necessary permission check is
missing letting people who should not be able to mount mqueuefs mount
mqueuefs. The field sb->s_user_ns is set incorrectly not allowing the
mounter of mqueuefs to remount and otherwise have proper control over
the filesystem.
Al Viro and I see the path to the necessary fixes differently and I am
not even certain at this point he actually sees all of the necessary
fixes. Given a couple weeks we can probably work something out but I
don't see the review being resolved in time for the final v4.16. I
don't want v4.16 shipping with a nasty regression. So unfortunately I
am sending a revert"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
Revert "mqueue: switch to on-demand creation of internal mount"
|
|
This reverts commit 36735a6a2b5e042db1af956ce4bcc13f3ff99e21.
Aleksa Sarai <asarai@suse.de> writes:
> [REGRESSION v4.16-rc6] [PATCH] mqueue: forbid unprivileged user access to internal mount
>
> Felix reported weird behaviour on 4.16.0-rc6 with regards to mqueue[1],
> which was introduced by 36735a6a2b5e ("mqueue: switch to on-demand
> creation of internal mount").
>
> Basically, the reproducer boils down to being able to mount mqueue if
> you create a new user namespace, even if you don't unshare the IPC
> namespace.
>
> Previously this was not possible, and you would get an -EPERM. The mount
> is the *host* mqueue mount, which is being cached and just returned from
> mqueue_mount(). To be honest, I'm not sure if this is safe or not (or if
> it was intentional -- since I'm not familiar with mqueue).
>
> To me it looks like there is a missing permission check. I've included a
> patch below that I've compile-tested, and should block the above case.
> Can someone please tell me if I'm missing something? Is this actually
> safe?
>
> [1]: https://github.com/docker/docker/issues/36674
The issue is a lot deeper than a missing permission check. sb->s_user_ns
was is improperly set as well. So in addition to the filesystem being
mounted when it should not be mounted, so things are not allow that should
be.
We are practically to the release of 4.16 and there is no agreement between
Al Viro and myself on what the code should looks like to fix things properly.
So revert the code to what it was before so that we can take our time
and discuss this properly.
Fixes: 36735a6a2b5e ("mqueue: switch to on-demand creation of internal mount")
Reported-by: Felix Abecassis <fabecassis@nvidia.com>
Reported-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Two fixes for pin control for v4.16:
- Renesas SH-PFC: remove a duplicate clkout pin which was causing
crashes
- fix Samsung out of bounds exceptions"
* tag 'pinctrl-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: samsung: Validate alias coming from DT
pinctrl: sh-pfc: r8a7795: remove duplicate of CLKOUT pin in pinmux_pins[]
|
|
With the cherry-picked perf/urgent commit merged separately we can now
merge all the fixes without conflicts.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pick up a cherry-picked commit.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull kprobe fixes from Steven Rostedt:
"The documentation for kprobe events says that symbol offets can take
both a + and - sign to get to befor and after the symbol address.
But in actuality, the code does not support the minus. This fixes that
issue, and adds a few more selftests to kprobe events"
* tag 'trace-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
selftests: ftrace: Add a testcase for probepoint
selftests: ftrace: Add a testcase for string type with kprobe_event
selftests: ftrace: Add probe event argument syntax testcase
tracing: probeevent: Fix to support minus offset from symbol
|
|
There's nothing IST-worthy about #BP/int3. We don't allow kprobes
in the small handful of places in the kernel that run at CPL0 with
an invalid stack, and 32-bit kernels have used normal interrupt
gates for #BP forever.
Furthermore, we don't allow kprobes in places that have usergs while
in kernel mode, so "paranoid" is also unnecessary.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
|
|
These types of jumps were confusing the annotate browser:
entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
Percent│ffffffff81a00020: swapgs
<SNIP>
│ffffffff81a00128: ↓ jae ffffffff81a00139 <syscall_return_via_sysret+0x53>
<SNIP>
│ffffffff81a00155: → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8>
I.e. the syscall_return_via_sysret function is actually "inside" the
entry_SYSCALL_64 function, and the offsets in jumps like these (+0x53)
are relative to syscall_return_via_sysret, not to syscall_return_via_sysret.
Or this may be some artifact in how the assembler marks the start and
end of a function and how this ends up in the ELF symtab for vmlinux,
i.e. syscall_return_via_sysret() isn't "inside" entry_SYSCALL_64, but
just right after it.
From readelf -sw vmlinux:
80267: ffffffff81a00020 315 NOTYPE GLOBAL DEFAULT 1 entry_SYSCALL_64
316: ffffffff81a000e6 0 NOTYPE LOCAL DEFAULT 1 syscall_return_via_sysret
0xffffffff81a00020 + 315 > 0xffffffff81a000e6
So instead of looking for offsets after that last '+' sign, calculate
offsets for jump target addresses that are inside the function being
disassembled from the absolute address, 0xffffffff81a00139 in this case,
subtracting from it the objdump address for the start of the function
being disassembled, entry_SYSCALL_64() in this case.
So, before this patch:
entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
Percent│ pop %r10
│ pop %r9
│ pop %r8
│ pop %rax
│ pop %rsi
│ pop %rdx
│ pop %rsi
│ mov %rsp,%rdi
│ mov %gs:0x5004,%rsp
│ pushq 0x28(%rdi)
│ pushq (%rdi)
│ push %rax
│ ↑ jmp 6c
│ mov %cr3,%rdi
│ ↑ jmp 62
│ mov %rdi,%rax
│ and $0x7ff,%rdi
│ bt %rdi,%gs:0x2219a
│ ↑ jae 53
│ btr %rdi,%gs:0x2219a
│ mov %rax,%rdi
│ ↑ jmp 5b
After:
entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
0.65 │ → jne swapgs_restore_regs_and_return_to_usermode
│ pop %r10
│ pop %r9
│ pop %r8
│ pop %rax
│ pop %rsi
│ pop %rdx
│ pop %rsi
│ mov %rsp,%rdi
│ mov %gs:0x5004,%rsp
│ pushq 0x28(%rdi)
│ pushq (%rdi)
│ push %rax
│ ↓ jmp 132
│ mov %cr3,%rdi
│ ┌──jmp 128
│ │ mov %rdi,%rax
│ │ and $0x7ff,%rdi
│ │ bt %rdi,%gs:0x2219a
│ │↓ jae 119
│ │ btr %rdi,%gs:0x2219a
│ │ mov %rax,%rdi
│ │↓ jmp 121
│119:│ mov %rax,%rdi
│ │ bts $0x3f,%rdi
│121:│ or $0x800,%rdi
│128:└─→or $0x1000,%rdi
│ mov %rdi,%cr3
│132: pop %rax
│ pop %rdi
│ pop %rsp
│ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8>
With those at least navigating to the right destination, an improvement
for these cases seems to be to be to somehow mark those inner functions,
which in this case could be:
entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
│syscall_return_via_sysret:
│ pop %r15
│ pop %r14
│ pop %r13
│ pop %r12
│ pop %rbp
│ pop %rbx
│ pop %rsi
│ pop %r10
│ pop %r9
│ pop %r8
│ pop %rax
│ pop %rsi
│ pop %rdx
│ pop %rsi
│ mov %rsp,%rdi
│ mov %gs:0x5004,%rsp
│ pushq 0x28(%rdi)
│ pushq (%rdi)
│ push %rax
│ ↓ jmp 132
│ mov %cr3,%rdi
│ ┌──jmp 128
│ │ mov %rdi,%rax
│ │ and $0x7ff,%rdi
│ │ bt %rdi,%gs:0x2219a
│ │↓ jae 119
│ │ btr %rdi,%gs:0x2219a
│ │ mov %rax,%rdi
│ │↓ jmp 121
│119:│ mov %rax,%rdi
│ │ bts $0x3f,%rdi
│121:│ or $0x800,%rdi
│128:└─→or $0x1000,%rdi
│ mov %rdi,%cr3
│132: pop %rax
│ pop %rdi
│ pop %rsp
│ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8>
This all gets much better viewed if one uses 'perf report --ignore-vmlinux'
forcing the usage of /proc/kcore + /proc/kallsyms, when the above
actually gets down to:
# perf report --ignore-vmlinux
## do '/64', will show the function names containing '64',
## navigate to /entry_SYSCALL_64_after_hwframe.annotation,
## press 'A' to annotate, then 'P' to print that annotation
## to a file
## From another xterm (or see on screen, this 'P' thing is for
## getting rid of those right side scroll bars/spaces):
# cat /entry_SYSCALL_64_after_hwframe.annotation
entry_SYSCALL_64_after_hwframe() /proc/kcore
Event: cycles:ppp
Percent
Disassembly of section load0:
ffffffff9aa00044 <load0>:
11.97 push %rax
4.85 push %rdi
push %rsi
2.59 push %rdx
2.27 push %rcx
0.32 pushq $0xffffffffffffffda
1.29 push %r8
xor %r8d,%r8d
1.62 push %r9
0.65 xor %r9d,%r9d
1.62 push %r10
xor %r10d,%r10d
5.50 push %r11
xor %r11d,%r11d
3.56 push %rbx
xor %ebx,%ebx
4.21 push %rbp
xor %ebp,%ebp
2.59 push %r12
0.97 xor %r12d,%r12d
3.24 push %r13
xor %r13d,%r13d
2.27 push %r14
xor %r14d,%r14d
4.21 push %r15
xor %r15d,%r15d
0.97 mov %rsp,%rdi
5.50 → callq do_syscall_64
14.56 mov 0x58(%rsp),%rcx
7.44 mov 0x80(%rsp),%r11
0.32 cmp %rcx,%r11
→ jne swapgs_restore_regs_and_return_to_usermode
0.32 shl $0x10,%rcx
0.32 sar $0x10,%rcx
3.24 cmp %rcx,%r11
→ jne swapgs_restore_regs_and_return_to_usermode
2.27 cmpq $0x33,0x88(%rsp)
1.29 → jne swapgs_restore_regs_and_return_to_usermode
mov 0x30(%rsp),%r11
8.74 cmp %r11,0x90(%rsp)
→ jne swapgs_restore_regs_and_return_to_usermode
0.32 test $0x10100,%r11
→ jne swapgs_restore_regs_and_return_to_usermode
0.32 cmpq $0x2b,0xa0(%rsp)
0.65 → jne swapgs_restore_regs_and_return_to_usermode
I.e. using kallsyms makes the function start/end be done differently
than using what is in the vmlinux ELF symtab and actually the hits
goes to entry_SYSCALL_64_after_hwframe, which is a GLOBAL() after the
start of entry_SYSCALL_64:
ENTRY(entry_SYSCALL_64)
UNWIND_HINT_EMPTY
<SNIP>
pushq $__USER_CS /* pt_regs->cs */
pushq %rcx /* pt_regs->ip */
GLOBAL(entry_SYSCALL_64_after_hwframe)
pushq %rax /* pt_regs->orig_ax */
PUSH_AND_CLEAR_REGS rax=$-ENOSYS
And it goes and ends at:
cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */
jne swapgs_restore_regs_and_return_to_usermode
/*
* We win! This label is here just for ease of understanding
* perf profiles. Nothing jumps here.
*/
syscall_return_via_sysret:
/* rcx and r11 are already restored (see code above) */
UNWIND_HINT_EMPTY
POP_REGS pop_rdi=0 skip_r11rcx=1
So perhaps some people should really just play with '--ignore-vmlinux'
to force /proc/kcore + kallsyms.
One idea is to do both, i.e. have a vmlinux annotation and a
kcore+kallsyms one, when possible, and even show the patched location,
etc.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-r11knxv8voesav31xokjiuo6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|