Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"This includes perf namespace support kernel side fixes, plus an
accumulated set of perf tooling fixes - including UAPI header
synchronization that should make the perf build less noisy"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
tooling/headers: Synchronize updated s390 and x86 UAPI headers
tools headers: Syncronize mman.h ABI header
tools headers: Synchronize prctl.h ABI header
tools headers: Synchronize KVM arch ABI headers
tools headers: Synchronize drm/i915_drm.h
tools headers uapi: Synchronize drm/drm.h
tools headers: Synchronize perf_event.h header
tools headers: Synchronize kernel ABI headers wrt SPDX tags
tools/headers: Synchronize kernel x86 UAPI headers
perf intel-pt: Bring instruction decoder files into line with the kernel
perf test: Fix test 21 for s390x
perf bench numa: Fixup discontiguous/sparse numa nodes
perf top: Use signal interface for SIGWINCH handler
perf top: Fix window dimensions change handling
perf: Fix header.size for namespace events
perf top: Ignore kptr_restrict when not sampling the kernel
perf record: Ignore kptr_restrict when not sampling the kernel
perf report: Ignore kptr_restrict when not sampling the kernel
perf evlist: Add helper to check if attr.exclude_kernel is set in all evsels
perf test shell: Fix test case probe libc's inet_pton on s390x
...
|
|
to v4.15
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
There were two trivial updates to these upstream UAPI headers:
arch/s390/include/uapi/asm/kvm.h
arch/s390/include/uapi/asm/kvm_perf.h
arch/x86/lib/x86-opcode-map.txt
Synchronize them with their tooling copies.
(The x86 opcode map includes a new instruction pattern now.)
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The new ORC unwinder breaks the build of a 64-bit kernel on a 32-bit
host. Building the kernel on a i386 or x32 host fails with:
orc_dump.c: In function 'orc_dump':
orc_dump.c:105:26: error: passing argument 2 of 'elf_getshdrnum' from incompatible pointer type [-Werror=incompatible-pointer-types]
if (elf_getshdrnum(elf, &nr_sections)) {
^
In file included from /usr/local/include/gelf.h:32:0,
from elf.h:22,
from warn.h:26,
from orc_dump.c:20:
/usr/local/include/libelf.h:304:12: note: expected 'size_t * {aka unsigned int *}' but argument is of type 'long unsigned int *'
extern int elf_getshdrnum (Elf *__elf, size_t *__dst);
^~~~~~~~~~~~~~
orc_dump.c:190:17: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'Elf64_Sxword {aka long long int}' [-Werror=format=]
printf("%s+%lx:", name, rela.r_addend);
~~^ ~~~~~~~~~~~~~
%llx
Fix the build failure.
Another problem is that if the user specifies HOSTCC or HOSTLD
variables, they are ignored in the objtool makefile. Change the
Makefile to respect these variables.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sven Joachim <svenjoac@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 627fce14809b ("objtool: Add ORC unwind table generation")
Link: http://lkml.kernel.org/r/19f0e64d8e07e30a7b307cd010eb780c404fe08d.1512252895.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Remove the backward/forward concept to make it uniform with user
interface (the '--overwrite' option).
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mengting Zhang <zhangmengting@huawei.com>
Link: http://lkml.kernel.org/r/20171204165107.95327-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf record' can switch its output data file. The new output should
only store the data after switching. However, in overwrite backward
mode, the new output still can have data from before switching. That
also brings extra overhead.
At the end of mmap_read(), the position of the processed ring buffer is
saved in md->prev. Next mmap_read should be end in md->prev if it is not
overwriten. That avoids processing duplicate data. However, md->prev is
discarded. So next the mmap_read() has to process whole valid ring
buffer, which probably includes old processed data.
Avoid calling backward_rb_find_range() when md->prev is still
available.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mengting Zhang <zhangmengting@huawei.com>
Link: http://lkml.kernel.org/r/20171204165107.95327-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf record' backward recording doesn't work as we expected: it never
overwrites when ring buffer gets full.
Test:
Run a busy python printing task background like this:
while True:
print 123
send SIGUSR2 to perf to capture snapshot, then:
# ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit --exclude-perf -a --switch-output
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2017110101520743 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2017110101521251 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2017110101521692 ]
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Dump perf.data.2017110101521936 ]
[ perf record: Captured and wrote 0.826 MB perf.data.<timestamp> ]
# ./perf script -i ./perf.data.2017110101520743 | head -n3
perf 2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
perf 2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
python 2545 [000] 12449.310800: raw_syscalls:sys_exit: NR 1 = 4
# ./perf script -i ./perf.data.2017110101521251 | head -n3
perf 2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
perf 2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
python 2545 [000] 12449.310800: raw_syscalls:sys_exit: NR 1 = 4
# ./perf script -i ./perf.data.2017110101521692 | head -n3
perf 2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
perf 2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
python 2545 [000] 12449.310800: raw_syscalls:sys_exit: NR 1 = 4
Timestamps never change, but my background task is a dead loop, can
easily overwhelm the ring buffer.
This patch fixes it by forcing unsetting PROT_WRITE for a backward ring
buffer, so all backward ring buffers become overwrite ring buffers.
Test result:
# ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit --exclude-perf -a --switch-output
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2017110101285323 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2017110101290053 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2017110101290446 ]
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Dump perf.data.2017110101290837 ]
[ perf record: Captured and wrote 0.826 MB perf.data.<timestamp> ]
# ./perf script -i ./perf.data.2017110101285323 | head -n3
python 2545 [000] 11064.268083: raw_syscalls:sys_exit: NR 1 = 4
python 2545 [000] 11064.268084: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
python 2545 [000] 11064.268086: raw_syscalls:sys_exit: NR 1 = 4
# ./perf script -i ./perf.data.2017110101290 | head -n3
failed to open ./perf.data.2017110101290: No such file or directory
# ./perf script -i ./perf.data.2017110101290053 | head -n3
python 2545 [000] 11071.564062: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
python 2545 [000] 11071.564064: raw_syscalls:sys_exit: NR 1 = 4
python 2545 [000] 11071.564066: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
# ./perf script -i ./perf.data.2017110101290 | head -n3
perf.data.2017110101290053 perf.data.2017110101290446 perf.data.2017110101290837
# ./perf script -i ./perf.data.2017110101290446 | head -n3
sshd 1321 [000] 11075.499473: raw_syscalls:sys_exit: NR 14 = 0
sshd 1321 [000] 11075.499474: raw_syscalls:sys_enter: NR 14 (2, 7ffe98899490, 0, 8, 0, 3000)
sshd 1321 [000] 11075.499474: raw_syscalls:sys_exit: NR 14 = 0
# ./perf script -i ./perf.data.2017110101290837 | head -n3
python 2545 [000] 11079.280844: raw_syscalls:sys_exit: NR 1 = 4
python 2545 [000] 11079.280847: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
python 2545 [000] 11079.280850: raw_syscalls:sys_exit: NR 1 = 4
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mengting Zhang <zhangmengting@huawei.com>
Link: http://lkml.kernel.org/r/20171204165107.95327-2-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There are codes that print messages to the screen between assignment of
the use_browser variable and setup_browser().
But since the GUI browser is not initialized during that period, all
messages fail to show if the user passed the --gtk option to perf as GTK
is not initialized yet.
Reorder the code to assign use_browser variable right before
setup_browser() is called.
Signed-off-by: Seokho Song <0xdevssh@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171204160244.6332-1-0xdevssh@gmail.com
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
identification for mapfile.csv
The powerpc cpuid information includes chip revision information.
Changes between chip revisions are usually minor bug fixes and usually
do not affect the operation of the performance monitoring hardware.
The original mapfile.csv matching requires enumerating every possible
cpuid string. When a new minor chip revision is produced a new entry
has to be added to the mapfile.csv and the code recompiled to allow perf
to have the implementation specific perf events for this new minor
revision. For users of various distibutions of Linux having to wait for
a new release of the kernel's perf tool to be built with these trivial
patches is inconvenient.
Using regular expressions rather than exactly string matching of the
entire cpuid string allows developers to write mapfile.csv files that do
not require patches and recompiles for each of these minor version
changes. If special cases need to be made for some particular versions,
they can be placed earlier in the mapfile.csv file before the more
general matches.
Signed-off-by: William Cohen <wcohen@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shriya <shriyak@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20171204145728.16792-1-wcohen@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1512188201-14109-1-git-send-email-qpakzk@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
All perf_mmap__read_forward() read from read-write ring buffer, so no
need check_messup. Reading from backward ring buffer doesn't require
check_messup because it never mess up. Cleanup arguments lists.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-6-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'overwrite' argument is always 'false'. Remove it from arguments list of
perf_mmap__push().
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-5-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
evlist->overwrite is set to false in all users. It can be removed.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
All users of perf_evlist__mmap_ex set !overwrite. Remove it from its
arguments list.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now all perf_evlist__mmap's users doesn't set 'overwrite'. Remove it
from arguments list.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-2-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On Fedora systems the perl and python CFLAGS/LDFLAGS include the
hardened specs from redhat-rpm-config package. We apply them only for
perl/python objects, which makes them not compatible with the rest of
the objects and the build fails with:
/usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -f
+PIC
/usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile w
+ith -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile.perf:507: perf] Error 1
make[1]: *** [Makefile.perf:210: sub-make] Error 2
make: *** [Makefile:69: all] Error 2
Mainly it's caused by perl/python objects being compiled with:
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
which prevent the final link impossible, because it will check
for 'proper' objects with following option:
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20171204082437.GC30564@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On some platforms(arm/arm64) which uses cpus map to get corresponding
cpuid string, cpuid can be NULL for PMUs other than CORE PMUs. Adding
check for NULL cpuid in function perf_pmu__find_map to avoid
segmentation fault.
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-6-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is not a full event list, but a short list of useful events.
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-5-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On some platforms, PMU core devices sysfs name is not cpu.
Adding function is_pmu_core to detect PMU core devices using
core device specific hints in sysfs.
For arm64 platforms, all core devices have file "cpus" in sysfs.
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Tested-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Tested-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/n/tip-y1woxt1k2pqqwpprhonnft2s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are some small misc driver fixes for 4.15-rc3 to resolve reported
issues. Specifically these are:
- binder fix for a memory leak
- vpd driver fixes for a number of reported problems
- hyperv driver fix for memory accesses where it shouldn't be.
All of these have been in linux-next for a while. There's also one
more MAINTAINERS file update that came in today to get the Android
developer's emails correct, which is also in this pull request, that
was not in linux-next, but should not be an issue"
* tag 'char-misc-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
MAINTAINERS: update Android driver maintainers.
firmware: vpd: Fix platform driver and device registration/unregistration
firmware: vpd: Tie firmware kobject to device lifetime
firmware: vpd: Destroy vpd sections in remove function
hv: kvp: Avoid reading past allocated blocks from KVP file
Drivers: hv: vmbus: Fix a rescind issue
ANDROID: binder: fix transaction leak.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a few minor USB fixes for 4.15-rc3.
The largest here is the Kconfig text and configuration changes for the
USB TypeC build options that you reported during the -rc1 merge
window. The others are all just small fixes for reported issues, as
well as some new device ids.
The most "interesting" of anything here is the usbip fixes as it seems
lots of people are starting to pay attention to that driver at the
moment. These fixes should resolve all of the reported problems as of
now.
Of course there are the usual xhci and gadget fixes as well, can't go
a pull request without those...
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits)
usb: xhci: fix panic in xhci_free_virt_devices_depth_first
xhci: Don't show incorrect WARN message about events for empty rings
usbip: fix usbip attach to find a port that matches the requested speed
usbip: Fix USB device hang due to wrong enabling of scatter-gather
uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices
usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub
usb: build drivers/usb/common/ when USB_SUPPORT is set
usb: hub: Cycle HUB power when initialization fails
USB: core: Add type-specific length check of BOS descriptors
usb: host: fix incorrect updating of offset
USB: ulpi: fix bus-node lookup
USB: usbfs: Filter flags passed in from user space
usb: add user selectable option for the whole USB Type-C Support
usb: f_fs: Force Reserved1=1 in OS_DESC_EXT_COMPAT
usb: gadget: core: Fix ->udc_set_speed() speed handling
usb: gadget: allow to enable legacy drivers without USB_ETH
usb: gadget: udc: renesas_usb3: fix number of the pipes
usb: gadget: don't dereference g until after it has been null checked
USB: serial: usb_debug: add new USB device id
usb: bdc: fix platform_no_drv_owner.cocci warnings
...
|
|
The regs_query_register_offset() helper function converts
register name like "%r0" to an offset of a register in user_pt_regs
It is required by the BPF prologue generator.
The user_pt_regs structure was recently added to "asm/ptrace.h".
Hence, update tools/perf/check-headers.sh to keep the header file
in sync with kernel changes.
Suggested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Synchronize the uapi kernel header files which solves the broken
uapi export of pt_regs. Because of arch-specific uapi headers,
extended the include path in the Makefile.
With this change, the test_verifier program compiles and runs successfully
on s390.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The get_cpuid_str function returns the MIDR string of the first online
cpu from the range of cpus associated with the PMU CORE device.
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-3-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The cpuid string will not be same on all CPUs on heterogeneous platforms
like ARM's big.LITTLE, adding provision(using pmu->cpus) to find cpuid
string from associated CPUs of PMU CORE device.
Also optimise arguments to function pmu_add_cpu_aliases.
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: http://lkml.kernel.org/r/20171016183222.25750-2-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On s390, object files must be compiled with position-indepedent code in
order to be incrementally linked or linked to shared libraries.
Therefore, add -fPIC to the CFLAGS for s390 to ensure each object file
is built properly.
Reported-by: Jonathan Hermann <jonathan.hermann@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux s390 list <linux-s390@vger.kernel.org>
LPU-Reference: 1512031765-9382-1-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-a8wga8hrl0d0r84cal96fmgv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Reusing the thread_map__new_by_uid() proc scanning already in place to
return a map with all threads in the system.
Based-on-a-patch-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-khh28q0wwqbqtrk32bfe07hd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In current stat-shadow.c, the rbtree deleting is ignored.
The patch adds the implementation to node_delete method of rblist.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512125856-22056-5-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently we have a rblist__delete() which is used to delete a rblist.
While rblist__delete() will free the pointer of rblist at the end.
It's an inconvenience for the user to delete a rblist which is not
allocated by something like malloc(). For example, the rblist is
embedded in a larger data structure.
This patch creates a new function rblist__exit() which is similar to
rblist__delete() but it will not free the pointer of rblist.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512125856-22056-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The command 'perf annotate' parses the output of objdump and also
investigates the comments produced by objdump. For example the
output of objdump produces (on x86):
23eee: 4c 8b 3d 13 01 21 00 mov 0x210113(%rip),%r15
# 234008 <stderr@@GLIBC_2.2.5+0x9a8>
and the function mov__parse() is called to investigate the complete
line. Mov__parse() breaks this line into several parts and finally
calls function comment__symbol() to parse the data after the comment
character '#'. Comment__symbol() expects a hexadecimal address followed
by a symbol in '<' and '>' brackets.
However the 2nd parameter given to function comment__symbol()
always points to the comment character '#'. The address parsing
always returns 0 because the character '#' is not a digit and
strtoull() fails without being noticed.
Fix this by advancing the second parameter to function comment__symbol()
by one byte before invocation and add an error check after strtoull()
has been called.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: 6de783b6f50f ("perf annotate: Resolve symbols using objdump comment")
Link: http://lkml.kernel.org/r/20171128075632.72182-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch fixes a bug introduced with commit d9f8dfa9baf9 ("perf
annotate s390: Implement jump types for perf annotate").
'perf annotate' displays annotated assembler output by reading output of
command objdump and parsing the disassembled lines. For each shown
mnemonic this function sequence is executed:
disasm_line__new()
|
+--> disasm_line__init_ins()
|
+--> ins__find()
|
+--> arch->associate_instruction_ops()
The s390x specific function assigned to function pointer
associate_instruction_ops refers to function s390__associate_ins_ops().
This function checks for supported mnemonics and assigns a NULL pointer
to unsupported mnemonics. However even the NULL pointer is added to the
architecture dependend instruction array.
This leads to an extremely large architecture instruction array
(due to array resize logic in function arch__grow_instructions()).
Depending on the objdump output being parsed the array can end up
with several ten-thousand elements.
This patch checks if a mnemonic is supported and only adds supported
ones into the architecture instruction array. The array does not contain
elements with NULL pointers anymore.
Before the patch (With some debug printf output):
[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxbb
real 8m49.679s
user 7m13.008s
sys 0m1.649s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
/tmp/xxxbb | tail -1
__ins__find sorted:1 nr_instructions:87433 ins:0x341583c0
[root@s35lp76 perf]#
The number of different s390x branch/jump/call/return instructions
entered into the array is 87433.
After the patch (With some printf debug output:)
[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxaa
real 1m24.553s
user 0m0.587s
sys 0m1.530s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
/tmp/xxxaa | tail -1
__ins__find sorted:1 nr_instructions:56 ins:0x3f406570
[root@s35lp76 perf]#
The number of different s390x branch/jump/call/return instructions
entered into the array is 56 which is sensible.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20171124094637.55558-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Waker threads in the futex wake-parallel benchmark are started by a loop
using pthread_create(). However, there is no synchronization for when
the waker threads wake the waiting threads. Comparison of the waker
threads' measurement timestamps show they are not all running
concurrently because older waker threads finish their task before newer
waker threads even start.
This patch uses a barrier to better synchronize the waker threads.
Signed-off-by: James Yang <james.yang@arm.com
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20171127042101.3659-4-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
[ Disable the wake-parallel test for systems without pthread_barrier_t ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As 'perf bench futex wake-parallel" will use this, which is not
available in older systems such as versions of the android NDK used in
my container build tests (r12b and r15c at the moment).
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: James Yang <james.yang@arm.com
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-1i7iv54in4wj08lwo55b0pzv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pull networking fixes from David Miller:
1) Various TCP control block fixes, including one that crashes with
SELinux, from David Ahern and Eric Dumazet.
2) Fix ACK generation in rxrpc, from David Howells.
3) ipvlan doesn't set the mark properly in the ipv4 route lookup key,
from Gao Feng.
4) SIT configuration doesn't take on the frag_off ipv4 field
configuration properly, fix from Hangbin Liu.
5) TSO can fail after device down/up on stmmac, fix from Lars Persson.
6) Various bpftool fixes (mostly in JSON handling) from Quentin Monnet.
7) Various SKB leak fixes in vhost/tun/tap (mostly observed as
performance problems). From Wei Xu.
8) mvpps's TX descriptors were not zero initialized, from Yan Markman.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
tcp: use IPCB instead of TCP_SKB_CB in inet_exact_dif_match()
tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()
rxrpc: Fix the MAINTAINERS record
rxrpc: Use correct netns source in rxrpc_release_sock()
liquidio: fix incorrect indentation of assignment statement
stmmac: reset last TSO segment size after device open
ipvlan: Add the skb->mark as flow4's member to lookup route
s390/qeth: build max size GSO skbs on L2 devices
s390/qeth: fix GSO throughput regression
s390/qeth: fix thinko in IPv4 multicast address tracking
tap: free skb if flags error
tun: free skb in early errors
vhost: fix skb leak in handle_rx()
bnxt_en: Fix a variable scoping in bnxt_hwrm_do_send_msg()
bnxt_en: fix dst/src fid for vxlan encap/decap actions
bnxt_en: wildcard smac while creating tunnel decap filter
bnxt_en: Need to unconditionally shut down RoCE in bnxt_shutdown
phylink: ensure we take the link down when phylink_stop() is called
sfp: warn about modules requiring address change sequence
sfp: improve RX_LOS handling
...
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2017-12-02
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix a compilation warning in xdp redirect tracepoint due to
missing bpf.h include that pulls in struct bpf_map, from Xie.
2) Limit the maximum number of attachable BPF progs for a given
perf event as long as uabi is not frozen yet. The hard upper
limit is now 64 and therefore the same as with BPF multi-prog
for cgroups. Also add related error checking for the sample
BPF loader when enabling and attaching to the perf event, from
Yonghong.
3) Specifically set the RLIMIT_MEMLOCK for the test_verifier_log
case, so that the test case can always pass and not fail in
some environments due to too low default limit, also from
Yonghong.
4) Fix up a missing license header comment for kernel/bpf/offload.c,
from Jakub.
5) Several fixes for bpftool, among others a crash on incorrect
arguments when json output is used, error message handling
fixes on unknown options and proper destruction of json writer
for some exit cases, all from Quentin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
- add missing module information to the Mediatek cpufreq driver module
(Jesse Chan)
- fix config dependencies for the Loongson cpufreq driver (James Hogan)
- fix two issues related to CPU offline in the cpupower utility
(Abhishek Goel).
* tag 'pm-4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: mediatek: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
cpufreq: Add Loongson machine dependencies
cpupower : Fix cpupower working when cpu0 is offline
cpupowerutils: bench - Fix cpu online check
|
|
The default rlimit RLIMIT_MEMLOCK is 64KB. In certain cases,
e.g. in a test machine mimicking our production system, this test may
fail due to unable to charge the required memory for prog load:
# ./test_verifier_log
Test log_level 0...
ERROR: Program load returned: ret:-1/errno:1, expected ret:-1/errno:22
Changing the default rlimit RLIMIT_MEMLOCK to unlimited makes
the test always pass.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
It was reported that the whole futex bench breaks when dealing with
non-contiguously numbered cpus.
$ echo 0 | sudo tee /sys/devices/system/cpu/cpu3/online
$ ./perf bench futex all
perf: pthread_create: Operation not permitted
Run summary [PID 14934]: 7 threads, each ....
James had implemented an approach with cpumaps that use an in house
flavor. Instead of re-inventing the wheel, I've redone the patch such
that we use the perf's util/cpumap.c interface instead.
Applies to all futex benchmarks.
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Originally-from: James Yang <james.yang@arm.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20171127042101.3659-2-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
usbip attach fails to find a free port when the device on the first port
is a USB_SPEED_SUPER device and non-super speed device is being attached.
It keeps checking the first port and returns without a match getting stuck
in a loop.
Fix it check to find the first port with matching speed.
Reported-by: Juan Zea <juan.zea@qindel.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In the Makefile, targets install, doc and doc-install should be added to
.PHONY. Let's fix this.
Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Programs and documentation not managed by package manager are generally
installed under /usr/local/, instead of the user's home directory. In
particular, `man` is generally able to find manual pages under
`/usr/local/share/man`.
bpftool generally follows perf's example, and perf installs to home
directory. However bpftool requires root credentials, so it seems
sensible to follow the more common convention of installing files under
/usr/local instead. So, make /usr/local the default prefix for
installing the binary with `make install`, and the documentation with
`make doc-install`. Also, create /usr/local/sbin if it does not exist.
Note that the bash-completion file, however, is still installed under
/usr/share/bash-completion/completions, as the default setup for bash
does not attempt to load completion files under /usr/local/.
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The end-of-line character inside the string would break JSON compliance.
Remove it, `p_err()` already adds a '\n' character for plain output
anyway.
Fixes: 9a5ab8bf1d6d ("tools: bpftool: turn err() and info() macros into functions")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
If `getopt_long()` meets an unknown option, it prints its own error
message to standard error output. While this does not strictly break
JSON output, it is the only case bpftool prints something to standard
error output if JSON output is required. All other errors are printed on
standard output as JSON objects, so that an external program does not
have to parse stderr.
This is changed by setting the global variable `opterr` to 0.
Furthermore, p_err() is used to reproduce the error message in a more
JSON-friendly way, so that users still get to know what the erroneous
option is.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The writer is cleaned at the end of the main function, but not if the
program exits sooner in usage(). Let's keep it clean and destroy the
writer before exiting.
Destruction and actual call to exit() are moved to another function so
that clean exit can also be performed without printing usage() hints.
Fixes: d35efba99d92 ("tools: bpftool: introduce --json and --pretty options")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
If bad or unrecognised parameters are specified after JSON output is
requested, `usage()` will try to output null JSON object before the
writer is created.
To prevent this, create the writer as soon as the `--json` option is
parsed.
Fixes: 004b45c0e51a ("tools: bpftool: provide JSON output for all possible commands")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Print file names of files that differ. For example, instead of:
Warning: Intel PT: x86 instruction decoder differs from kernel
print:
Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/inat.h' differs from latest version at 'arch/x86/include/asm/inat.h'
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/1511253326-22308-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The PERF_RECORD_USER_ events are synthesized by the tool to assist in
processing the PERF_RECORD_ ones generated by the kernel, the printing
of that information doesn't come with a perf_sample structure, so, when
dumping the event fields using 'perf report -D' there were columns that
end up not being printed.
To tidy up a bit this, fake a perf_sample structure with zeroes to have
the missing columns printed and avoid the occasional surprise with that.
Before:
0 0x45b8 [0x68]: PERF_RECORD_MMAP -1/0: [0xffffffffc12ec000(0x4000) @ 0]: x /lib/modules/4.14.0+/kernel/fs/nls/nls_utf8.ko
0x4620 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27820
0x4648 [0x18]: PERF_RECORD_CPU_MAP: 0-3
0 0x4660 [0x28]: PERF_RECORD_COMM: perf:27820/27820
0x4a58 [0x8]: PERF_RECORD_FINISHED_ROUND
447723433020976 0x4688 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 27820/27820: 0xffffffff8f1b6d7a period: 1 addr: 0
After:
$ perf report -D | grep PERF_RECORD_ | head
0 0xe8 [0x20]: PERF_RECORD_TIME_CONV: unhandled!
0 0x108 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 32555
0 0x130 [0x18]: PERF_RECORD_CPU_MAP: 0-3
0 0x148 [0x28]: PERF_RECORD_COMM: perf:32555/32555
0 0x4e8 [0x8]: PERF_RECORD_FINISHED_ROUND
448743409421205 0x170 [0x28]: PERF_RECORD_COMM exec: sleep:32555/32555
448743409431883 0x198 [0x68]: PERF_RECORD_MMAP2 32555/32555: [0x55e11d75a000(0x208000) @ 0 fd:00 3147174 2566255743]: r-xp /usr/bin/sleep
448743409443873 0x200 [0x70]: PERF_RECORD_MMAP2 32555/32555: [0x7f0ced316000(0x229000) @ 0 fd:00 3151761 2566238119]: r-xp /usr/lib64/ld-2.25.so
448743409454790 0x270 [0x60]: PERF_RECORD_MMAP2 32555/32555: [0x7ffe84f6d000(0x2000) @ 0 00:00 0 0]: r-xp [vdso]
448743409479500 0x2d0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4002): 32555/32555: 0xffffffff8f84c7e7 period: 1 addr: 0
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 9aefcab0de47 ("perf session: Consolidate the dump code")
Link: https://lkml.kernel.org/n/tip-todcu15x0cwgppkh1gi6uhru@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a tip for Node.js USDT(User-Level Statically Defined Tracing) probes
in tips.txt
Signed-off-by: Hansuk Hong <flavono123@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20171123160546.9722-1-flavono123@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add support for computing 'perf stat' style metrics in 'perf script'.
When using leader sampling we can get metrics for each sampling period
by computing formulas over the values of the different group members.
This allows things like fine grained IPC tracking through sampling, much
more fine grained than with 'perf stat'.
The metric is still averaged over the sampling period, it is not just
for the sampling point.
This patch adds a new metric output field for 'perf script' that uses
the existing 'perf stat' metrics infrastructure to compute any metrics
supported by 'perf stat'.
For example to sample IPC:
$ perf record -e '{ref-cycles,cycles,instructions}:S' -a sleep 1
$ perf script -F metric,ip,sym,time,cpu,comm
...
alsa-sink-ALC32 [000] 42815.856074: 7fd65937d6cc [unknown]
alsa-sink-ALC32 [000] 42815.856074: 7fd65937d6cc [unknown]
alsa-sink-ALC32 [000] 42815.856074: 7fd65937d6cc [unknown]
alsa-sink-ALC32 [000] 42815.856074: metric: 0.13 insn per cycle
swapper [000] 42815.857961: ffffffff81655df0 __schedule
swapper [000] 42815.857961: ffffffff81655df0 __schedule
swapper [000] 42815.857961: ffffffff81655df0 __schedule
swapper [000] 42815.857961: metric: 0.23 insn per cycle
qemu-system-x86 [000] 42815.858130: ffffffff8165ad0e _raw_spin_unlock_irqrestore
qemu-system-x86 [000] 42815.858130: ffffffff8165ad0e _raw_spin_unlock_irqrestore
qemu-system-x86 [000] 42815.858130: ffffffff8165ad0e _raw_spin_unlock_irqrestore
qemu-system-x86 [000] 42815.858130: metric: 0.46 insn per cycle
:4972 [000] 42815.858312: ffffffffa080e5f2 vmx_vcpu_run
:4972 [000] 42815.858312: ffffffffa080e5f2 vmx_vcpu_run
:4972 [000] 42815.858312: ffffffffa080e5f2 vmx_vcpu_run
:4972 [000] 42815.858312: metric: 0.45 insn per cycle
TopDown:
This requires disabling SMT if you have it enabled, because SMT would
require sampling per core, which is not supported.
$ perf record -e '{ref-cycles,topdown-fetch-bubbles,\
topdown-recovery-bubbles,\
topdown-slots-retired,topdown-total-slots,\
topdown-slots-issued}:S' -a sleep 1
$ perf script --header -I -F cpu,ip,sym,event,metric,period
...
[000] 121108 ref-cycles: ffffffff8165222e copy_user_enhanced_fast_string
[000] 190350 topdown-fetch-bubbles: ffffffff8165222e copy_user_enhanced_fast_string
[000] 2055 topdown-recovery-bubbles: ffffffff8165222e copy_user_enhanced_fast_string
[000] 148729 topdown-slots-retired: ffffffff8165222e copy_user_enhanced_fast_string
[000] 144324 topdown-total-slots: ffffffff8165222e copy_user_enhanced_fast_string
[000] 160852 topdown-slots-issued: ffffffff8165222e copy_user_enhanced_fast_string
[000] metric: 33.0% frontend bound
[000] metric: 3.5% bad speculation
[000] metric: 25.8% retiring
[000] metric: 37.7% backend bound
[000] 112112 ref-cycles: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 357222 topdown-fetch-bubbles: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 3325 topdown-recovery-bubbles: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 323553 topdown-slots-retired: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 270507 topdown-total-slots: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 341226 topdown-slots-issued: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] metric: 33.0% frontend bound
[000] metric: 2.9% bad speculation
[000] metric: 29.9% retiring
[000] metric: 34.2% backend bound
...
v2:
Use evsel->priv for new fields
Port to new base line, support fp output.
Handle stats in ->stats, not ->priv
Minor cleanups
Extra explanation about the use of the term 'averaging', from Andi in the
thread in the Link: tag below:
<quote Andi>
The current samples contains the sum of event counts for a sampling period.
EventA-1 EventA-2 EventA-3 EventA-4
EventB-1 EventB-2 EventC-3
gap with no events overflow
|-----------------------------------------------------------------|
period-start period-end
^ ^
| |
previous sample current sample
So EventA = 4 and EventB = 3 at the sample point
I generate a metric, let's say EventA / EventB. It applies to the whole period.
But the metric is over a longer time which does not have the same behavior. For
example the gap above doesn't have any events, while they are clustered at the
beginning and end of the sample period.
But we're summing everything together. The metric doesn't know that the gap is
different than the busy period.
That's what I'm trying to express with averaging.
</quote>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171117214300.32746-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|