diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2016-01-11 14:39:17 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2016-01-11 14:39:17 -0800 |
commit | 5cb52b5e1654f3f1ed9c32e34456d98559c85aa0 (patch) | |
tree | 737c73d6aef99a17f57c2974f1e2a142a5f1a377 /tools/perf/util/help.c | |
parent | 24af98c4cf5f5e69266e270c7f3fb34b82ff6656 (diff) | |
parent | 3eb9ede23bdd96e9ba60e2b4d4d17a7c35d58448 (diff) |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Intel Knights Landing support. (Harish Chegondi)
- Intel Broadwell-EP uncore PMU support. (Kan Liang)
- Core code improvements. (Peter Zijlstra.)
- Event filter, LBR and PEBS fixes. (Stephane Eranian)
- Enable cycles:pp on Intel Atom. (Stephane Eranian)
- Add cycles:ppp support for Skylake. (Andi Kleen)
- Various x86 NMI overhead optimizations. (Andi Kleen)
- Intel PT enhancements. (Takao Indoh)
- AMD cache events fix. (Vince Weaver)
Tons of tooling changes:
- Show random perf tool tips in the 'perf report' bottom line
(Namhyung Kim)
- perf report now defaults to --group if the perf.data file has
grouped events, try it with:
# perf record -e '{cycles,instructions}' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ]
# perf report
# Samples: 1K of event 'anon group { cycles, instructions }'
# Event count (approx.): 1955219195
#
# Overhead Command Shared Object Symbol
2.86% 0.22% swapper [kernel.kallsyms] [k] intel_idle
1.05% 0.33% firefox libxul.so [.] js::SetObjectElement
1.05% 0.00% kworker/0:3 [kernel.kallsyms] [k] gen6_ring_get_seqno
0.88% 0.17% chrome chrome [.] 0x0000000000ee27ab
0.65% 0.86% firefox libxul.so [.] js::ValueToId<(js::AllowGC)1>
0.64% 0.23% JS Helper libxul.so [.] js::SplayTree<js::jit::LiveRange*, js::jit::LiveRange>::splay
0.62% 1.27% firefox libxul.so [.] js::GetIterator
0.61% 1.74% firefox libxul.so [.] js::NativeSetProperty
0.61% 0.31% firefox libxul.so [.] js::SetPropertyByDefining
- Introduce the 'perf stat record/report' workflow:
Generate perf.data files from 'perf stat', to tap into the
scripting capabilities perf has instead of defining a 'perf stat'
specific scripting support to calculate event ratios, etc.
Simple example:
$ perf stat record -e cycles usleep 1
Performance counter stats for 'usleep 1':
1,134,996 cycles
0.000670644 seconds time elapsed
$ perf stat report
Performance counter stats for '/home/acme/bin/perf stat record -e cycles usleep 1':
1,134,996 cycles
0.000670644 seconds time elapsed
$
It generates PERF_RECORD_ userspace records to store the details:
$ perf report -D | grep PERF_RECORD
0xf0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27637
0x118 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
0x12a [0x40]: PERF_RECORD_STAT_CONFIG
0x16a [0x30]: PERF_RECORD_STAT
-1 -1 0x19a [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
0x1da [0x18]: PERF_RECORD_STAT_ROUND
[acme@ssdandy linux]$
An effort was made to make perf.data files generated like this to
not generate cryptic messages when processed by older tools.
The 'perf script' bits need rebasing, will go up later.
- Make command line options always available, even when they depend
on some feature being enabled, warning the user about use of such
options (Wang Nan)
- Support hw breakpoint events (mem:0xAddress) in the default output
mode in 'perf script' (Wang Nan)
- Fixes and improvements for supporting annotating ARM binaries,
support ARM call and jump instructions, more work needed to have
arch specific stuff separated into tools/perf/arch/*/annotate/
(Russell King)
- Add initial 'perf config' command, for now just with a --list
command to the contents of the configuration file in use and a
basic man page describing its format, commands for doing edits and
detailed documentation are being reviewed and proof-read. (Taeung
Song)
- Allows BPF scriptlets specify arguments to be fetched using DWARF
info, using a prologue generated at compile/build time (He Kuang,
Wang Nan)
- Allow attaching BPF scriptlets to module symbols (Wang Nan)
- Allow attaching BPF scriptlets to userspace code using uprobe (Wang
Nan)
- BPF programs now can specify 'perf probe' tunables via its section
name, separating key=val values using semicolons (Wang Nan)
Testing some of these new BPF features:
Use case: get callchains when receiving SSL packets, filter then in the
kernel, at arbitrary place.
# cat ssl.bpf.c
#define SEC(NAME) __attribute__((section(NAME), used))
struct pt_regs;
SEC("func=__inet_lookup_established hnum")
int func(struct pt_regs *ctx, int err, unsigned short port)
{
return err == 0 && port == 443;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
#
# perf record -a -g -e ssl.bpf.c
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ]
# perf script | head -30
swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux)
856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux)
2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux)
2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux)
96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux)
969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux)
2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux)
95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux)
1163ffa start_kernel ([kernel.vmlinux].init.text)
11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text)
1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)
qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux)
8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux)
430a br_handle_frame_finish ([bridge])
48bc br_handle_frame ([bridge])
855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
#
- Use 'perf probe' various options to list functions, see what
variables can be collected at any given point, experiment first
collecting without a filter, then filter, use it together with
'perf trace', 'perf top', with or without callchains, if it
explodes, please tell us!
- Introduce a new callchain mode: "folded", that will list per line
representations of all callchains for a give histogram entry,
facilitating 'perf report' output processing by other tools, such
as Brendan Gregg's flamegraph tools (Namhyung Kim)
E.g:
# perf report | grep -v ^# | head
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
|
---cpu_startup_entry
|
|--12.07%--start_secondary
|
--6.30%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
#
Becomes, in "folded" mode:
# perf report -g folded | grep -v ^# | head -5
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
12.07% cpu_startup_entry;start_secondary
6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
11.23% call_cpuidle;cpu_startup_entry;start_secondary
5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
#
The user can also select one of "count", "period" or "percent" as
the first column.
... and lots of infrastructure enhancements, plus fixes and other
changes, features I failed to list - see the shortlog and the git log
for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (271 commits)
perf evlist: Add --trace-fields option to show trace fields
perf record: Store data mmaps for dwarf unwind
perf libdw: Check for mmaps also in MAP__VARIABLE tree
perf unwind: Check for mmaps also in MAP__VARIABLE tree
perf unwind: Use find_map function in access_dso_mem
perf evlist: Remove perf_evlist__(enable|disable)_event functions
perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does)
perf report: Show random usage tip on the help line
perf hists: Export a couple of hist functions
perf diff: Use perf_hpp__register_sort_field interface
perf tools: Add overhead/overhead_children keys defaults via string
perf tools: Remove list entry from struct sort_entry
perf tools: Include all tools/lib directory for tags/cscope/TAGS targets
perf script: Align event name properly
perf tools: Add missing headers in perf's MANIFEST
perf tools: Do not show trace command if it's not compiled in
perf report: Change default to use event group view
perf top: Decay periods in callchains
tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/
tools lib: Sync tools/lib/find_bit.c with the kernel
...
Diffstat (limited to 'tools/perf/util/help.c')
-rw-r--r-- | tools/perf/util/help.c | 339 |
1 files changed, 0 insertions, 339 deletions
diff --git a/tools/perf/util/help.c b/tools/perf/util/help.c deleted file mode 100644 index 86c37c472263..000000000000 --- a/tools/perf/util/help.c +++ /dev/null @@ -1,339 +0,0 @@ -#include "cache.h" -#include "../builtin.h" -#include "exec_cmd.h" -#include "levenshtein.h" -#include "help.h" -#include <termios.h> - -void add_cmdname(struct cmdnames *cmds, const char *name, size_t len) -{ - struct cmdname *ent = malloc(sizeof(*ent) + len + 1); - - ent->len = len; - memcpy(ent->name, name, len); - ent->name[len] = 0; - - ALLOC_GROW(cmds->names, cmds->cnt + 1, cmds->alloc); - cmds->names[cmds->cnt++] = ent; -} - -static void clean_cmdnames(struct cmdnames *cmds) -{ - unsigned int i; - - for (i = 0; i < cmds->cnt; ++i) - zfree(&cmds->names[i]); - zfree(&cmds->names); - cmds->cnt = 0; - cmds->alloc = 0; -} - -static int cmdname_compare(const void *a_, const void *b_) -{ - struct cmdname *a = *(struct cmdname **)a_; - struct cmdname *b = *(struct cmdname **)b_; - return strcmp(a->name, b->name); -} - -static void uniq(struct cmdnames *cmds) -{ - unsigned int i, j; - - if (!cmds->cnt) - return; - - for (i = j = 1; i < cmds->cnt; i++) - if (strcmp(cmds->names[i]->name, cmds->names[i-1]->name)) - cmds->names[j++] = cmds->names[i]; - - cmds->cnt = j; -} - -void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes) -{ - size_t ci, cj, ei; - int cmp; - - ci = cj = ei = 0; - while (ci < cmds->cnt && ei < excludes->cnt) { - cmp = strcmp(cmds->names[ci]->name, excludes->names[ei]->name); - if (cmp < 0) - cmds->names[cj++] = cmds->names[ci++]; - else if (cmp == 0) - ci++, ei++; - else if (cmp > 0) - ei++; - } - - while (ci < cmds->cnt) - cmds->names[cj++] = cmds->names[ci++]; - - cmds->cnt = cj; -} - -static void pretty_print_string_list(struct cmdnames *cmds, int longest) -{ - int cols = 1, rows; - int space = longest + 1; /* min 1 SP between words */ - struct winsize win; - int max_cols; - int i, j; - - get_term_dimensions(&win); - max_cols = win.ws_col - 1; /* don't print *on* the edge */ - - if (space < max_cols) - cols = max_cols / space; - rows = (cmds->cnt + cols - 1) / cols; - - for (i = 0; i < rows; i++) { - printf(" "); - - for (j = 0; j < cols; j++) { - unsigned int n = j * rows + i; - unsigned int size = space; - - if (n >= cmds->cnt) - break; - if (j == cols-1 || n + rows >= cmds->cnt) - size = 1; - printf("%-*s", size, cmds->names[n]->name); - } - putchar('\n'); - } -} - -static int is_executable(const char *name) -{ - struct stat st; - - if (stat(name, &st) || /* stat, not lstat */ - !S_ISREG(st.st_mode)) - return 0; - - return st.st_mode & S_IXUSR; -} - -static void list_commands_in_dir(struct cmdnames *cmds, - const char *path, - const char *prefix) -{ - int prefix_len; - DIR *dir = opendir(path); - struct dirent *de; - struct strbuf buf = STRBUF_INIT; - int len; - - if (!dir) - return; - if (!prefix) - prefix = "perf-"; - prefix_len = strlen(prefix); - - strbuf_addf(&buf, "%s/", path); - len = buf.len; - - while ((de = readdir(dir)) != NULL) { - int entlen; - - if (prefixcmp(de->d_name, prefix)) - continue; - - strbuf_setlen(&buf, len); - strbuf_addstr(&buf, de->d_name); - if (!is_executable(buf.buf)) - continue; - - entlen = strlen(de->d_name) - prefix_len; - if (has_extension(de->d_name, ".exe")) - entlen -= 4; - - add_cmdname(cmds, de->d_name + prefix_len, entlen); - } - closedir(dir); - strbuf_release(&buf); -} - -void load_command_list(const char *prefix, - struct cmdnames *main_cmds, - struct cmdnames *other_cmds) -{ - const char *env_path = getenv("PATH"); - const char *exec_path = perf_exec_path(); - - if (exec_path) { - list_commands_in_dir(main_cmds, exec_path, prefix); - qsort(main_cmds->names, main_cmds->cnt, - sizeof(*main_cmds->names), cmdname_compare); - uniq(main_cmds); - } - - if (env_path) { - char *paths, *path, *colon; - path = paths = strdup(env_path); - while (1) { - if ((colon = strchr(path, PATH_SEP))) - *colon = 0; - if (!exec_path || strcmp(path, exec_path)) - list_commands_in_dir(other_cmds, path, prefix); - - if (!colon) - break; - path = colon + 1; - } - free(paths); - - qsort(other_cmds->names, other_cmds->cnt, - sizeof(*other_cmds->names), cmdname_compare); - uniq(other_cmds); - } - exclude_cmds(other_cmds, main_cmds); -} - -void list_commands(const char *title, struct cmdnames *main_cmds, - struct cmdnames *other_cmds) -{ - unsigned int i, longest = 0; - - for (i = 0; i < main_cmds->cnt; i++) - if (longest < main_cmds->names[i]->len) - longest = main_cmds->names[i]->len; - for (i = 0; i < other_cmds->cnt; i++) - if (longest < other_cmds->names[i]->len) - longest = other_cmds->names[i]->len; - - if (main_cmds->cnt) { - const char *exec_path = perf_exec_path(); - printf("available %s in '%s'\n", title, exec_path); - printf("----------------"); - mput_char('-', strlen(title) + strlen(exec_path)); - putchar('\n'); - pretty_print_string_list(main_cmds, longest); - putchar('\n'); - } - - if (other_cmds->cnt) { - printf("%s available from elsewhere on your $PATH\n", title); - printf("---------------------------------------"); - mput_char('-', strlen(title)); - putchar('\n'); - pretty_print_string_list(other_cmds, longest); - putchar('\n'); - } -} - -int is_in_cmdlist(struct cmdnames *c, const char *s) -{ - unsigned int i; - - for (i = 0; i < c->cnt; i++) - if (!strcmp(s, c->names[i]->name)) - return 1; - return 0; -} - -static int autocorrect; -static struct cmdnames aliases; - -static int perf_unknown_cmd_config(const char *var, const char *value, void *cb) -{ - if (!strcmp(var, "help.autocorrect")) - autocorrect = perf_config_int(var,value); - /* Also use aliases for command lookup */ - if (!prefixcmp(var, "alias.")) - add_cmdname(&aliases, var + 6, strlen(var + 6)); - - return perf_default_config(var, value, cb); -} - -static int levenshtein_compare(const void *p1, const void *p2) -{ - const struct cmdname *const *c1 = p1, *const *c2 = p2; - const char *s1 = (*c1)->name, *s2 = (*c2)->name; - int l1 = (*c1)->len; - int l2 = (*c2)->len; - return l1 != l2 ? l1 - l2 : strcmp(s1, s2); -} - -static void add_cmd_list(struct cmdnames *cmds, struct cmdnames *old) -{ - unsigned int i; - - ALLOC_GROW(cmds->names, cmds->cnt + old->cnt, cmds->alloc); - - for (i = 0; i < old->cnt; i++) - cmds->names[cmds->cnt++] = old->names[i]; - zfree(&old->names); - old->cnt = 0; -} - -const char *help_unknown_cmd(const char *cmd) -{ - unsigned int i, n = 0, best_similarity = 0; - struct cmdnames main_cmds, other_cmds; - - memset(&main_cmds, 0, sizeof(main_cmds)); - memset(&other_cmds, 0, sizeof(main_cmds)); - memset(&aliases, 0, sizeof(aliases)); - - perf_config(perf_unknown_cmd_config, NULL); - - load_command_list("perf-", &main_cmds, &other_cmds); - - add_cmd_list(&main_cmds, &aliases); - add_cmd_list(&main_cmds, &other_cmds); - qsort(main_cmds.names, main_cmds.cnt, - sizeof(main_cmds.names), cmdname_compare); - uniq(&main_cmds); - - if (main_cmds.cnt) { - /* This reuses cmdname->len for similarity index */ - for (i = 0; i < main_cmds.cnt; ++i) - main_cmds.names[i]->len = - levenshtein(cmd, main_cmds.names[i]->name, 0, 2, 1, 4); - - qsort(main_cmds.names, main_cmds.cnt, - sizeof(*main_cmds.names), levenshtein_compare); - - best_similarity = main_cmds.names[0]->len; - n = 1; - while (n < main_cmds.cnt && best_similarity == main_cmds.names[n]->len) - ++n; - } - - if (autocorrect && n == 1) { - const char *assumed = main_cmds.names[0]->name; - - main_cmds.names[0] = NULL; - clean_cmdnames(&main_cmds); - fprintf(stderr, "WARNING: You called a perf program named '%s', " - "which does not exist.\n" - "Continuing under the assumption that you meant '%s'\n", - cmd, assumed); - if (autocorrect > 0) { - fprintf(stderr, "in %0.1f seconds automatically...\n", - (float)autocorrect/10.0); - poll(NULL, 0, autocorrect * 100); - } - return assumed; - } - - fprintf(stderr, "perf: '%s' is not a perf-command. See 'perf --help'.\n", cmd); - - if (main_cmds.cnt && best_similarity < 6) { - fprintf(stderr, "\nDid you mean %s?\n", - n < 2 ? "this": "one of these"); - - for (i = 0; i < n; i++) - fprintf(stderr, "\t%s\n", main_cmds.names[i]->name); - } - - exit(1); -} - -int cmd_version(int argc __maybe_unused, const char **argv __maybe_unused, - const char *prefix __maybe_unused) -{ - printf("perf version %s\n", perf_version_string); - return 0; -} |