summaryrefslogtreecommitdiff
path: root/tools/perf/util/stat.h
AgeCommit message (Collapse)Author
2019-08-26perf record: Move record_opts and other record decls out of perf.hArnaldo Carvalho de Melo
And into a separate util/record.h, to better isolate things and make sure that those who use record_opts and the other moved declarations are explicitly including the necessary header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-31q8mei1qkh74qvkl9nwidfq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26perf stat: Remove needless headers from stat.hArnaldo Carvalho de Melo
Just a forward declaration for 'struct timespec' is needed, ditch the rest. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-6shdqw801oqe7ax6r307k27r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29libperf: Adopt xyarray class from perfJiri Olsa
Move the xyarray class from perf to libperf, because it's going to be used in both. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-58-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29perf evlist: Rename struct perf_evlist to struct evlistJiri Olsa
Rename struct perf_evlist to struct evlist, so we don't have a name clash when we add struct perf_evlist in libperf. Committer notes: Added fixes to build on arm64, from Jiri and from me (tools/perf/util/cs-etm.c) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29perf evsel: Rename struct perf_evsel to struct evselJiri Olsa
Rename struct perf_evsel to struct evsel, so we don't have a name clash when we add struct perf_evsel in libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29perf cpu_map: Rename struct cpu_map to struct perf_cpu_mapJiri Olsa
Rename struct cpu_map to struct perf_cpu_map, so it could be part of libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-10perf stat: Support per-die aggregationKan Liang
It is useful to aggregate counts per die. E.g. Uncore becomes die-scope on Xeon Cascade Lake-AP. Introduce a new option "--per-die" to support per-die aggregation. The global id for each core has been changed to socket + die id + core id. The global id for each die is socket + die id. Add die information for per-core aggregation. The output of per-core aggregation will be changed from "S0-C0" to "S0-D0-C0". Any scripts which rely on the output format of per-core aggregation probably be broken. For 'perf stat record/report', there is no die information when processing the old perf.data. The per-die result will be the same as per-socket. Committer notes: Renamed 'die' variable to 'die_id' to fix the build in some systems: CC /tmp/build/perf/builtin-script.o cc1: warnings being treated as errors builtin-stat.c: In function 'perf_env__get_die': builtin-stat.c:963: error: declaration of 'die' shadows a global declaration util/util.h:19: error: shadowed declaration is here mv: cannot stat `/tmp/build/perf/.builtin-stat.o.tmp': No such file or directory Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-bsnhx7vgsuu6ei307mw60mbj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-19perf tools: Remove perf_tool from event_op2Jiri Olsa
Now that we keep a perf_tool pointer inside perf_session, there's no need to have a perf_tool argument in the event_op2 callback. Remove it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180913125450.21342-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move the display functions to stat-display.cJiri Olsa
Move perf_evlist__print_counters() with all its dependency functions to the stat-display.c object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-44-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'metric_events' to 'struct perf_stat_config'Jiri Olsa
Move the static variable 'metric_events' to 'struct perf_stat_config', so that it can be passed around and used outside 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-43-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'walltime_*' data to 'struct perf_stat_config'Jiri Olsa
Move the static variables 'walltime_*' to 'struct perf_stat_config', so that it can be passed around and used outside 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-42-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'no_merge' data to 'struct perf_stat_config'Jiri Olsa
Move the static variable 'no_merge' to 'struct perf_stat_config', so that it can be passed around and used outside 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-40-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'big_num' data to 'struct perf_stat_config'Jiri Olsa
Move the static variable 'big_num' to 'struct perf_stat_config', so that it can be passed around and used outside 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-39-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move *_aggr_* data to 'struct perf_stat_config'Jiri Olsa
Move the *_aggr_* global variables to 'struct perf_stat_config', so that it can be passed around and used outside 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-37-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move ru_* data to 'struct perf_stat_config'Jiri Olsa
Move the 'ru_*' global variables to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-36-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'print_mixed_hw_group_error' to 'struct perf_stat_config'Jiri Olsa
Move the 'print_mixed_hw_group_error' global variable to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-35-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'print_free_counters_hint' to 'struct perf_stat_config'Jiri Olsa
Move the 'print_free_counters_hint' variable to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-34-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'null_run' to 'struct perf_stat_config'Jiri Olsa
Move the static 'null_run' variable to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-33-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Add 'walltime_nsecs_stats' pointer to 'struct perf_stat_config'Jiri Olsa
Add 'walltime_nsecs_stats' pointer to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. It's initialized to point to stat's walltime_nsecs_stats value. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-32-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'metric_only_len' to 'struct perf_stat_config'Jiri Olsa
Move the static 'metric_only_len' variable to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-29-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'run_count' to 'struct perf_stat_config'Jiri Olsa
Move the static 'run_count' variable to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-28-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'unit_width' to 'struct perf_stat_config'Jiri Olsa
Move the static 'unit_width' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-24-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'metric_only' to 'struct perf_stat_config'Jiri Olsa
Move the static 'metric_only' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-23-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'interval_clear' to 'struct perf_stat_config'Jiri Olsa
Move the static 'interval_clear' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-22-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move csv_* to 'struct perf_stat_config'Jiri Olsa
Move the static csv_* variables to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-21-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Pass a 'struct perf_stat_config' argument to global print functionsJiri Olsa
Add 'struct perf_stat_config' argument to the global print functions, so that these functions can be used out of the 'perf stat' command code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-20-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move perf_stat_synthesize_config() to stat.cJiri Olsa
So that it can be used globally. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-15-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move create_perf_stat_counter() to stat.cJiri Olsa
Move create_perf_stat_counter() to the 'stat' class, so that we can use it globally. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Add 'identifier' flag to 'struct perf_stat_config'Jiri Olsa
Add 'identifier' flag to 'struct perf_stat_config' to carry the info whether to use PERF_SAMPLE_IDENTIFIER for events. This makes create_perf_stat_counter() independent. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'no_inherit' to 'struct perf_stat_config'Jiri Olsa
Move the static 'no_inherit' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf stat: Move 'initial_delay' to 'struct perf_stat_config'Jiri Olsa
Move the static 'initial_delay' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Add 'struct perf_stat_config' argument to create_perf_stat_counter() and use its 'initial_delay' member instead of the static one. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf stat: Add --table option to display time of each runJiri Olsa
Add --table option to display time for each run (-r option), like: $ perf stat --null -r 5 --table perf bench sched pipe Performance counter stats for './perf bench sched pipe' (5 runs): # Table of individual measurements: 5.379 (-0.176) 5.243 (-0.311) 5.238 (-0.317) 5.536 (-0.019) 6.377 (+0.823) # Final result: 5.555 +- 0.213 seconds time elapsed ( +- 3.83% ) Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180423090823.32309-8-jolsa@kernel.org [ Document the new option in 'perf stat's man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-16perf stat: Make function perf_stat_evsel_id_init staticThomas Richter
Function perf_stat_evsel_id_init() has global linkage but is only used in util/stat.c. Make it static. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180312103807.45069-2-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf stat: Add support to print counts after a period of timeyuzhoujian
Introduce a new option to print counts after N milliseconds and update 'perf stat' documentation accordingly. Show below is the output of the new option for perf stat. $ perf stat --time 2000 -e cycles -a Performance counter stats for 'system wide': 157,260,423 cycles 2.003060766 seconds time elapsed We can print the count deltas after N milliseconds with this new introduced option. This option is not supported with "-I" option. In addition, according to Kangliang's patch(19afd10410957), the monitoring overhead for system-wide core event could be very high if the interval-print parameter was below 100ms, and the limitation value is 10ms. So the same warning will be displayed when the time is set between 10ms to 100ms, and the minimal time is limited to 10ms. Users can make a decision according to their spcific cases. Committer notes: This actually stops the workload after the specified time, then prints the counts. So I renamed the option to --timeout and updated the documentation to state that it will not just print the counts after the specified time, but will really stop the 'perf stat' session and print the counts. The rename from 'time' to 'timeout' also fixes the build in systems where 'time' is used by glibc and can't be used as a name of a variable, such as centos:5 and centos:6. Changes since v3: - none. Changes since v2: - modify the time check in __run_perf_stat func to keep some consistency with the workload case. - add the warning when the time is set between 10ms to 100ms. - add the pr_err when the time is set below 10ms. Changes since v1: - none. Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1517217923-8302-3-git-send-email-ufo19890607@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf stat: Add support to print counts for fixed timesyuzhoujian
Introduce a new option to print counts for fixed number of times and update 'perf stat' documentation accordingly. Show below is the output of the new option for perf stat. $ perf stat -I 1000 --interval-count 2 -e cycles -a # time counts unit events 1.002827089 93,884,870 cycles 2.004231506 56,573,446 cycles We can just print the counts for several times with this newly introduced option. The usage of it is a little like 'vmstat', and it should be used together with "-I" option. $ vmstat -n 1 2 procs ---------memory-------------- --swap- ----io-- -system-- ------cpu--- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 78270544 547484 51732076 0 0 0 20 1 1 1 0 99 0 0 0 0 0 78270512 547484 51732080 0 0 0 16 477 1555 0 0 100 0 0 Changes since v3: - merge interval_count check and times check to one line. - fix the wrong indent in stat.h - use stat_config.times instead of 'times' in cmd_stat function. Changes since v2: - none. Changes since v1: - change the name of the new option "times-print" to "interval-count". - keep the new option interval specifically. Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1517217923-8302-2-git-send-email-ufo19890607@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Resort '--per-thread' resultJin Yao
There are many threads reported if we enable '--per-thread' globally. 1. Most of the threads are not counted or counting value 0. This patch removes these threads. 2. We also resort the threads in display according to the counting value. It's useful for user to see the hottest threads easily. For example, the new results would be: root@skl:/tmp# perf stat --per-thread ^C Performance counter stats for 'system wide': perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized rcu_sched-8 10 context-switches # 0.117 M/sec kworker/u16:1-18249 7 context-switches # 0.056 M/sec sshd-23111 4 context-switches # 0.014 M/sec vmstat-23127 4 context-switches # 0.003 M/sec perf-24165 4 context-switches # 0.930 K/sec kworker/0:2-19991 3 context-switches # 0.022 M/sec kworker/u16:2-23146 3 context-switches # 0.039 M/sec kworker/4:1-15354 2 context-switches # 0.070 M/sec kworker/6:0-17528 2 context-switches # 0.084 M/sec sshd-23058 2 context-switches # 0.010 M/sec ksoftirqd/0-7 1 context-switches # 0.113 M/sec watchdog/0-11 1 context-switches # 0.092 M/sec watchdog/1-14 1 context-switches # 0.118 M/sec watchdog/2-20 1 context-switches # 0.412 M/sec watchdog/3-26 1 context-switches # 0.346 M/sec watchdog/4-32 1 context-switches # 0.362 M/sec watchdog/5-38 1 context-switches # 0.677 M/sec watchdog/6-44 1 context-switches # 0.671 M/sec watchdog/7-50 1 context-switches # 0.340 M/sec kworker/4:1H-1887 1 context-switches # 0.076 M/sec thermald-2841 1 context-switches # 0.004 M/sec gmain-2700 1 context-switches # 0.024 M/sec irqbalance-2780 1 context-switches # 0.001 M/sec kworker/3:2-12870 1 context-switches # 0.098 M/sec kworker/5:2-31362 1 context-switches # 0.086 M/sec kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec rcu_sched-8 1 cpu-migrations # 0.012 M/sec sshd-23058 1 cpu-migrations # 0.005 M/sec perf-24165 8,833,385 cycles # 2.053 GHz vmstat-23127 1,702,699 cycles # 1.090 GHz irqbalance-2780 739,847 cycles # 0.894 GHz sshd-23111 269,506 cycles # 0.968 GHz thermald-2841 204,556 cycles # 0.886 GHz sshd-23058 158,780 cycles # 0.766 GHz kworker/0:2-19991 112,981 cycles # 0.843 GHz kworker/u16:1-18249 100,926 cycles # 0.803 GHz rcu_sched-8 74,024 cycles # 0.865 GHz kworker/u16:2-23146 55,984 cycles # 0.726 GHz gmain-2700 34,278 cycles # 0.820 GHz kworker/4:1-15354 20,665 cycles # 0.728 GHz kworker/6:0-17528 16,445 cycles # 0.688 GHz kworker/5:2-31362 9,492 cycles # 0.816 GHz watchdog/3-26 8,695 cycles # 3.006 GHz kworker/4:1H-1887 8,238 cycles # 0.624 GHz watchdog/4-32 7,580 cycles # 2.747 GHz kworker/3:2-12870 7,306 cycles # 0.715 GHz watchdog/2-20 7,274 cycles # 2.995 GHz watchdog/0-11 6,988 cycles # 0.642 GHz ksoftirqd/0-7 6,376 cycles # 0.719 GHz watchdog/1-14 5,340 cycles # 0.630 GHz watchdog/5-38 4,061 cycles # 2.749 GHz watchdog/6-44 3,976 cycles # 2.667 GHz watchdog/7-50 3,418 cycles # 1.161 GHz vmstat-23127 2,511,699 instructions # 1.48 insn per cycle perf-24165 1,829,908 instructions # 0.21 insn per cycle irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle thermald-2841 143,544 instructions # 0.70 insn per cycle sshd-23111 128,138 instructions # 0.48 insn per cycle sshd-23058 57,654 instructions # 0.36 insn per cycle rcu_sched-8 44,063 instructions # 0.60 insn per cycle kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle gmain-2700 13,691 instructions # 0.40 insn per cycle kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle watchdog/0-11 3,135 instructions # 0.45 insn per cycle watchdog/1-14 3,135 instructions # 0.59 insn per cycle watchdog/2-20 3,135 instructions # 0.43 insn per cycle watchdog/3-26 3,135 instructions # 0.36 insn per cycle watchdog/4-32 3,135 instructions # 0.41 insn per cycle watchdog/5-38 3,135 instructions # 0.77 insn per cycle watchdog/6-44 3,135 instructions # 0.79 insn per cycle watchdog/7-50 3,135 instructions # 0.92 insn per cycle vmstat-23127 539,181 branches # 345.139 M/sec perf-24165 375,364 branches # 87.245 M/sec irqbalance-2780 262,092 branches # 316.593 M/sec thermald-2841 31,611 branches # 136.915 M/sec sshd-23111 21,874 branches # 78.596 M/sec sshd-23058 10,682 branches # 51.528 M/sec rcu_sched-8 8,693 branches # 101.633 M/sec kworker/u16:1-18249 7,891 branches # 62.808 M/sec kworker/0:2-19991 5,761 branches # 42.998 M/sec kworker/u16:2-23146 4,099 branches # 53.138 M/sec kworker/4:1-15354 2,755 branches # 97.110 M/sec gmain-2700 2,638 branches # 63.127 M/sec kworker/6:0-17528 2,216 branches # 92.739 M/sec kworker/5:2-31362 1,132 branches # 97.360 M/sec kworker/3:2-12870 1,081 branches # 105.773 M/sec kworker/4:1H-1887 725 branches # 54.887 M/sec ksoftirqd/0-7 707 branches # 79.716 M/sec watchdog/0-11 652 branches # 59.860 M/sec watchdog/1-14 652 branches # 76.923 M/sec watchdog/2-20 652 branches # 268.423 M/sec watchdog/3-26 652 branches # 225.372 M/sec watchdog/4-32 652 branches # 236.318 M/sec watchdog/5-38 652 branches # 441.435 M/sec watchdog/6-44 652 branches # 437.290 M/sec watchdog/7-50 652 branches # 221.467 M/sec vmstat-23127 8,960 branch-misses # 1.66% of all branches irqbalance-2780 3,047 branch-misses # 1.16% of all branches perf-24165 2,876 branch-misses # 0.77% of all branches sshd-23111 1,843 branch-misses # 8.43% of all branches thermald-2841 1,444 branch-misses # 4.57% of all branches sshd-23058 1,379 branch-misses # 12.91% of all branches kworker/u16:1-18249 982 branch-misses # 12.44% of all branches rcu_sched-8 893 branch-misses # 10.27% of all branches kworker/u16:2-23146 578 branch-misses # 14.10% of all branches kworker/0:2-19991 376 branch-misses # 6.53% of all branches gmain-2700 280 branch-misses # 10.61% of all branches kworker/6:0-17528 196 branch-misses # 8.84% of all branches kworker/4:1-15354 187 branch-misses # 6.79% of all branches kworker/5:2-31362 123 branch-misses # 10.87% of all branches watchdog/0-11 95 branch-misses # 14.57% of all branches watchdog/4-32 89 branch-misses # 13.65% of all branches kworker/3:2-12870 80 branch-misses # 7.40% of all branches watchdog/3-26 61 branch-misses # 9.36% of all branches kworker/4:1H-1887 60 branch-misses # 8.28% of all branches watchdog/2-20 52 branch-misses # 7.98% of all branches ksoftirqd/0-7 47 branch-misses # 6.65% of all branches watchdog/1-14 46 branch-misses # 7.06% of all branches watchdog/7-50 13 branch-misses # 1.99% of all branches watchdog/5-38 8 branch-misses # 1.23% of all branches watchdog/6-44 7 branch-misses # 1.07% of all branches 3.695150786 seconds time elapsed root@skl:/tmp# perf stat --per-thread -M IPC,CPI ^C Performance counter stats for 'system wide': vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC thermald-2841 1,472,670 inst_retired.any # 1.3 IPC sshd-23111 977,374 inst_retired.any # 1.2 IPC perf-24163 483,779 inst_retired.any # 0.2 IPC gmain-2700 341,213 inst_retired.any # 0.9 IPC sshd-23058 148,891 inst_retired.any # 0.8 IPC rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC rcu_sched-8 14,474 inst_retired.any # 0.8 IPC kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC watchdog/0-11 1,479 inst_retired.any # 0.3 IPC watchdog/1-14 1,479 inst_retired.any # 0.3 IPC watchdog/2-20 1,479 inst_retired.any # 0.4 IPC watchdog/3-26 1,479 inst_retired.any # 0.4 IPC watchdog/4-32 1,479 inst_retired.any # 0.3 IPC watchdog/5-38 1,479 inst_retired.any # 0.3 IPC watchdog/6-44 1,479 inst_retired.any # 0.7 IPC watchdog/7-50 1,479 inst_retired.any # 0.7 IPC kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC perf-24163 2,249,872 cpu_clk_unhalted.thread vmstat-23127 1,352,455 cpu_clk_unhalted.thread thermald-2841 1,161,140 cpu_clk_unhalted.thread sshd-23111 807,827 cpu_clk_unhalted.thread gmain-2700 375,535 cpu_clk_unhalted.thread sshd-23058 194,071 cpu_clk_unhalted.thread kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread kworker/0:2-19991 46,550 cpu_clk_unhalted.thread rcu_sched-8 18,855 cpu_clk_unhalted.thread rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread kworker/4:1-15354 8,812 cpu_clk_unhalted.thread kworker/5:2-31362 6,812 cpu_clk_unhalted.thread kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread kworker/6:0-17528 5,111 cpu_clk_unhalted.thread kworker/7:2-23145 4,667 cpu_clk_unhalted.thread watchdog/0-11 4,663 cpu_clk_unhalted.thread watchdog/1-14 4,663 cpu_clk_unhalted.thread watchdog/4-32 4,626 cpu_clk_unhalted.thread watchdog/5-38 4,403 cpu_clk_unhalted.thread watchdog/3-26 3,936 cpu_clk_unhalted.thread watchdog/2-20 3,850 cpu_clk_unhalted.thread kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread watchdog/6-44 2,017 cpu_clk_unhalted.thread watchdog/7-50 2,017 cpu_clk_unhalted.thread vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI thermald-2841 1,472,670 inst_retired.any # 0.8 CPI sshd-23111 977,374 inst_retired.any # 0.8 CPI perf-24163 495,037 inst_retired.any # 4.7 CPI gmain-2700 341,213 inst_retired.any # 1.1 CPI sshd-23058 148,891 inst_retired.any # 1.3 CPI rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI rcu_sched-8 14,474 inst_retired.any # 1.3 CPI kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI watchdog/0-11 1,479 inst_retired.any # 3.2 CPI watchdog/1-14 1,479 inst_retired.any # 3.2 CPI watchdog/2-20 1,479 inst_retired.any # 2.6 CPI watchdog/3-26 1,479 inst_retired.any # 2.7 CPI watchdog/4-32 1,479 inst_retired.any # 3.1 CPI watchdog/5-38 1,479 inst_retired.any # 3.0 CPI watchdog/6-44 1,479 inst_retired.any # 1.4 CPI watchdog/7-50 1,479 inst_retired.any # 1.4 CPI kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI perf-24163 2,302,323 cycles vmstat-23127 1,352,455 cycles thermald-2841 1,161,140 cycles sshd-23111 807,827 cycles gmain-2700 375,535 cycles sshd-23058 194,071 cycles kworker/u16:1-18249 114,306 cycles rtkit-daemon-3288 103,547 cycles kworker/0:2-19991 46,550 cycles rcu_sched-8 18,855 cycles rtkit-daemon-3289 17,549 cycles kworker/4:1-15354 8,812 cycles kworker/5:2-31362 6,812 cycles kworker/4:1H-1887 5,270 cycles kworker/6:0-17528 5,111 cycles kworker/7:2-23145 4,667 cycles watchdog/0-11 4,663 cycles watchdog/1-14 4,663 cycles watchdog/4-32 4,626 cycles watchdog/5-38 4,403 cycles watchdog/3-26 3,936 cycles watchdog/2-20 3,850 cycles kworker/u16:2-23146 2,654 cycles watchdog/6-44 2,017 cycles watchdog/7-50 2,017 cycles 2.175726600 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Allocate shadow stats buffer for threadsJin Yao
After perf_evlist__create_maps() being executed, we can get all threads from /proc. And via thread_map__nr(), we can also get the number of threads. With the number of threads, the patch allocates a buffer which will record the shadow stats for these threads. The buffer pointer is saved in stat_config. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-8-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Remove a set of shadow stats static variablesJin Yao
In previous patches, we have reconstructed the code and let it not access the static variables directly. This patch removes these static variables. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-7-git-send-email-yao.jin@linux.intel.com [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Print per-thread shadow statsJin Yao
The function perf_stat__print_shadow_stats() is called to print the shadow stats on a set of static variables. But the static variables are the limitations to support per-thread shadow stats. This patch lets the perf_stat__print_shadow_stats() support to print the shadow stats from a input parameter 'st'. It will not directly get value from static variable. Instead, it now uses runtime_stat_avg() and runtime_stat_n() to get and compute the values. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-6-git-send-email-yao.jin@linux.intel.com [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Update per-thread shadow statsJin Yao
The functions perf_stat__update_shadow_stats() is called to update the shadow stats on a set of static variables. But the static variables are the limitations to be extended to support per-thread shadow stats. This patch lets the perf_stat__update_shadow_stats() support to update the shadow stats on a input parameter 'st' and uses update_runtime_stat() to update the stats. It will not directly update the static variables as before. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-5-git-send-email-yao.jin@linux.intel.com [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Create the runtime_stat init/exit functionJin Yao
It mainly initializes and releases the rblist which is defined in struct runtime_stat. For the original rblist 'runtime_saved_values', it's still kept there for keeping the patch bisectable. The rblist 'runtime_saved_values' will be removed in later patch at switching time. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-4-git-send-email-yao.jin@linux.intel.com [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf stat: Define a structure for per-thread shadow statsJin Yao
Perf has a set of static variables to record the runtime shadow metrics stats. While if we want to record the runtime shadow stats for per-thread, it will be the limitation. This patch creates a structure and the next patches will use this structure to update the runtime shadow stats for per-thread. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512482591-4646-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-07Merge branch 'linus' into perf/core, to fix conflictsIngo Molnar
Conflicts: tools/perf/arch/arm/annotate/instructions.c tools/perf/arch/arm64/annotate/instructions.c tools/perf/arch/powerpc/annotate/instructions.c tools/perf/arch/s390/annotate/instructions.c tools/perf/arch/x86/tests/intel-cqm.c tools/perf/ui/tui/progress.c tools/perf/util/zlib.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-30perf stat: Move the shadow stats scale computation in ↵Jiri Olsa
perf_stat__update_shadow_stats Move the shadow stats scale computation to the perf_stat__update_shadow_stats() function, so it's centralized and we don't forget to do it. It also saves few lines of code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Changbin Du <changbin.du@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-htg7mmyxv6pcrf57qyo6msid@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-13perf stat: Support JSON metrics in perf statAndi Kleen
Add generic support for standalone metrics specified in JSON files to perf stat. A metric is a formula that uses multiple events to compute a higher level result (e.g. IPC). Previously metrics were always tied to an event and automatically enabled with that event. But now change it that we can have standalone metrics. They are in the same JSON data structure as events, but don't have an event name. We also allow to organize the metrics in metric groups, which allows a short cut to select several related metrics at once. Add a new -M / --metrics option to perf stat that adds the metrics or metric groups specified. Add the core code to manage and parse the metric groups. They are collected from the JSON data structures into a separate rblist. When computing shadow values look for metrics in that list. Then they are computed using the existing saved values infrastructure in stat-shadow.c The actual JSON metrics are in a separate pull request. % perf stat -M Summary --metric-only -a sleep 1 Performance counter stats for 'system wide': Instructions CLKS CPU_Utilization GFLOPs SMT_2T_Utilization Kernel_Utilization 317614222.0 1392930775.0 0.0 0.0 0.2 0.1 1.001497549 seconds time elapsed % perf stat -M GFLOPs flops Performance counter stats for 'flops': 3,999,541,471 fp_comp_ops_exe.sse_scalar_single # 1.2 GFLOPs (66.65%) 14 fp_comp_ops_exe.sse_scalar_double (66.65%) 0 fp_comp_ops_exe.sse_packed_double (66.67%) 0 fp_comp_ops_exe.sse_packed_single (66.70%) 0 simd_fp_256.packed_double (66.70%) 0 simd_fp_256.packed_single (66.67%) 0 duration_time 3.238372845 seconds time elapsed v2: Add missing header file v3: Move find_map to pmu.c Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26perf evsel: Add read_counter()Jiri Olsa
Add perf_evsel__read_counter() to read single or group counter. After calling this function the counter's evsel::counts struct is filled with values for the counter and member of its group if there are any. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170726120206.9099-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf stat: Add support to measure SMI costKan Liang
Implementing a new --smi-cost mode in perf stat to measure SMI cost. During the measurement, the /sys/device/cpu/freeze_on_smi will be set. The measurement can be done with one counter (unhalted core cycles), and two free running MSR counters (IA32_APERF and SMI_COUNT). In practice, the percentages of SMI core cycles should be more useful than absolute value. So the output will be the percentage of SMI core cycles and SMI#. metric_only will be set by default. SMI cycles% = (aperf - unhalted core cycles) / aperf Here is an example output. Performance counter stats for 'sudo echo ': SMI cycles% SMI# 0.1% 1 0.010858678 seconds time elapsed Users who wants to get the actual value can apply additional --no-metric-only. Signed-off-by: Kan Liang <Kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Elliott <elliott@hpe.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf stat: Output JSON MetricExpr metricAndi Kleen
Add generic infrastructure to perf stat to output ratios for "MetricExpr" entries in the event lists. Many events are more useful as ratios than in raw form, typically some count in relation to total ticks. Transfer the MetricExpr information from the alias to the evsel. We mark the events that need to be collected for MetricExpr, and also link the events using them with a pointer. The code is careful to always prefer the right event in the same group to minimize multiplexing errors. At the moment only a single relation is supported. Then add a rblist to the stat shadow code that remembers stats based on the cpu and context. Then finally update and retrieve and print these values similarly to the existing hardcoded perf metrics. We use the simple expression parser added earlier to evaluate the expression. Normally we just output the result without further commentary, but for --metric-only this would lead to empty columns. So for this case use the original event as description. There is no attempt to automatically add the MetricExpr event, if it is missing, however we suggest it to the user, because the user tool doesn't have enough information to reliably construct a group that is guaranteed to schedule. So we leave that to the user. % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' 1.000147889 800,085,181 unc_p_clockticks 1.000147889 93,126,241 unc_p_freq_max_os_cycles # 11.6 2.000448381 800,218,217 unc_p_clockticks 2.000448381 142,516,095 unc_p_freq_max_os_cycles # 17.8 3.000639852 800,243,057 unc_p_clockticks 3.000639852 162,292,689 unc_p_freq_max_os_cycles # 20.3 % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only # time freq_max_os_cycles % 1.000127077 0.9 2.000301436 0.7 3.000456379 0.0 v2: Change from DivideBy to MetricExpr v3: Use expr__ prefix. Support more than one other event. v4: Update description v5: Only print warning message once for multiple PMUs. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-11-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06perf stat: Add computation of TopDown formulasAndi Kleen
Implement the TopDown formulas in 'perf stat'. The topdown basic metrics reported by the kernel are collected, and the formulas are computed and output as normal metrics. See the kernel commit exporting the events for details on the used metrics. Committer note: Output example: # perf stat --topdown -a usleep 1 Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound S0-C0 2 23.8% 11.6% 28.3% 36.3% S0-C1 2 16.2% 15.7% 36.5% 31.6% 0.000579956 seconds time elapsed # v2: Always print all metrics, only use thresholds for coloring. v3: Mark retiring over threshold green, not red. v4: Only print one decimal digit Fix color printing of one metric v5: Avoid printing -0.0 v6: Remove extra frontend event lookup Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1464119559-17203-2-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>