Age | Commit message (Collapse) | Author |
|
One more step in the merge of 'struct maps' with 'struct map_groups'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9ibtn3vua76f934t7woyf26w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Continuing the merge of 'struct maps' with 'struct map_groups'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-z8d14wrw393a0fbvmnk1bqd9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
And pick the shortest name: 'struct maps'.
The split existed because we used to have two groups of maps, one for
functions and one for variables, but that only complicated things,
sometimes we needed to figure out what was at some address and then had
to first try it on the functions group and if that failed, fall back to
the variables one.
That split is long gone, so for quite a while we had only one struct
maps per struct map_groups, simplify things by combining those structs.
First patch is the minimum needed to merge both, follow up patches will
rename 'thread->mg' to 'thread->maps', etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
At some point those stopped being used, prune them.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-p2k98mj3ff2uk1z95sbl5r6e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There are still lots of lookups by name, even if just when loading
vmlinux, till that code is studied to figure out if its possible to do
away with those map lookup by names, provide a way to sort it using
libc's qsort/bsearch.
Doing it at the first lookup defers the sorting a bit, and as the code
stands now, is never done for user maps, just for the kernel ones.
# perf probe -l
# perf probe -x ~/bin/perf -L __map_groups__find_by_name
<__map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
0 static struct map *__map_groups__find_by_name(struct map_groups *mg, const char *name)
1 {
struct map **mapp;
4 if (mg->maps_by_name == NULL &&
5 map__groups__sort_by_name_from_rbtree(mg))
6 return NULL;
8 mapp = bsearch(name, mg->maps_by_name, mg->nr_maps, sizeof(*mapp), map__strcmp_name);
9 if (mapp)
10 return *mapp;
11 return NULL;
12 }
struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
{
# perf probe -x ~/bin/perf 'found=__map_groups__find_by_name:10 name:string'
Added new event:
probe_perf:found (on __map_groups__find_by_name:10 in /home/acme/bin/perf with name:string)
You can now use it in all perf tools, such as:
perf record -e probe_perf:found -aR sleep 1
#
# perf probe -x ~/bin/perf -L map_groups__find_by_name
<map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
0 struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
1 {
2 struct maps *maps = &mg->maps;
struct map *map;
5 down_read(&maps->lock);
7 if (mg->last_search_by_name && strcmp(mg->last_search_by_name->dso->short_name, name) == 0) {
8 map = mg->last_search_by_name;
9 goto out_unlock;
}
/*
* If we have mg->maps_by_name, then the name isn't in the rbtree,
* as mg->maps_by_name mirrors the rbtree when lookups by name are
* made.
*/
16 map = __map_groups__find_by_name(mg, name);
17 if (map || mg->maps_by_name != NULL)
18 goto out_unlock;
/* Fallback to traversing the rbtree... */
21 maps__for_each_entry(maps, map)
22 if (strcmp(map->dso->short_name, name) == 0) {
23 mg->last_search_by_name = map;
24 goto out_unlock;
}
27 map = NULL;
out_unlock:
30 up_read(&maps->lock);
31 return map;
32 }
int dso__load_vmlinux(struct dso *dso, struct map *map,
const char *vmlinux, bool vmlinux_allocated)
# perf probe -x ~/bin/perf 'fallback=map_groups__find_by_name:21 name:string'
Added new events:
probe_perf:fallback (on map_groups__find_by_name:21 in /home/acme/bin/perf with name:string)
probe_perf:fallback_1 (on map_groups__find_by_name:21 in /home/acme/bin/perf with name:string)
You can now use it in all perf tools, such as:
perf record -e probe_perf:fallback_1 -aR sleep 1
#
# perf probe -l
probe_perf:fallback (on map_groups__find_by_name:21@util/symbol.c in /home/acme/bin/perf with name_string)
probe_perf:fallback_1 (on map_groups__find_by_name:21@util/symbol.c in /home/acme/bin/perf with name_string)
probe_perf:found (on __map_groups__find_by_name:10@util/symbol.c in /home/acme/bin/perf with name_string)
#
# perf stat -e probe_perf:*
Now run 'perf top' in another term and then, after a while, stop 'perf stat':
Furthermore, if we ask for interval printing, we can see that that is done just
at the start of the workload:
# perf stat -I1000 -e probe_perf:*
# time counts unit events
1.000319513 0 probe_perf:found
1.000319513 0 probe_perf:fallback_1
1.000319513 0 probe_perf:fallback
2.001868092 23,251 probe_perf:found
2.001868092 0 probe_perf:fallback_1
2.001868092 0 probe_perf:fallback
3.002901597 0 probe_perf:found
3.002901597 0 probe_perf:fallback_1
3.002901597 0 probe_perf:fallback
4.003358591 0 probe_perf:found
4.003358591 0 probe_perf:fallback_1
4.003358591 0 probe_perf:fallback
^C
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c5lmbyr14x448rcfii7y6t3k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Lets see if it helps:
First look at the probeable lines for the function that does lookups by
name in a map_groups struct:
# perf probe -x ~/bin/perf -L map_groups__find_by_name
<map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
0 struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
1 {
2 struct maps *maps = &mg->maps;
struct map *map;
5 down_read(&maps->lock);
7 if (mg->last_search_by_name && strcmp(mg->last_search_by_name->dso->short_name, name) == 0) {
8 map = mg->last_search_by_name;
9 goto out_unlock;
}
12 maps__for_each_entry(maps, map)
13 if (strcmp(map->dso->short_name, name) == 0) {
14 mg->last_search_by_name = map;
15 goto out_unlock;
}
18 map = NULL;
out_unlock:
21 up_read(&maps->lock);
22 return map;
23 }
int dso__load_vmlinux(struct dso *dso, struct map *map,
const char *vmlinux, bool vmlinux_allocated)
#
Now add a probe to the place where we reuse the last search:
# perf probe -x ~/bin/perf map_groups__find_by_name:8
Added new event:
probe_perf:map_groups__find_by_name (on map_groups__find_by_name:8 in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:map_groups__find_by_name -aR sleep 1
#
Now lets do a system wide 'perf stat' counting those events:
# perf stat -e probe_perf:*
Leave it running and lets do a 'perf top', then, after a while, stop the
'perf stat':
# perf stat -e probe_perf:*
^C
Performance counter stats for 'system wide':
3,603 probe_perf:map_groups__find_by_name
44.565253139 seconds time elapsed
#
yeah, good to have.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-tcz37g3nxv3tvxw3q90vga3p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is only used for the kernel maps, shave 24 bytes out 'struct map'
and just traverse the existing per ip rbtree to look for maps by name,
use a front end cache to reuse the last search if its the same name.
After this 'struct map' is down to just two cachelines:
$ pahole -C map ~/bin/perf
struct map {
union {
struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 */
struct list_head node; /* 0 16 */
} __attribute__((__aligned__(8))); /* 0 24 */
u64 start; /* 24 8 */
u64 end; /* 32 8 */
_Bool erange_warned; /* 40 1 */
/* XXX 3 bytes hole, try to pack */
u32 priv; /* 44 4 */
u32 prot; /* 48 4 */
u32 flags; /* 52 4 */
u64 pgoff; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
u64 reloc; /* 64 8 */
u32 maj; /* 72 4 */
u32 min; /* 76 4 */
u64 ino; /* 80 8 */
u64 ino_generation; /* 88 8 */
u64 (*map_ip)(struct map *, u64); /* 96 8 */
u64 (*unmap_ip)(struct map *, u64); /* 104 8 */
struct dso * dso; /* 112 8 */
refcount_t refcnt; /* 120 4 */
/* size: 128, cachelines: 2, members: 17 */
/* sum members: 121, holes: 1, sum holes: 3 */
/* padding: 4 */
/* forced alignments: 1 */
} __attribute__((__aligned__(8)));
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-bvr8fqfgzxtgnhnwt5sssx5g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
No need to iterate via the ->names rbtree, as all the entries there
as in maps->entries as well, reuse __maps__purge() for that.
Doing it this way we can kill maps__for_each_entry_by_name(),
maps__for_each_entry_by_name_safe(), maps__{first,next}_by_name().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ps0nrio8pydyo23rr2s696ue@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We were just passing a map to look for and reuse its map->groups member,
but the idea is that this is going away, as a map can be in multiple
rb_trees when being reused via a map_node, so do as all the other
map_groups methods and pass as its first arg the object being operated
on.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-nmi2pbggqloogwl6vxrvex5a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To reduce boilerplate, providing a more compact form to iterate over the
maps in a map_group.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-gc3go6fmdn30twusg91t2q56@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To reduce boilerplate, provide a more compact form using an idiom
present in other trees of data structures.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-59gmq4kg1r68ou1wknyjl78x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit e5adfc3e7e77 ("perf map: Synthesize maps only for thread group
leader") changed the recording side so that we no longer get mmap events
for threads other than the thread group leader (when synthesising these
events for threads which exist before perf is started).
When a file recorded after this change is loaded, the lack of mmap
records mean that unwinding is not set up for any other threads.
This can be seen in a simple record/report scenario:
perf record --call-graph=dwarf -t $TID
perf report
If $TID is a process ID then the report will show call graphs, but if
$TID is a secondary thread the output is as if --call-graph=none was
specified.
Following the rationale in that commit, move the libunwind fields into
struct map_groups and update the libunwind functions to take this
instead of the struct thread. This is only required for
unwind__finish_access which must now be called from map_groups__delete
and the others are changed for symmetry.
Note that unwind__get_entries keeps the thread argument since it is
required for symbol lookup and the libdw unwind provider uses the thread
ID.
Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: e5adfc3e7e77 ("perf map: Synthesize maps only for thread group leader")
Link: http://lkml.kernel.org/r/20190815100146.28842-2-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add map_groups__merge_in test to test the map_groups__merge_in function
usage - merging kcore maps into existing eBPF maps.
Committer testing:
# perf test merge
59: map_groups__merge_in : Ok
# perf test -v merge
59: map_groups__merge_in :
--- start ---
test child forked, pid 8349
test child finished with 0
---- end ----
map_groups__merge_in: Ok
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
And since machine.h only needs what is in there, make it stop including
map.h and instead include this newly introduced map_groups.h instead.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-dbob25fv5rp2rjpwlnterf38@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|