summaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2013-04-09kernel: tracing: Use strlcpy instead of strncpyChen Gang
Use strlcpy() instead of strncpy() as it will always add a '\0' to the end of the string even if the buffer is smaller than what is being copied. Link: http://lkml.kernel.org/r/51624254.30301@asianux.com Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-08ftrace: Do not call stub functions in control loopSteven Rostedt (Red Hat)
The function tracing control loop used by perf spits out a warning if the called function is not a control function. This is because the control function references a per cpu allocated data structure on struct ftrace_ops that is not allocated for other types of functions. commit 0a016409e42 "ftrace: Optimize the function tracer list loop" Had an optimization done to all function tracing loops to optimize for a single registered ops. Unfortunately, this allows for a slight race when tracing starts or ends, where the stub function might be called after the current registered ops is removed. In this case we get the following dump: root# perf stat -e ftrace:function sleep 1 [ 74.339105] WARNING: at include/linux/ftrace.h:209 ftrace_ops_control_func+0xde/0xf0() [ 74.349522] Hardware name: PRIMERGY RX200 S6 [ 74.357149] Modules linked in: sg igb iTCO_wdt ptp pps_core iTCO_vendor_support i7core_edac dca lpc_ich i2c_i801 coretemp edac_core crc32c_intel mfd_core ghash_clmulni_intel dm_multipath acpi_power_meter pcspk r microcode vhost_net tun macvtap macvlan nfsd kvm_intel kvm auth_rpcgss nfs_acl lockd sunrpc uinput xfs libcrc32c sd_mod crc_t10dif sr_mod cdrom mgag200 i2c_algo_bit drm_kms_helper ttm qla2xxx mptsas ahci drm li bahci scsi_transport_sas mptscsih libata scsi_transport_fc i2c_core mptbase scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [ 74.446233] Pid: 1377, comm: perf Tainted: G W 3.9.0-rc1 #1 [ 74.453458] Call Trace: [ 74.456233] [<ffffffff81062e3f>] warn_slowpath_common+0x7f/0xc0 [ 74.462997] [<ffffffff810fbc60>] ? rcu_note_context_switch+0xa0/0xa0 [ 74.470272] [<ffffffff811041a2>] ? __unregister_ftrace_function+0xa2/0x1a0 [ 74.478117] [<ffffffff81062e9a>] warn_slowpath_null+0x1a/0x20 [ 74.484681] [<ffffffff81102ede>] ftrace_ops_control_func+0xde/0xf0 [ 74.491760] [<ffffffff8162f400>] ftrace_call+0x5/0x2f [ 74.497511] [<ffffffff8162f400>] ? ftrace_call+0x5/0x2f [ 74.503486] [<ffffffff8162f400>] ? ftrace_call+0x5/0x2f [ 74.509500] [<ffffffff810fbc65>] ? synchronize_sched+0x5/0x50 [ 74.516088] [<ffffffff816254d5>] ? _cond_resched+0x5/0x40 [ 74.522268] [<ffffffff810fbc65>] ? synchronize_sched+0x5/0x50 [ 74.528837] [<ffffffff811041a2>] ? __unregister_ftrace_function+0xa2/0x1a0 [ 74.536696] [<ffffffff816254d5>] ? _cond_resched+0x5/0x40 [ 74.542878] [<ffffffff8162402d>] ? mutex_lock+0x1d/0x50 [ 74.548869] [<ffffffff81105c67>] unregister_ftrace_function+0x27/0x50 [ 74.556243] [<ffffffff8111eadf>] perf_ftrace_event_register+0x9f/0x140 [ 74.563709] [<ffffffff816254d5>] ? _cond_resched+0x5/0x40 [ 74.569887] [<ffffffff8162402d>] ? mutex_lock+0x1d/0x50 [ 74.575898] [<ffffffff8111e94e>] perf_trace_destroy+0x2e/0x50 [ 74.582505] [<ffffffff81127ba9>] tp_perf_event_destroy+0x9/0x10 [ 74.589298] [<ffffffff811295d0>] free_event+0x70/0x1a0 [ 74.595208] [<ffffffff8112a579>] perf_event_release_kernel+0x69/0xa0 [ 74.602460] [<ffffffff816254d5>] ? _cond_resched+0x5/0x40 [ 74.608667] [<ffffffff8112a640>] put_event+0x90/0xc0 [ 74.614373] [<ffffffff8112a740>] perf_release+0x10/0x20 [ 74.620367] [<ffffffff811a3044>] __fput+0xf4/0x280 [ 74.625894] [<ffffffff811a31de>] ____fput+0xe/0x10 [ 74.631387] [<ffffffff81083697>] task_work_run+0xa7/0xe0 [ 74.637452] [<ffffffff81014981>] do_notify_resume+0x71/0xb0 [ 74.643843] [<ffffffff8162fa92>] int_signal+0x12/0x17 To fix this a new ftrace_ops flag is added that denotes the ftrace_list_end ftrace_ops stub as just that, a stub. This flag is now checked in the control loop and the function is not called if the flag is set. Thanks to Jovi for not just reporting the bug, but also pointing out where the bug was in the code. Link: http://lkml.kernel.org/r/514A8855.7090402@redhat.com Link: http://lkml.kernel.org/r/1364377499-1900-15-git-send-email-jovi.zhangwei@huawei.com Tested-by: WANG Chao <chaowang@redhat.com> Reported-by: WANG Chao <chaowang@redhat.com> Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-08ftrace: Consistently restore trace function on sysctl enablingJan Kiszka
If we reenable ftrace via syctl, we currently set ftrace_trace_function based on the previous simplistic algorithm. This is inconsistent with what update_ftrace_function does. So better call that helper instead. Link: http://lkml.kernel.org/r/5151D26F.1070702@siemens.com Cc: stable@vger.kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-08tracing: Fix race with update_max_tr_single and changing tracersSteven Rostedt (Red Hat)
The commit 34600f0e9 "tracing: Fix race with max_tr and changing tracers" fixed the updating of the main buffers with the race of changing tracers, but left out the fix to the updating of just a per cpu buffer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-08ftrace: Fix strncpy() use, use strlcpy() instead of strncpy()Chen Gang
For NUL terminated string we always need to set '\0' at the end. Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: rostedt@goodmis.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/516243B7.9020405@asianux.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-08perf: Fix strncpy() use, use strlcpy() instead of strncpy()Chen Gang
For NUL terminated string we always need to set '\0' at the end. Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: rostedt@goodmis.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/51624254.30301@asianux.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-23Export blk_fill_rwbs()Kent Overstreet
Exported so it can be used by bcache's tracepoints Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@redhat.com>
2013-03-20tracing: Update debugfs README fileSteven Rostedt (Red Hat)
Update the README file in debugfs/tracing to something more useful. What's currently in the file is very old and what it shows doesn't have much use. Heck, it tells you how to mount debugfs! But to read this file you would have already needed to mount it. Replace the file with current up-to-date information. It's rather limited, but what do you expect from a pseudo README file. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-18Merge branch 'tip/perf/urgent-2' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent Pull tracing fixes from Steven Rostedt. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-15tracing: Fix ftrace_dump()Steven Rostedt (Red Hat)
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it will dump out the ftrace buffers to the console when either a oops, panic, or a sysrq-z occurs. This was written a long time ago when ftrace was fragile to recursion. But it wasn't written well even for that. There's a possible deadlock that can occur if a ftrace_dump() is happening and an NMI triggers another dump. This is because it grabs a lock before checking if the dump ran. It also totally disables ftrace, and tracing for no good reasons. As the ring_buffer now checks if it is read via a oops or NMI, where there's a chance that the buffer gets corrupted, it will disable itself. No need to have ftrace_dump() do the same. ftrace_dump() is now cleaned up where it uses an atomic counter to make sure only one dump happens at a time. A simple atomic_inc_return() is enough that is needed for both other CPUs and NMIs. No need for a spinlock, as if one CPU is running the dump, no other CPU needs to do it too. The tracing_on variable is turned off and not turned on. The original code did this, but it wasn't pretty. By just disabling this variable we get the result of not seeing traces that happen between crashes. For sysrq-z, it doesn't get turned on, but the user can always write a '1' to the tracing_on file. If they are using sysrq-z, then they should know about tracing_on. The new code is much easier to read and less error prone. No more deadlock possibility when an NMI triggers here. Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Rename trace_event_mutex to trace_event_semzhangwei(Jovi)
trace_event_mutex is an rw semaphore now, not a mutex, change the name. Link: http://lkml.kernel.org/r/513D843B.40109@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> [ Forward ported to my new code ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Fix comment about prefix in arch_syscall_match_sym_name()zhangwei(Jovi)
ppc64 has its own syscall prefix like ".SyS" or ".sys". Make the comment in arch_syscall_match_sym_name() more understandable. Link: http://lkml.kernel.org/r/513D842F.40205@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Convert trace_destroy_fields() to staticzhangwei(Jovi)
trace_destroy_fields() is not used outside of the file. It can be a static function. Link: http://lkml.kernel.org/r/513D842A.2000907@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Move find_event_field() into trace_events.czhangwei(Jovi)
By moving find_event_field() and trace_find_field() into trace_events.c, the ftrace_common_fields list and trace_get_fields() can become local to the trace_events.c file. find_event_field() is renamed to trace_find_event_field() to conform to the tracing global function names. Link: http://lkml.kernel.org/r/513D8426.9070109@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> [ rostedt: Modified trace_find_field() to trace_find_event_field() ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Use TRACE_MAX_PRINT instead of constantzhangwei(Jovi)
TRACE_MAX_PRINT macro is defined, but is not used. Link: http://lkml.kernel.org/r/513D8421.4070404@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Use pr_warn_once instead of open coded implementationzhangwei(Jovi)
Use pr_warn_once, instead of making an open coded implementation. Link: http://lkml.kernel.org/r/513D8419.20400@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15ring-buffer: Add ring buffer startup selftestSteven Rostedt (Red Hat)
When testing my large changes to the ftrace system, there was a bug that looked like the ring buffer was dropping events. I wrote up a quick integrity checker of the ring buffer to see if it was. Although the bug ended up being something stupid I did in ftrace, and had nothing to do with the ring buffer, I figured if I spent the time to write up this test, I might as well include it in the kernel. I cleaned it up a bit, as the original version was rather ugly. Not saying this version is pretty, but it's a beauty queen compared to what I original wrote. To enable the start up test, set CONFIG_RING_BUFFER_STARTUP_TEST. Note, it runs for 10 seconds, so it will slow your boot time by at least 10 more seconds. What it does is documented in both the comments and the Kconfig help. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add "perf" trace_clockSteven Rostedt (Red Hat)
The function trace_clock() calls "local_clock()" which is exactly the same clock that perf uses. I'm not sure why perf doesn't call trace_clock(), as trace_clock() doesn't have any users. But now it does. As trace_clock() calls local_clock() like perf does, I added the trace_clock "perf" option that uses trace_clock(). Now the ftrace buffers can use the same clock as perf uses. This will be useful when perf starts reading the ftrace buffers, and will be able to interleave them with the same clock data. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add "uptime" trace clock that uses jiffiesSteven Rostedt (Red Hat)
Add a simple trace clock called "uptime" for those that are interested in the uptime of the trace. It uses jiffies as that's the safest method, as other uptime clocks grab seq locks, which could cause a deadlock if taken from an event or function tracer. Requested-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add function-trace option to disable function tracing of latency ↵Steven Rostedt (Red Hat)
tracers Currently, the only way to stop the latency tracers from doing function tracing is to fully disable the function tracer from the proc file system: echo 0 > /proc/sys/kernel/ftrace_enabled This is a big hammer approach as it disables function tracing for all users. This includes kprobes, perf, stack tracer, etc. Instead, create a function-trace option that the latency tracers can check to determine if it should enable function tracing or not. This option can be set or cleared even while the tracer is active and the tracers will disable or enable function tracing depending on how the option was set. Instead of using the proc file, disable latency function tracing with echo 0 > /debug/tracing/options/function-trace Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Clark Williams <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Remove most or all of stack tracer stack size from stack_max_sizeSteven Rostedt (Red Hat)
Currently, the depth reported in the stack tracer stack_trace file does not match the stack_max_size file. This is because the stack_max_size includes the overhead of stack tracer itself while the depth does not. The first time a max is triggered, a calculation is not performed that figures out the overhead of the stack tracer and subtracts it from the stack_max_size variable. The overhead is stored and is subtracted from the reported stack size for comparing for a new max. Now the stack_max_size corresponds to the reported depth: # cat stack_max_size 4640 # cat stack_trace Depth Size Location (48 entries) ----- ---- -------- 0) 4640 32 _raw_spin_lock+0x18/0x24 1) 4608 112 ____cache_alloc+0xb7/0x22d 2) 4496 80 kmem_cache_alloc+0x63/0x12f 3) 4416 16 mempool_alloc_slab+0x15/0x17 [...] While testing against and older gcc on x86 that uses mcount instead of fentry, I found that pasing in ip + MCOUNT_INSN_SIZE let the stack trace show one more function deep which was missing before. Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Fix stack tracer with fentry useSteven Rostedt (Red Hat)
When gcc 4.6 on x86 is used, the function tracer will use the new option -mfentry which does a call to "fentry" at every function instead of "mcount". The significance of this is that fentry is called as the first operation of the function instead of the mcount usage of being called after the stack. This causes the stack tracer to show some bogus results for the size of the last function traced, as well as showing "ftrace_call" instead of the function. This is due to the stack frame not being set up by the function that is about to be traced. # cat stack_trace Depth Size Location (48 entries) ----- ---- -------- 0) 4824 216 ftrace_call+0x5/0x2f 1) 4608 112 ____cache_alloc+0xb7/0x22d 2) 4496 80 kmem_cache_alloc+0x63/0x12f The 216 size for ftrace_call includes both the ftrace_call stack (which includes the saving of registers it does), as well as the stack size of the parent. To fix this, if CC_USING_FENTRY is defined, then the stack_tracer will reserve the first item in stack_dump_trace[] array when calling save_stack_trace(), and it will fill it in with the parent ip. Then the code will look for the parent pointer on the stack and give the real size of the parent's stack pointer: # cat stack_trace Depth Size Location (14 entries) ----- ---- -------- 0) 2640 48 update_group_power+0x26/0x187 1) 2592 224 update_sd_lb_stats+0x2a5/0x4ac 2) 2368 160 find_busiest_group+0x31/0x1f1 3) 2208 256 load_balance+0xd9/0x662 I'm Cc'ing stable, although it's not urgent, as it only shows bogus size for item #0, the rest of the trace is legit. It should still be corrected in previous stable releases. Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Use stack of calling function for stack tracerSteven Rostedt (Red Hat)
Use the stack of stack_trace_call() instead of check_stack() as the test pointer for max stack size. It makes it a bit cleaner and a little more accurate. Adding stable, as a later fix depends on this patch. Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add function probe to trigger stack tracesSteven Rostedt (Red Hat)
Add a function probe that will cause a stack trace to be traced in the ring buffer when the given function(s) are called. format is: <function>:stacktrace[:<count>] echo 'schedule:stacktrace' > /debug/tracing/set_ftrace_filter cat /debug/tracing/trace_pipe kworker/2:0-4329 [002] ...2 2933.558007: <stack trace> => kthread => ret_from_fork <idle>-0 [000] .N.2 2933.558019: <stack trace> => rest_init => start_kernel => x86_64_start_reservations => x86_64_start_kernel kworker/2:0-4329 [002] ...2 2933.558109: <stack trace> => kthread => ret_from_fork [...] This can be set to only trace a specific amount of times: echo 'schedule:stacktrace:3' > /debug/tracing/set_ftrace_filter cat /debug/tracing/trace_pipe <...>-58 [003] ...2 841.801694: <stack trace> => kthread => ret_from_fork <idle>-0 [001] .N.2 841.801697: <stack trace> => start_secondary <...>-2059 [001] ...2 841.801736: <stack trace> => wait_for_common => wait_for_completion => flush_work => tty_flush_to_ldisc => input_available_p => n_tty_poll => tty_poll => do_select => core_sys_select => sys_select => system_call_fastpath To remove these: echo '!schedule:stacktrace' > /debug/tracing/set_ftrace_filter echo '!schedule:stacktrace:0' > /debug/tracing/set_ftrace_filter Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add skip argument to trace_dump_stack()Steven Rostedt (Red Hat)
Altough the trace_dump_stack() already skips three functions in the call to stack trace, which gets the stack trace to start at the caller of the function, the caller may want to skip some more too (as it may have helper functions). Add a skip argument to the trace_dump_stack() that lets the caller skip back tracing functions that it doesn't care about. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add function probe triggers to enable/disable eventsSteven Rostedt (Red Hat)
Add triggers to function tracer that lets an event get enabled or disabled when a function is called: format is: <function>:enable_event:<system>:<event>[:<count>] <function>:disable_event:<system>:<event>[:<count>] echo 'schedule:enable_event:sched:sched_switch' > /debug/tracing/set_ftrace_filter Every time schedule is called, it will enable the sched_switch event. echo 'schedule:disable_event:sched:sched_switch:2' > /debug/tracing/set_ftrace_filter The first two times schedule is called while the sched_switch event is enabled, it will disable it. It will not count for a time that the event is already disabled (or enabled for enable_event). [ fixed return without mutex_unlock() - thanks to Dan Carpenter and smatch ] Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add a way to soft disable trace eventsSteven Rostedt (Red Hat)
In order to let triggers enable or disable events, we need a 'soft' method for doing so. For example, if a function probe is added that lets a user enable or disable events when a function is called, that change must be done without taking locks or a mutex, and definitely it can't sleep. But the full enabling of a tracepoint is expensive. By adding a 'SOFT_DISABLE' flag, and converting the flags to be updated without the protection of a mutex (using set/clear_bit()), this soft disable flag can be used to allow critical sections to enable or disable events from being traced (after the event has been placed into "SOFT_MODE"). Some caveats though: The comm recorder (to map pids with a comm) can not be soft disabled (yet). If you disable an event with with a "soft" disable and wait a while before reading the trace, the comm cache may be replaced and you'll get a bunch of <...> for comms in the trace. Reading the "enable" file for an event that is disabled will now give you "0*" where the '*' denotes that the tracepoint is still active but the event itself is "disabled". [ fixed _BIT used in & operation : thanks to Dan Carpenter and smatch ] Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15ftrace: Use manual free after synchronize_sched() not call_rcu_sched()Steven Rostedt (Red Hat)
The entries to the probe hash must be freed after a synchronize_sched() after the entry has been removed from the hash. As the entries are registered with ops that may have their own callbacks, and these callbacks may sleep, we can not use call_rcu_sched() because the rcu callbacks registered with that are called from a softirq context. Instead of using call_rcu_sched(), manually save the entries on a free_list and at the end of the loop that removes the entries, do a synchronize_sched() and then go through the free_list, freeing the entries. Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15ftrace: Clean up function probe methodsSteven Rostedt (Red Hat)
When a function probe is created, each function that the probe is attached to, a "callback" method is called. On release of the probe, each function entry calls the "free" method. First, "callback" is a confusing name and does not really match what it does. Callback sounds like it will be called when the probe triggers. But that's not the case. This is really an "init" function, so lets rename it as such. Secondly, both "init" and "free" do not pass enough information back to the handlers. Pass back the ops, ip and data for each time the method is called. We have the information, might as well use it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add snapshot trigger to function probesSteven Rostedt (Red Hat)
echo 'schedule:snapshot:1' > /debug/tracing/set_ftrace_filter This will cause the scheduler to trigger a snapshot the next time it's called (you can use any function that's not called by NMI). Even though it triggers only once, you still need to remove it with: echo '!schedule:snapshot:0' > /debug/tracing/set_ftrace_filter The :1 can be left off for the first command: echo 'schedule:snapshot' > /debug/tracing/set_ftrace_filter But this will cause all calls to schedule to trigger a snapshot. This must be removed without the ':0' echo '!schedule:snapshot' > /debug/tracing/set_ftrace_filter As adding a "count" is a different operation (internally). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add alloc/free_snapshot() to replace duplicate codeSteven Rostedt (Red Hat)
Add alloc_snapshot() and free_snapshot() to allocate and free the snapshot buffer respectively, and use these to remove duplicate code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15ftrace: Fix function probe to only enable needed functionsSteven Rostedt (Red Hat)
Currently the function probe enables all functions and runs a "hash" against every function call to see if it should call a probe. This is extremely wasteful. Note, a probe is something like: echo schedule:traceoff > /debug/tracing/set_ftrace_filter When schedule is called, the probe will disable tracing. But currently, it has a call back for *all* functions, and checks to see if the called function is the probe that is needed. The probe function has been created before ftrace was rewritten to allow for more than one "op" to be registered by the function tracer. When probes were created, it couldn't limit the functions without also limiting normal function calls. But now we can, it's about time to update the probe code. Todo, have separate ops for different entries. That is, assign a ftrace_ops per probe, instead of one op for all probes. But as there's not many probes assigned, this may not be that urgent. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15ftrace: Separate unlimited probes from count limited probesSteven Rostedt (Red Hat)
The function tracing probes that trigger traceon or traceoff can be set to unlimited, or given a count of # of times to execute. By separating these two types of probes, we can then use the dynamic ftrace function filtering directly, and remove the brute force "check if this function called is my probe" routines in ftrace. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Consolidate ftrace_trace_onoff_unreg() into callbackSteven Rostedt (Red Hat)
The only thing ftrace_trace_onoff_unreg() does is to do a strcmp() against the cmd parameter to determine what op to unregister. But this compare is also done after the location that this function is called (and returns). By moving the check for '!' to unregister after the strcmp(), the callback function itself can just do the unregister and we can get rid of the helper function. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Consolidate updating of count for traceon/offSteven Rostedt (Red Hat)
Remove some duplicate code and replace it with a helper function. This makes the code a it cleaner. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Let tracing_snapshot() be used by modules but not NMISteven Rostedt (Red Hat)
Add EXPORT_SYMBOL_GPL() to let the tracing_snapshot() functions be called from modules. Also add a test to see if the snapshot was called from NMI context and just warn in the tracing buffer if so, and return. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add internal ftrace trace_puts() for ftrace to useSteven Rostedt (Red Hat)
There's a few places that ftrace uses trace_printk() for internal use, but this requires context (normal, softirq, irq, NMI) buffers to keep things lockless. But the trace_puts() does not, as it can write the string directly into the ring buffer. Make a internal helper for trace_puts() and have the internal functions use that. This way the extra context buffers are not used. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add trace_puts() for even faster trace_printk() tracingSteven Rostedt (Red Hat)
The trace_printk() is extremely fast and is very handy as it can be used in any context (including NMIs!). But it still requires scanning the fmt string for parsing the args. Even the trace_bprintk() requires a scan to know what args will be saved, although it doesn't copy the format string itself. Several times trace_printk() has no args, and wastes cpu cycles scanning the fmt string. Adding trace_puts() allows the developer to use an even faster tracing method that only saves the pointer to the string in the ring buffer without doing any format parsing at all. This will help remove even more of the "Heisenbug" effect, when debugging. Also fixed up the F_printk()s for the ftrace internal bprint and print events. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Fix the branch tracer that broke with buffer changeSteven Rostedt (Red Hat)
The changce to add the trace_buffer struct to have the trace array have both the main buffer and max buffer broke the branch tracer because the change did not update that code. As the branch tracer adds a significant amount of overhead, and must be selected via a selection (not a allyesconfig) it was missed in testing. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add alloc_snapshot kernel command line parameterSteven Rostedt (Red Hat)
If debugging the kernel, and the developer wants to use tracing_snapshot() in places where tracing_snapshot_alloc() may be difficult (or more likely, the developer is lazy and doesn't want to bother with tracing_snapshot_alloc() at all), then adding alloc_snapshot to the kernel command line parameter will tell ftrace to allocate the snapshot buffer (if configured) when it allocates the main tracing buffer. I also noticed that ring_buffer_expanded and tracing_selftest_disabled had inconsistent use of boolean "true" and "false" with "0" and "1". I cleaned that up too. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Move the tracing selftest code into its own functionSteven Rostedt (Red Hat)
Move the tracing startup selftest code into its own function and when not enabled, always have that function succeed. This makes the register_tracer() function much more readable. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15ring-buffer: Do not use schedule_work_on() for current CPUSteven Rostedt (Red Hat)
The ring buffer updates when done while the ring buffer is active, needs to be completed on the CPU that is used for the ring buffer per_cpu buffer. To accomplish this, schedule_work_on() is used to schedule work on the given CPU. Now there's no reason to use schedule_work_on() if the process doing the update happens to be on the CPU that it is processing. It has already filled the requirement. Instead, just do the work and continue. This is needed for tracing_snapshot_alloc() where it may be called really early in boot, where the work queues have not been set up yet. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add internal tracing_snapshot() functionsSteven Rostedt (Red Hat)
The new snapshot feature is quite handy. It's a way for the user to take advantage of the spare buffer that, until then, only the latency tracers used to "snapshot" the buffer when it hit a max latency. Now users can trigger a "snapshot" manually when some condition is hit in a program. But a snapshot currently can not be triggered by a condition inside the kernel. With the addition of tracing_snapshot() and tracing_snapshot_alloc(), snapshots can now be taking when a condition is hit, and the developer wants to snapshot the case without stopping the trace. Note, any snapshot will overwrite the old one, so take care in how this is done. These new functions are to be used like tracing_on(), tracing_off() and trace_printk() are. That is, they should never be called in the mainline Linux kernel. They are solely for the purpose of debugging. The tracing_snapshot() will not allocate a buffer, but it is safe to be called from any context (except NMIs). But if a snapshot buffer isn't allocated when it is called, it will write to the live buffer, complaining about the lack of a snapshot buffer, and then stop tracing (giving you the "permanent snapshot"). tracing_snapshot_alloc() will allocate the snapshot buffer if it was not already allocated and then take the snapshot. This routine *may sleep*, and must be called from context that can sleep. The allocation is done with GFP_KERNEL and not atomic. If you need a snapshot in an atomic context, say in early boot, then it is best to call the tracing_snapshot_alloc() before then, where it will allocate the buffer, and then you can use the tracing_snapshot() anywhere you want and still get snapshots. Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Prevent deleting instances when they are being readSteven Rostedt (Red Hat)
Add a ref count to the trace_array structure and prevent removal of instances that have open descriptors. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add per_cpu directory into tracing instancesSteven Rostedt (Red Hat)
Add the per_cpu directory to the created tracing instances: cd /sys/kernel/debug/tracing/instances mkdir foo ls foo/per_cpu/cpu0 buffer_size_kb snapshot_raw trace trace_pipe_raw snapshot stats trace_pipe Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add snapshot feature to instancesSteven Rostedt (Red Hat)
Add the "snapshot" file to the the multi-buffer instances. cd /sys/kernel/debug/tracing/instances mkdir foo ls foo buffer_size_kb buffer_total_size_kb events free_buffer set_event snapshot trace trace_clock trace_marker trace_options trace_pipe tracing_on cat foo/snapshot # tracer: nop # # # * Snapshot is freed * # # Snapshot commands: # echo 0 > snapshot : Clears and frees snapshot buffer # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. # Takes a snapshot of the main buffer. # echo 2 > snapshot : Clears snapshot buffer (but does not allocate) # (Doesn't have to be '2' works with any number that # is not a '0' or '1') Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Consolidate buffer allocation codeSteven Rostedt (Red Hat)
There's a bit of duplicate code in creating the trace buffers for the normal trace buffer and the max trace buffer among the instances and the main global_trace. This code can be consolidated and cleaned up a bit making the code cleaner and more readable as well as less duplication. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Have trace_array keep track if snapshot buffer is allocatedSteven Rostedt (Red Hat)
The snapshot buffer belongs to the trace array not the tracer that is running. The trace array should be the data structure that keeps track of whether or not the snapshot buffer is allocated, not the tracer desciptor. Having the trace array keep track of it makes modifications so much easier. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add snapshot_raw to extract the raw data from snapshotSteven Rostedt (Red Hat)
Add a 'snapshot_raw' per_cpu file that allows tools to read the raw binary data of the snapshot buffer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add config option to allow snapshot to swap per cpuSteven Rostedt (Red Hat)
When the preempt or irq latency tracers are enabled, they require the ring buffer to be able to swap the per cpu sub buffers between two main buffers. This adds a slight overhead to tracing as the trace recording needs to perform some checks to synchronize between recording and swaps that might be happening on other CPUs. The config RING_BUFFER_ALLOW_SWAP is set when a user of the ring buffer needs the "swap cpu" feature, otherwise the extra checks are not implemented and removed from the tracing overhead. The snapshot feature will swap per CPU if the RING_BUFFER_ALLOW_SWAP config is set. But that only gets set by things like OPROFILE and the irqs and preempt latency tracers. This config is added to let the user decide to include this feature with the snapshot agnostic from whether or not another user of the ring buffer sets this config. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>