summaryrefslogtreecommitdiff
path: root/kernel/trace/trace.c
AgeCommit message (Collapse)Author
2020-08-07Merge tag 'trace-v5.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - The biggest news in that the tracing ring buffer can now time events that interrupted other ring buffer events. Before this change, if an interrupt came in while recording another event, and that interrupt also had an event, those events would all have the same time stamp as the event it interrupted. Now, with the new design, those events will have a unique time stamp and rightfully display the time for those events that were recorded while interrupting another event. - Bootconfig how has an "override" operator that lets the users have a default config, but then add options to override the default. - A fix was made to properly filter function graph tracing to the ftrace PIDs. This came in at the end of the -rc cycle, and needs to be backported. - Several clean ups, performance updates, and minor fixes as well. * tag 'trace-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (39 commits) tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers kprobes: Fix compiler warning for !CONFIG_KPROBES_ON_FTRACE tracing: Use trace_sched_process_free() instead of exit() for pid tracing bootconfig: Fix to find the initargs correctly Documentation: bootconfig: Add bootconfig override operator tools/bootconfig: Add testcases for value override operator lib/bootconfig: Add override operator support kprobes: Remove show_registers() function prototype tracing/uprobe: Remove dead code in trace_uprobe_register() kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler ftrace: Fix ftrace_trace_task return value tracepoint: Use __used attribute definitions from compiler_attributes.h tracepoint: Mark __tracepoint_string's __used trace : Have tracing buffer info use kvzalloc instead of kzalloc tracing: Remove outdated comment in stack handling ftrace: Do not let direct or IPMODIFY ftrace_ops be added to module and set trampolines ftrace: Setup correct FTRACE_FL_REGS flags for module tracing/hwlat: Honor the tracing_cpumask tracing/hwlat: Drop the duplicate assignment in start_kthread() tracing: Save one trace_event->type by using __TRACE_LAST_TYPE ...
2020-08-07tracing: Add trace_array_init_printk() to initialize instance trace_printk() ↵Steven Rostedt (VMware)
buffers As trace_array_printk() used with not global instances will not add noise to the main buffer, they are OK to have in the kernel (unlike trace_printk()). This require the subsystem to create their own tracing instance, and the trace_array_printk() only writes into those instances. Add trace_array_init_printk() to initialize the trace_printk() buffers without printing out the WARNING message. Reported-by: Sean Paul <sean@poorly.run> Reviewed-by: Sean Paul <sean@poorly.run> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-08-06Merge tag 'fsnotify_for_v5.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: - fanotify fix for softlockups when there are many queued events - performance improvement to reduce fsnotify overhead when not used - Amir's implementation of fanotify events with names. With these you can now efficiently monitor whole filesystem, eg to mirror changes to another machine. * tag 'fsnotify_for_v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (37 commits) fanotify: compare fsid when merging name event fsnotify: create method handle_inode_event() in fsnotify_operations fanotify: report parent fid + child fid fanotify: report parent fid + name + child fid fanotify: add support for FAN_REPORT_NAME fanotify: report events with parent dir fid to sb/mount/non-dir marks fanotify: add basic support for FAN_REPORT_DIR_FID fsnotify: remove check that source dentry is positive fsnotify: send event with parent/name info to sb/mount/non-dir marks audit: do not set FS_EVENT_ON_CHILD in audit marks mask inotify: do not set FS_EVENT_ON_CHILD in non-dir mark mask fsnotify: pass dir and inode arguments to fsnotify() fsnotify: create helper fsnotify_inode() fsnotify: send event to parent and child with single callback inotify: report both events on parent and child with single callback dnotify: report both events on parent and child with single callback fanotify: no external fh buffer in fanotify_name_event fanotify: use struct fanotify_info to parcel the variable size buffer fsnotify: add object type "child" to object type iterator fanotify: use FAN_EVENT_ON_CHILD as implicit flag on sb/mount/non-dir marks ...
2020-08-03trace : Have tracing buffer info use kvzalloc instead of kzallocZhaoyang Huang
High order memory stuff within trace could introduce OOM, use kvzalloc instead. Please find the bellowing for the call stack we run across in an android system. The scenario happens when traced_probes is woken up to get a large quantity of trace even if free memory is even higher than watermark_low.  traced_probes invoked oom-killer: gfp_mask=0x140c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null), order=2, oom_score_adj=-1 traced_probes cpuset=system-background mems_allowed=0 CPU: 3 PID: 588 Comm: traced_probes Tainted: G W O 4.14.181 #1 Hardware name: Generic DT based system (unwind_backtrace) from [<c010d824>] (show_stack+0x20/0x24) (show_stack) from [<c0b2e174>] (dump_stack+0xa8/0xec) (dump_stack) from [<c027d584>] (dump_header+0x9c/0x220) (dump_header) from [<c027cfe4>] (oom_kill_process+0xc0/0x5c4) (oom_kill_process) from [<c027cb94>] (out_of_memory+0x220/0x310) (out_of_memory) from [<c02816bc>] (__alloc_pages_nodemask+0xff8/0x13a4) (__alloc_pages_nodemask) from [<c02a6a1c>] (kmalloc_order+0x30/0x48) (kmalloc_order) from [<c02a6a64>] (kmalloc_order_trace+0x30/0x118) (kmalloc_order_trace) from [<c0223d7c>] (tracing_buffers_open+0x50/0xfc) (tracing_buffers_open) from [<c02e6f58>] (do_dentry_open+0x278/0x34c) (do_dentry_open) from [<c02e70d0>] (vfs_open+0x50/0x70) (vfs_open) from [<c02f7c24>] (path_openat+0x5fc/0x169c) (path_openat) from [<c02f75c4>] (do_filp_open+0x94/0xf8) (do_filp_open) from [<c02e7650>] (do_sys_open+0x168/0x26c) (do_sys_open) from [<c02e77bc>] (SyS_openat+0x34/0x38) (SyS_openat) from [<c0108bc0>] (ret_fast_syscall+0x0/0x28) Link: https://lkml.kernel.org/r/1596155265-32365-1-git-send-email-zhaoyang.huang@unisoc.com Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-07-30tracing: Remove outdated comment in stack handlingVincent Whitchurch
This comment describes the behaviour before commit 2a820bf74918 ("tracing: Use percpu stack trace buffer more intelligently"). Since that commit, interrupts and NMIs do use the per-cpu stacks so the comment is no longer correct. Remove it. (Note that the FTRACE_STACK_SIZE mentioned in the comment has never existed, it probably should have said FTRACE_STACK_ENTRIES.) Link: https://lkml.kernel.org/r/20200727092840.18659-1-vincent.whitchurch@axis.com Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-07-27fsnotify: create helper fsnotify_inode()Amir Goldstein
Simple helper to consolidate biolerplate code. Link: https://lore.kernel.org/r/20200722125849.17418-5-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-23tracefs: Remove unnecessary debug_fs checks.Peter Enderborg
This is a preparation for debugfs restricted mode. We don't need debugfs to trace, the removed check stop tracefs to work if debugfs is not initialised. We instead tries to automount within debugfs and relay on it's handling. The code path is to create a backward compatibility from when tracefs was part of debugfs, it is now standalone and does not need debugfs. When debugfs is in restricted it is compiled in but not active and return EPERM to clients and tracefs wont work if it assumes it is active it is compiled in kernel. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Enderborg <peter.enderborg@sony.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20200716071511.26864-2-peter.enderborg@sony.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPUNicholas Piggin
On a 144 thread system, `perf ftrace` takes about 20 seconds to start up, due to calling synchronize_rcu() for each CPU. cat /proc/108560/stack 0xc0003e7eb336f470 __switch_to+0x2e0/0x480 __wait_rcu_gp+0x20c/0x220 synchronize_rcu+0x9c/0xc0 ring_buffer_reset_cpu+0x88/0x2e0 tracing_reset_online_cpus+0x84/0xe0 tracing_open+0x1d4/0x1f0 On a system with 10x more threads, it starts to become an annoyance. Batch these up so we disable all the per-cpu buffers first, then synchronize_rcu() once, then reset each of the buffers. This brings the time down to about 0.5s. Link: https://lkml.kernel.org/r/20200625053403.2386972-1-npiggin@gmail.com Tested-by: Anton Blanchard <anton@ozlabs.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-06-30tracing: Move pipe reference to trace array instead of current_tracerSteven Rostedt (VMware)
If a process has the trace_pipe open on a trace_array, the current tracer for that trace array should not be changed. This was original enforced by a global lock, but when instances were introduced, it was moved to the current_trace. But this structure is shared by all instances, and a trace_pipe is for a single instance. There's no reason that a process that has trace_pipe open on one instance should prevent another instance from changing its current tracer. Move the reference counter to the trace_array instead. This is marked as "Fixes" but is more of a clean up than a true fix. Backport if you want, but its not critical. Fixes: cf6ab6d9143b1 ("tracing: Add ref count to tracer for when they are being read by pipe") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-06-29tracing: Only allow trace_array_printk() to be used by instancesSteven Rostedt (VMware)
To prevent default "trace_printks()" from spamming the top level tracing ring buffer, only allow trace instances to use trace_array_printk() (which can be used without the trace_printk() start up warning). Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-06-16tracing: Remove unused event variable in tracing_iter_resetYangHui
We do not use the event variable, just remove it. Signed-off-by: YangHui <yanghui.def@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-06-09Merge tag 'trace-v5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "No new features this release. Mostly clean ups, restructuring and documentation. - Have ftrace_bug() show ftrace errors before the WARN, as the WARN will reboot the box before the error messages are printed if panic_on_warn is set. - Have traceoff_on_warn disable tracing sooner (before prints) - Write a message to the trace buffer that its being disabled when disable_trace_on_warning() is set. - Separate out synthetic events from histogram code to let it be used by other parts of the kernel. - More documentation on histogram design. - Other small fixes and clean ups" * tag 'trace-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Remove obsolete PREEMPTIRQ_EVENTS kconfig option tracing/doc: Fix ascii-art in histogram-design.rst tracing: Add a trace print when traceoff_on_warning is triggered ftrace,bug: Improve traceoff_on_warn selftests/ftrace: Distinguish between hist and synthetic event checks tracing: Move synthetic events to a separate file tracing: Fix events.rst section numbering tracing/doc: Fix typos in histogram-design.rst tracing: Add hist_debug trace event files for histogram debugging tracing: Add histogram-design document tracing: Check state.disabled in synth event trace functions tracing/probe: reverse arguments to list_add tools/bootconfig: Add a summary of test cases and return error ftrace: show debugging information when panic_on_warn set
2020-06-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz Augusto von Dentz. 2) Add GSO partial support to igc, from Sasha Neftin. 3) Several cleanups and improvements to r8169 from Heiner Kallweit. 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a device self-test. From Andrew Lunn. 5) Start moving away from custom driver versions, use the globally defined kernel version instead, from Leon Romanovsky. 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin. 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet. 8) Add sriov and vf support to hinic, from Luo bin. 9) Support Media Redundancy Protocol (MRP) in the bridging code, from Horatiu Vultur. 10) Support netmap in the nft_nat code, from Pablo Neira Ayuso. 11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina Dubroca. Also add ipv6 support for espintcp. 12) Lots of ReST conversions of the networking documentation, from Mauro Carvalho Chehab. 13) Support configuration of ethtool rxnfc flows in bcmgenet driver, from Doug Berger. 14) Allow to dump cgroup id and filter by it in inet_diag code, from Dmitry Yakunin. 15) Add infrastructure to export netlink attribute policies to userspace, from Johannes Berg. 16) Several optimizations to sch_fq scheduler, from Eric Dumazet. 17) Fallback to the default qdisc if qdisc init fails because otherwise a packet scheduler init failure will make a device inoperative. From Jesper Dangaard Brouer. 18) Several RISCV bpf jit optimizations, from Luke Nelson. 19) Correct the return type of the ->ndo_start_xmit() method in several drivers, it's netdev_tx_t but many drivers were using 'int'. From Yunjian Wang. 20) Add an ethtool interface for PHY master/slave config, from Oleksij Rempel. 21) Add BPF iterators, from Yonghang Song. 22) Add cable test infrastructure, including ethool interfaces, from Andrew Lunn. Marvell PHY driver is the first to support this facility. 23) Remove zero-length arrays all over, from Gustavo A. R. Silva. 24) Calculate and maintain an explicit frame size in XDP, from Jesper Dangaard Brouer. 25) Add CAP_BPF, from Alexei Starovoitov. 26) Support terse dumps in the packet scheduler, from Vlad Buslov. 27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei. 28) Add devm_register_netdev(), from Bartosz Golaszewski. 29) Minimize qdisc resets, from Cong Wang. 30) Get rid of kernel_getsockopt and kernel_setsockopt in order to eliminate set_fs/get_fs calls. From Christoph Hellwig. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits) selftests: net: ip_defrag: ignore EPERM net_failover: fixed rollback in net_failover_open() Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv" Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv" vmxnet3: allow rx flow hash ops only when rss is enabled hinic: add set_channels ethtool_ops support selftests/bpf: Add a default $(CXX) value tools/bpf: Don't use $(COMPILE.c) bpf, selftests: Use bpf_probe_read_kernel s390/bpf: Use bcr 0,%0 as tail call nop filler s390/bpf: Maintain 8-byte stack alignment selftests/bpf: Fix verifier test selftests/bpf: Fix sample_cnt shared between two threads bpf, selftests: Adapt cls_redirect to call csum_level helper bpf: Add csum_level helper for fixing up csum levels bpf: Fix up bpf_skb_adjust_room helper's skb csum setting sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf() crypto/chtls: IPv6 support for inline TLS Crypto/chcr: Fixes a coccinile check error Crypto/chcr: Fixes compilations warnings ...
2020-06-03Merge branch 'work.splice' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull splice updates from Al Viro: "Christoph's assorted splice cleanups" * 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: rename pipe_buf ->steal to ->try_steal fs: make the pipe_buf_operations ->confirm operation optional fs: make the pipe_buf_operations ->steal operation optional trace: remove tracing_pipe_buf_ops pipe: merge anon_pipe_buf*_ops fs: simplify do_splice_from fs: simplify do_splice_to
2020-06-02mm: remove vmalloc_sync_(un)mappings()Joerg Roedel
These functions are not needed anymore because the vmalloc and ioremap mappings are now synchronized when they are created or torn down. Remove all callers and function definitions. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200515140023.25469-7-joro@8bytes.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-01tracing: Add a trace print when traceoff_on_warning is triggeredSteven Rostedt (VMware)
When "traceoff_on_warning" is enabled and a warning happens, there can still be many trace events happening on other CPUs between the time the warning occurred and the last trace event on that same CPU. This can cause confusion in examining the trace, as it may not be obvious where the warning happened. By adding a trace print into the trace just before disabling tracing, it makes it obvious where the warning occurred, and the developer doesn't have to look at other means to see what CPU it occurred on. Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-20fs: make the pipe_buf_operations ->confirm operation optionalChristoph Hellwig
Just return 0 for success if it is not present. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20fs: make the pipe_buf_operations ->steal operation optionalChristoph Hellwig
Just return 1 for failure if it is not present. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20trace: remove tracing_pipe_buf_opsChristoph Hellwig
tracing_pipe_buf_ops has identical ops to default_pipe_buf_ops, so use that instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Move the bpf verifier trace check into the new switch statement in HEAD. Resolve the overlapping changes in hinic, where bug fixes overlap the addition of VF support. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07tracing: Make tracing_snapshot_instance_cond() staticZou Wei
Fix the following sparse warning: kernel/trace/trace.c:950:6: warning: symbol 'tracing_snapshot_instance_cond' was not declared. Should it be static? Link: http://lkml.kernel.org/r/1587614905-48692-1-git-send-email-zou_wei@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-07tracing: Add a vmalloc_sync_mappings() for safe measureSteven Rostedt (VMware)
x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu areas can be complex, to say the least. Mappings may happen at boot up, and if nothing synchronizes the page tables, those page mappings may not be synced till they are used. This causes issues for anything that might touch one of those mappings in the path of the page fault handler. When one of those unmapped mappings is touched in the page fault handler, it will cause another page fault, which in turn will cause a page fault, and leave us in a loop of page faults. Commit 763802b53a42 ("x86/mm: split vmalloc_sync_all()") split vmalloc_sync_all() into vmalloc_sync_unmappings() and vmalloc_sync_mappings(), as on system exit, it did not need to do a full sync on x86_64 (although it still needed to be done on x86_32). By chance, the vmalloc_sync_all() would synchronize the page mappings done at boot up and prevent the per cpu area from being a problem for tracing in the page fault handler. But when that synchronization in the exit of a task became a nop, it caused the problem to appear. Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home Cc: stable@vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com> Suggested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-04-27sysctl: pass kernel pointers to ->proc_handlerChristoph Hellwig
Instead of having all the sysctl handlers deal with user pointers, which is rather hairy in terms of the BPF interaction, copy the input to and from userspace in common code. This also means that the strings are always NUL-terminated by the common code, making the API a little bit safer. As most handler just pass through the data to one of the common handlers a lot of the changes are mechnical. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-04-03tracing: Do not allocate buffer in trace_find_next_entry() in atomicSteven Rostedt (VMware)
When dumping out the trace data in latency format, a check is made to peek at the next event to compare its timestamp to the current one, and if the delta is of a greater size, it will add a marker showing so. But to do this, it needs to save the current event otherwise peeking at the next event will remove the current event. To save the event, a temp buffer is used, and if the event is bigger than the temp buffer, the temp buffer is freed and a bigger buffer is allocated. This allocation is a problem when called in atomic context. The only way this gets called via atomic context is via ftrace_dump(). Thus, use a static buffer of 128 bytes (which covers most events), and if the event is bigger than that, simply return NULL. The callers of trace_find_next_entry() need to handle a NULL case, as that's what would happen if the allocation failed. Link: https://lore.kernel.org/r/20200326091256.GR11705@shao2-debian Fixes: ff895103a84ab ("tracing: Save off entry when peeking at next entry") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-03-27ftrace: Create set_ftrace_notrace_pid to not trace tasksSteven Rostedt (VMware)
There's currently a way to select a task that should only be traced by functions, but there's no way to select a task not to be traced by the function tracer. Add a set_ftrace_notrace_pid file that acts the same as set_ftrace_pid (and is also affected by function-fork), but the task pids in this file will not be traced even if they are listed in the set_ftrace_pid file. This makes it easy for tools like trace-cmd to "hide" itself from the function tracer when it is recording other tasks. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-03-27ring-buffer/tracing: Have iterator acknowledge dropped eventsSteven Rostedt (VMware)
Have the ring_buffer_iterator set a flag if events were dropped as it were to go and peek at the next event. Have the trace file display this fact if it happened with a "LOST EVENTS" message. Link: http://lkml.kernel.org/r/20200317213417.045858900@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-03-27tracing: Do not disable tracing when reading the trace fileSteven Rostedt (VMware)
When opening the "trace" file, it is no longer necessary to disable tracing. Note, a new option is created called "pause-on-trace", when set, will cause the trace file to emulate its original behavior. Link: http://lkml.kernel.org/r/20200317213416.903351225@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-03-19ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()Steven Rostedt (VMware)
When the ring buffer was first created, the iterator followed the normal producer/consumer operations where it had both a peek() operation, that just returned the event at the current location, and a read(), that would return the event at the current location and also increment the iterator such that the next peek() or read() will return the next event. The only use of the ring_buffer_read() is currently to move the iterator to the next location and nothing now actually reads the event it returns. Rename this function to its actual use case to ring_buffer_iter_advance(), which also adds the "iter" part to the name, which is more meaningful. As the timestamp returned by ring_buffer_read() was never used, there's no reason that this new version should bother having returning it. It will also become a void function. Link: http://lkml.kernel.org/r/20200317213416.018928618@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-03-19tracing: Save off entry when peeking at next entrySteven Rostedt (VMware)
In order to have the iterator read the buffer even when it's still updating, it requires that the ring buffer iterator saves each event in a separate location outside the ring buffer such that its use is immutable. There's one use case that saves off the event returned from the ring buffer interator and calls it again to look at the next event, before going back to use the first event. As the ring buffer iterator will only have a single copy, this use case will no longer be supported. Instead, have the one use case create its own buffer to store the first event when looking at the next event. This way, when looking at the first event again, it wont be corrupted by the second read. Link: http://lkml.kernel.org/r/20200317213415.722539921@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-03-19tracing: Use address-of operator on section symbolsNathan Chancellor
Clang warns: ../kernel/trace/trace.c:9335:33: warning: array comparison always evaluates to true [-Wtautological-compare] if (__stop___trace_bprintk_fmt != __start___trace_bprintk_fmt) ^ 1 warning generated. These are not true arrays, they are linker defined symbols, which are just addresses. Using the address of operator silences the warning and does not change the runtime result of the check (tested with some print statements compiled in with clang + ld.lld and gcc + ld.bfd in QEMU). Link: http://lkml.kernel.org/r/20200220051011.26113-1-natechancellor@gmail.com Link: https://github.com/ClangBuiltLinux/linux/issues/893 Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-26Merge tag 'trace-v5.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing and bootconfig updates: "Fixes and changes to bootconfig before it goes live in a release. Change in API of bootconfig (before it comes live in a release): - Have a magic value "BOOTCONFIG" in initrd to know a bootconfig exists - Set CONFIG_BOOT_CONFIG to 'n' by default - Show error if "bootconfig" on cmdline but not compiled in - Prevent redefining the same value - Have a way to append values - Added a SELECT BLK_DEV_INITRD to fix a build failure Synthetic event fixes: - Switch to raw_smp_processor_id() for recording CPU value in preempt section. (No care for what the value actually is) - Fix samples always recording u64 values - Fix endianess - Check number of values matches number of fields - Fix a printing bug Fix of trace_printk() breaking postponed start up tests Make a function static that is only used in a single file" * tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: bootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue bootconfig: Add append value operator support bootconfig: Prohibit re-defining value on same key bootconfig: Print array as multiple commands for legacy command line bootconfig: Reject subkey and value on same parent key tools/bootconfig: Remove unneeded error message silencer bootconfig: Add bootconfig magic word for indicating bootconfig explicitly bootconfig: Set CONFIG_BOOT_CONFIG=n by default tracing: Clear trace_state when starting trace bootconfig: Mark boot_config_checksum() static tracing: Disable trace_printk() on post poned tests tracing: Have synthetic event test use raw_smp_processor_id() tracing: Fix number printing bug in print_synth_event() tracing: Check that number of vals matches number of synth event fields tracing: Make synth_event trace functions endian-correct tracing: Make sure synth_event_trace() example always uses u64
2020-02-20tracing: Disable trace_printk() on post poned testsSteven Rostedt (VMware)
The tracing seftests checks various aspects of the tracing infrastructure, and one is filtering. If trace_printk() is active during a self test, it can cause the filtering to fail, which will disable that part of the trace. To keep the selftests from failing because of trace_printk() calls, trace_printk() checks the variable tracing_selftest_running, and if set, it does not write to the tracing buffer. As some tracers were registered earlier in boot, the selftest they triggered would fail because not all the infrastructure was set up for the full selftest. Thus, some of the tests were post poned to when their infrastructure was ready (namely file system code). The postpone code did not set the tracing_seftest_running variable, and could fail if a trace_printk() was added and executed during their run. Cc: stable@vger.kernel.org Fixes: 9afecfbb95198 ("tracing: Postpone tracer start-up tests till the system is more robust") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-06Merge tag 'trace-v5.6-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Added new "bootconfig". This looks for a file appended to initrd to add boot config options, and has been discussed thoroughly at Linux Plumbers. Very useful for adding kprobes at bootup. Only enabled if "bootconfig" is on the real kernel command line. - Created dynamic event creation. Merges common code between creating synthetic events and kprobe events. - Rename perf "ring_buffer" structure to "perf_buffer" - Rename ftrace "ring_buffer" structure to "trace_buffer" Had to rename existing "trace_buffer" to "array_buffer" - Allow trace_printk() to work withing (some) tracing code. - Sort of tracing configs to be a little better organized - Fixed bug where ftrace_graph hash was not being protected properly - Various other small fixes and clean ups * tag 'trace-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (88 commits) bootconfig: Show the number of nodes on boot message tools/bootconfig: Show the number of bootconfig nodes bootconfig: Add more parse error messages bootconfig: Use bootconfig instead of boot config ftrace: Protect ftrace_graph_hash with ftrace_sync ftrace: Add comment to why rcu_dereference_sched() is open coded tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu tracing: Annotate ftrace_graph_hash pointer with __rcu bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline tracing: Use seq_buf for building dynevent_cmd string tracing: Remove useless code in dynevent_arg_pair_add() tracing: Remove check_arg() callbacks from dynevent args tracing: Consolidate some synth_event_trace code tracing: Fix now invalid var_ref_vals assumption in trace action tracing: Change trace_boot to use synth_event interface tracing: Move tracing selftests to bottom of menu tracing: Move mmio tracer config up with the other tracers tracing: Move tracing test module configs together tracing: Move all function tracing configs together tracing: Documentation for in-kernel synthetic event API ...
2020-02-05Merge branch 'work.recursive_removal' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs recursive removal updates from Al Viro: "We have quite a few places where synthetic filesystems do an equivalent of 'rm -rf', with varying amounts of code duplication, wrong locking, etc. That really ought to be a library helper. Only debugfs (and very similar tracefs) are converted here - I have more conversions, but they'd never been in -next, so they'll have to wait" * 'work.recursive_removal' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: simple_recursive_removal(): kernel-side rm -rf for ramfs-style filesystems
2020-01-30tracing: Add trace_array_find/_get() to find instance trace arraysTom Zanussi
Add a new trace_array_find() function that can be used to find a trace array given the instance name, and replace existing code that does the same thing with it. Also add trace_array_find_get() which does the same but returns the trace array after upping its refcount. Also make both available for use outside of trace.c. Link: http://lkml.kernel.org/r/cb68528c975eba95bee4561ac67dd1499423b2e5.1580323897.git.zanussi@kernel.org Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-30tracing: eval_map_next() should always increase position indexVasily Averin
if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. Link: http://lkml.kernel.org/r/7ad85b22-1866-977c-db17-88ac438bc764@virtuozzo.com Signed-off-by: Vasily Averin <vvs@virtuozzo.com> [ This is not a bug fix, it just makes it "technically correct" which is why I applied it. NULL is only returned on an anomaly which triggers a WARN_ON ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-25tracing: Use pr_err() instead of WARN() for memory failuresSteven Rostedt (VMware)
As warnings can trigger panics, especially when "panic_on_warn" is set, memory failure warnings can cause panics and fail fuzz testers that are stressing memory. Create a MEM_FAIL() macro to use instead of WARN() in the tracing code (perhaps this should be a kernel wide macro?), and use that for memory failure issues. This should stop failing fuzz tests due to warnings. Link: https://lore.kernel.org/r/CACT4Y+ZP-7np20GVRu3p+eZys9GPtbu+JpfV+HtsufAzvTgJrg@mail.gmail.com Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-24tracing: Decrement trace_array when bootconfig creates an instanceSteven Rostedt (VMware)
The trace_array_get_by_name() creates a ftrace instance and trace_array_put() is used to remove the reference. Even though the trace_array_get_by_name() creates the instance, it also adds a reference count to it, that prevents user space from removing it. As the bootconfig just creates the instance on boot up, it should still be used where it can be deleted by user space after boot. A trace_array_put() is required to let that happen. Also, change the documentation on trace_array_get_by_name() to make this not be so confusing. Link: https://lore.kernel.org/r/20200124205927.76128804@rorschach.local.home Fixes: 4f712a4d04a4e ("tracing/boot: Add instance node support") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-24tracing: Remove unneeded NULL checkDan Carpenter
We checked "iter->trace" earlier so there is no need to check here. Link: http://lkml.kernel.org/r/20141122183012.GB6994@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> [ Pulled from the archeological digging of my INBOX ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-22tracing: Fix uninitialized buffer var on early exit to trace_vbprintk()Steven Rostedt (VMware)
If we exit due to a bad input to trace_printk() (highly unlikely), then the buffer variable will not be initialized when we unnest the ring buffer. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-20tracing: Do not set trace clock if tracefs lockdown is in effectMasami Ichikawa
When trace_clock option is not set and unstable clcok detected, tracing_set_default_clock() sets trace_clock(ThinkPad A285 is one of case). In that case, if lockdown is in effect, null pointer dereference error happens in ring_buffer_set_clock(). Link: http://lkml.kernel.org/r/20200116131236.3866925-1-masami256@gmail.com Cc: stable@vger.kernel.org Fixes: 17911ff38aa58 ("tracing: Add locked_down checks to the open calls of files created for tracefs") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1788488 Signed-off-by: Masami Ichikawa <masami256@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-16tracing: Allow trace_printk() to nest in other tracing codeSteven Rostedt (VMware)
trace_printk() is used to debug the kernel which includes the tracing infrastructure. But because it writes to the ring buffer, and so does much of the tracing infrastructure, the ring buffer's recursive detection will drop writes to the ring buffer that is in the same context as the current write is happening (it allows interrupts to write when normal context is writing, but wont let normal context write while normal context is writing). This can cause confusion and think that the code is where the trace_printk() exists is not hit. To solve this, up the recursive nesting of the ring buffer when trace_printk() is called before it writes to the buffer itself. Note, this does make it dangerous to use trace_printk() in the ring buffer code itself, because this basically disables the recursion protection of trace_printk() buffer writes. But as trace_printk() is only used for debugging, and if this does occur, the developer will see the cause real quick (recursive blowing up of the stack). Thus the developer can deal with that. But having trace_printk() silently ignored is a much bigger problem, and disabling recursive protection is a small price to pay to fix it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-13tracing/boot: Add cpu_mask option supportMasami Hiramatsu
Add ftrace.cpumask option support to boot-time tracing. This sets cpumask for each instance. - ftrace.[instance.INSTANCE.]cpumask = CPUMASK; Set the trace cpumask. Note that the CPUMASK should be a string which <tracefs>/tracing_cpumask can accepts. Link: http://lkml.kernel.org/r/157867243625.17873.13613922641273149372.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-13tracing/boot: Add boot-time tracingMasami Hiramatsu
Setup tracing options via extra boot config in addition to kernel command line. This adds following commands support. These are applied to the global trace instance. - ftrace.options = OPT1[,OPT2...] Enable given ftrace options. - ftrace.trace_clock = CLOCK Set given CLOCK to ftrace's trace_clock. - ftrace.buffer_size = SIZE Configure ftrace buffer size to SIZE. You can use "KB" or "MB" for that SIZE. - ftrace.events = EVENT[, EVENT2...] Enable given events on boot. You can use a wild card in EVENT. - ftrace.tracer = TRACER Set TRACER to current tracer on boot. (e.g. function) Note that this is NOT replacing the kernel parameters, because this boot config based setting is later than that. If you want to trace earlier boot events, you still need kernel parameters. Link: http://lkml.kernel.org/r/157867237723.17873.17494943526320587488.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-13tracing: kprobes: Output kprobe event to printk bufferMasami Hiramatsu
Since kprobe-events use event_trigger_unlock_commit_regs() directly, that events doesn't show up in printk buffer if "tp_printk" is set. Use trace_event_buffer_commit() in kprobe events so that it can invoke output_printk() as same as other trace events. Link: http://lkml.kernel.org/r/157867233085.17873.5210928676787339604.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> [ Adjusted data var declaration placement in __kretprobe_trace_func() ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-13tracing: Apply soft-disabled and filter to tracepoints printkMasami Hiramatsu
Apply soft-disabled and the filter rule of the trace events to the printk output of tracepoints (a.k.a. tp_printk kernel parameter) as same as trace buffer output. Link: http://lkml.kernel.org/r/157867231876.17873.15825819592284704068.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-13tracing: Make struct ring_buffer less ambiguousSteven Rostedt (VMware)
As there's two struct ring_buffers in the kernel, it causes some confusion. The other one being the perf ring buffer. It was agreed upon that as neither of the ring buffers are generic enough to be used globally, they should be renamed as: perf's ring_buffer -> perf_buffer ftrace's ring_buffer -> trace_buffer This implements the changes to the ring buffer that ftrace uses. Link: https://lore.kernel.org/r/20191213140531.116b3200@gandalf.local.home Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-13tracing: Rename trace_buffer to array_bufferSteven Rostedt (VMware)
As we are working to remove the generic "ring_buffer" name that is used by both tracing and perf, the ring_buffer name for tracing will be renamed to trace_buffer, and perf's ring buffer will be renamed to perf_buffer. As there already exists a trace_buffer that is used by the trace_arrays, it needs to be first renamed to array_buffer. Link: https://lore.kernel.org/r/20191213153553.GE20583@krava Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-12-21tracing: Fix lock inversion in trace_event_enable_tgid_record()Prateek Sood
Task T2 Task T3 trace_options_core_write() subsystem_open() mutex_lock(trace_types_lock) mutex_lock(event_mutex) set_tracer_flag() trace_event_enable_tgid_record() mutex_lock(trace_types_lock) mutex_lock(event_mutex) This gives a circular dependency deadlock between trace_types_lock and event_mutex. To fix this invert the usage of trace_types_lock and event_mutex in trace_options_core_write(). This keeps the sequence of lock usage consistent. Link: http://lkml.kernel.org/r/0101016eef175e38-8ca71caf-a4eb-480d-a1e6-6f0bbc015495-000000@us-west-2.amazonses.com Cc: stable@vger.kernel.org Fixes: d914ba37d7145 ("tracing: Add support for recording tgid of tasks") Signed-off-by: Prateek Sood <prsood@codeaurora.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-12-10simple_recursive_removal(): kernel-side rm -rf for ramfs-style filesystemsAl Viro
two requirements: no file creations in IS_DEADDIR and no cross-directory renames whatsoever. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>