summaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2018-10-10tracing: probeevent: Add $argN for accessing function argsMasami Hiramatsu
Add $argN special fetch variable for accessing function arguments. This allows user to trace the Nth argument easily at the function entry. Note that this returns most probably assignment of registers and stacks. In some case, it may not work well. If you need to access correct registers or stacks you should use perf-probe. Link: http://lkml.kernel.org/r/152465888632.26224.3412465701570253696.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Add array type supportMasami Hiramatsu
Add array type support for probe events. This allows user to get arraied types from memory address. The array type syntax is TYPE[N] Where TYPE is one of types (u8/16/32/64,s8/16/32/64, x8/16/32/64, symbol, string) and N is a fixed value less than 64. The string array type is a bit different from other types. For other base types, <base-type>[1] is equal to <base-type> (e.g. +0(%di):x32[1] is same as +0(%di):x32.) But string[1] is not equal to string. The string type itself represents "char array", but string array type represents "char * array". So, for example, +0(%di):string[1] is equal to +0(+0(%di)):string. Link: http://lkml.kernel.org/r/152465891533.26224.6150658225601339931.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Add symbol typeMasami Hiramatsu
Add "symbol" type to probeevent, which is an alias of u32 or u64 (depends on BITS_PER_LONG). This shows the result value in symbol+offset style. This type is only available with kprobe events. Link: http://lkml.kernel.org/r/152465882860.26224.14779072294412467338.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Unify fetch_insn processing common partMasami Hiramatsu
Unify the fetch_insn bottom process (from stage 2: dereference indirect data) from kprobe and uprobe events, since those are mostly same. Link: http://lkml.kernel.org/r/152465879965.26224.8547240824606804815.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Append traceprobe_ for exported functionMasami Hiramatsu
Append traceprobe_ for exported function set_print_fmt() as same as other functions. Link: http://lkml.kernel.org/r/152465877071.26224.11143125027282999726.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Return consumed bytes of dynamic areaMasami Hiramatsu
Cleanup string fetching routine so that returns the consumed bytes of dynamic area and store the string information as data_loc format instead of data_rloc. This simplifies the fetcharg loop. Link: http://lkml.kernel.org/r/152465874163.26224.12125143907501289031.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Unify fetch type tablesMasami Hiramatsu
Unify {k,u}probe_fetch_type_table to probe_fetch_type_table because the main difference of those type tables (fetcharg methods) are gone. Now we can consolidate it. Link: http://lkml.kernel.org/r/152465871274.26224.13999436317830479698.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Introduce new argument fetching codeMasami Hiramatsu
Replace {k,u}probe event argument fetching framework with switch-case based. Currently that is implemented with structures, macros and chain of function-pointers, which is more complicated than necessary and may get a performance penalty by retpoline. This simplify that with an array of "fetch_insn" (opcode and oprands), and make process_fetch_insn() just interprets it. No function pointers are used. Link: http://lkml.kernel.org/r/152465868340.26224.2551120475197839464.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Remove NOKPROBE_SYMBOL from print functionsMasami Hiramatsu
Remove unneeded NOKPROBE_SYMBOL from print functions since the print functions are only used when printing out the trace data, and not from kprobe handler. Link: http://lkml.kernel.org/r/152465865422.26224.10111548170594014954.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Cleanup argument field definitionMasami Hiramatsu
Cleanup event argument definition code in one place for maintenancability. Link: http://lkml.kernel.org/r/152465862529.26224.9068605421476018902.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10tracing: probeevent: Cleanup print argument functionsMasami Hiramatsu
Cleanup the print-argument function to decouple it into print-name and print-value, so that it can support more flexible expression, like array type. Link: http://lkml.kernel.org/r/152465859635.26224.13452846788717102315.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-10trace_uprobe: support reference counter in fd-based uprobeSong Liu
This patch enables uprobes with reference counter in fd-based uprobe. Highest 32 bits of perf_event_attr.config is used to stored offset of the reference count (semaphore). Format information in /sys/bus/event_source/devices/uprobe/format/ is updated to reflect this new feature. Link: http://lkml.kernel.org/r/20181002053636.1896903-1-songliubraving@fb.com Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-01Merge tag 'v4.19-rc6' into for-4.20/blockJens Axboe
Merge -rc6 in, for two reasons: 1) Resolve a trivial conflict in the blk-mq-tag.c documentation 2) A few important regression fixes went into upstream directly, so they aren't in the 4.20 branch. Signed-off-by: Jens Axboe <axboe@kernel.dk> * tag 'v4.19-rc6': (780 commits) Linux 4.19-rc6 MAINTAINERS: fix reference to moved drivers/{misc => auxdisplay}/panel.c cpufreq: qcom-kryo: Fix section annotations perf/core: Add sanity check to deal with pinned event failure xen/blkfront: correct purging of persistent grants Revert "xen/blkfront: When purging persistent grants, keep them in the buffer" selftests/powerpc: Fix Makefiles for headers_install change blk-mq: I/O and timer unplugs are inverted in blktrace dax: Fix deadlock in dax_lock_mapping_entry() x86/boot: Fix kexec booting failure in the SEV bit detection code bcache: add separate workqueue for journal_write to avoid deadlock drm/amd/display: Fix Edid emulation for linux drm/amd/display: Fix Vega10 lightup on S3 resume drm/amdgpu: Fix vce work queue was not cancelled when suspend Revert "drm/panel: Add device_link from panel device to DRM device" xen/blkfront: When purging persistent grants, keep them in the buffer clocksource/drivers/timer-atmel-pit: Properly handle error cases block: fix deadline elevator drain for zoned block devices ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set ... Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-26tracing: Trivia spelling fix containerof() -> container_of()Andy Shevchenko
This is the only location on kernel that has wrong spelling of the container_of() helper. Fix it. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-09-24trace_uprobe/sdt: Prevent multiple reference counter for same uprobeRavi Bangoria
We assume to have only one reference counter for one uprobe. Don't allow user to add multiple trace_uprobe entries having same inode+offset but different reference counter. Ex, # echo "p:sdt_tick/loop2 /home/ravi/tick:0x6e4(0x10036)" > uprobe_events # echo "p:sdt_tick/loop2_1 /home/ravi/tick:0x6e4(0xfffff)" >> uprobe_events bash: echo: write error: Invalid argument # dmesg trace_kprobe: Reference counter offset mismatch. There is one exception though: When user is trying to replace the old entry with the new one, we allow this if the new entry does not conflict with any other existing entries. Link: http://lkml.kernel.org/r/20180820044250.11659-4-ravi.bangoria@linux.ibm.com Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Song Liu <songliubraving@fb.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-09-24uprobes: Support SDT markers having reference count (semaphore)Ravi Bangoria
Userspace Statically Defined Tracepoints[1] are dtrace style markers inside userspace applications. Applications like PostgreSQL, MySQL, Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc have these markers embedded in them. These markers are added by developer at important places in the code. Each marker source expands to a single nop instruction in the compiled code but there may be additional overhead for computing the marker arguments which expands to couple of instructions. In case the overhead is more, execution of it can be omitted by runtime if() condition when no one is tracing on the marker: if (reference_counter > 0) { Execute marker instructions; } Default value of reference counter is 0. Tracer has to increment the reference counter before tracing on a marker and decrement it when done with the tracing. Implement the reference counter logic in core uprobe. User will be able to use it from trace_uprobe as well as from kernel module. New trace_uprobe definition with reference counter will now be: <path>:<offset>[(ref_ctr_offset)] where ref_ctr_offset is an optional field. For kernel module, new variant of uprobe_register() has been introduced: uprobe_register_refctr(inode, offset, ref_ctr_offset, consumer) No new variant for uprobe_unregister() because it's assumed to have only one reference counter for one uprobe. [1] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation Note: 'reference counter' is called as 'semaphore' in original Dtrace (or Systemtap, bcc and even in ELF) documentation and code. But the term 'semaphore' is misleading in this context. This is just a counter used to hold number of tracers tracing on a marker. This is not really used for any synchronization. So we are calling it a 'reference counter' in kernel / perf code. Link: http://lkml.kernel.org/r/20180820044250.11659-2-ravi.bangoria@linux.ibm.com Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> [Only trace_uprobe.c] Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Song Liu <songliubraving@fb.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-09-24tracing/kprobe: Remove unneeded extra strchr() from create_trace_kprobe()Steven Rostedt (VMware)
By utilizing a temporary variable, we can avoid adding another call to strchr(). Instead, save the first call to a temp variable, and then use that variable as the reference to set the event variable. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-09-21blkcg: remove bio->bi_css and instead use bio->bi_blkgDennis Zhou (Facebook)
Prior patches ensured that all bios are now associated with some blkg. This now makes bio->bi_css unnecessary as blkg maintains a reference to the blkcg already. This patch removes the field bi_css and transfers corresponding uses to access via bi_blkg. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-17ring-buffer: Allow for rescheduling when removing pagesVaibhav Nagarnaik
When reducing ring buffer size, pages are removed by scheduling a work item on each CPU for the corresponding CPU ring buffer. After the pages are removed from ring buffer linked list, the pages are free()d in a tight loop. The loop does not give up CPU until all pages are removed. In a worst case behavior, when lot of pages are to be freed, it can cause system stall. After the pages are removed from the list, the free() can happen while the work is rescheduled. Call cond_resched() in the loop to prevent the system hangup. Link: http://lkml.kernel.org/r/20180907223129.71994-1-vnagarnaik@google.com Cc: stable@vger.kernel.org Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Reported-by: Jason Behmer <jbehmer@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-23Merge tag 'trace-v4.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Masami found an off by one bug in the code that keeps "notrace" functions from being traced by kprobes. During my testing, I found that there's places that we may want to add kprobes to notrace, thus we may end up changing this code before 4.19 is released. The history behind this change is that we found that adding kprobes to various notrace functions caused the kernel to crashed. We took the safe route and decided not to allow kprobes to trace any notrace function. But because notrace is added to functions that just cause weird side effects to the function tracer, but are still safe, preventing kprobes for all notrace functios may be too much of a big hammer. One such place is __schedule() is marked notrace, to keep function tracer from doing strange recursive loops when it gets traced with NEED_RESCHED set. With this change, one can not add kprobes to the scheduler. Masami also added code to use gcov on ftrace" * tag 'trace-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/kprobes: Fix to check notrace function with correct range tracing: Allow gcov profiling on only ftrace subsystem
2018-08-22Merge tag 'for-4.19/post-20180822' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull more block updates from Jens Axboe: - Set of bcache fixes and changes (Coly) - The flush warn fix (me) - Small series of BFQ fixes (Paolo) - wbt hang fix (Ming) - blktrace fix (Steven) - blk-mq hardware queue count update fix (Jianchao) - Various little fixes * tag 'for-4.19/post-20180822' of git://git.kernel.dk/linux-block: (31 commits) block/DAC960.c: make some arrays static const, shrinks object size blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter blk-mq: init hctx sched after update ctx and hctx mapping block: remove duplicate initialization tracing/blktrace: Fix to allow setting same value pktcdvd: fix setting of 'ret' error return for a few cases block: change return type to bool block, bfq: return nbytes and not zero from struct cftype .write() method block, bfq: improve code of bfq_bfqq_charge_time block, bfq: reduce write overcharge block, bfq: always update the budget of an entity when needed block, bfq: readd missing reset of parent-entity service blk-wbt: fix IO hang in wbt_wait() block: don't warn for flush on read-only device bcache: add the missing comments for smp_mb()/smp_wmb() bcache: remove unnecessary space before ioctl function pointer arguments bcache: add missing SPDX header bcache: move open brace at end of function definitions to next line bcache: add static const prefix to char * array declarations bcache: fix code comments style ...
2018-08-21tracing/kprobes: Fix to check notrace function with correct rangeMasami Hiramatsu
Fix within_notrace_func() to check notrace function correctly. Since the ftrace_location_range(start, end) function checks the range inclusively (start <= ftrace-loc <= end), the end address must not include the entry address of next function. However, within_notrace_func() uses kallsyms_lookup_size_offset() to get the function size and calculate the end address from adding the size to the entry address. This means the end address is the entry address of the next function. In the result, within_notrace_func() fails to find notrace function if the next function of the target function is ftraced. Let's subtract 1 from the end address so that ftrace_location_range() can check it correctly. Link: http://lkml.kernel.org/r/153485669706.16611.17726752296213785504.stgit@devbox Fixes: commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on notrace function") Reported-by: Michael Rodin <michael@rodin.online> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-21tracing: Allow gcov profiling on only ftrace subsystemMasami Hiramatsu
Add GCOV_PROFILE_FTRACE to allow gcov profiling on only files in ftrace subsystem. This config option will be used for checking kselftest/ftrace coverage. Link: http://lkml.kernel.org/r/153483647755.32472.4746349899604275441.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-20Merge tag 'trace-v4.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Restructure of lockdep and latency tracers This is the biggest change. Joel Fernandes restructured the hooks from irqs and preemption disabling and enabling. He got rid of a lot of the preprocessor #ifdef mess that they caused. He turned both lockdep and the latency tracers to use trace events inserted in the preempt/irqs disabling paths. But unfortunately, these started to cause issues in corner cases. Thus, parts of the code was reverted back to where lockdep and the latency tracers just get called directly (without using the trace events). But because the original change cleaned up the code very nicely we kept that, as well as the trace events for preempt and irqs disabling, but they are limited to not being called in NMIs. - Have trace events use SRCU for "rcu idle" calls. This was required for the preempt/irqs off trace events. But it also had to not allow them to be called in NMI context. Waiting till Paul makes an NMI safe SRCU API. - New notrace SRCU API to allow trace events to use SRCU. - Addition of mcount-nop option support - SPDX headers replacing GPL templates. - Various other fixes and clean ups. - Some fixes are marked for stable, but were not fully tested before the merge window opened. * tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) tracing: Fix SPDX format headers to use C++ style comments tracing: Add SPDX License format tags to tracing files tracing: Add SPDX License format to bpf_trace.c blktrace: Add SPDX License format header s390/ftrace: Add -mfentry and -mnop-mcount support tracing: Add -mcount-nop option support tracing: Avoid calling cc-option -mrecord-mcount for every Makefile tracing: Handle CC_FLAGS_FTRACE more accurately Uprobe: Additional argument arch_uprobe to uprobe_write_opcode() Uprobes: Simplify uprobe_register() body tracepoints: Free early tracepoints after RCU is initialized uprobes: Use synchronize_rcu() not synchronize_sched() tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister() ftrace: Remove unused pointer ftrace_swapper_pid tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing/irqsoff: Handle preempt_count for different configs tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing: irqsoff: Account for additional preempt_disable trace: Use rcu_dereference_raw for hooks from trace-event subsystem tracing/kprobes: Fix within_notrace_func() to check only notrace functions ...
2018-08-16tracing: Fix SPDX format headers to use C++ style commentsSteven Rostedt (VMware)
The Linux kernel adopted the SPDX License format headers to ease license compliance management, and uses the C++ '//' style comments for the SPDX header tags. Some files in the tracing directory used the C style /* */ comments for them. To be consistent across all files, replace the /* */ C style SPDX tags with the C++ // SPDX tags. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-16tracing: Add SPDX License format tags to tracing filesSteven Rostedt (VMware)
Add the SPDX License header to ease license compliance management. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-16tracing: Add SPDX License format to bpf_trace.cSteven Rostedt (VMware)
Add the SPDX License header to ease license compliance management. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-16tracing/blktrace: Fix to allow setting same valueSteven Rostedt (VMware)
Masami Hiramatsu reported: Current trace-enable attribute in sysfs returns an error if user writes the same setting value as current one, e.g. # cat /sys/block/sda/trace/enable 0 # echo 0 > /sys/block/sda/trace/enable bash: echo: write error: Invalid argument # echo 1 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable bash: echo: write error: Device or resource busy But this is not a preferred behavior, it should ignore if new setting is same as current one. This fixes the problem as below. # cat /sys/block/sda/trace/enable 0 # echo 0 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable Link: http://lkml.kernel.org/r/20180816103802.08678002@gandalf.local.home Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Cc: stable@vger.kernel.org Fixes: cd649b8bb830d ("blktrace: remove sysfs_blk_trace_enable_show/store()") Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-16blktrace: Add SPDX License format headerSteven Rostedt (VMware)
Add the SPDX License header to ease license compliance management. Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-15tracing: Add -mcount-nop option supportVasily Gorbik
-mcount-nop gcc option generates the calls to the profiling functions as nops which allows to avoid patching mcount jump with NOP instructions initially. -mcount-nop gcc option will be activated if platform selects HAVE_NOP_MCOUNT and gcc actually supports it. In addition to that CC_USING_NOP_MCOUNT is defined and could be used by architectures to adapt ftrace patching behavior. Link: http://lkml.kernel.org/r/patch-3.thread-aa7b8d.git-e02ed2dc082b.your-ad-here.call-01533557518-ext-9465@work.hours Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-15Merge tag 'printk-for-4.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Different vendors have a different expectation about a console quietness. Make it configurable to reduce bike-shedding about the upstream default - Decide about the message visibility when the message is stored. It avoids races caused by a delayed console handling - Always store printk() messages into the per-CPU buffers again in NMI. The only exception is when flushing trace log in panic(). There the risk of loosing messages is worth an eventual reordering - Handle invalid %pO printf modifiers correctly - Better handle %p printf modifier tests before crng is initialized - Some clean up * tag 'printk-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: lib/vsprintf: Do not handle %pO[^F] as %px printk: Fix warning about unused suppress_message_printing printk/nmi: Prevent deadlock when accessing the main log buffer in NMI printk: Create helper function to queue deferred console handling printk: Split the code for storing a message into the log buffer printk: Clean up syslog_print_all() printk: Remove unnecessary kmalloc() from syslog during clear printk: Make CONSOLE_LOGLEVEL_QUIET configurable printk: make sure to print log on console. lib/test_printf.c: accept "ptrval" as valid result for plain 'p' tests
2018-08-14Merge tag 'docs-4.19' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation update from Jonathan Corbet: "This was a moderately busy cycle for docs, with the usual collection of small fixes and updates. We also have new ktime_get_*() docs from Arnd, some kernel-doc fixes, a new set of Italian translations (non so se vale la pena, ma non fa male - speriamo bene), and some extensive early memory-management documentation improvements from Mike Rapoport" * tag 'docs-4.19' of git://git.lwn.net/linux: (52 commits) Documentation: corrections to console/console.txt Documentation: add ioctl number entry for v4l2-subdev.h Remove gendered language from management style documentation scripts/kernel-doc: Escape all literal braces in regexes docs/mm: add description of boot time memory management docs/mm: memblock: add overview documentation docs/mm: memblock: add kernel-doc description for memblock types docs/mm: memblock: add kernel-doc comments for memblock_add[_node] docs/mm: memblock: update kernel-doc comments mm/memblock: add a name for memblock flags enumeration docs/mm: bootmem: add overview documentation docs/mm: bootmem: add kernel-doc description of 'struct bootmem_data' docs/mm: bootmem: fix kernel-doc warnings docs/mm: nobootmem: fixup kernel-doc comments mm/bootmem: drop duplicated kernel-doc comments Documentation: vm.txt: Adding 'nr_hugepages_mempolicy' parameter description. doc:it_IT: translation for kernel-hacking docs: Fix the reference labels in Locking.rst doc: tracing: Fix a typo of trace_stat mm: Introduce new type vm_fault_t ...
2018-08-14Merge tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: "First pull request for this merge window, there will also be a followup request with some stragglers. This pull request contains: - Fix for a thundering heard issue in the wbt block code (Anchal Agarwal) - A few NVMe pull requests: * Improved tracepoints (Keith) * Larger inline data support for RDMA (Steve Wise) * RDMA setup/teardown fixes (Sagi) * Effects log suppor for NVMe target (Chaitanya Kulkarni) * Buffered IO suppor for NVMe target (Chaitanya Kulkarni) * TP4004 (ANA) support (Christoph) * Various NVMe fixes - Block io-latency controller support. Much needed support for properly containing block devices. (Josef) - Series improving how we handle sense information on the stack (Kees) - Lightnvm fixes and updates/improvements (Mathias/Javier et al) - Zoned device support for null_blk (Matias) - AIX partition fixes (Mauricio Faria de Oliveira) - DIF checksum code made generic (Max Gurtovoy) - Add support for discard in iostats (Michael Callahan / Tejun) - Set of updates for BFQ (Paolo) - Removal of async write support for bsg (Christoph) - Bio page dirtying and clone fixups (Christoph) - Set of bcache fix/changes (via Coly) - Series improving blk-mq queue setup/teardown speed (Ming) - Series improving merging performance on blk-mq (Ming) - Lots of other fixes and cleanups from a slew of folks" * tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block: (190 commits) blkcg: Make blkg_root_lookup() work for queues in bypass mode bcache: fix error setting writeback_rate through sysfs interface null_blk: add lock drop/acquire annotation Blk-throttle: reduce tail io latency when iops limit is enforced block: paride: pd: mark expected switch fall-throughs block: Ensure that a request queue is dissociated from the cgroup controller block: Introduce blk_exit_queue() blkcg: Introduce blkg_root_lookup() block: Remove two superfluous #include directives blk-mq: count the hctx as active before allocating tag block: bvec_nr_vecs() returns value for wrong slab bcache: trivial - remove tailing backslash in macro BTREE_FLAG bcache: make the pr_err statement used for ENOENT only in sysfs_attatch section bcache: set max writeback rate when I/O request is idle bcache: add code comments for bset.c bcache: fix mistaken comments in request.c bcache: fix mistaken code comments in bcache.h bcache: add a comment in super.c bcache: avoid unncessary cache prefetch bch_btree_node_get() bcache: display rate debug parameters to 0 when writeback is not running ...
2018-08-10uprobes: Use synchronize_rcu() not synchronize_sched()Steven Rostedt (VMware)
While debugging another bug, I was looking at all the synchronize*() functions being used in kernel/trace, and noticed that trace_uprobes was using synchronize_sched(), with a comment to synchronize with {u,ret}_probe_trace_func(). When looking at those functions, the data is protected with "rcu_read_lock()" and not with "rcu_read_lock_sched()". This is using the wrong synchronize_*() function. Link: http://lkml.kernel.org/r/20180809160553.469e1e32@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 70ed91c6ec7f8 ("tracing/uprobes: Support ftrace_event_file base multibuffer") Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-10tracing: Fix synchronizing to event changes with ↵Steven Rostedt (VMware)
tracepoint_synchronize_unregister() Now that some trace events can be protected by srcu_read_lock(tracepoint_srcu), we need to make sure all locations that depend on this are also protected. There were many places that did a synchronize_sched() thinking that it was enough to protect againts access to trace events. This use to be the case, but now that we use SRCU for _rcuidle() trace events, they may not be protected by synchronize_sched(), as they may be called in paths that RCU is not watching for preempt disable. Fixes: e6753f23d961d ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-10ftrace: Remove unused pointer ftrace_swapper_pidColin Ian King
Pointer ftrace_swapper_pid is defined but is never used hence it is redundant and can be removed. The use of this variable was removed in commit 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like events do"). Cleans up clang warning: warning: 'ftrace_swapper_pid' defined but not used [-Wunused-const-variable=] Link: http://lkml.kernel.org/r/20180809125609.13142-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-10tracing: More reverting of "tracing: Centralize preemptirq tracepoints and ↵Steven Rostedt (VMware)
unify their usage" Joel Fernandes created a nice patch that cleaned up the duplicate hooks used by lockdep and irqsoff latency tracer. It made both use tracepoints. But the latency tracer is triggering warnings when using tracepoints to call into the latency tracer's routines. Mainly, they can be called from NMI context. If that happens, then the SRCU may not work properly because on some architectures, SRCU is not safe to be called in both NMI and non-NMI context. This is a partial revert of the clean up patch c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") that adds back the direct calls into the latency tracer. It also only calls the trace events when not in NMI. Link: http://lkml.kernel.org/r/20180809210654.622445925@goodmis.org Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-10tracing/irqsoff: Handle preempt_count for different configsSteven Rostedt (VMware)
I was hitting the following warning: WARNING: CPU: 0 PID: 1 at kernel/trace/trace_irqsoff.c:631 tracer_hardirqs_off+0x15/0x2a Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc6-test+ #13 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 EIP: tracer_hardirqs_off+0x15/0x2a Code: ff 85 c0 74 0e 8b 45 00 8b 50 04 8b 45 04 e8 35 ff ff ff 5d c3 55 64 a1 cc 37 51 c1 a9 ff ff ff 7f 89 e5 53 89 d3 89 ca 75 02 <0f> 0b e8 90 fc ff ff 85 c0 74 07 89 d8 e8 0c ff ff ff 5b 5d c3 55 EAX: 80000000 EBX: c04337f0 ECX: c04338e3 EDX: c04338e3 ESI: c04337f0 EDI: c04338e3 EBP: f2aa1d68 ESP: f2aa1d64 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210046 CR0: 80050033 CR2: 00000000 CR3: 01668000 CR4: 001406f0 Call Trace: trace_irq_disable_rcuidle+0x63/0x6c trace_hardirqs_off+0x26/0x30 default_send_IPI_mask_allbutself_logical+0x31/0x93 default_send_IPI_allbutself+0x37/0x48 native_send_call_func_ipi+0x4d/0x6a smp_call_function_many+0x165/0x19d ? add_nops+0x34/0x34 ? trace_hardirqs_on_caller+0x2d/0x2d ? add_nops+0x34/0x34 smp_call_function+0x1f/0x23 on_each_cpu+0x15/0x43 ? trace_hardirqs_on_caller+0x2d/0x2d ? trace_hardirqs_on_caller+0x2d/0x2d ? trace_irq_disable_rcuidle+0x1/0x6c text_poke_bp+0xa0/0xc2 ? trace_hardirqs_on_caller+0x2d/0x2d arch_jump_label_transform+0xa7/0xcb ? trace_irq_disable_rcuidle+0x5/0x6c __jump_label_update+0x3e/0x6d jump_label_update+0x7d/0x81 static_key_slow_inc_cpuslocked+0x58/0x6d static_key_slow_inc+0x19/0x20 tracepoint_probe_register_prio+0x19e/0x1d1 ? start_critical_timings+0x1c/0x1c tracepoint_probe_register+0xf/0x11 irqsoff_tracer_init+0x21/0xf2 tracer_init+0x16/0x1a trace_selftest_startup_irqsoff+0x25/0xc4 run_tracer_selftest+0xca/0x131 register_tracer+0xd5/0x172 ? trace_event_define_fields_preemptirq_template+0x45/0x45 init_irqsoff_tracer+0xd/0x11 do_one_initcall+0xab/0x1e8 ? rcu_read_lock_sched_held+0x3d/0x44 ? trace_initcall_level+0x52/0x86 kernel_init_freeable+0x195/0x21a ? rest_init+0xb4/0xb4 kernel_init+0xd/0xe4 ret_from_fork+0x2e/0x38 It is due to running a CONFIG_PREEMPT_VOLUNTARY kernel, which would trigger this warning every time: WARN_ON_ONCE(preempt_count()); Because on CONFIG_PREEMPT_VOLUNTARY, preempt_count() is always zero. This warning is to make sure preempt_count is set because tracer_hardirqs_on() does a preempt_enable_notrace() to make the preempt_trace() work properly, as being called by a trace event, the trace event code disables preemption, and the tracer wants to know what the preemption was before it was called. Instead of enabling preemption like this, just record the preempt_count, subtract PREEMPT_DISABLE_OFFSET from it (which is zero with !CONFIG_PREEMPT set), and pass that value to the necessary functions, which should use the passed in parameter instead of calling preempt_count() directly. Fixes: da5b3ebb45277 ("tracing: irqsoff: Account for additional preempt_disable") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-10tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and ↵Steven Rostedt (VMware)
unify their usage" Joel Fernandes created a nice patch that cleaned up the duplicate hooks used by lockdep and irqsoff latency tracer. It made both use tracepoints. But it caused lockdep to trigger several false positives. We have not figured out why yet, but removing lockdep from using the trace event hooks and just call its helper functions directly (like it use to), makes the problem go away. This is a partial revert of the clean up patch c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") that adds direct calls for lockdep, but also keeps most of the clean up done to get rid of the horrible preprocessor if statements. Link: http://lkml.kernel.org/r/20180806155058.5ee875f4@gandalf.local.home Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-06tracing: irqsoff: Account for additional preempt_disableJoel Fernandes (Google)
Recently we tried to make the preemptirqsoff tracer to use irqsoff tracepoint probes. However this causes issues as reported by Masami: [2.271078] Testing tracer preemptirqsoff: .. no entries found ..FAILED! [2.381015] WARNING: CPU: 0 PID: 1 at /home/mhiramat/ksrc/linux/kernel/ trace/trace.c:1512 run_tracer_selftest+0xf3/0x154 This is due to the tracepoint code increasing the preempt nesting count by calling an additional preempt_disable before calling into the preemptoff tracer which messes up the preempt_count() check in tracer_hardirqs_off. To fix this, make the irqsoff tracer probes balance the additional outer preempt_disable with a preempt_enable_notrace. The other way to fix this is to just use SRCU for all tracepoints. However we can't do that because we can't use NMIs from RCU context. Link: http://lkml.kernel.org/r/20180806034049.67949-1-joel@joelfernandes.org Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-05Merge tag 'v4.18-rc6' into for-4.19/block2Jens Axboe
Pull in 4.18-rc6 to get the NVMe core AEN change to avoid a merge conflict down the line. Signed-of-by: Jens Axboe <axboe@kernel.dk>
2018-08-03trace: Use rcu_dereference_raw for hooks from trace-event subsystemJoel Fernandes (Google)
Since we switched to using SRCU for tracepoints used in the idle path, we can no longer use rcu_dereference_sched for dereferencing points in trace-event hooks. Since tracepoints can now use either SRCU or sched-RCU, just use rcu_dereference_raw for traceevents just like we're doing when dereferencing the tracepoint table. This prevents an RCU warning reported by Masami: [ 282.060593] WARNING: can't dereference registers at 00000000f3c7f62b [ 282.063200] ============================= [ 282.064082] WARNING: suspicious RCU usage [ 282.064963] 4.18.0-rc6+ #15 Tainted: G W [ 282.066048] ----------------------------- [ 282.066923] /home/mhiramat/ksrc/linux/kernel/trace/trace_events.c:242 suspicious rcu_dereference_check() usage! [ 282.068974] [ 282.068974] other info that might help us debug this: [ 282.068974] [ 282.070770] [ 282.070770] RCU used illegally from idle CPU! [ 282.070770] rcu_scheduler_active = 2, debug_locks = 1 [ 282.072938] RCU used illegally from extended quiescent state! [ 282.074183] no locks held by swapper/0/0. [ 282.075071] [ 282.075071] stack backtrace: [ 282.076121] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W [ 282.077782] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) [ 282.079604] Call Trace: [ 282.080212] <IRQ> [ 282.080755] dump_stack+0x85/0xcb [ 282.081523] trace_event_ignore_this_pid+0x66/0x70 [ 282.082541] trace_event_raw_event_preemptirq_template+0xa2/0xb0 [ 282.083774] ? interrupt_entry+0xc4/0xe0 [ 282.084665] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 282.085669] trace_hardirqs_off_caller+0x90/0xd0 [ 282.086597] trace_hardirqs_off_thunk+0x1a/0x1c [ 282.087433] ? call_function_interrupt+0xa/0x20 [ 282.088201] interrupt_entry+0xc4/0xe0 [ 282.088848] ? call_function_interrupt+0xa/0x20 [ 282.089579] </IRQ> [ 282.090029] ? native_safe_halt+0x2/0x10 [ 282.090695] ? default_idle+0x1f/0x160 [ 282.091330] ? default_idle_call+0x24/0x40 [ 282.091997] ? do_idle+0x210/0x250 [ 282.092658] ? cpu_startup_entry+0x6f/0x80 [ 282.093338] ? start_kernel+0x49d/0x4bd [ 282.093987] ? secondary_startup_64+0xa5/0xb0 Link: http://lkml.kernel.org/r/20180803023407.225852-1-joel@joelfernandes.org Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-02tracing/kprobes: Fix within_notrace_func() to check only notrace functionsMasami Hiramatsu
Fix within_notrace_func() to check only notrace functions and to ignore the kprobe-event which can not solve symbol addresses. within_notrace_func() returns true if the given kprobe events probe point seems to be out-of-range. But that is not the correct place to check for it, it should be done in kprobes afterwards. kprobe-events allow users to define a probe point on "currently unloaded module" so that it can trace the event during module load. In this case, the user will put a probe on a symbol which is not in kallsyms yet and it hits the within_notrace_func(). As a result, kprobe-events always refuses if user defines a probe on a "currenly unloaded module". Fixes: commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on notrace function") Link: http://lkml.kernel.org/r/153319624799.29007.13604430345640129926.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-02Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-08-01ftrace: Use true and false for boolean values in ops_references_rec()Gustavo A. R. Silva
Return statements in functions returning bool should use true or false instead of an integer value. This code was detected with the help of Coccinelle. Link: http://lkml.kernel.org/r/20180802010056.GA31012@embeddedor.com Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01ring-buffer: Make ring_buffer_record_is_set_on() return boolSteven Rostedt (VMware)
The value of ring_buffer_record_is_set_on() is either true or false, so have its return value be bool. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01ring-buffer: Make ring_buffer_record_is_on() return boolSteven Rostedt (VMware)
The value of ring_buffer_record_is_on() is either true or false, so have its return value be bool. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01tracing: Make tracer_tracing_is_on() return boolSteven Rostedt (VMware)
There's code that expects tracer_tracing_is_on() to be either true or false, not some random number. Currently, it should only return one or zero, but just in case, change its return value to bool, to enforce it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01tracing: Do a WARN_ON() if start_thread() in hwlat is called when thread existsSteven Rostedt (VMware)
The start function of the hwlat tracer should never be called when the hwlat thread already exists. If it is called, do a WARN_ON(). Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01ftrace: Add missing check for existing hwlat threadErica Bugden
The hwlat tracer uses a kernel thread to measure latencies. The function that creates this kernel thread, start_kthread(), can be called when the tracer is initialized and when the tracer is explicitly enabled. start_kthread() does not check if there is an existing hwlat kernel thread and will create a new one each time it is called. This causes the reference to the previous thread to be lost. Without the thread reference, the old kernel thread becomes unstoppable and continues to use CPU time even after the hwlat tracer has been disabled. This problem can be observed when a system is booted with tracing enabled and the hwlat tracer is configured like this: echo hwlat > current_tracer; echo 1 > tracing_on Add the missing check for an existing kernel thread in start_kthread() to prevent this problem. This function and the rest of the hwlat kernel thread setup and teardown are already serialized because they are called through the tracer core code with trace_type_lock held. [ Note, this only fixes the symptom. The real fix was not to call this function when tracing_on was already one. But this still makes the code more robust, so we'll add it. ] Link: http://lkml.kernel.org/r/1533120354-22923-1-git-send-email-erica.bugden@linutronix.de Signed-off-by: Erica Bugden <erica.bugden@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>