summaryrefslogtreecommitdiff
path: root/drivers/tty/sysrq.c
AgeCommit message (Collapse)Author
2020-10-02tty/sysrq: Extend the sysrq_key_table to cover capital lettersAndrzej Pietrasiewicz
All slots in sysrq_key_table[] are either used, reserved or at least commented with their intended use. This patch adds capital letter versions available, which means adding 26 more entries. For already existing SysRq operations the user presses Alt-SysRq-<key>, and for the newly added ones Alt-Shift-SysRq-<key>. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://lore.kernel.org/r/20200818112825.6445-2-andrzej.p@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24tty/sysrq: emergency_thaw_all does not depend on CONFIG_BLOCKChristoph Hellwig
We can also thaw non-block file systems. Remove the CONFIG_BLOCK in sysrq.c after making the prototype available unconditionally. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-09kernel: rename show_stack_loglvl() => show_stack()Dmitry Safonov
Now the last users of show_stack() got converted to use an explicit log level, show_stack_loglvl() can drop it's redundant suffix and become once again well known show_stack(). Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200418201944.482088-51-dima@arista.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09sysrq: use show_stack_loglvl()Dmitry Safonov
Show the stack trace on a CPU with the same log level as "CPU%d" header. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Link: http://lkml.kernel.org/r/20200418201944.482088-45-dima@arista.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-15tty/sysrq: constify the the sysrq_key_op(s)Emil Velikov
All the users threat them as immutable - annotate them as such. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Link: https://lore.kernel.org/r/20200513214351.2138580-3-emil.l.velikov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15tty/sysrq: constify the sysrq APIEmil Velikov
The user is not supposed to thinker with the underlying sysrq_key_op. Make that explicit by adding a handful of const notations. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Link: https://lore.kernel.org/r/20200513214351.2138580-2-emil.l.velikov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15tty/sysrq: alpha: export and use __sysrq_get_key_op()Emil Velikov
Export a pointer to the sysrq_get_key_op(). This way we can cleanly unregister it, instead of the current solutions of modifuing it inplace. Since __sysrq_get_key_op() is no longer used externally, let's make it a static function. This patch will allow us to limit access to each and every sysrq op and constify the sysrq handling. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: linux-kernel@vger.kernel.org Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-alpha@vger.kernel.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Link: https://lore.kernel.org/r/20200513214351.2138580-1-emil.l.velikov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-20tty/sysrq: Export sysrq_mask(), sysrq_toggle_support()Dmitry Safonov
Build fix for serial_core being module: ERROR: modpost: "sysrq_toggle_support" [drivers/tty/serial/serial_core.ko] undefined! ERROR: modpost: "sysrq_mask" [drivers/tty/serial/serial_core.ko] undefined! Fixes: eaee41727e6d ("sysctl/sysrq: Remove __sysrq_enabled copy") Cc: Jiri Slaby <jslaby@suse.com> Reported-by: "kernelci.org bot" <bot@kernelci.org> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Dmitry Safonov <dima@arista.com> Link: https://lore.kernel.org/r/20200420172317.599611-1-dima@arista.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-07sysctl/sysrq: Remove __sysrq_enabled copyDmitry Safonov
Many embedded boards have a disconnected TTL level serial which can generate some garbage that can lead to spurious false sysrq detects. Currently, sysrq can be either completely disabled for serial console or always disabled (with CONFIG_MAGIC_SYSRQ_SERIAL), since commit 732dbf3a6104 ("serial: do not accept sysrq characters via serial port") At Arista, we have such boards that can generate BREAK and random garbage. While disabling sysrq for serial console would solve the problem with spurious false sysrq triggers, it's also desirable to have a way to enable sysrq back. Having the way to enable sysrq was beneficial to debug lockups with a manual investigation in field and on the other side preventing false sysrq detections. As a preparation to add sysrq_toggle_support() call into uart, remove a private copy of sysrq_enabled from sysctl - it should reflect the actual status of sysrq. Furthermore, the private copy isn't correct already in case sysrq_always_enabled is true. So, remove __sysrq_enabled and use a getter-helper sysrq_mask() to check sysrq_key_op enabled status. Cc: Iurii Zaikin <yzaikin@google.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Link: https://lore.kernel.org/r/20200302175135.269397-2-dima@arista.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-04proc: convert everything to "struct proc_ops"Alexey Dobriyan
The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in seq_file.h. Conversion rule is: llseek => proc_lseek unlocked_ioctl => proc_ioctl xxx => proc_xxx delete ".owner = THIS_MODULE" line [akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c] [sfr@canb.auug.org.au: fix kernel/sched/psi.c] Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-17sysrq: Remove sysrq_handler_registeredDmitry Safonov
sysrq_toggle_support() can be called in parallel, in return calling input_(un)register_handler(), which fortunately is safe to call in parallel and regardless of registered/unregistered status of sysrq_handler. Remove sysrq_handler_registered as it doesn't have any function there and may confuse reader about possible race. Signed-off-by: Dmitry Safonov <dima@arista.com> Link: https://lore.kernel.org/r/20191213000657.931618-2-dima@arista.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-14panic: avoid the extra noise dmesgFeng Tang
When kernel panic happens, it will first print the panic call stack, then the ending msg like: [ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception [ 35.749975] ------------[ cut here ]------------ The above message are very useful for debugging. But if system is configured to not reboot on panic, say the "panic_timeout" parameter equals 0, it will likely print out many noisy message like WARN() call stack for each and every CPU except the panic one, messages like below: WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190 Call Trace: <IRQ> try_to_wake_up default_wake_function autoremove_wake_function __wake_up_common __wake_up_common_lock __wake_up wake_up_klogd_work_func irq_work_run_list irq_work_tick update_process_times tick_sched_timer __hrtimer_run_queues hrtimer_interrupt smp_apic_timer_interrupt apic_timer_interrupt For people working in console mode, the screen will first show the panic call stack, but immediately overridden by these noisy extra messages, which makes debugging much more difficult, as the original context gets lost on screen. Also these noisy messages will confuse some users, as I have seen many bug reporters posted the noisy message into bugzilla, instead of the real panic call stack and context. Adding a flag "suppress_printk" which gets set in panic() to avoid those noisy messages, without changing current kernel behavior that both panic blinking and sysrq magic key can work as is, suggested by Petr Mladek. To verify this, make sure kernel is not configured to reboot on panic and in console # echo c > /proc/sysrq-trigger to see if console only prints out the panic call stack. Link: http://lkml.kernel.org/r/1551430186-24169-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Suggested-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-28tty/sysrq: Convert show_lock to raw_spinlock_tJulien Grall
Systems which don't provide arch_trigger_cpumask_backtrace() will invoke showacpu() from a smp_call_function() function which is invoked with disabled interrupts even on -RT systems. The function acquires the show_lock lock which only purpose is to ensure that the CPUs don't print simultaneously. Otherwise the output would clash and it would be hard to tell the output from CPUx apart from CPUy. On -RT the spin_lock() can not be acquired from this context. A raw_spin_lock() is required. It will introduce the system's latency by performing the sysrq request and other CPUs will block on the lock until the request is done. This is okay because the user asked for a backtrace of all active CPUs and under "normal circumstances in production" this path should not be triggered. Signed-off-by: Julien Grall <julien.grall@arm.com> [bigeasy@linuxtronix.de: commit description] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-22sysrq: Remove duplicated sysrq messagePetr Mladek
The commit 97f5f0cd8cd0a0544 ("Input: implement SysRq as a separate input handler") added pr_fmt() definition. It caused a duplicated message prefix in the sysrq header messages, for example: [ 177.053931] sysrq: SysRq : Show backtrace of all active CPUs [ 742.864776] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) Fixes: 97f5f0cd8cd0a05 ("Input: implement SysRq as a separate input handler") Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-22sysrq: Restore original console_loglevel when sysrq disabledPetr Mladek
The sysrq header line is printed with an increased loglevel to provide users some positive feedback. The original loglevel is not restored when the sysrq operation is disabled. This bug was introduced in 2.6.12 (pre-git-history) by the commit ("Allow admin to enable only some of the Magic-Sysrq functions"). Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05tty/sysrq: Do not call sync directly from sysrq_do_reset()Mark Tomlinson
sysrq_do_reset() is called in softirq context, so it cannot call sync() directly. Instead, call orderly_reboot(), which creates a work item to run /sbin/reboot, or do emergency_sync and restart if the command fails. Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-27sysrq: Use panic() to force a crashMatthias Kaehlcke
sysrq_handle_crash() currently forces a crash by dereferencing a NULL pointer, which is undefined behavior in C. Just call panic() instead, which is simpler and doesn't depend on compiler specific handling of the undefined behavior. Remove the comment on why the RCU lock needs to be released, it isn't accurate anymore since the crash now isn't handled by the page fault handler (for reference: the comment was added by commit 984cf355aeaa ("sysrq: Fix warning in sysrq generated crash.")). Releasing the lock is still good practice though. Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-27tty/sysrq: add of_node_put()Yangtao Li
of_find_node_by_path() acquires a reference to the node returned by it and that reference needs to be dropped by its caller. bl_idle_init() doesn't do that, so fix it. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-11signal: send_sig_all no longer needs SEND_SIG_FORCEDEric W. Biederman
Now that send_signal always delivers SEND_SIG_PRIV signals to a pid namespace init it is no longer necessary to use SEND_SIG_FORCED when calling do_send_sig_info to ensure that pid namespace inits are signaled and possibly killed. Using SEND_SIG_PRIV is sufficient. So use SEND_SIG_PRIV so that userspace when it receives a SIGTERM can tell that the kernel sent the signal and not some random userspace application. Fixes: b82c32872db2 ("sysrq: use SEND_SIG_FORCED instead of force_sig()") Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21signal: Pass pid type into do_send_sig_infoEric W. Biederman
This passes the information we already have at the call sight into do_send_sig_info. Ultimately allowing for better handling of signals sent to a group of processes during fork. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-02fs: add ksys_sync() helper; remove in-kernel calls to sys_sync()Dominik Brodowski
Using this helper allows us to avoid the in-kernel calls to the sys_sync() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_sync(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2017-11-13Merge tag 'tty-4.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big tty/serial driver pull request for 4.15-rc1. Lots of serial driver updates in here, some small vt cleanups, and a raft of SPDX and license boilerplate cleanups, messing up the diffstat a bit. Nothing major, with no realy functional changes except better hardware support for some platforms. All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (110 commits) tty: ehv_bytechan: fix spelling mistake tty: serial: meson: allow baud-rates lower than 9600 serial: 8250_fintek: Fix crash with baud rate B0 serial: 8250_fintek: Disable delays for ports != 0 serial: 8250_fintek: Return -EINVAL on invalid configuration tty: Remove redundant license text tty: serdev: Remove redundant license text tty: hvc: Remove redundant license text tty: serial: Remove redundant license text tty: add SPDX identifiers to all remaining files in drivers/tty/ tty: serial: jsm: remove redundant pointer ts tty: serial: jsm: add space before the open parenthesis '(' tty: serial: jsm: fix coding style tty: serial: jsm: delete space between function name and '(' tty: serial: jsm: add blank line after declarations tty: serial: jsm: change the type of local variable tty: serial: imx: remove dead code imx_dma_rxint tty: serial: imx: disable ageing timer interrupt if dma in use serial: 8250: fix potential deadlock in rs485-mode serial: m32r_sio: Drop redundant .data assignment ...
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-20tty/sysrq: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Jiri Slaby <jslaby@suse.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-04sysrq : fix Show Regs call trace on ARMJibin Xu
When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled, echo 1 >/proc/sys/kernel/sysrq echo p >/proc/sysrq-trigger kernel will print call trace as below: sysrq: SysRq : Show Regs BUG: using __this_cpu_read() in preemptible [00000000] code: sh/435 caller is __this_cpu_preempt_check+0x18/0x20 Call trace: [<ffffff8008088e80>] dump_backtrace+0x0/0x1d0 [<ffffff8008089074>] show_stack+0x24/0x30 [<ffffff8008447970>] dump_stack+0x90/0xb0 [<ffffff8008463950>] check_preemption_disabled+0x100/0x108 [<ffffff8008463998>] __this_cpu_preempt_check+0x18/0x20 [<ffffff80084c9194>] sysrq_handle_showregs+0x1c/0x40 [<ffffff80084c9c7c>] __handle_sysrq+0x12c/0x1a0 [<ffffff80084ca140>] write_sysrq_trigger+0x60/0x70 [<ffffff8008251e00>] proc_reg_write+0x90/0xd0 [<ffffff80081f1788>] __vfs_write+0x48/0x90 [<ffffff80081f241c>] vfs_write+0xa4/0x190 [<ffffff80081f3354>] SyS_write+0x54/0xb0 [<ffffff80080833f0>] el0_svc_naked+0x24/0x28 This can be seen on a common board like an r-pi3. This happens because when echo p >/proc/sysrq-trigger, get_irq_regs() is called outside of IRQ context, if preemption is enabled in this situation,kernel will print the call trace. Since many prior discussions on the mailing lists have made it clear that get_irq_regs either just returns NULL or stale data when used outside of IRQ context,we simply avoid calling it outside of IRQ context. Signed-off-by: Jibin Xu <jibin.xu@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-03oom: improve oom disable handlingMichal Hocko
Tetsuo has reported that sysrq triggered OOM killer will print a misleading information when no tasks are selected: sysrq: SysRq : Manual OOM execution Out of memory: Kill process 4468 ((agetty)) score 0 or sacrifice child Killed process 4468 ((agetty)) total-vm:43704kB, anon-rss:1760kB, file-rss:0kB, shmem-rss:0kB sysrq: SysRq : Manual OOM execution Out of memory: Kill process 4469 (systemd-cgroups) score 0 or sacrifice child Killed process 4469 (systemd-cgroups) total-vm:10704kB, anon-rss:120kB, file-rss:0kB, shmem-rss:0kB sysrq: SysRq : Manual OOM execution sysrq: OOM request ignored because killer is disabled sysrq: SysRq : Manual OOM execution sysrq: OOM request ignored because killer is disabled sysrq: SysRq : Manual OOM execution sysrq: OOM request ignored because killer is disabled The real reason is that there are no eligible tasks for the OOM killer to select but since commit 7c5f64f84483 ("mm: oom: deduplicate victim selection code for memcg and global oom") the semantic of out_of_memory has changed without updating moom_callback. This patch updates moom_callback to tell that no task was eligible which is the case for both oom killer disabled and no eligible tasks. In order to help distinguish first case from the second add printk to both oom_killer_{enable,disable}. This information is useful on its own because it might help debugging potential memory allocation failures. Fixes: 7c5f64f84483 ("mm: oom: deduplicate victim selection code for memcg and global oom") Link: http://lkml.kernel.org/r/20170404134705.6361-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-05net: ibm: emac: remove unused sysrq handler for 'c' keyEric Biggers
Since commit d6580a9f1523 ("kexec: sysrq: simplify sysrq-c handler"), the sysrq handler for the 'c' key has been sysrq_crash_op. Debugging code in the ibm_emac driver also tries to register a handler for the 'c' key, but this has no effect because register_sysrq_key() doesn't replace existing handlers. Since evidently no one has cared enough to fix this in the last 8 years, and it's very rare for drivers to register sysrq handlers (for good reason), just remove the dead code. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/task.h> We are going to split <linux/sched/task.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/debug.h> We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/debug.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/signal.h> We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/signal.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-22lib/show_mem.c: teach show_mem to work with the given nodemaskMichal Hocko
show_mem() allows to filter out node specific data which is irrelevant to the allocation request via SHOW_MEM_FILTER_NODES. The filtering is done in skip_free_areas_node which skips all nodes which are not in the mems_allowed of the current process. This works most of the time as expected because the nodemask shouldn't be outside of the allocating task but there are some exceptions. E.g. memory hotplug might want to request allocations from outside of the allowed nodes (see new_node_page). Get rid of this hardcoded behavior and push the allocation mask down the show_mem path and use it instead of cpuset_current_mems_allowed. NULL nodemask is interpreted as cpuset_current_mems_allowed. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170117091543.25850-5-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-11sysrq: attach sysrq handler correctly for 32-bit kernelAkinobu Mita
The sysrq input handler should be attached to the input device which has a left alt key. On 32-bit kernels, some input devices which has a left alt key cannot attach sysrq handler. Because the keybit bitmap in struct input_device_id for sysrq is not correctly initialized. KEY_LEFTALT is 56 which is greater than BITS_PER_LONG on 32-bit kernels. I found this problem when using a matrix keypad device which defines a KEY_LEFTALT (56) but doesn't have a KEY_O (24 == 56%32). Cc: Jiri Slaby <jslaby@suse.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-26mm: oom: add memcg to oom_controlVladimir Davydov
It's a part of oom context just like allocation order and nodemask, so let's move it to oom_control instead of passing it in the argument list. Link: http://lkml.kernel.org/r/40e03fd7aaf1f55c75d787128d6d17c5a71226c2.1464358556.git.vdavydov@virtuozzo.com Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29sysrq: Fix warning in sysrq generated crash.Ani Sinha
Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not disable preemption, faulthandler_disabled() in __do_page_fault() in x86/fault.c returns false. When the code later calls might_sleep() in the pagefault handler, we get the following warning: BUG: sleeping function called from invalid context at ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash Preemption disabled at:[<ffffffff81484339>] printk+0x48/0x4a To fix this, we release the RCU read lock before we crash. Tested this patch on linux 3.18 by booting off one of our boards. Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") Signed-off-by: Ani Sinha <ani@arista.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-04drivers/tty: make sysrq.c slightly more explicitly non-modularPaul Gortmaker
The Kconfig currently controlling compilation of this code is: config.debug:config MAGIC_SYSRQ bool "Magic SysRq key" ...meaning that it currently is not being built as a module by anyone. Lets remove the traces of modularity we can so that when reading the driver there is less doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We don't delete the module.h include since other parts of the file are using content from there. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-08mm, oom: pass an oom order of -1 when triggered by sysrqDavid Rientjes
The force_kill member of struct oom_control isn't needed if an order of -1 is used instead. This is the same as order == -1 in struct compact_control which requires full memory compaction. This patch introduces no functional change. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08mm, oom: organize oom context into structDavid Rientjes
There are essential elements to an oom context that are passed around to multiple functions. Organize these elements into a new struct, struct oom_control, that specifies the context for an oom condition. This patch introduces no functional change. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-01Merge tag 'modules-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "Main excitement here is Peter Zijlstra's lockless rbtree optimization to speed module address lookup. He found some abusers of the module lock doing that too. A little bit of parameter work here too; including Dan Streetman's breaking up the big param mutex so writing a parameter can load another module (yeah, really). Unfortunately that broke the usual suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were appended too" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits) modules: only use mod->param_lock if CONFIG_MODULES param: fix module param locks when !CONFIG_SYSFS. rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() module: add per-module param_lock module: make perm const params: suppress unused variable error, warn once just in case code changes. modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'. kernel/module.c: avoid ifdefs for sig_enforce declaration kernel/workqueue.c: remove ifdefs over wq_power_efficient kernel/params.c: export param_ops_bool_enable_only kernel/params.c: generalize bool_enable_only kernel/module.c: use generic module param operaters for sig_enforce kernel/params: constify struct kernel_param_ops uses sysfs: tightened sysfs permission checks module: Rework module_addr_{min,max} module: Use __module_address() for module_address_lookup() module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING module: Optimize __module_address() using a latched RB-tree rbtree: Implement generic latch_tree seqlock: Introduce raw_read_seqcount_latch() ...
2015-06-27Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS updates from Ralf Baechle: - Improvements to the tlb_dump code - KVM fixes - Add support for appended DTB - Minor improvements to the R12000 support - Minor improvements to the R12000 support - Various platform improvments for BCM47xx - The usual pile of minor cleanups - A number of BPF fixes and improvments - Some improvments to the support for R3000 and DECstations - Some improvments to the ATH79 platform support - A major patchset for the JZ4740 SOC adding support for the CI20 platform - Add support for the Pistachio SOC - Minor BMIPS/BCM63xx platform support improvments. - Avoid "SYNC 0" as memory barrier when unlocking spinlocks - Add support for the XWR-1750 board. - Paul's __cpuinit/__cpuinitdata cleanups. - New Malta CPU board support large memory so enable ZONE_DMA32. * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (131 commits) MIPS: spinlock: Adjust arch_spin_lock back-off time MIPS: asmmacro: Ensure 64-bit FP registers are used with MSA MIPS: BCM47xx: Simplify handling SPROM revisions MIPS: Cobalt Don't use module_init in non-modular MTD registration. MIPS: BCM47xx: Move NVRAM driver to the drivers/firmware/ MIPS: use for_each_sg() MIPS: BCM47xx: Don't select BCMA_HOST_PCI MIPS: BCM47xx: Add helper variable for storing NVRAM length MIPS: IRQ/IP27: Move IRQ allocation API to platform code. MIPS: Replace smp_mb with release barrier function in unlocks. MIPS: i8259: DT support MIPS: Malta: Basic DT plumbing MIPS: include errno.h for ENODEV in mips-cm.h MIPS: Define GCR_GIC_STATUS register fields MIPS: BPF: Introduce BPF ASM helpers MIPS: BPF: Use BPF register names to describe the ABI MIPS: BPF: Move register definition to the BPF header MIPS: net: BPF: Replace RSIZE with SZREG MIPS: BPF: Free up some callee-saved registers MIPS: Xtalk: Update xwidget.h with known Xtalk device numbers ...
2015-06-24Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge first patchbomb from Andrew Morton: - a few misc things - ocfs2 udpates - kernel/watchdog.c feature work (took ages to get right) - most of MM. A few tricky bits are held up and probably won't make 4.2. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (91 commits) mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc() mm, thp: respect MPOL_PREFERRED policy with non-local node tmpfs: truncate prealloc blocks past i_size mm/memory hotplug: print the last vmemmap region at the end of hot add memory mm/mmap.c: optimization of do_mmap_pgoff function mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan mm: kmemleak: avoid deadlock on the kmemleak object insertion error path mm: kmemleak: do not acquire scan_mutex in kmemleak_do_cleanup() mm: kmemleak: fix delete_object_*() race when called on the same memory block mm: kmemleak: allow safe memory scanning during kmemleak disabling memcg: convert mem_cgroup->under_oom from atomic_t to int memcg: remove unused mem_cgroup->oom_wakeups frontswap: allow multiple backends x86, mirror: x86 enabling - find mirrored memory ranges mm/memblock: allocate boot time data structures from mirrored memory mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute mm: do not ignore mapping_gfp_mask in page cache allocation paths mm/cma.c: fix typos in comments mm/oom_kill.c: print points as unsigned int mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages ...
2015-06-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input subsystem updates from Dmitry Torokhov: "Thanks to Samuel Thibault input device (keyboard) LEDs are no longer hardwired within the input core but use LED subsystem and so allow use of different triggers; Hans de Goede did a large update for the ALPS touchpad driver; we have new TI drv2665 haptics driver and DA9063 OnKey driver, and host of other drivers got various fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (55 commits) Input: pixcir_i2c_ts - fix receive error MAINTAINERS: remove non existent input mt git tree Input: improve usage of gpiod API tty/vt/keyboard: define LED triggers for VT keyboard lock states tty/vt/keyboard: define LED triggers for VT LED states Input: export LEDs as class devices in sysfs Input: cyttsp4 - use swap() in cyttsp4_get_touch() Input: goodix - do not explicitly set evbits in input device Input: goodix - export id and version read from device Input: goodix - fix variable length array warning Input: goodix - fix alignment issues Input: add OnKey driver for DA9063 MFD part Input: elan_i2c - add product IDs FW names Input: elan_i2c - add support for multi IC type and iap format Input: focaltech - report finger width to userspace tty: remove platform_sysrq_reset_seq Input: synaptics_i2c - use proper boolean values Input: psmouse - use true instead of 1 for boolean values Input: cyapa - fix a few typos in comments Input: stmpe-ts - enforce device tree only mode ...
2015-06-24mm: oom_kill: simplify OOM killer lockingJohannes Weiner
The zonelist locking and the oom_sem are two overlapping locks that are used to serialize global OOM killing against different things. The historical zonelist locking serializes OOM kills from allocations with overlapping zonelists against each other to prevent killing more tasks than necessary in the same memory domain. Only when neither tasklists nor zonelists from two concurrent OOM kills overlap (tasks in separate memcgs bound to separate nodes) are OOM kills allowed to execute in parallel. The younger oom_sem is a read-write lock to serialize OOM killing against the PM code trying to disable the OOM killer altogether. However, the OOM killer is a fairly cold error path, there is really no reason to optimize for highly performant and concurrent OOM kills. And the oom_sem is just flat-out redundant. Replace both locking schemes with a single global mutex serializing OOM kills regardless of context. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-21MIPS: Add SysRq operation to dump TLBs on all CPUsJames Hogan
Add a MIPS specific SysRq operation to dump the TLB entries on all CPUs, using the 'x' trigger key. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10072/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-06-02tty: remove platform_sysrq_reset_seqArnd Bergmann
The platform_sysrq_reset_seq code was intended as a way for an embedded platform to provide its own sysrq sequence at compile time. After over two years, nobody has started using it in an upstream kernel, and the platforms that were interested in it have moved on to devicetree, which can be used to configure the sequence without requiring kernel changes. The method is also incompatible with the way that most architectures build support for multiple platforms into a single kernel. Now the code is producing warnings when built with gcc-5.1: drivers/tty/sysrq.c: In function 'sysrq_init': drivers/tty/sysrq.c:959:33: warning: array subscript is above array bounds [-Warray-bounds] key = platform_sysrq_reset_seq[i]; We could fix this, but it seems unlikely that it will ever be used, so let's just remove the code instead. We still have the option to pass the sequence either in DT, using the kernel command line, or using the /sys/module/sysrq/parameters/reset_seq file. Fixes: 154b7a489a ("Input: sysrq - allow specifying alternate reset sequence") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2015-05-28kernel/params: constify struct kernel_param_ops usesLuis R. Rodriguez
Most code already uses consts for the struct kernel_param_ops, sweep the kernel for the last offending stragglers. Other than include/linux/moduleparam.h and kernel/params.c all other changes were generated with the following Coccinelle SmPL patch. Merge conflicts between trees can be handled with Coccinelle. In the future git could get Coccinelle merge support to deal with patch --> fail --> grammar --> Coccinelle --> new patch conflicts automatically for us on patches where the grammar is available and the patch is of high confidence. Consider this a feature request. Test compiled on x86_64 against: * allnoconfig * allmodconfig * allyesconfig @ const_found @ identifier ops; @@ const struct kernel_param_ops ops = { }; @ const_not_found depends on !const_found @ identifier ops; @@ -struct kernel_param_ops ops = { +const struct kernel_param_ops ops = { }; Generated-by: Coccinelle SmPL Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Junio C Hamano <gitster@pobox.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: cocci@systeme.lip6.fr Cc: linux-kernel@vger.kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-09workqueue: dump workqueues on sysrq-tTejun Heo
Workqueues are used extensively throughout the kernel but sometimes it's difficult to debug stalls involving work items because visibility into its inner workings is fairly limited. Although sysrq-t task dump annotates each active worker task with the information on the work item being executed, it is challenging to find out which work items are pending or delayed on which queues and how pools are being managed. This patch implements show_workqueue_state() which dumps all busy workqueues and pools and is called from the sysrq-t handler. At the end of sysrq-t dump, something like the following is printed. Showing busy workqueues and worker pools: ... workqueue filler_wq: flags=0x0 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256 in-flight: 491:filler_workfn, 507:filler_workfn pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 in-flight: 501:filler_workfn pending: filler_workfn ... workqueue test_wq: flags=0x8 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1 in-flight: 510(RESCUER):test_workfn BAR(69) BAR(500) delayed: test_workfn1 BAR(492), test_workfn2 ... pool 0: cpus=0 node=0 flags=0x0 nice=0 workers=2 manager: 137 pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=3 manager: 469 pool 3: cpus=1 node=0 flags=0x0 nice=-20 workers=2 idle: 16 pool 8: cpus=0-3 flags=0x4 nice=0 workers=2 manager: 62 The above shows that test_wq is executing test_workfn() on pid 510 which is the rescuer and also that there are two tasks 69 and 500 waiting for the work item to finish in flush_work(). As test_wq has max_active of 1, there are two work items for test_workfn1() and test_workfn2() which are delayed till the current work item is finished. In addition, pid 492 is flushing test_workfn1(). The work item for test_workfn() is being executed on pwq of pool 2 which is the normal priority per-cpu pool for CPU 1. The pool has three workers, two of which are executing filler_workfn() for filler_wq and the last one is assuming the manager role trying to create more workers. This extra workqueue state dump will hopefully help chasing down hangs involving workqueues. v3: cpulist_pr_cont() replaced with "%*pbl" printf formatting. v2: As suggested by Andrew, minor formatting change in pr_cont_work(), printk()'s replaced with pr_info()'s, and cpumask printing now uses cpulist_pr_cont(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> CC: Ingo Molnar <mingo@redhat.com>
2015-02-11oom, PM: make OOM detection in the freezer path racelessMichal Hocko
Commit 5695be142e20 ("OOM, PM: OOM killed task shouldn't escape PM suspend") has left a race window when OOM killer manages to note_oom_kill after freeze_processes checks the counter. The race window is quite small and really unlikely and partial solution deemed sufficient at the time of submission. Tejun wasn't happy about this partial solution though and insisted on a full solution. That requires the full OOM and freezer's task freezing exclusion, though. This is done by this patch which introduces oom_sem RW lock and turns oom_killer_disable() into a full OOM barrier. oom_killer_disabled check is moved from the allocation path to the OOM level and we take oom_sem for reading for both the check and the whole OOM invocation. oom_killer_disable() takes oom_sem for writing so it waits for all currently running OOM killer invocations. Then it disable all the further OOMs by setting oom_killer_disabled and checks for any oom victims. Victims are counted via mark_tsk_oom_victim resp. unmark_oom_victim. The last victim wakes up all waiters enqueued by oom_killer_disable(). Therefore this function acts as the full OOM barrier. The page fault path is covered now as well although it was assumed to be safe before. As per Tejun, "We used to have freezing points deep in file system code which may be reacheable from page fault." so it would be better and more robust to not rely on freezing points here. Same applies to the memcg OOM killer. out_of_memory tells the caller whether the OOM was allowed to trigger and the callers are supposed to handle the situation. The page allocation path simply fails the allocation same as before. The page fault path will retry the fault (more on that later) and Sysrq OOM trigger will simply complain to the log. Normally there wouldn't be any unfrozen user tasks after try_to_freeze_tasks so the function will not block. But if there was an OOM killer racing with try_to_freeze_tasks and the OOM victim didn't finish yet then we have to wait for it. This should complete in a finite time, though, because - the victim cannot loop in the page fault handler (it would die on the way out from the exception) - it cannot loop in the page allocator because all the further allocation would fail and __GFP_NOFAIL allocations are not acceptable at this stage - it shouldn't be blocked on any locks held by frozen tasks (try_to_freeze expects lockless context) and kernel threads and work queues are not frozen yet Signed-off-by: Michal Hocko <mhocko@suse.cz> Suggested-by: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11sysrq: convert printk to pr_* equivalentMichal Hocko
While touching this area let's convert printk to pr_*. This also makes the printing of continuation lines done properly. Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06mm, oom: ensure memoryless node zonelist always includes zonesDavid Rientjes
With memoryless node support being worked on, it's possible that for optimizations that a node may not have a non-NULL zonelist. When CONFIG_NUMA is enabled and node 0 is memoryless, this means the zonelist for first_online_node may become NULL. The oom killer requires a zonelist that includes all memory zones for the sysrq trigger and pagefault out of memory handler. Ensure that a non-NULL zonelist is always passed to the oom killer. [akpm@linux-foundation.org: fix non-numa build] Signed-off-by: David Rientjes <rientjes@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06sysrq,rcu: suppress RCU stall warnings while sysrq runsRik van Riel
Some sysrq handlers can run for a long time, because they dump a lot of data onto a serial console. Having RCU stall warnings pop up in the middle of them only makes the problem worse. This patch temporarily disables RCU stall warnings while a sysrq request is handled. Signed-off-by: Rik van Riel <riel@redhat.com> Suggested-by: Paul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Madper Xie <cxie@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>