summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2014-08-07acct: serialize acct_on()Al Viro
brute-force - on a global mutex that isn't nested into anything. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07acct() should honour the limits from the very beginningAl Viro
We need to check free space on the first write to freshly opened log. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07split the slow path in acct_process() offAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07separate namespace-independent parts of filling acct_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07acct: switch to __kernel_write()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07acct: encode_comp_t(0) is 0, fortunately...Al Viro
There was an amusing bogosity in ac_rw calculation - it tried to do encode_comp_t(encode_comp_t(0) / 1024). Seeing that comp_t is a 3-bit exponent + 13-bit mantissa... it's a good thing that 0 is represented by all-bits-clear. The history of that one is interesting - it was introduced in 2.1.68pre1, when acct.c had been reworked and moved to separate file. Two months later (2.1.86) somebody has noticed that the sucker won't compile - there was no task_struct::io_usage. At which point the ac_io calculation had changed from encode_comp_t(current->io_usage) to encode_comp_t(0) and the bug in the next line (absolutely real back then, had it ever managed to compile) become a harmless bogosity. Looks like nobody has ever noticed until now. Anyway, let's bury that idiocy now that it got noticed. 17 years is long enough... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07Merge commit 'ccbf62d8a284cf181ac28c8e8407dd077d90dd4b' into for-nextAl Viro
backmerge to avoid kernel/acct.c conflict
2014-08-06Merge branch 'akpm' (patchbomb from Andrew Morton)Linus Torvalds
Merge incoming from Andrew Morton: - Various misc things. - arch/sh updates. - Part of ocfs2. Review is slow. - Slab updates. - Most of -mm. - printk updates. - lib/ updates. - checkpatch updates. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (226 commits) checkpatch: update $declaration_macros, add uninitialized_var checkpatch: warn on missing spaces in broken up quoted checkpatch: fix false positives for --strict "space after cast" test checkpatch: fix false positive MISSING_BREAK warnings with --file checkpatch: add test for native c90 types in unusual order checkpatch: add signed generic types checkpatch: add short int to c variable types checkpatch: add for_each tests to indentation and brace tests checkpatch: fix brace style misuses of else and while checkpatch: add --fix option for a couple OPEN_BRACE misuses checkpatch: use the correct indentation for which() checkpatch: add fix_insert_line and fix_delete_line helpers checkpatch: add ability to insert and delete lines to patch/file checkpatch: add an index variable for fixed lines checkpatch: warn on break after goto or return with same tab indentation checkpatch: emit a warning on file add/move/delete checkpatch: add test for commit id formatting style in commit log checkpatch: emit fewer kmalloc_array/kcalloc conversion warnings checkpatch: improve "no space after cast" test checkpatch: allow multiple const * types ...
2014-08-06Merge tag 'pm+acpi-3.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "Again, ACPICA leads the pack (47 commits), followed by cpufreq (18 commits) and system suspend/hibernation (9 commits). From the new code perspective, the ACPICA update brings ACPI 5.1 to the table, including a new device configuration object called _DSD (Device Specific Data) that will hopefully help us to operate device properties like Device Trees do (at least to some extent) and changes related to supporting ACPI on ARM. Apart from that we have hibernation changes making it use radix trees to store memory bitmaps which should speed up some operations carried out by it quite significantly. We also have some power management changes related to suspend-to-idle (the "freeze" sleep state) support and more preliminary changes needed to support ACPI on ARM (outside of ACPICA). The rest is fixes and cleanups pretty much everywhere. Specifics: - ACPICA update to upstream version 20140724. That includes ACPI 5.1 material (support for the _CCA and _DSD predefined names, changes related to the DMAR and PCCT tables and ARM support among other things) and cleanups related to using ACPICA's header files. A major part of it is related to acpidump and the core code used by that utility. Changes from Bob Moore, David E Box, Lv Zheng, Sascha Wildner, Tomasz Nowicki, Hanjun Guo. - Radix trees for memory bitmaps used by the hibernation core from Joerg Roedel. - Support for waking up the system from suspend-to-idle (also known as the "freeze" sleep state) using ACPI-based PCI wakeup signaling (Rafael J Wysocki). - Fixes for issues related to ACPI button events (Rafael J Wysocki). - New device ID for an ACPI-enumerated device included into the Wildcat Point PCH from Jie Yang. - ACPI video updates related to backlight handling from Hans de Goede and Linus Torvalds. - Preliminary changes needed to support ACPI on ARM from Hanjun Guo and Graeme Gregory. - ACPI PNP core cleanups from Arjun Sreedharan and Zhang Rui. - Cleanups related to ACPI_COMPANION() and ACPI_HANDLE() macros (Rafael J Wysocki). - ACPI-based device hotplug cleanups from Wei Yongjun and Rafael J Wysocki. - Cleanups and improvements related to system suspend from Lan Tianyu, Randy Dunlap and Rafael J Wysocki. - ACPI battery cleanup from Wei Yongjun. - cpufreq core fixes from Viresh Kumar. - Elimination of a deadband effect from the cpufreq ondemand governor and intel_pstate driver cleanups from Stratos Karafotis. - 350MHz CPU support for the powernow-k6 cpufreq driver from Mikulas Patocka. - Fix for the imx6 cpufreq driver from Anson Huang. - cpuidle core and governor cleanups from Daniel Lezcano, Sandeep Tripathy and Mohammad Merajul Islam Molla. - Build fix for the big_little cpuidle driver from Sachin Kamat. - Configuration fix for the Operation Performance Points (OPP) framework from Mark Brown. - APM cleanup from Jean Delvare. - cpupower utility fixes and cleanups from Peter Senna Tschudin, Andrey Utkin, Himangi Saraogi, Rickard Strandqvist, Thomas Renninger" * tag 'pm+acpi-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (118 commits) ACPI / LPSS: add LPSS device for Wildcat Point PCH ACPI / PNP: Replace faulty is_hex_digit() by isxdigit() ACPICA: Update version to 20140724. ACPICA: ACPI 5.1: Update for PCCT table changes. ACPICA/ARM: ACPI 5.1: Update for GTDT table changes. ACPICA/ARM: ACPI 5.1: Update for MADT changes. ACPICA/ARM: ACPI 5.1: Update for FADT changes. ACPICA: ACPI 5.1: Support for the _CCA predifined name. ACPICA: ACPI 5.1: New notify value for System Affinity Update. ACPICA: ACPI 5.1: Support for the _DSD predefined name. ACPICA: Debug object: Add current value of Timer() to debug line prefix. ACPICA: acpihelp: Add UUID support, restructure some existing files. ACPICA: Utilities: Fix local printf issue. ACPICA: Tables: Update for DMAR table changes. ACPICA: Remove some extraneous printf arguments. ACPICA: Update for comments/formatting. No functional changes. ACPICA: Disassembler: Add support for the ToUUID opererator (macro). ACPICA: Remove a redundant cast to acpi_size for ACPI_OFFSET() macro. ACPICA: Work around an ancient GCC bug. ACPI / processor: Make it possible to get local x2apic id via _MAT ...
2014-08-06Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This patch set consists of the usual driver updates (ufs, storvsc, pm8001 hpsa). It also has removal of the user space target driver code (everyone is using LIO now), a partial PCI MSI-X update, more multi-queue updates, conversion to 64 bit LUNs (so we could theoretically cope with any LUN returned by a device) and placeholder support for the ZBC device type (Shingle drives), plus an assortment of minor updates and bug fixes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (143 commits) scsi: do not issue SCSI RSOC command to Promise Vtrak E610f vmw_pvscsi: Use pci_enable_msix_exact() instead of pci_enable_msix() pm8001: Fix invalid return when request_irq() failed lpfc: Remove superfluous call to pci_disable_msix() isci: Use pci_enable_msix_exact() instead of pci_enable_msix() bfa: Use pci_enable_msix_exact() instead of pci_enable_msix() bfa: Cleanup bfad_setup_intr() function bfa: Do not call pci_enable_msix() after it failed once fnic: Use pci_enable_msix_exact() instead of pci_enable_msix() scsi: use short driver name for per-driver cmd slab caches scsi_debug: support scsi-mq, queues and locks Drivers: add blist flags scsi: ufs: fix endianness sparse warnings scsi: ufs: make undeclared functions static bnx2i: Update driver version to 2.7.10.1 pm8001: fix a memory leak in nvmd_resp pm8001: fix update_flash pm8001: fix a memory leak in flash_update pm8001: Cleaning up uninitialized variables pm8001: Fix to remove null pointer checks that could never happen ...
2014-08-06kernel/printk/printk.c: fix bool assignementsNeil Zhang
Fix coccinelle warnings. Signed-off-by: Neil Zhang <zhangwm@marvell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: enable interrupts before calling console_trylock_for_printk()Jan Kara
We need interrupts disabled when calling console_trylock_for_printk() only so that cpu id we pass to can_use_console() remains valid (for other things console_sem provides all the exclusion we need and deadlocks on console_sem due to interrupts are impossible because we use down_trylock()). However if we are rescheduled, we are guaranteed to run on an online cpu so we can easily just get the cpu id in can_use_console(). We can lose a bit of performance when we enable interrupts in vprintk_emit() and then disable them again in console_unlock() but OTOH it can somewhat reduce interrupt latency caused by console_unlock(). We differ from (reverted) commit 939f04bec1a4 in that we avoid calling console_unlock() from vprintk_emit() with lockdep enabled as that has unveiled quite some bugs leading to system freezes during boot (e.g. https://lkml.org/lkml/2014/5/30/242, https://lkml.org/lkml/2014/6/28/521). Signed-off-by: Jan Kara <jack@suse.cz> Tested-by: Andreas Bombe <aeb@debian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: miscellaneous cleanupsAlex Elder
Some small cleanups to kernel/printk/printk.c. None of them should cause any change in behavior. - When CONFIG_PRINTK is defined, parenthesize the value of LOG_LINE_MAX. - When CONFIG_PRINTK is *not* defined, there is an extra LOG_LINE_MAX definition; delete it. - Pull an assignment out of a conditional expression in console_setup(). - Use isdigit() in console_setup() rather than open coding it. - In update_console_cmdline(), drop a NUL-termination assignment; the strlcpy() call that precedes it guarantees it's not needed. - Simplify some logic in printk_timed_ratelimit(). Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Petr Mladek <pmladek@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jan Kara <jack@suse.cz> Cc: John Stultz <john.stultz@linaro.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: use a clever macroAlex Elder
Use the IS_ENABLED() macro rather than #ifdef blocks to set certain global values. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Borislav Petkov <bp@suse.de> Reviewed-by: Petr Mladek <pmladek@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: John Stultz <john.stultz@linaro.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: fix some commentsAlex Elder
Fix a few comments that don't accurately describe their corresponding code. It also fixes some minor typographical errors. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Petr Mladek <pmladek@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jan Kara <jack@suse.cz> Cc: John Stultz <john.stultz@linaro.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: rename DEFAULT_MESSAGE_LOGLEVELAlex Elder
Commit a8fe19ebfbfd ("kernel/printk: use symbolic defines for console loglevels") makes consistent use of symbolic values for printk() log levels. The naming scheme used is different from the one used for DEFAULT_MESSAGE_LOGLEVEL though. Change that symbol name to be MESSAGE_LOGLEVEL_DEFAULT for consistency. And because the value of that symbol comes from a similarly-named config option, rename CONFIG_DEFAULT_MESSAGE_LOGLEVEL as well. Signed-off-by: Alex Elder <elder@linaro.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jan Kara <jack@suse.cz> Cc: John Stultz <john.stultz@linaro.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: tweak do_syslog() to match commentsAlex Elder
In do_syslog() there's a path used by kmsg_poll() and kmsg_read() that only needs to know whether there's any data available to read (and not its size). These callers only check for non-zero return. As a shortcut, do_syslog() returns the difference between what has been logged and what has been "seen." The comments say that the "count of records" should be returned but it's not. Instead it returns (log_next_idx - syslog_idx), which is a difference between buffer offsets--and the result could be negative. The behavior is the same (it'll be zero or not in the same cases), but the count of records is more meaningful and it matches what the comments say. So change the code to return that. Signed-off-by: Alex Elder <elder@linaro.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: allow increasing the ring buffer depending on the number of CPUsLuis R. Rodriguez
The default size of the ring buffer is too small for machines with a large amount of CPUs under heavy load. What ends up happening when debugging is the ring buffer overlaps and chews up old messages making debugging impossible unless the size is passed as a kernel parameter. An idle system upon boot up will on average spew out only about one or two extra lines but where this really matters is on heavy load and that will vary widely depending on the system and environment. There are mechanisms to help increase the kernel ring buffer for tracing through debugfs, and those interfaces even allow growing the kernel ring buffer per CPU. We also have a static value which can be passed upon boot. Relying on debugfs however is not ideal for production, and relying on the value passed upon bootup is can only used *after* an issue has creeped up. Instead of being reactive this adds a proactive measure which lets you scale the amount of contributions you'd expect to the kernel ring buffer under load by each CPU in the worst case scenario. We use num_possible_cpus() to avoid complexities which could be introduced by dynamically changing the ring buffer size at run time, num_possible_cpus() lets us use the upper limit on possible number of CPUs therefore avoiding having to deal with hotplugging CPUs on and off. This introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT which is used to specify the maximum amount of contributions to the kernel ring buffer in the worst case before the kernel ring buffer flips over, the size is specified as a power of 2. The total amount of contributions made by each CPU must be greater than half of the default kernel ring buffer size (1 << LOG_BUF_SHIFT bytes) in order to trigger an increase upon bootup. The kernel ring buffer is increased to the next power of two that would fit the required minimum kernel ring buffer size plus the additional CPU contribution. For example if LOG_BUF_SHIFT is 18 (256 KB) you'd require at least 128 KB contributions by other CPUs in order to trigger an increase of the kernel ring buffer. With a LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything over > 64 possible CPUs to trigger an increase. If you had 128 possible CPUs the amount of minimum required kernel ring buffer bumps to: ((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB Since we require the ring buffer to be a power of two the new required size would be 1024 KB. This CPU contributions are ignored when the "log_buf_len" kernel parameter is used as it forces the exact size of the ring buffer to an expected power of two value. [pmladek@suse.cz: fix build] Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.cz> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Tested-by: Petr Mladek <pmladek@suse.cz> Reviewed-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Petr Mladek <pmladek@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Arun KS <arunks.linux@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: make dynamic units clear for the kernel ring bufferLuis R. Rodriguez
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Suggested-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Petr Mladek <pmladek@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Arun KS <arunks.linux@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: move power of 2 practice of ring buffer size to a helperLuis R. Rodriguez
In practice the power of 2 practice of the size of the kernel ring buffer remains purely historical but not a requirement, specially now that we have LOG_ALIGN and use it for both static and dynamic allocations. It could have helped with implicit alignment back in the days given the even the dynamically sized ring buffer was guaranteed to be aligned so long as CONFIG_LOG_BUF_SHIFT was set to produce a __LOG_BUF_LEN which is architecture aligned, since log_buf_len=n would be allowed only if it was > __LOG_BUF_LEN and we always ended up rounding the log_buf_len=n to the next power of 2 with roundup_pow_of_two(), any multiple of 2 then should be also architecture aligned. These assumptions of course relied heavily on CONFIG_LOG_BUF_SHIFT producing an aligned value but users can always change this. We now have precise alignment requirements set for the log buffer size for both static and dynamic allocations, but lets upkeep the old practice of using powers of 2 for its size to help with easy expected scalable values and the allocators for dynamic allocations. We'll reuse this later so move this into a helper. Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Petr Mladek <pmladek@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Arun KS <arunks.linux@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06printk: make dynamic kernel ring buffer alignment explicitLuis R. Rodriguez
We have to consider alignment for the ring buffer both for the default static size, and then also for when an dynamic allocation is made when the log_buf_len=n kernel parameter is passed to set the size specifically to a size larger than the default size set by the architecture through CONFIG_LOG_BUF_SHIFT. The default static kernel ring buffer can be aligned properly if architectures set CONFIG_LOG_BUF_SHIFT properly, we provide ranges for the size though so even if CONFIG_LOG_BUF_SHIFT has a sensible aligned value it can be reduced to a non aligned value. Commit 6ebb017de9 ("printk: Fix alignment of buf causing crash on ARM EABI") by Andrew Lunn ensures the static buffer is always aligned and the decision of alignment is done by the compiler by using __alignof__(struct log). When log_buf_len=n is used we allocate the ring buffer dynamically. Dynamic allocation varies, for the early allocation called before setup_arch() memblock_virt_alloc() requests a page aligment and for the default kernel allocation memblock_virt_alloc_nopanic() requests no special alignment, which in turn ends up aligning the allocation to SMP_CACHE_BYTES, which is L1 cache aligned. Since we already have the required alignment for the kernel ring buffer though we can do better and request explicit alignment for LOG_ALIGN. This does that to be safe and make dynamic allocation alignment explicit. Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Tested-by: Petr Mladek <pmladek@suse.cz> Acked-by: Petr Mladek <pmladek@suse.cz> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Petr Mladek <pmladek@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Arun KS <arunks.linux@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06kernel/smp.c:on_each_cpu_cond(): fix warning in fallback pathSasha Levin
The rarely-executed memry-allocation-failed callback path generates a WARN_ON_ONCE() when smp_call_function_single() succeeds. Presumably it's supposed to warn on failures. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@gentwo.org> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Tejun Heo <htejun@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06mm, oom: remove unnecessary exit_state checkDavid Rientjes
The oom killer scans each process and determines whether it is eligible for oom kill or whether the oom killer should abort because of concurrent memory freeing. It will abort when an eligible process is found to have TIF_MEMDIE set, meaning it has already been oom killed and we're waiting for it to exit. Processes with task->mm == NULL should not be considered because they are either kthreads or have already detached their memory and killing them would not lead to memory freeing. That memory is only freed after exit_mm() has returned, however, and not when task->mm is first set to NULL. Clear TIF_MEMDIE after exit_mm()'s mmput() so that an oom killed process is no longer considered for oom kill, but only until exit_mm() has returned. This was fragile in the past because it relied on exit_notify() to be reached before no longer considering TIF_MEMDIE processes. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06mm, hugetlb: remove hugetlb_zero and hugetlb_infinityDavid Rientjes
They are unnecessary: "zero" can be used in place of "hugetlb_zero" and passing extra2 == NULL is equivalent to infinity. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06kernel/watchdog.c: convert printk/pr_warning to pr_foo()Fabian Frederick
Replace some obsolete functions. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06kernel/auditfilter.c: replace count*size kmalloc by kcallocFabian Frederick
kcalloc manages count*sizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06PM / hibernate: avoid unsafe pages in e820 reserved regionsLee, Chun-Yi
When the machine doesn't well handle the e820 persistent when hibernate resuming, then it may cause page fault when writing image to snapshot buffer: [ 17.929495] BUG: unable to handle kernel paging request at ffff880069d4f000 [ 17.933469] IP: [<ffffffff810a1cf0>] load_image_lzo+0x810/0xe40 [ 17.933469] PGD 2194067 PUD 77ffff067 PMD 2197067 PTE 0 [ 17.933469] Oops: 0002 [#1] SMP ... The ffff880069d4f000 page is in e820 reserved region of resume boot kernel: [ 0.000000] BIOS-e820: [mem 0x0000000069d4f000-0x0000000069e12fff] reserved ... [ 0.000000] PM: Registered nosave memory: [mem 0x69d4f000-0x69e12fff] So snapshot.c mark the pfn to forbidden pages map. But, this page is also in the memory bitmap in snapshot image because it's an original page used by image kernel, so it will also mark as an unsafe(free) page in prepare_image(). That means the page in e820 when resuming mark as "forbidden" and "free", it causes get_buffer() treat it as an allocated unsafe page. Then snapshot_write_next() return this page to load_image, load_image writing content to this address, but this page didn't really allocated . So, we got page fault. Although the root cause is from BIOS, I think aggressive check and significant message in kernel will better then a page fault for issue tracking, especially when serial console unavailable. This patch adds code in mark_unsafe_pages() for check does free pages in nosave region. If so, then it print message and return fault to stop whole S4 resume process: [ 8.166004] PM: Image loading progress: 0% [ 8.658717] PM: 0x6796c000 in e820 nosave region: [mem 0x6796c000-0x6796cfff] [ 8.918737] PM: Read 2511940 kbytes in 1.04 seconds (2415.32 MB/s) [ 8.926633] PM: Error -14 resuming [ 8.933534] PM: Failed to load hibernation image, recovering. Reviewed-by: Takashi Iwai <tiwai@suse.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Lee, Chun-Yi <jlee@suse.com> [rjw: Subject] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-08-06ring-buffer: Always reset iterator to reader pageSteven Rostedt (Red Hat)
When performing a consuming read, the ring buffer swaps out a page from the ring buffer with a empty page and this page that was swapped out becomes the new reader page. The reader page is owned by the reader and since it was swapped out of the ring buffer, writers do not have access to it (there's an exception to that rule, but it's out of scope for this commit). When reading the "trace" file, it is a non consuming read, which means that the data in the ring buffer will not be modified. When the trace file is opened, a ring buffer iterator is allocated and writes to the ring buffer are disabled, such that the iterator will not have issues iterating over the data. Although the ring buffer disabled writes, it does not disable other reads, or even consuming reads. If a consuming read happens, then the iterator is reset and starts reading from the beginning again. My tests would sometimes trigger this bug on my i386 box: WARNING: CPU: 0 PID: 5175 at kernel/trace/trace.c:1527 __trace_find_cmdline+0x66/0xaa() Modules linked in: CPU: 0 PID: 5175 Comm: grep Not tainted 3.16.0-rc3-test+ #8 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 00000000 00000000 f09c9e1c c18796b3 c1b5d74c f09c9e4c c103a0e3 c1b5154b f09c9e78 00001437 c1b5d74c 000005f7 c10bd85a c10bd85a c1cac57c f09c9eb0 ed0e0000 f09c9e64 c103a185 00000009 f09c9e5c c1b5154b f09c9e78 f09c9e80^M Call Trace: [<c18796b3>] dump_stack+0x4b/0x75 [<c103a0e3>] warn_slowpath_common+0x7e/0x95 [<c10bd85a>] ? __trace_find_cmdline+0x66/0xaa [<c10bd85a>] ? __trace_find_cmdline+0x66/0xaa [<c103a185>] warn_slowpath_fmt+0x33/0x35 [<c10bd85a>] __trace_find_cmdline+0x66/0xaa^M [<c10bed04>] trace_find_cmdline+0x40/0x64 [<c10c3c16>] trace_print_context+0x27/0xec [<c10c4360>] ? trace_seq_printf+0x37/0x5b [<c10c0b15>] print_trace_line+0x319/0x39b [<c10ba3fb>] ? ring_buffer_read+0x47/0x50 [<c10c13b1>] s_show+0x192/0x1ab [<c10bfd9a>] ? s_next+0x5a/0x7c [<c112e76e>] seq_read+0x267/0x34c [<c1115a25>] vfs_read+0x8c/0xef [<c112e507>] ? seq_lseek+0x154/0x154 [<c1115ba2>] SyS_read+0x54/0x7f [<c188488e>] syscall_call+0x7/0xb ---[ end trace 3f507febd6b4cc83 ]--- >>>> ##### CPU 1 buffer started #### Which was the __trace_find_cmdline() function complaining about the pid in the event record being negative. After adding more test cases, this would trigger more often. Strangely enough, it would never trigger on a single test, but instead would trigger only when running all the tests. I believe that was the case because it required one of the tests to be shutting down via delayed instances while a new test started up. After spending several days debugging this, I found that it was caused by the iterator becoming corrupted. Debugging further, I found out why the iterator became corrupted. It happened with the rb_iter_reset(). As consuming reads may not read the full reader page, and only part of it, there's a "read" field to know where the last read took place. The iterator, must also start at the read position. In the rb_iter_reset() code, if the reader page was disconnected from the ring buffer, the iterator would start at the head page within the ring buffer (where writes still happen). But the mistake there was that it still used the "read" field to start the iterator on the head page, where it should always start at zero because readers never read from within the ring buffer where writes occur. I originally wrote a patch to have it set the iter->head to 0 instead of iter->head_page->read, but then I questioned why it wasn't always setting the iter to point to the reader page, as the reader page is still valid. The list_empty(reader_page->list) just means that it was successful in swapping out. But the reader_page may still have data. There was a bug report a long time ago that was not reproducible that had something about trace_pipe (consuming read) not matching trace (iterator read). This may explain why that happened. Anyway, the correct answer to this bug is to always use the reader page an not reset the iterator to inside the writable ring buffer. Cc: stable@vger.kernel.org # 2.6.28+ Fixes: d769041f8653 "ring_buffer: implement new locking" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-08-06ring-buffer: Up rb_iter_peek() loop count to 3Steven Rostedt (Red Hat)
After writting a test to try to trigger the bug that caused the ring buffer iterator to become corrupted, I hit another bug: WARNING: CPU: 1 PID: 5281 at kernel/trace/ring_buffer.c:3766 rb_iter_peek+0x113/0x238() Modules linked in: ipt_MASQUERADE sunrpc [...] CPU: 1 PID: 5281 Comm: grep Tainted: G W 3.16.0-rc3-test+ #143 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007 0000000000000000 ffffffff81809a80 ffffffff81503fb0 0000000000000000 ffffffff81040ca1 ffff8800796d6010 ffffffff810c138d ffff8800796d6010 ffff880077438c80 ffff8800796d6010 ffff88007abbe600 0000000000000003 Call Trace: [<ffffffff81503fb0>] ? dump_stack+0x4a/0x75 [<ffffffff81040ca1>] ? warn_slowpath_common+0x7e/0x97 [<ffffffff810c138d>] ? rb_iter_peek+0x113/0x238 [<ffffffff810c138d>] ? rb_iter_peek+0x113/0x238 [<ffffffff810c14df>] ? ring_buffer_iter_peek+0x2d/0x5c [<ffffffff810c6f73>] ? tracing_iter_reset+0x6e/0x96 [<ffffffff810c74a3>] ? s_start+0xd7/0x17b [<ffffffff8112b13e>] ? kmem_cache_alloc_trace+0xda/0xea [<ffffffff8114cf94>] ? seq_read+0x148/0x361 [<ffffffff81132d98>] ? vfs_read+0x93/0xf1 [<ffffffff81132f1b>] ? SyS_read+0x60/0x8e [<ffffffff8150bf9f>] ? tracesys+0xdd/0xe2 Debugging this bug, which triggers when the rb_iter_peek() loops too many times (more than 2 times), I discovered there's a case that can cause that function to legitimately loop 3 times! rb_iter_peek() is different than rb_buffer_peek() as the rb_buffer_peek() only deals with the reader page (it's for consuming reads). The rb_iter_peek() is for traversing the buffer without consuming it, and as such, it can loop for one more reason. That is, if we hit the end of the reader page or any page, it will go to the next page and try again. That is, we have this: 1. iter->head > iter->head_page->page->commit (rb_inc_iter() which moves the iter to the next page) try again 2. event = rb_iter_head_event() event->type_len == RINGBUF_TYPE_TIME_EXTEND rb_advance_iter() try again 3. read the event. But we never get to 3, because the count is greater than 2 and we cause the WARNING and return NULL. Up the counter to 3. Cc: stable@vger.kernel.org # 2.6.37+ Fixes: 69d1b839f7ee "ring-buffer: Bind time extend and data events together" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-08-06cpuidle: menu: Lookup CPU runqueues lessMel Gorman
The menu governer makes separate lookups of the CPU runqueue to get load and number of IO waiters but it can be done with a single lookup. Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights: 1) Steady transitioning of the BPF instructure to a generic spot so all kernel subsystems can make use of it, from Alexei Starovoitov. 2) SFC driver supports busy polling, from Alexandre Rames. 3) Take advantage of hash table in UDP multicast delivery, from David Held. 4) Lighten locking, in particular by getting rid of the LRU lists, in inet frag handling. From Florian Westphal. 5) Add support for various RFC6458 control messages in SCTP, from Geir Ola Vaagland. 6) Allow to filter bridge forwarding database dumps by device, from Jamal Hadi Salim. 7) virtio-net also now supports busy polling, from Jason Wang. 8) Some low level optimization tweaks in pktgen from Jesper Dangaard Brouer. 9) Add support for ipv6 address generation modes, so that userland can have some input into the process. From Jiri Pirko. 10) Consolidate common TCP connection request code in ipv4 and ipv6, from Octavian Purdila. 11) New ARP packet logger in netfilter, from Pablo Neira Ayuso. 12) Generic resizable RCU hash table, with intial users in netlink and nftables. From Thomas Graf. 13) Maintain a name assignment type so that userspace can see where a network device name came from (enumerated by kernel, assigned explicitly by userspace, etc.) From Tom Gundersen. 14) Automatic flow label generation on transmit in ipv6, from Tom Herbert. 15) New packet timestamping facilities from Willem de Bruijn, meant to assist in measuring latencies going into/out-of the packet scheduler, latency from TCP data transmission to ACK, etc" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits) cxgb4 : Disable recursive mailbox commands when enabling vi net: reduce USB network driver config options. tg3: Modify tg3_tso_bug() to handle multiple TX rings amd-xgbe: Perform phy connect/disconnect at dev open/stop amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask net: sun4i-emac: fix memory leak on bad packet sctp: fix possible seqlock seadlock in sctp_packet_transmit() Revert "net: phy: Set the driver when registering an MDIO bus device" cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine team: Simplify return path of team_newlink bridge: Update outdated comment on promiscuous mode net-timestamp: ACK timestamp for bytestreams net-timestamp: TCP timestamping net-timestamp: SCHED timestamp on entering packet scheduler net-timestamp: add key to disambiguate concurrent datagrams net-timestamp: move timestamp flags out of sk_flags net-timestamp: extend SCM_TIMESTAMPING ancillary data struct cxgb4i : Move stray CPL definitions to cxgb4 driver tcp: reduce spurious retransmits due to transient SACK reneging qlcnic: Initialize dcbnl_ops before register_netdev ...
2014-08-06Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "In this release: - PKCS#7 parser for the key management subsystem from David Howells - appoint Kees Cook as seccomp maintainer - bugfixes and general maintenance across the subsystem" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (94 commits) X.509: Need to export x509_request_asymmetric_key() netlabel: shorter names for the NetLabel catmap funcs/structs netlabel: fix the catmap walking functions netlabel: fix the horribly broken catmap functions netlabel: fix a problem when setting bits below the previously lowest bit PKCS#7: X.509 certificate issuer and subject are mandatory fields in the ASN.1 tpm: simplify code by using %*phN specifier tpm: Provide a generic means to override the chip returned timeouts tpm: missing tpm_chip_put in tpm_get_random() tpm: Properly clean sysfs entries in error path tpm: Add missing tpm_do_selftest to ST33 I2C driver PKCS#7: Use x509_request_asymmetric_key() Revert "selinux: fix the default socket labeling in sock_graft()" X.509: x509_request_asymmetric_keys() doesn't need string length arguments PKCS#7: fix sparse non static symbol warning KEYS: revert encrypted key change ima: add support for measuring and appraising firmware firmware_class: perform new LSM checks security: introduce kernel_fw_from_file hook PKCS#7: Missing inclusion of linux/err.h ...
2014-08-06Rip out get_signal_to_deliver()Richard Weinberger
Now we can turn get_signal() to the main function. Signed-off-by: Richard Weinberger <richard@nod.at>
2014-08-06Clean up signal_delivered()Richard Weinberger
- Pass a ksignal struct to it - Remove unused regs parameter - Make it private as it's nowhere outside of kernel/signal.c is used Signed-off-by: Richard Weinberger <richard@nod.at>
2014-08-06tracehook_signal_handler: Remove sig, info, ka and regsRichard Weinberger
These parameters are nowhere used, so we can remove them. Signed-off-by: Richard Weinberger <richard@nod.at>
2014-08-05Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer and time updates from Thomas Gleixner: "A rather large update of timers, timekeeping & co - Core timekeeping code is year-2038 safe now for 32bit machines. Now we just need to fix all in kernel users and the gazillion of user space interfaces which rely on timespec/timeval :) - Better cache layout for the timekeeping internal data structures. - Proper nanosecond based interfaces for in kernel users. - Tree wide cleanup of code which wants nanoseconds but does hoops and loops to convert back and forth from timespecs. Some of it definitely belongs into the ugly code museum. - Consolidation of the timekeeping interface zoo. - A fast NMI safe accessor to clock monotonic for tracing. This is a long standing request to support correlated user/kernel space traces. With proper NTP frequency correction it's also suitable for correlation of traces accross separate machines. - Checkpoint/restart support for timerfd. - A few NOHZ[_FULL] improvements in the [hr]timer code. - Code move from kernel to kernel/time of all time* related code. - New clocksource/event drivers from the ARM universe. I'm really impressed that despite an architected timer in the newer chips SoC manufacturers insist on inventing new and differently broken SoC specific timers. [ Ed. "Impressed"? I don't think that word means what you think it means ] - Another round of code move from arch to drivers. Looks like most of the legacy mess in ARM regarding timers is sorted out except for a few obnoxious strongholds. - The usual updates and fixlets all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits) timekeeping: Fixup typo in update_vsyscall_old definition clocksource: document some basic timekeeping concepts timekeeping: Use cached ntp_tick_length when accumulating error timekeeping: Rework frequency adjustments to work better w/ nohz timekeeping: Minor fixup for timespec64->timespec assignment ftrace: Provide trace clocks monotonic timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC seqcount: Add raw_write_seqcount_latch() seqcount: Provide raw_read_seqcount() timekeeping: Use tk_read_base as argument for timekeeping_get_ns() timekeeping: Create struct tk_read_base and use it in struct timekeeper timekeeping: Restructure the timekeeper some more clocksource: Get rid of cycle_last clocksource: Move cycle_last validation to core code clocksource: Make delta calculation a function wireless: ath9k: Get rid of timespec conversions drm: vmwgfx: Use nsec based interfaces drm: i915: Use nsec based interfaces timekeeping: Provide ktime_get_raw() hangcheck-timer: Use ktime_get_ns() ...
2014-08-05Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Nothing spectacular from the irq department this time: - overhaul of the crossbar chip driver - overhaul of the spear shirq chip driver - support for the atmel-aic chip - code move from arch to drivers - the usual tiny fixlets - two reverts worth to mention which undo the too simple attempt of supporting wakeup interrupts on shared interrupt lines" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) Revert "irq: Warn when shared interrupts do not match on NO_SUSPEND" Revert "PM / sleep / irq: Do not suspend wakeup interrupts" irq: Warn when shared interrupts do not match on NO_SUSPEND irqchip: atmel-aic: Define irq fixups for atmel SoCs irqchip: atmel-aic: Implement RTC irq fixup irqchip: atmel-aic: Add irq fixup infrastructure irqchip: atmel-aic: Add atmel AIC/AIC5 drivers irqchip: atmel-aic: Move binding doc to interrupt-controller directory genirq: generic chip: Export irq_map_generic_chip function PM / sleep / irq: Do not suspend wakeup interrupts irqchip: or1k-pic: Migrate from arch/openrisc/ irqchip: crossbar: Allow for quirky hardware with direct hardwiring of GIC documentation: dt: omap: crossbar: Add description for interrupt consumer irqchip: crossbar: Introduce centralized check for crossbar write irqchip: crossbar: Introduce ti, max-crossbar-sources to identify valid crossbar mapping irqchip: crossbar: Add kerneldoc for crossbar_domain_unmap callback irqchip: crossbar: Set cb pointer to null in case of error irqchip: crossbar: Change the goto naming irqchip: crossbar: Return proper error value irqchip: crossbar: Fix kerneldoc warning ...
2014-08-05Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: PM / Hibernate: Touch Soft Lockup Watchdog in rtree_next_node PM / Hibernate: Remove the old memory-bitmap implementation PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free() PM / Hibernate: Implement position keeping in radix tree PM / Hibernate: Add memory_rtree_find_bit function PM / Hibernate: Create a Radix-Tree to store memory bitmap PM / sleep: fix kernel-doc warnings in drivers/base/power/main.c
2014-08-05Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: cpuidle: Remove time measurement in poll state cpuidle: Remove manual selection of the multiple driver support cpuidle: ladder governor - use macro instead of hardcoded value cpuidle: big_little: Fix build error cpuidle: menu governor - remove unused macro STDDEV_THRESH cpuidle: fix permission for driver name sysfs node cpuidle: move idle traces to cpuidle_enter_state()
2014-08-04Merge tag 'staging-3.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver updates from Greg KH: "Here's the big pull request for the staging driver tree for 3.17-rc1. Lots of things in here, over 2000 patches, but the best part is this: 1480 files changed, 39070 insertions(+), 254659 deletions(-) Thanks to the great work of Kristina Martšenko, 14 different staging drivers have been removed from the tree as they were obsolete and no one was willing to work on cleaning them up. Other than the driver removals, loads of cleanups are in here (comedi, lustre, etc.) as well as the usual IIO driver updates and additions. All of this has been in the linux-next tree for a while" * tag 'staging-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (2199 commits) staging: comedi: addi_apci_1564: remove diagnostic interrupt support code staging: comedi: addi_apci_1564: add subdevice to check diagnostic status staging: wlan-ng: coding style problem fix staging: wlan-ng: fixing coding style problems staging: comedi: ii_pci20kc: request and ioremap memory staging: lustre: bitwise vs logical typo staging: dgnc: Remove unneeded dgnc_trace.c and dgnc_trace.h staging: dgnc: rephrase comment staging: comedi: ni_tio: remove some dead code staging: rtl8723au: Fix static symbol sparse warning staging: rtl8723au: usb_dvobj_init(): Remove unused variable 'pdev_desc' staging: rtl8723au: Do not duplicate kernel provided USB macros staging: rtl8723au: Remove never set struct pwrctrl_priv.bHWPowerdown staging: rtl8723au: Remove two never set variables staging: rtl8723au: RSSI_test is never set staging:r8190: coding style: Fixed checkpatch reported Error staging:r8180: coding style: Fixed too long lines staging:r8180: coding style: Fixed commenting style staging: lustre: ptlrpc: lproc_ptlrpc.c - fix dereferenceing user space buffer staging: lustre: ldlm: ldlm_resource.c - fix dereferenceing user space buffer ...
2014-08-04Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Move the nohz kick code out of the scheduler tick to a dedicated IPI, from Frederic Weisbecker. This necessiated quite some background infrastructure rework, including: * Clean up some irq-work internals * Implement remote irq-work * Implement nohz kick on top of remote irq-work * Move full dynticks timer enqueue notification to new kick * Move multi-task notification to new kick * Remove unecessary barriers on multi-task notification - Remove proliferation of wait_on_bit() action functions and allow wait_on_bit_action() functions to support a timeout. (Neil Brown) - Another round of sched/numa improvements, cleanups and fixes. (Rik van Riel) - Implement fast idling of CPUs when the system is partially loaded, for better scalability. (Tim Chen) - Restructure and fix the CPU hotplug handling code that may leave cfs_rq and rt_rq's throttled when tasks are migrated away from a dead cpu. (Kirill Tkhai) - Robustify the sched topology setup code. (Peterz Zijlstra) - Improve sched_feat() handling wrt. static_keys (Jason Baron) - Misc fixes. * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) sched/fair: Fix 'make xmldocs' warning caused by missing description sched: Use macro for magic number of -1 for setparam sched: Robustify topology setup sched: Fix sched_setparam() policy == -1 logic sched: Allow wait_on_bit_action() functions to support a timeout sched: Remove proliferation of wait_on_bit() action functions sched/numa: Revert "Use effective_load() to balance NUMA loads" sched: Fix static_key race with sched_feat() sched: Remove extra static_key*() function indirection sched/rt: Fix replenish_dl_entity() comments to match the current upstream code sched: Transform resched_task() into resched_curr() sched/deadline: Kill task_struct->pi_top_task sched: Rework check_for_tasks() sched/rt: Enqueue just unthrottled rt_rq back on the stack in __disable_runtime() sched/fair: Disable runtime_enabled on dying rq sched/numa: Change scan period code to match intent sched/numa: Rework best node setting in task_numa_migrate() sched/numa: Examine a task move when examining a task swap sched/numa: Simplify task_numa_compare() sched/numa: Use effective_load() to balance NUMA loads ...
2014-08-04Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Kernel side changes: - Consolidate the PMU interrupt-disabled code amongst architectures (Vince Weaver) - misc fixes Tooling changes (new features, user visible changes): - Add support for pagefault tracing in 'trace', please see multiple examples in the changeset messages (Stanislav Fomichev). - Add pagefault statistics in 'trace' (Stanislav Fomichev) - Add header for columns in 'top' and 'report' TUI browsers (Jiri Olsa) - Add pagefault statistics in 'trace' (Stanislav Fomichev) - Add IO mode into timechart command (Stanislav Fomichev) - Fallback to syscalls:* when raw_syscalls:* is not available in the perl and python perf scripts. (Daniel Bristot de Oliveira) - Add --repeat global option to 'perf bench' to be used in benchmarks such as the existing 'futex' one, that was modified to use it instead of a local option. (Davidlohr Bueso) - Fix fd -> pathname resolution in 'trace', be it using /proc or a vfs_getname probe point. (Arnaldo Carvalho de Melo) - Add suggestion of how to set perf_event_paranoid sysctl, to help non-root users trying tools like 'trace' to get a working environment. (Arnaldo Carvalho de Melo) - Updates from trace-cmd for traceevent plugin_kvm plus args cleanup (Steven Rostedt, Jan Kiszka) - Support S/390 in 'perf kvm stat' (Alexander Yarygin) Tooling infrastructure changes: - Allow reserving a row for header purposes in the hists browser (Arnaldo Carvalho de Melo) - Various fixes and prep work related to supporting Intel PT (Adrian Hunter) - Introduce multiple debug variables control (Jiri Olsa) - Add callchain and additional sample information for python scripts (Joseph Schuchart) - More prep work to support Intel PT: (Adrian Hunter) - Polishing 'script' BTS output - 'inject' can specify --kallsym - VDSO is per machine, not a global var - Expose data addr lookup functions previously private to 'script' - Large mmap fixes in events processing - Include standard stringify macros in power pc code (Sukadev Bhattiprolu) Tooling cleanups: - Convert open coded equivalents to asprintf() (Andy Shevchenko) - Remove needless reassignments in 'trace' (Arnaldo Carvalho de Melo) - Cache the is_exit syscall test in 'trace) (Arnaldo Carvalho de Melo) - No need to reimplement err() in 'perf bench sched-messaging', drop barf(). (Davidlohr Bueso). - Remove ev_name argument from perf_evsel__hists_browse, can be obtained from the other parameters. (Jiri Olsa) Tooling fixes: - Fix memory leak in the 'sched-messaging' perf bench test. (Davidlohr Bueso) - The -o and -n 'perf bench mem' options are mutually exclusive, emit error when both are specified. (Davidlohr Bueso) - Fix scrollbar refresh row index in the ui browser, problem exposed now that headers will be added and will be allowed to be switched on/off. (Jiri Olsa) - Handle the num array type in python properly (Sebastian Andrzej Siewior) - Fix wrong condition for allocation failure (Jiri Olsa) - Adjust callchain based on DWARF debug info on powerpc (Sukadev Bhattiprolu) - Fix a risk for doing free on uninitialized pointer in traceevent lib (Rickard Strandqvist) - Update attr test with PERF_FLAG_FD_CLOEXEC flag (Jiri Olsa) - Enable close-on-exec flag on perf file descriptor (Yann Droneaud) - Fix build on gcc 4.4.7 (Arnaldo Carvalho de Melo) - Event ordering fixes (Jiri Olsa)" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (123 commits) Revert "perf tools: Fix jump label always changing during tracing" perf tools: Fix perf usage string leftover perf: Check permission only for parent tracepoint event perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds perf record: Always force PERF_RECORD_FINISHED_ROUND event perf inject: Add --kallsyms parameter perf tools: Expose 'addr' functions so they can be reused perf session: Fix accounting of ordered samples queue perf powerpc: Include util/util.h and remove stringify macros perf tools: Fix build on gcc 4.4.7 perf tools: Add thread parameter to vdso__dso_findnew() perf tools: Add dso__type() perf tools: Separate the VDSO map name from the VDSO dso name perf tools: Add vdso__new() perf machine: Fix the lifetime of the VDSO temporary file perf tools: Group VDSO global variables into a structure perf session: Add ability to skip 4GiB or more perf session: Add ability to 'skip' a non-piped event stream perf tools: Pass machine to vdso__dso_findnew() perf tools: Add dso__data_size() ...
2014-08-04Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle are: - big rtmutex and futex cleanup and robustification from Thomas Gleixner - mutex optimizations and refinements from Jason Low - arch_mutex_cpu_relax() removal and related cleanups - smaller lockdep tweaks" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) arch, locking: Ciao arch_mutex_cpu_relax() locking/lockdep: Only ask for /proc/lock_stat output when available locking/mutexes: Optimize mutex trylock slowpath locking/mutexes: Try to acquire mutex only if it is unlocked locking/mutexes: Delete the MUTEX_SHOW_NO_WAITER macro locking/mutexes: Correct documentation on mutex optimistic spinning rtmutex: Make the rtmutex tester depend on BROKEN futex: Simplify futex_lock_pi_atomic() and make it more robust futex: Split out the first waiter attachment from lookup_pi_state() futex: Split out the waiter check from lookup_pi_state() futex: Use futex_top_waiter() in lookup_pi_state() futex: Make unlock_pi more robust rtmutex: Avoid pointless requeueing in the deadlock detection chain walk rtmutex: Cleanup deadlock detector debug logic rtmutex: Confine deadlock logic to futex rtmutex: Simplify remove_waiter() rtmutex: Document pi chain walk rtmutex: Clarify the boost/deboost part rtmutex: No need to keep task ref for lock owner check rtmutex: Simplify and document try_to_take_rtmutex() ...
2014-08-04Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molar: "The main changes: - torture-test updates - callback-offloading changes - maintainership changes - update RCU documentation - miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) rcu: Allow for NULL tick_nohz_full_mask when nohz_full= missing rcu: Fix a sparse warning in rcu_report_unblock_qs_rnp() rcu: Fix a sparse warning in rcu_initiate_boost() rcu: Fix __rcu_reclaim() to use true/false for bool rcu: Remove CONFIG_PROVE_RCU_DELAY rcu: Use __this_cpu_read() instead of per_cpu_ptr() rcu: Don't use NMIs to dump other CPUs' stacks rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs rcu: Simplify priority boosting by putting rt_mutex in rcu_node rcu: Check both root and current rcu_node when setting up future grace period rcu: Allow post-unlock reference for rt_mutex rcu: Loosen __call_rcu()'s rcu_head alignment constraint rcu: Eliminate read-modify-write ACCESS_ONCE() calls rcu: Remove redundant ACCESS_ONCE() from tick_do_timer_cpu rcu: Make rcu node arrays static const char * const signal: Explain local_irq_save() call rcu: Handle obsolete references to TINY_PREEMPT_RCU rcu: Document deadlock-avoidance information for rcu_read_unlock() scripts: Teach get_maintainer.pl about the new "R:" tag rcu: Update rcu torture maintainership filename patterns ...
2014-08-04Merge tag 'trace-3.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing filter cleanups from Steven Rostedt: "Oleg Nesterov did several clean ups with the tracing filter code. As he found some small bugs that went into 3.16, and these changes were based on that, I had to apply his changes to a separate branch than my main development branch. This was based on work that was already pulled into 3.16, and is a separate pull request to keep from having local merges in my pull request" * tag 'trace-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Kill "filter_string" arg of replace_preds() tracing: Change apply_subsystem_event_filter() paths to check file->system == dir tracing: Kill ftrace_event_call->files tracing/uprobes: Kill the dead TRACE_EVENT_FL_USE_CALL_FILTER logic tracing: Kill call_filter_disable() tracing: Kill destroy_call_preds() tracing: Kill destroy_preds() and destroy_file_preds()
2014-08-04Merge tag 'trace-3.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "This pull request has a lot of work done. The main thing is the changes to the ftrace function callback infrastructure. It's introducing a way to allow different functions to call directly different trampolines instead of all calling the same "mcount" one. The only user of this for now is the function graph tracer, which always had a different trampoline, but the function tracer trampoline was called and did basically nothing, and then the function graph tracer trampoline was called. The difference now, is that the function graph tracer trampoline can be called directly if a function is only being traced by the function graph trampoline. If function tracing is also happening on the same function, the old way is still done. The accounting for this takes up more memory when function graph tracing is activated, as it needs to keep track of which functions it uses. I have a new way that wont take as much memory, but it's not ready yet for this merge window, and will have to wait for the next one. Another big change was the removal of the ftrace_start/stop() calls that were used by the suspend/resume code that stopped function tracing when entering into suspend and resume paths. The stop of ftrace was done because there was some function that would crash the system if one called smp_processor_id()! The stop/start was a big hammer to solve the issue at the time, which was when ftrace was first introduced into Linux. Now ftrace has better infrastructure to debug such issues, and I found the problem function and labeled it with "notrace" and function tracing can now safely be activated all the way down into the guts of suspend and resume Other changes include clean ups of uprobe code, clean up of the trace_seq() code, and other various small fixes and clean ups to ftrace and tracing" * tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits) ftrace: Add warning if tramp hash does not match nr_trampolines ftrace: Fix trampoline hash update check on rec->flags ring-buffer: Use rb_page_size() instead of open coded head_page size ftrace: Rename ftrace_ops field from trampolines to nr_trampolines tracing: Convert local function_graph functions to static ftrace: Do not copy old hash when resetting tracing: let user specify tracing_thresh after selecting function_graph ring-buffer: Always run per-cpu ring buffer resize with schedule_work_on() tracing: Remove function_trace_stop and HAVE_FUNCTION_TRACE_MCOUNT_TEST s390/ftrace: remove check of obsolete variable function_trace_stop arm64, ftrace: Remove check of obsolete variable function_trace_stop Blackfin: ftrace: Remove check of obsolete variable function_trace_stop metag: ftrace: Remove check of obsolete variable function_trace_stop microblaze: ftrace: Remove check of obsolete variable function_trace_stop MIPS: ftrace: Remove check of obsolete variable function_trace_stop parisc: ftrace: Remove check of obsolete variable function_trace_stop sh: ftrace: Remove check of obsolete variable function_trace_stop sparc64,ftrace: Remove check of obsolete variable function_trace_stop tile: ftrace: Remove check of obsolete variable function_trace_stop ftrace: x86: Remove check of obsolete variable function_trace_stop ...
2014-08-04Merge branch 'for-3.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup changes from Tejun Heo: "Mostly changes to get the v2 interface ready. The core features are mostly ready now and I think it's reasonable to expect to drop the devel mask in one or two devel cycles at least for a subset of controllers. - cgroup added a controller dependency mechanism so that block cgroup can depend on memory cgroup. This will be used to finally support IO provisioning on the writeback traffic, which is currently being implemented. - The v2 interface now uses a separate table so that the interface files for the new interface are explicitly declared in one place. Each controller will explicitly review and add the files for the new interface. - cpuset is getting ready for the hierarchical behavior which is in the similar style with other controllers so that an ancestor's configuration change doesn't change the descendants' configurations irreversibly and processes aren't silently migrated when a CPU or node goes down. All the changes are to the new interface and no behavior changed for the multiple hierarchies" * 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (29 commits) cpuset: fix the WARN_ON() in update_nodemasks_hier() cgroup: initialize cgrp_dfl_root_inhibit_ss_mask from !->dfl_files test cgroup: make CFTYPE_ONLY_ON_DFL and CFTYPE_NO_ internal to cgroup core cgroup: distinguish the default and legacy hierarchies when handling cftypes cgroup: replace cgroup_add_cftypes() with cgroup_add_legacy_cftypes() cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes cgroup: split cgroup_base_files[] into cgroup_{dfl|legacy}_base_files[] cpuset: export effective masks to userspace cpuset: allow writing offlined masks to cpuset.cpus/mems cpuset: enable onlined cpu/node in effective masks cpuset: refactor cpuset_hotplug_update_tasks() cpuset: make cs->{cpus, mems}_allowed as user-configured masks cpuset: apply cs->effective_{cpus,mems} cpuset: initialize top_cpuset's configured masks at mount cpuset: use effective cpumask to build sched domains cpuset: inherit ancestor's masks if effective_{cpus, mems} becomes empty cpuset: update cs->effective_{cpus, mems} when config changes cpuset: update cpuset->effective_{cpus,mems} at hotplug cpuset: add cs->effective_cpus and cs->effective_mems cgroup: clean up sane_behavior handling ...
2014-08-04Merge branch 'for-3.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: - Major reorganization of percpu header files which I think makes things a lot more readable and logical than before. - percpu-refcount is updated so that it requires explicit destruction and can be reinitialized if necessary. This was pulled into the block tree to replace the custom percpu refcnting implemented in blk-mq. - In the process, percpu and percpu-refcount got cleaned up a bit * 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (21 commits) percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero() percpu-refcount: require percpu_ref to be exited explicitly percpu-refcount: use unsigned long for pcpu_count pointer percpu-refcount: add helpers for ->percpu_count accesses percpu-refcount: one bit is enough for REF_STATUS percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc() workqueue: stronger test in process_one_work() workqueue: clear POOL_DISASSOCIATED in rebind_workers() percpu: Use ALIGN macro instead of hand coding alignment calculation percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations percpu: preffity percpu header files percpu: use raw_cpu_*() to define __this_cpu_*() percpu: reorder macros in percpu header files percpu: move {raw|this}_cpu_*() definitions to include/linux/percpu-defs.h percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h percpu: only allow sized arch overrides for {raw|this}_cpu_*() ops percpu: reorganize include/linux/percpu-defs.h percpu: move accessors from include/linux/percpu.h to percpu-defs.h percpu: include/asm-generic/percpu.h should contain only arch-overridable parts percpu: introduce arch_raw_cpu_ptr() ...
2014-08-04Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue updates from Tejun Heo: "Lai has been doing a lot of cleanups of workqueue and kthread_work. No significant behavior change. Just a lot of cleanups all over the place. Some are a bit invasive but overall nothing too dangerous" * 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: kthread_work: remove the unused wait_queue_head kthread_work: wake up worker only when the worker is idle workqueue: use nr_node_ids instead of wq_numa_tbl_len workqueue: remove the misnamed out_unlock label in get_unbound_pool() workqueue: remove the stale comment in pwq_unbound_release_workfn() workqueue: move rescuer pool detachment to the end workqueue: unfold start_worker() into create_worker() workqueue: remove @wakeup from worker_set_flags() workqueue: remove an unneeded UNBOUND test before waking up the next worker workqueue: wake regular worker if need_more_worker() when rescuer leave the pool workqueue: alloc struct worker on its local node workqueue: reuse the already calculated pwq in try_to_grab_pending() workqueue: stronger test in process_one_work() workqueue: clear POOL_DISASSOCIATED in rebind_workers() workqueue: sanity check pool->cpu in wq_worker_sleeping() workqueue: clear leftover flags when detached workqueue: remove useless WARN_ON_ONCE() workqueue: use schedule_timeout_interruptible() instead of open code workqueue: remove the empty check in too_many_workers() workqueue: use "pool->cpu < 0" to stand for an unbound pool
2014-08-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto update from Herbert Xu: - CTR(AES) optimisation on x86_64 using "by8" AVX. - arm64 support to ccp - Intel QAT crypto driver - Qualcomm crypto engine driver - x86-64 assembly optimisation for 3DES - CTR(3DES) speed test - move FIPS panic from module.c so that it only triggers on crypto modules - SP800-90A Deterministic Random Bit Generator (drbg). - more test vectors for ghash. - tweak self tests to catch partial block bugs. - misc fixes. * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (94 commits) crypto: drbg - fix failure of generating multiple of 2**16 bytes crypto: ccp - Do not sign extend input data to CCP crypto: testmgr - add missing spaces to drbg error strings crypto: atmel-tdes - Switch to managed version of kzalloc crypto: atmel-sha - Switch to managed version of kzalloc crypto: testmgr - use chunks smaller than algo block size in chunk tests crypto: qat - Fixed SKU1 dev issue crypto: qat - Use hweight for bit counting crypto: qat - Updated print outputs crypto: qat - change ae_num to ae_id crypto: qat - change slice->regions to slice->region crypto: qat - use min_t macro crypto: qat - remove unnecessary parentheses crypto: qat - remove unneeded header crypto: qat - checkpatch blank lines crypto: qat - remove unnecessary return codes crypto: Resolve shadow warnings crypto: ccp - Remove "select OF" from Kconfig crypto: caam - fix DECO RSR polling crypto: qce - Let 'DEV_QCE' depend on both HAS_DMA and HAS_IOMEM ...