summaryrefslogtreecommitdiff
path: root/init/main.c
AgeCommit message (Collapse)Author
2012-11-19pidns: Consolidate initialzation of special init task stateEric W. Biederman
Instead of setting child_reaper and SIGNAL_UNKILLABLE one way for the system init process, and another way for pid namespace init processes test pid->nr == 1 and use the same code for both. For the global init this results in SIGNAL_UNKILLABLE being set much earlier in the initialization process. This is a small cleanup and it paves the way for allowing unshare and enter of the pid namespace as that path like our global init also will not set CLONE_NEWPID. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-11-02FRV: gcc-4.1.2 also inlines weak functionsDavid Howells
gcc-4.1.2 inlines weak functions, which causes FRV to fail when the dummy thread_info_cache_init() gets inlined into start_kernel(). Signed-off-by: David Howells <dhowells@redhat.com>
2012-10-30x86-64/efi: Use EFI to deal with platform wall clock (again)Jan Beulich
Other than ix86, x86-64 on EFI so far didn't set the {g,s}et_wallclock accessors to the EFI routines, thus incorrectly using raw RTC accesses instead. Simply removing the #ifdef around the respective code isn't enough, however: While so far early get-time calls were done in physical mode, this doesn't work properly for x86-64, as virtual addresses would still need to be set up for all runtime regions (which wasn't the case on the system I have access to), so instead the patch moves the call to efi_enter_virtual_mode() ahead (which in turn allows to drop all code related to calling efi-get-time in physical mode). Additionally the earlier calling of efi_set_executable() requires the CPA code to cope, i.e. during early boot it must be avoided to call cpa_flush_array(), as the first thing this function does is a BUG_ON(irqs_disabled()). Also make the two EFI functions in question here static - they're not being referenced elsewhere. History: This commit was originally merged as bacef661acdb ("x86-64/efi: Use EFI to deal with platform wall clock") but it resulted in some ASUS machines no longer booting due to a firmware bug, and so was reverted in f026cfa82f62. A pre-emptive fix for the buggy ASUS firmware was merged in 03a1c254975e ("x86, efi: 1:1 pagetable mapping for virtual EFI calls") so now this patch can be reapplied. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Matt Fleming <matt.fleming@intel.com> Acked-by: Matthew Garrett <mjg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> [added commit history]
2012-10-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull third pile of kernel_execve() patches from Al Viro: "The last bits of infrastructure for kernel_thread() et.al., with alpha/arm/x86 use of those. Plus sanitizing the asm glue and do_notify_resume() on alpha, fixing the "disabled irq while running task_work stuff" breakage there. At that point the rest of kernel_thread/kernel_execve/sys_execve work can be done independently for different architectures. The only pending bits that do depend on having all architectures converted are restrictred to fs/* and kernel/* - that'll obviously have to wait for the next cycle. I thought we'd have to wait for all of them done before we start eliminating the longjump-style insanity in kernel_execve(), but it turned out there's a very simple way to do that without flagday-style changes." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to saner kernel_execve() semantics arm: switch to saner kernel_execve() semantics x86, um: convert to saner kernel_execve() semantics infrastructure for saner ret_from_kernel_thread semantics make sure that kernel_thread() callbacks call do_exit() themselves make sure that we always have a return path from kernel_execve() ppc: eeh_event should just use kthread_run() don't bother with kernel_thread/kernel_execve for launching linuxrc alpha: get rid of switch_stack argument of do_work_pending() alpha: don't bother passing switch_stack separately from regs alpha: take SIGPENDING/NOTIFY_RESUME loop into signal.c alpha: simplify TIF_NEED_RESCHED handling
2012-10-12infrastructure for saner ret_from_kernel_thread semanticsAl Viro
* allow kernel_execve() leave the actual return to userland to caller (selected by CONFIG_GENERIC_KERNEL_EXECVE). Callers updated accordingly. * architecture that does select GENERIC_KERNEL_EXECVE in its Kconfig should have its ret_from_kernel_thread() do this: call schedule_tail call the callback left for it by copy_thread(); if it ever returns, that's because it has just done successful kernel_execve() jump to return from syscall IOW, its only difference from ret_from_fork() is that it does call the callback. * such an architecture should also get rid of ret_from_kernel_execve() and __ARCH_WANT_KERNEL_EXECVE This is the last part of infrastructure patches in that area - from that point on work on different architectures can live independently. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-11make sure that we always have a return path from kernel_execve()Al Viro
The only place where kernel_execve() is called without a way to return to the caller of kernel_thread() callback is kernel_post(). Reorganize kernel_init()/kernel_post() - instead of the former calling the latter in the end (and getting freed by it), have the latter *begin* with calling the former (and turn the latter into kernel_thread() callback, of course). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-09prio_tree: removeMichel Lespinasse
After both prio_tree users have been converted to use red-black trees, there is no need to keep around the prio tree library anymore. Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-29efi: Fix the ACPI BGRT driver for images located in EFI boot services memoryJosh Triplett
The ACPI BGRT driver accesses the BIOS logo image when it initializes. However, ACPI 5.0 (which introduces the BGRT) recommends putting the logo image in EFI boot services memory, so that the OS can reclaim that memory. Production systems follow this recommendation, breaking the ACPI BGRT driver. Move the bulk of the BGRT code to run during a new EFI late initialization phase, which occurs after switching EFI to virtual mode, and after initializing ACPI, but before freeing boot services memory. Copy the BIOS logo image to kernel memory at that point, and make it accessible to the BGRT driver. Rework the existing ACPI BGRT driver to act as a simple wrapper exposing that image (and the properties from the BGRT) via sysfs. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Link: http://lkml.kernel.org/r/93ce9f823f1c1f3bb88bdd662cce08eee7a17f5d.1348876882.git.josh@joshtriplett.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-29efi: Defer freeing boot services memory until after ACPI initJosh Triplett
Some new ACPI 5.0 tables reference resources stored in boot services memory, so keep that memory around until we have ACPI and can extract data from it. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Link: http://lkml.kernel.org/r/baaa6d44bdc4eb0c58e5d1b4ccd2c729f854ac55.1348876882.git.josh@joshtriplett.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-08-14Revert "x86-64/efi: Use EFI to deal with platform wall clock"H. Peter Anvin
This reverts commit bacef661acdb634170a8faddbc1cf28e8f8b9eee. This commit has been found to cause serious regressions on a number of ASUS machines at the least. We probably need to provide a 1:1 map in addition to the EFI virtual memory map in order for this to work. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Reported-and-bisected-by: Jérôme Carretero <cJ-ko@zougloub.eu> Cc: Jan Beulich <jbeulich@suse.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120805172903.5f8bb24c@zougloub.eu
2012-07-31mm/hotplug: correctly setup fallback zonelists when creating new pgdatJiang Liu
When hotadd_new_pgdat() is called to create new pgdat for a new node, a fallback zonelist should be created for the new node. There's code to try to achieve that in hotadd_new_pgdat() as below: /* * The node we allocated has no zone fallback lists. For avoiding * to access not-initialized zonelist, build here. */ mutex_lock(&zonelists_mutex); build_all_zonelists(pgdat, NULL); mutex_unlock(&zonelists_mutex); But it doesn't work as expected. When hotadd_new_pgdat() is called, the new node is still in offline state because node_set_online(nid) hasn't been called yet. And build_all_zonelists() only builds zonelists for online nodes as: for_each_online_node(nid) { pg_data_t *pgdat = NODE_DATA(nid); build_zonelists(pgdat); build_zonelist_cache(pgdat); } Though we hope to create zonelist for the new pgdat, but it doesn't. So add a new parameter "pgdat" the build_all_zonelists() to build pgdat for the new pgdat too. Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Keping Chen <chenkeping@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-27Merge tag 'cpumask-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus Pull cpumask changes from Rusty Russell: "Trivial comment changes to cpumask code. I guess it's getting boring." Boring is good. * tag 'cpumask-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: cpumask: cpulist_parse() comments correction init: add comments to keep initcall-names in sync with initcall levels cpumask: add a few comments of cpumask functions
2012-07-27init: add comments to keep initcall-names in sync with initcall levelsJim Cromie
main.c has initcall_level_names[] for parse_args to print in debug messages, add comments to keep them in sync with initcalls defined in init.h. Also add "loadable" into comment re not using *_initcall macros in modules, to disambiguate from kernel/params.c and other builtins. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-07-26Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pul x86/efi changes from Ingo Molnar: "This tree adds an EFI bootloader handover protocol, which, once supported on the bootloader side, will make bootup faster and might result in simpler bootloaders. The other change activates the EFI wall clock time accessors on x86-64 as well, instead of the legacy RTC readout." * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: Handover Protocol x86-64/efi: Use EFI to deal with platform wall clock
2012-07-22switch fput to task_work_addAl Viro
... and schedule_work() for interrupt/kernel_thread callers (and yes, now it *is* OK to call from interrupt). We are guaranteed that __fput() will be done before we return to userland (or exit). Note that for fput() from a kernel thread we get an async behaviour; it's almost always OK, but sometimes you might need to have __fput() completed before you do anything else. There are two mechanisms for that - a general barrier (flush_delayed_fput()) and explicit __fput_sync(). Both should be used with care (as was the case for fput() from kernel threads all along). See comments in fs/file_table.c for details. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-08init: Drop initcall level outputBorislav Petkov
9fb48c744ba6a ("params: add 3rd arg to option handler callback signature") added similar lines to dmesg: initlevel:0=early, 4 registered initcalls initlevel:1=core, 31 registered initcalls initlevel:2=postcore, 11 registered initcalls initlevel:3=arch, 7 registered initcalls initlevel:4=subsys, 40 registered initcalls initlevel:5=fs, 30 registered initcalls initlevel:6=device, 250 registered initcalls initlevel:7=late, 35 registered initcalls but they don't contain any info for the general user staring at dmesg. I'm very doubtful the count of initcalls registered per level helps anyone so drop that output completely. Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-08module_param: stop double-calling parameters.Rusty Russell
Commit 026cee0086fe1df4cf74691cf273062cc769617d "params: <level>_initcall-like kernel parameters" set old-style module parameters to level 0. And we call those level 0 calls where we used to, early in start_kernel(). We also loop through the initcall levels and call the levelled module_params before the corresponding initcall. Unfortunately level 0 is early_init(), so we call the standard module_param calls twice. (Turns out most things don't care, but at least ubi.mtd does). Change the level to -1 for standard module_param calls. Reported-by: Benoît Thébaudeau <benoit.thebaudeau@advansee.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
2012-06-06x86-64/efi: Use EFI to deal with platform wall clockJan Beulich
Other than ix86, x86-64 on EFI so far didn't set the {g,s}et_wallclock accessors to the EFI routines, thus incorrectly using raw RTC accesses instead. Simply removing the #ifdef around the respective code isn't enough, however: While so far early get-time calls were done in physical mode, this doesn't work properly for x86-64, as virtual addresses would still need to be set up for all runtime regions (which wasn't the case on the system I have access to), so instead the patch moves the call to efi_enter_virtual_mode() ahead (which in turn allows to drop all code related to calling efi-get-time in physical mode). Additionally the earlier calling of efi_set_executable() requires the CPA code to cope, i.e. during early boot it must be avoided to call cpa_flush_array(), as the first thing this function does is a BUG_ON(irqs_disabled()). Also make the two EFI functions in question here static - they're not being referenced elsewhere. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Matt Fleming <matt.fleming@intel.com> Acked-by: Matthew Garrett <mjg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FBFBF5F020000780008637F@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-22Merge tag 'driver-core-3.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg Kroah-Hartman: "Here's the driver core, and other driver subsystems, pull request for the 3.5-rc1 merge window. Outside of a few minor driver core changes, we ended up with the following different subsystem and core changes as well, due to interdependancies on the driver core: - hyperv driver updates - drivers/memory being created and some drivers moved into it - extcon driver subsystem created out of the old Android staging switch driver code - dynamic debug updates - printk rework, and /dev/kmsg changes All of this has been tested in the linux-next releases for a few weeks with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fix up conflicts in drivers/extcon/extcon-max8997.c where git noticed that a patch to the deleted drivers/misc/max8997-muic.c driver needs to be applied to this one. * tag 'driver-core-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (90 commits) uio_pdrv_genirq: get irq through platform resource if not set otherwise memory: tegra{20,30}-mc: Remove empty *_remove() printk() - isolate KERN_CONT users from ordinary complete lines sysfs: get rid of some lockdep false positives Drivers: hv: util: Properly handle version negotiations. Drivers: hv: Get rid of an unnecessary check in vmbus_prep_negotiate_resp() memory: tegra{20,30}-mc: Use dev_err_ratelimited() driver core: Add dev_*_ratelimited() family Driver Core: don't oops with unregistered driver in driver_find_device() printk() - restore prefix/timestamp printing for multi-newline strings printk: add stub for prepend_timestamp() ARM: tegra30: Make MC optional in Kconfig ARM: tegra20: Make MC optional in Kconfig ARM: tegra30: MC: Remove unnecessary BUG*() ARM: tegra20: MC: Remove unnecessary BUG*() printk: correctly align __log_buf ARM: tegra30: Add Tegra Memory Controller(MC) driver ARM: tegra20: Add Tegra Memory Controller(MC) driver printk() - restore timestamp printing at console output printk() - do not merge continuation lines of different threads ...
2012-05-21Fix blocking allocations called very early during bootupLinus Torvalds
During early boot, when the scheduler hasn't really been fully set up, we really can't do blocking allocations because with certain (dubious) configurations the "might_resched()" calls can actually result in scheduling events. We could just make such users always use GFP_ATOMIC, but quite often the code that does the allocation isn't really aware of the fact that the scheduler isn't up yet, and forcing that kind of random knowledge on the initialization code is just annoying and not good for anybody. And we actually have a the 'gfp_allowed_mask' exactly for this reason: it's just that the kernel init sequence happens to set it to allow blocking allocations much too early. So move the 'gfp_allowed_mask' initialization from 'start_kernel()' (which is some of the earliest init code, and runs with preemption disabled for good reasons) into 'kernel_init()'. kernel_init() is run in the newly created thread that will become the 'init' process, as opposed to the early startup code that runs within the context of what will be the first idle thread. So by the time we reach 'kernel_init()', we know that the scheduler must be at least limping along, because we've already scheduled from the idle thread into the init thread. Reported-by: Steven Rostedt <rostedt@goodmis.org> Cc: David Rientjes <rientjes@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-02Merge 3.4-rc5 into driver-core-nextGreg Kroah-Hartman
This was done to resolve a merge issue with the init/main.c file. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-30params: add 3rd arg to option handler callback signatureJim Cromie
Add a 3rd arg, named "doing", to unknown-options callbacks invoked from parse_args(). The arg is passed as: "Booting kernel" from start_kernel(), initcall_level_names[i] from do_initcall_level(), mod->name from load_module(), via parse_args(), parse_one() parse_args() already has the "name" parameter, which is renamed to "doing" to better reflect current uses 1,2 above. parse_args() passes it to an altered parse_one(), which now passes it down into the unknown option handler callbacks. The mod->name will be needed to handle dyndbg for loadable modules, since params passed by modprobe are not qualified (they do not have a "$modname." prefix), and by the time the unknown-param callback is called, the module name is not otherwise available. Minor tweaks: Add param-name to parse_one's pr_debug(), current message doesnt identify the param being handled, add it. Add a pr_info to print current level and level_name of the initcall, and number of registered initcalls at that level. This adds 7 lines to dmesg output, like: initlevel:6=device, 172 registered initcalls Drop "parameters" from initcall_level_names[], its unhelpful in the pr_info() added above. This array is passed into parse_args() by do_initcall_level(). CC: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-25init: fix bug where environment vars can't be passed via boot argsChris Metcalf
Commit 026cee0086f had the side-effect of dropping the '=' from the unknown boot arguments that are passed to init as environment variables. This is because parse_args() puts a NUL in the string where the '=' was when it passes the "param" and "val" pointers to the parsing subfunctions. Previously, unknown_bootoption() was the last parse_args() subfunction to run, and it carefully put back the '=' character. Now the ignore_unknown_bootoption() is the last one to run, and it wasn't doing the necessary repair, so the envp params ended up with the embedded NUL and were no longer seen as valid environment variables by init. Tested-by: Woody Suwalski <terraluna977@gmail.com> Acked-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-03-28Merge tag 'split-asm_system_h-for-linus-20120328' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...
2012-03-28Move all declarations of free_initmem() to linux/mm.hDavid Howells
Move all declarations of free_initmem() to linux/mm.h so that there's only one and it's used by everything. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-c6x-dev@linux-c6x.org cc: microblaze-uclinux@itee.uq.edu.au cc: linux-sh@vger.kernel.org cc: sparclinux@vger.kernel.org cc: x86@kernel.org cc: linux-mm@kvack.org
2012-03-26params: <level>_initcall-like kernel parametersPawel Moll
This patch adds a set of macros that can be used to declare kernel parameters to be parsed _before_ initcalls at a chosen level are executed. We rename the now-unused "flags" field of struct kernel_param as the level. It's signed, for when we use this for early params as well, in future. Linker macro collating init calls had to be modified in order to add additional symbols between levels that are later used by the init code to split the calls into blocks. Signed-off-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "It's indeed trivial -- mostly documentation updates and a bunch of typo fixes from Masanari. There are also several linux/version.h include removals from Jesper." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits) kcore: fix spelling in read_kcore() comment constify struct pci_dev * in obvious cases Revert "char: Fix typo in viotape.c" init: fix wording error in mm_init comment usb: gadget: Kconfig: fix typo for 'different' Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c" writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header writeback: fix typo in the writeback_control comment Documentation: Fix multiple typo in Documentation tpm_tis: fix tis_lock with respect to RCU Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c" Doc: Update numastat.txt qla4xxx: Add missing spaces to error messages compiler.h: Fix typo security: struct security_operations kerneldoc fix Documentation: broken URL in libata.tmpl Documentation: broken URL in filesystems.tmpl mtd: simplify return logic in do_map_probe() mm: fix comment typo of truncate_inode_pages_range power: bq27x00: Fix typos in comment ...
2012-03-15init: fix wording error in mm_init commentJim Cromie
s/countinuous/contiguous/, reword sentence. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-03-01sched/rt: Use schedule_preempt_disabled()Thomas Gleixner
Coccinelle based conversion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-24swm5zut3h9c4a6s46x8rws@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-13module_param: make bool parameters really bool (core code)Rusty Russell
module_param(bool) used to counter-intuitively take an int. In fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-11Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/numa: Add constraints check for nid parameters mm, x86: Remove debug_pagealloc_enabled x86/mm: Initialize high mem before free_all_bootmem() arch/x86/kernel/e820.c: quiet sparse noise about plain integer as NULL pointer arch/x86/kernel/e820.c: Eliminate bubble sort from sanitize_e820_map() x86: Fix mmap random address range x86, mm: Unify zone_sizes_init() x86, mm: Prepare zone_sizes_init() for unification x86, mm: Use max_low_pfn for ZONE_NORMAL on 64-bit x86, mm: Wrap ZONE_DMA32 with CONFIG_ZONE_DMA32 x86, mm: Use max_pfn instead of highend_pfn x86, mm: Move zone init from paging_init() on 64-bit x86, mm: Use MAX_DMA_PFN for ZONE_DMA on 32-bit
2011-12-06mm, x86: Remove debug_pagealloc_enabledStanislaw Gruszka
When (no)bootmem finish operation, it pass pages to buddy allocator. Since debug_pagealloc_enabled is not set, we will do not protect pages, what is not what we want with CONFIG_DEBUG_PAGEALLOC=y. To fix remove debug_pagealloc_enabled. That variable was introduced by commit 12d6f21e "x86: do not PSE on CONFIG_DEBUG_PAGEALLOC=y" to get more CPA (change page attribude) code testing. But currently we have CONFIG_CPA_DEBUG, which test CPA. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1322582711-14571-1-git-send-email-sgruszka@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06init/main.c: Execute lockdep_init() as early as possibleMing Lei
This patch fixes a lockdep warning on ARM platforms: [ 0.000000] WARNING: lockdep init error! Arch code didn't call lockdep_init() early enough? [ 0.000000] Call stack leading to lockdep invocation was: [ 0.000000] [<c00164bc>] save_stack_trace_tsk+0x0/0x90 [ 0.000000] [<ffffffff>] 0xffffffff The warning is caused by printk inside smp_setup_processor_id(). It is safe to do this because lockdep_init() doesn't depend on smp_setup_processor_id(), so improve things that printk can be called as early as possible without lockdep complaint. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1321508072-23853-1-git-send-email-tom.leiming@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-06Merge branch 'upstream/jump-label-noearly' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: jump-label: initialize jump-label subsystem much earlier x86/jump_label: add arch_jump_label_transform_static() s390/jump-label: add arch_jump_label_transform_static() jump_label: add arch_jump_label_transform_static() to optimise non-live code updates sparc/jump_label: drop arch_jump_label_text_poke_early() x86/jump_label: drop arch_jump_label_text_poke_early() jump_label: if a key has already been initialized, don't nop it out stop_machine: make stop_machine safe and efficient to call early jump_label: use proper atomic_t initializer Conflicts: - arch/x86/kernel/jump_label.c Added __init_or_module to arch_jump_label_text_poke_early vs removal of that function entirely - kernel/stop_machine.c same patch ("stop_machine: make stop_machine safe and efficient to call early") merged twice, with whitespace fix in one version
2011-10-26params: make dashes and underscores in parameter names truly equalMichal Schmidt
The user may use "foo-bar" for a kernel parameter defined as "foo_bar". Make sure it works the other way around too. Apply the equality of dashes and underscores on early_params and __setup params as well. The example given in Documentation/kernel-parameters.txt indicates that this is the intended behaviour. With the patch the kernel accepts "log-buf-len=1M" as expected. https://bugzilla.redhat.com/show_bug.cgi?id=744545 Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (neatened implementations)
2011-10-25jump-label: initialize jump-label subsystem much earlierJeremy Fitzhardinge
Initialize jump_labels much, much earlier, so they're available for use during system setup. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2011-09-29bootup: move 'usermodehelper_enable()' a little earlierwangyanqing
Commit d5767c53535a ("bootup: move 'usermodehelper_enable()' to the end of do_basic_setup()") moved 'usermodehelper_enable()' to end of do_basic_setup() to after the initcalls. But then I get failed to let uvesafb work on my computer, and lose the splash boot. So maybe we could start usermodehelper_enable a little early to make some task work that need eary init with the help of user mode. [ I would *really* prefer that initcalls not call into user space - even the real 'init' hasn't been execve'd yet, after all! But for uvesafb it really does look like we don't have much choice. I considered doing this when we mount the root filesystem, but depending on config options that is in multiple places. We could do the usermode helper enable as a rootfs_initcall().. So I'm just using wang yanqing's trivial patch. It's not wonderful, but it's simple and should work. We should revisit this some day, though. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-28bootup: move 'usermodehelper_enable()' to the end of do_basic_setup()Linus Torvalds
Doing it just before starting to call into cpu_idle() made a sick kind of sense only because the original bug we fixed (see commit 288d5abec831: "Boot up with usermodehelper disabled") was about problems with some scheduler data structures not being initialized, and they had better be initialized at that point. But it really didn't make any other conceptual sense, and doing it after the initial "schedule()" call for the idle thread actually opened up a race: what if the main initialization thread did everything without needing to sleep, and got all the way into user land too? Without actually having scheduled back to the idle thread? Now, in normal circumstances that doesn't ever happen, but it looks like Richard Cochran triggered exactly that on his ARM IXP4xx machines: "I have some ARM IXP4xx based machines that use the two on chip MAC ports (aka NPEs). The NPE needs a firmware in order to function. Ever since the following commit [that 288d5abec831 one], it is no longer possible to bring up the interfaces during the init scripts." with a call trace showing an ioctl coming from user space. Richard says: "The init is busybox, and the startup script does mount, syslogd, and then ifup, so that all can go by quickly." The fix is to move the usermodehelper_enable() into the main 'init' thread, and just put it after we've done all our initcalls. By then, everything really should be up, but we've obviously not actually started the user-mode portion of init yet. Reported-and-tested-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-21init: carefully handle loglevel option on kernel cmdline.Alexander Sverdlin
When a malformed loglevel value (for example "${abc}") is passed on the kernel cmdline, the loglevel itself is being set to 0. That then suppresses all following messages, including all the errors and crashes caused by other malformed cmdline options. This could make debugging process quite tricky. This patch leaves the previous value of loglevel if the new value is incorrect and reports an error code in this case. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@sysgo.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03Boot up with usermodehelper disabledLinus Torvalds
The core device layer sends tons of uevent notifications for each device it finds, and if the kernel has been built with a non-empty CONFIG_UEVENT_HELPER_PATH that will make us try to execute the usermode helper binary for all these events very early in the boot. Not only won't the root filesystem even be mounted at that point, we literally won't have necessarily even initialized all the process handling data structures at that point, which causes no end of silly problems even when the usermode helper doesn't actually succeed in executing. So just use our existing infrastructure to disable the usermodehelpers to make the kernel start out with them disabled. We enable them when we've at least initialized stuff a bit. Problems related to an uninitialized init_ipc_ns.ids[IPC_SHM_IDS].rw_mutex reported by various people. Reported-by: Manuel Lauss <manuel.lauss@googlemail.com> Reported-by: Richard Weinberger <richard@nod.at> Reported-by: Marc Zyngier <maz@misterjones.org> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03tmpfs: miscellaneous trivial cleanupsHugh Dickins
While it's at its least, make a number of boring nitpicky cleanups to shmem.c, mostly for consistency of variable naming. Things like "swap" instead of "entry", "pgoff_t index" instead of "unsigned long idx". And since everything else here is prefixed "shmem_", better change init_tmpfs() to shmem_init(). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-17generic-ipi: Fix kexec boot crash by initializing call_single_queue before ↵Takao Indoh
enabling interrupts There is a problem that kdump(2nd kernel) sometimes hangs up due to a pending IPI from 1st kernel. Kernel panic occurs because IPI comes before call_single_queue is initialized. To fix the crash, rename init_call_single_data() to call_function_init() and call it in start_kernel() so that call_single_queue can be initialized before enabling interrupts. The details of the crash are: (1) 2nd kernel boots up (2) A pending IPI from 1st kernel comes when irqs are first enabled in start_kernel(). (3) Kernel tries to handle the interrupt, but call_single_queue is not initialized yet at this point. As a result, in the generic_smp_call_function_single_interrupt(), NULL pointer dereference occurs when list_replace_init() tries to access &q->list.next. Therefore this patch changes the name of init_call_single_data() to call_function_init() and calls it before local_irq_enable() in start_kernel(). Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Milton Miller <miltonm@bga.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: kexec@lists.infradead.org Link: http://lkml.kernel.org/r/D6CBEE2F420741indou.takao@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-29mm: Fix boot crash in mm_alloc()Linus Torvalds
Thomas Gleixner reports that we now have a boot crash triggered by CONFIG_CPUMASK_OFFSTACK=y: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c11ae035>] find_next_bit+0x55/0xb0 Call Trace: [<c11addda>] cpumask_any_but+0x2a/0x70 [<c102396b>] flush_tlb_mm+0x2b/0x80 [<c1022705>] pud_populate+0x35/0x50 [<c10227ba>] pgd_alloc+0x9a/0xf0 [<c103a3fc>] mm_init+0xec/0x120 [<c103a7a3>] mm_alloc+0x53/0xd0 which was introduced by commit de03c72cfce5 ("mm: convert mm->cpu_vm_cpumask into cpumask_var_t"), and is due to wrong ordering of mm_init() vs mm_init_cpumask Thomas wrote a patch to just fix the ordering of initialization, but I hate the new double allocation in the fork path, so I ended up instead doing some more radical surgery to clean it all up. Reported-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Ingo Molnar <mingo@elte.hu> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25printk: allocate kernel log buffer earlierMike Travis
On larger systems, because of the numerous ACPI, Bootmem and EFI messages, the static log buffer overflows before the larger one specified by the log_buf_len param is allocated. Minimize the overflow by allocating the new log buffer as soon as possible. On kernels without memblock, a later call to setup_log_buf from kernel/init.c is the fallback. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix CONFIG_PRINTK=n build] Signed-off-by: Mike Travis <travis@sgi.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25mm: convert mm->cpu_vm_cpumask into cpumask_var_tKOSAKI Motohiro
cpumask_t is very big struct and cpu_vm_mask is placed wrong position. It might lead to reduce cache hit ratio. This patch has two change. 1) Move the place of cpumask into last of mm_struct. Because usually cpumask is accessed only front bits when the system has cpu-hotplug capability 2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory footprint if cpumask_size() will use nr_cpumask_bits properly in future. In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var. It may help to detect out of tree cpu_vm_mask users. This patch has no functional change. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-19kmemleak: Initialise kmemleak after debug_objects_mem_init()Catalin Marinas
Kmemleak frees objects via RCU and when CONFIG_DEBUG_OBJECTS_RCU_HEAD is enabled, the RCU callback triggers a call to free_object() in lib/debugobjects.c. Since kmemleak is initialised before debug objects initialisation, it may result in a kernel panic during booting. This patch moves the kmemleak_init() call after debug_objects_mem_init(). Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com> Tested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@kernel.org>
2011-03-23pid: remove the child_reaper special case in init/main.cEric W. Biederman
This patchset is a cleanup and a preparation to unshare the pid namespace. These prerequisites prepare for Eric's patchset to give a file descriptor to a namespace and join an existing namespace. This patch: It turns out that the existing assignment in copy_process of the child_reaper can handle the initial assignment of child_reaper we just need to generalize the test in kernel/fork.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22smp: move smp setup functions to kernel/smp.cAmerigo Wang
Move setup_nr_cpu_ids(), smp_init() and some other SMP boot parameter setup functions from init/main.c to kenrel/smp.c, saves some #ifdef CONFIG_SMP. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Rakib Mullick <rakib.mullick@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tejun Heo <tj@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20lockdep: Move early boot local IRQ enable/disable status to init/main.cTejun Heo
During early boot, local IRQ is disabled until IRQ subsystem is properly initialized. During this time, no one should enable local IRQ and some operations which usually are not allowed with IRQ disabled, e.g. operations which might sleep or require communications with other processors, are allowed. lockdep tracked this with early_boot_irqs_off/on() callbacks. As other subsystems need this information too, move it to init/main.c and make it generally available. While at it, toggle the boolean to early_boot_irqs_disabled instead of enabled so that it can be initialized with %false and %true indicates the exceptional condition. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110120110635.GB6036@htj.dyndns.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.