summaryrefslogtreecommitdiff
path: root/kernel/module.c
AgeCommit message (Collapse)Author
2012-12-14module: add flags arg to sys_finit_module()Rusty Russell
Thanks to Michael Kerrisk for keeping us honest. These flags are actually useful for eliminating the only case where kmod has to mangle a module's internals: for overriding module versioning. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Acked-by: Kees Cook <keescook@chromium.org>
2012-12-14module: add syscall to load module from fdKees Cook
As part of the effort to create a stronger boundary between root and kernel, Chrome OS wants to be able to enforce that kernel modules are being loaded only from our read-only crypto-hash verified (dm_verity) root filesystem. Since the init_module syscall hands the kernel a module as a memory blob, no reasoning about the origin of the blob can be made. Earlier proposals for appending signatures to kernel modules would not be useful in Chrome OS, since it would involve adding an additional set of keys to our kernel and builds for no good reason: we already trust the contents of our root filesystem. We don't need to verify those kernel modules a second time. Having to do signature checking on module loading would slow us down and be redundant. All we need to know is where a module is coming from so we can say yes/no to loading it. If a file descriptor is used as the source of a kernel module, many more things can be reasoned about. In Chrome OS's case, we could enforce that the module lives on the filesystem we expect it to live on. In the case of IMA (or other LSMs), it would be possible, for example, to examine extended attributes that may contain signatures over the contents of the module. This introduces a new syscall (on x86), similar to init_module, that has only two arguments. The first argument is used as a file descriptor to the module and the second argument is a pointer to the NULL terminated string of module arguments. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-31module: fix out-by-one error in kallsymsRusty Russell
Masaki found and patched a kallsyms issue: the last symbol in a module's symtab wasn't transferred. This is because we manually copy the zero'th entry (which is always empty) then copy the rest in a loop starting at 1, though from src[0]. His fix was minimal, I prefer to rewrite the loops in more standard form. There are two loops: one to get the size, and one to copy. Make these identical: always count entry 0 and any defined symbol in an allocated non-init section. This bug exists since the following commit was introduced. module: reduce symbol table for loaded modules (v2) commit: 4a4962263f07d14660849ec134ee42b63e95ea9a LKML: http://lkml.org/lkml/2012/10/24/27 Reported-by: Masaki Kimura <masaki.kimura.kz@hitachi.com> Cc: stable@kernel.org
2012-10-19MODSIGN: Move the magic string to the end of a module and eliminate the searchDavid Howells
Emit the magic string that indicates a module has a signature after the signature data instead of before it. This allows module_sig_check() to be made simpler and faster by the elimination of the search for the magic string. Instead we just need to do a single memcmp(). This works because at the end of the signature data there is the fixed-length signature information block. This block then falls immediately prior to the magic number. From the contents of the information block, it is trivial to calculate the size of the signature data and thus the size of the actual module data. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-10MODSIGN: Add FIPS policyDavid Howells
If we're in FIPS mode, we should panic if we fail to verify the signature on a module or we're asked to load an unsigned module in signature enforcing mode. Possibly FIPS mode should automatically enable enforcing mode. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-10module: signature checking hookRusty Russell
We do a very simple search for a particular string appended to the module (which is cache-hot and about to be SHA'd anyway). There's both a config option and a boot parameter which control whether we accept or fail with unsigned modules and modules that are signed with an unknown key. If module signing is enabled, the kernel will be tainted if a module is loaded that is unsigned or has a signature for which we don't have the key. (Useful feedback and tweaks by David Howells <dhowells@redhat.com>) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28module: wait when loading a module which is currently initializing.Rusty Russell
The original module-init-tools module loader used a fnctl lock on the .ko file to avoid attempts to simultaneously load a module. Unfortunately, you can't get an exclusive fcntl lock on a read-only fd, making this not work for read-only mounted filesystems. module-init-tools has a hacky sleep-and-loop for this now. It's not that hard to wait in the kernel, and only return -EEXIST once the first module has finished loading (or continue loading the module if the first one failed to initialize for some reason). It's also consistent with what we do for dependent modules which are still loading. Suggested-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28module: fix symbol waiting when module fails before initRusty Russell
We use resolve_symbol_wait(), which blocks if the module containing the symbol is still loading. However: 1) The module_wq we use is only woken after calling the modules' init function, but there are other failure paths after the module is placed in the linked list where we need to do the same thing. 2) wake_up() only wakes one waiter, and our waitqueue is shared by all modules, so we need to wake them all. 3) wake_up_all() doesn't imply a memory barrier: I feel happier calling it after we've grabbed and dropped the module_mutex, not just after the state assignment. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28Make most arch asm/module.h files use asm-generic/module.hDavid Howells
Use the mapping of Elf_[SPE]hdr, Elf_Addr, Elf_Sym, Elf_Dyn, Elf_Rel/Rela, ELF_R_TYPE() and ELF_R_SYM() to either the 32-bit version or the 64-bit version into asm-generic/module.h for all arches bar MIPS. Also, use the generic definition mod_arch_specific where possible. To this end, I've defined three new config bools: (*) HAVE_MOD_ARCH_SPECIFIC Arches define this if they don't want to use the empty generic mod_arch_specific struct. (*) MODULES_USE_ELF_RELA Arches define this if their modules can contain RELA records. This causes the Elf_Rela mapping to be emitted and allows apply_relocate_add() to be defined by the arch rather than have the core emit an error message. (*) MODULES_USE_ELF_REL Arches define this if their modules can contain REL records. This causes the Elf_Rel mapping to be emitted and allows apply_relocate() to be defined by the arch rather than have the core emit an error message. Note that it is possible to allow both REL and RELA records: m68k and mips are two arches that do this. With this, some arch asm/module.h files can be deleted entirely and replaced with a generic-y marker in the arch Kbuild file. Additionally, I have removed the bits from m32r and score that handle the unsupported type of relocation record as that's now handled centrally. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28module: taint kernel when lve module is loadedMatthew Garrett
Cloudlinux have a product called lve that includes a kernel module. This was previously GPLed but is now under a proprietary license, but the module continues to declare MODULE_LICENSE("GPL") and makes use of some EXPORT_SYMBOL_GPL symbols. Forcibly taint it in order to avoid this. Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Alex Lyashkov <umka@cloudlinux.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
2012-05-23Guard check in module loader against integer overflowDavid Howells
The check: if (len < hdr->e_shoff + hdr->e_shnum * sizeof(Elf_Shdr)) may not work if there's an overflow in the right-hand side of the condition. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-04-30dynamic_debug: make dynamic-debug work for module initializationJim Cromie
This introduces a fake module param $module.dyndbg. Its based upon Thomas Renninger's $module.ddebug boot-time debugging patch from https://lkml.org/lkml/2010/9/15/397 The 'fake' module parameter is provided for all modules, whether or not they need it. It is not explicitly added to each module, but is implemented in callbacks invoked from parse_args. For builtin modules, dynamic_debug_init() now directly calls parse_args(..., &ddebug_dyndbg_boot_params_cb), to process the params undeclared in the modules, just after the ddebug tables are processed. While its slightly weird to reprocess the boot params, parse_args() is already called repeatedly by do_initcall_levels(). More importantly, the dyndbg queries (given in ddebug_query or dyndbg params) cannot be activated until after the ddebug tables are ready, and reusing parse_args is cleaner than doing an ad-hoc parse. This reparse would break options like inc_verbosity, but they probably should be params, like verbosity=3. ddebug_dyndbg_boot_params_cb() handles both bare dyndbg (aka: ddebug_query) and module-prefixed dyndbg params, and ignores all other parameters. For example, the following will enable pr_debug()s in 4 builtin modules, in the order given: dyndbg="module params +p; module aio +p" module.dyndbg=+p pci.dyndbg For loadable modules, parse_args() in load_module() calls ddebug_dyndbg_module_params_cb(). This handles bare dyndbg params as passed from modprobe, and errors on other unknown params. Note that modprobe reads /proc/cmdline, so "modprobe foo" grabs all foo.params, strips the "foo.", and passes these to the kernel. ddebug_dyndbg_module_params_cb() is again called for the unknown params; it handles dyndbg, and errors on others. The "doing" arg added previously contains the module name. For non CONFIG_DYNAMIC_DEBUG builds, the stub function accepts and ignores $module.dyndbg params, other unknowns get -ENOENT. If no param value is given (as in pci.dyndbg example above), "+p" is assumed, which enables all pr_debug callsites in the module. The dyndbg fake parameter is not shown in /sys/module/*/parameters, thus it does not use any resources. Changes to it are made via the control file. Also change pr_info in ddebug_exec_queries to vpr_info, no need to see it all the time. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> CC: Thomas Renninger <trenn@suse.de> CC: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-26module: Remove module size limitSasha Levin
Module size was limited to 64MB, this was legacy limitation due to vmalloc() which was removed a while ago. Limiting module size to 64MB is both pointless and affects real world use cases. Cc: Tim Abbott <tim.abbott@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-26module: move __module_get and try_module_get() out of line.Steven Rostedt
With the preempt, tracepoint and everything, it's getting a bit chubby. For an Ubuntu-based config: Before: $ size -t `find * -name '*.ko'` | grep TOTAL 56199906 3870760 1606616 61677282 3ad1ee2 (TOTALS) $ size vmlinux text data bss dec hex filename 8509342 850368 3358720 12718430 c2115e vmlinux After: $ size -t `find * -name '*.ko'` | grep TOTAL 56183760 3867892 1606616 61658268 3acd49c (TOTALS) $ size vmlinux text data bss dec hex filename 8501842 849088 3358720 12709650 c1ef12 vmlinux Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (made all out-of-line)
2012-03-26params: <level>_initcall-like kernel parametersPawel Moll
This patch adds a set of macros that can be used to declare kernel parameters to be parsed _before_ initcalls at a chosen level are executed. We rename the now-unused "flags" field of struct kernel_param as the level. It's signed, for when we use this for early params as well, in future. Linker macro collating init calls had to be modified in order to add additional symbols between levels that are later used by the init code to split the calls into blocks. Signed-off-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-26module: add kernel param to force disable module loadDave Young
Sometimes we need to test a kernel of same version with code or config option changes. We already have sysctl to disable module load, but add a kernel parameter will be more convenient. Since modules_disabled is int, so here use bint type in core_param. TODO: make sysctl accept bool and change modules_disabled to bool Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-15error: implicit declaration of function 'module_flags_taint'Kevin Winchester
Recent changes to kernel/module.c caused the following compile error: kernel/module.c: In function ‘show_taint’: kernel/module.c:1024:2: error: implicit declaration of function ‘module_flags_taint’ [-Werror=implicit-function-declaration] cc1: some warnings being treated as errors Correct this error by moving the definition of module_flags_taint outside of the #ifdef CONFIG_MODULE_UNLOAD section. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13modules: sysfs - export: taint, coresize, initsizeKay Sievers
Recent tools do not want to use /proc to retrieve module information. A few values are currently missing from sysfs to replace the information available in /proc/modules. This adds /sys/module/*/{coresize,initsize,taint} attributes. TAINT_PROPRIETARY_MODULE (P) and TAINT_OOT_MODULE (O) flags are both always shown now, and do no longer exclude each other, also in /proc/modules. Replace the open-coded sysfs attribute initializers with the __ATTR() macro. Add the new attributes to Documentation/ABI. Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13module: replace DEBUGP with pr_debugJim Cromie
Use more flexible pr_debug. This allows: echo "module module +p" > /dbg/dynamic_debug/control to turn on debug messages when needed. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13module: struct module_ref should contains long fieldsEric Dumazet
module_ref contains two "unsigned int" fields. Thats now too small, since some machines can open more than 2^32 files. Check commit 518de9b39e8 (fs: allow for more than 2^31 files) for reference. We can add an aligned(2 * sizeof(unsigned long)) attribute to force alloc_percpu() allocating module_ref areas in single cache lines. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Rusty Russell <rusty@rustcorp.com.au> CC: Tejun Heo <tj@kernel.org> CC: Robin Holt <holt@sgi.com> CC: David Miller <davem@davemloft.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13module: Fix performance regression on modules with large symbol tablesKevin Cernekee
Looking at /proc/kallsyms, one starts to ponder whether all of the extra strtab-related complexity in module.c is worth the memory savings. Instead of making the add_kallsyms() loop even more complex, I tried the other route of deleting the strmap logic and naively copying each string into core_strtab with no consideration for consolidating duplicates. Performance on an "already exists" insmod of nvidia.ko (runs add_kallsyms() but does not actually initialize the module): Original scheme: 1.230s With naive copying: 0.058s Extra space used: 35k (of a 408k module). Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> LKML-Reference: <73defb5e4bca04a6431392cc341112b1@localhost>
2012-01-13module: Add comments describing how the "strmap" logic worksKevin Cernekee
Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-11-06Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
2011-11-07module,bug: Add TAINT_OOT_MODULE flag for modules not built in-treeBen Hutchings
Use of the GPL or a compatible licence doesn't necessarily make the code any good. We already consider staging modules to be suspect, and this should also be true for out-of-tree modules which may receive very little review. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Reviewed-by: Dave Jones <davej@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (patched oops-tracing.txt)
2011-11-07module: Enable dynamic debugging regardless of taintBen Hutchings
Dynamic debugging is currently disabled for tainted modules, except for TAINT_CRAP. This prevents use of dynamic debugging for out-of-tree modules once the next patch is applied. This condition was apparently intended to avoid a crash if a force- loaded module has an incompatible definition of dynamic debug structures. However, a administrator that forces us to load a module is claiming that it *is* compatible even though it fails our version checks. If they are mistaken, there are any number of ways the module could crash the system. As a side-effect, proprietary and other tainted modules can now use dynamic_debug. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-10-31kernel: Map most files to use export.h instead of module.hPaul Gortmaker
The changed files were only including linux/module.h for the EXPORT_SYMBOL infrastructure, and nothing else. Revector them onto the isolated export header for faster compile times. Nothing to see here but a whole lot of instances of: -#include <linux/module.h> +#include <linux/export.h> This commit is only changing the kernel dir; next targets will probably be mm, fs, the arch dirs, etc. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-08-10Tracepoint: Dissociate from module mutexMathieu Desnoyers
Copy the information needed from struct module into a local module list held within tracepoint.c from within the module coming/going notifier. This vastly simplifies locking of tracepoint registration / unregistration, because we don't have to take the module mutex to register and unregister tracepoints anymore. Steven Rostedt ran into dependency problems related to modules mutex vs kprobes mutex vs ftrace mutex vs tracepoint mutex that seems to be hard to fix without removing this dependency between tracepoint and module mutex. (note: it should be investigated whether kprobes could benefit of being dissociated from the modules mutex too.) This also fixes module handling of tracepoint list iterators, because it was expecting the list to be sorted by pointer address. Given we have control on our own list now, it's OK to sort this list which has tracepoints as its only purpose. The reason why this sorting is required is to handle the fact that seq files (and any read() operation from user-space) cannot hold the tracepoint mutex across multiple calls, so list entries may vanish between calls. With sorting, the tracepoint iterator becomes usable even if the list don't contain the exact item pointed to by the iterator anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Jason Baron <jbaron@redhat.com> CC: Ingo Molnar <mingo@elte.hu> CC: Lai Jiangshan <laijs@cn.fujitsu.com> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Thomas Gleixner <tglx@linutronix.de> CC: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20110810191839.GC8525@Krystal Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-24module: add /sys/module/<name>/uevent filesKay Sievers
Userspace wants to manage module parameters with udev rules. This currently only works for loaded modules, but not for built-in ones. To allow access to the built-in modules we need to re-trigger all module load events that happened before any userspace was running. We already do the same thing for all devices, subsystems(buses) and drivers. This adds the currently missing /sys/module/<name>/uevent files to all module entries. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (split & trivial fix)
2011-07-24module: change attr callbacks to take struct module_kobjectKay Sievers
This simplifies the next patch, where we have an attribute on a builtin module (ie. module == NULL). Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (split into 2)
2011-07-24modules: add default loader hook implementationsJonas Bonn
The module loader code allows architectures to hook into the code by providing a small number of entry points that each arch must implement. This patch provides __weakly linked generic implementations of these entry points for architectures that don't need to do anything special. Signed-off-by: Jonas Bonn <jonas@southpole.se> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-23Merge branch 'staging-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6 * 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (970 commits) staging: usbip: replace usbip_u{dbg,err,info} and printk with dev_ and pr_ staging:iio: Trivial kconfig reorganization and uniformity improvements. staging:iio:documenation partial update. staging:iio: use pollfunc allocation helpers in remaining drivers. staging:iio:max1363 misc cleanups and use of for_each_bit_set to simplify event code spitting out. staging:iio: implement an iio_info structure to take some of the constant elements out of iio_dev. staging:iio:meter:ade7758: Use private data space from iio_allocate_device staging:iio:accel:lis3l02dq make write_reg_8 take value not a pointer to value. staging:iio: ring core cleanups + check if read_last available in lis3l02dq staging:iio:core cleanup: squash tiny wrappers and use dev_set_name to handle creation of event interface name. staging:iio: poll func allocation clean up. staging:iio:ad7780 trivial unused header cleanup. staging:iio:adc: AD7780: Use private data space from iio_allocate_device + trivial fixes staging:iio:adc:AD7780: Convert to new channel registration method staging:iio:adc: AD7606: Drop dev_data in favour of iio_priv() staging:iio:adc: AD7606: Consitently use indio_dev staging:iio: Rip out helper for software rings. staging:iio:adc:AD7298: Use private data space from iio_allocate_device staging:iio: rationalization of different buffer implementation hooks. staging:iio:imu:adis16400 avoid allocating rx, tx, and state separately from iio_dev. ... Fix up trivial conflicts in - drivers/staging/intel_sst/intelmid.c: patches applied in both branches - drivers/staging/rt2860/common/cmm_data_{pci,usb}.c: removed vs spelling - drivers/staging/usbip/vhci_sysfs.c: trivial header file inclusion
2011-05-19module: Use binary search in lookup_symbol()Alessio Igor Bogani
The function is_exported() with its helper function lookup_symbol() are used to verify if a provided symbol is effectively exported by the kernel or by the modules. Now that both have their symbols sorted we can replace a linear search with a binary search which provide a considerably speed-up. This work was supported by a hardware donation from the CE Linux Forum. Signed-off-by: Alessio Igor Bogani <abogani@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19module: Use the binary search for symbols resolutionAlessio Igor Bogani
Takes advantage of the order and locates symbols using binary search. This work was supported by a hardware donation from the CE Linux Forum. Signed-off-by: Alessio Igor Bogani <abogani@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Dirk Behme <dirk.behme@googlemail.com>
2011-05-19module: each_symbol_section instead of each_symbolRusty Russell
Instead of having a callback function for each symbol in the kernel, have a callback for each array of symbols. This eases the logic when we move to sorted symbols and binary search. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
2011-05-19module: split unset_section_ro_nx function.Jan Glauber
Split the unprotect function into a function per section to make the code more readable and add the missing static declaration. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19module: undo module RONX protection correctly.Jan Glauber
While debugging I stumbled over two problems in the code that protects module pages. First issue is that disabling the protection before freeing init or unload of a module is not symmetric with the enablement. For instance, if pages are set to RO the page range from module_core to module_core + core_ro_size is protected. If a module is unloaded the page range from module_core to module_core + core_size is set back to RW. So pages that were not set to RO are also changed to RW. This is not critical but IMHO it should be symmetric. Second issue is that while set_memory_rw & set_memory_ro are used for RO/RW changes only set_memory_nx is involved for NX/X. One would await that the inverse function is called when the NX protection should be removed, which is not the case here, unless I'm missing something. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19module: zero mod->init_ro_size after init is freed.Jan Glauber
Reset mod->init_ro_size to zero after the init part of a module is unloaded. Otherwise we need to check if module->init is NULL in the unprotect functions in the next patch. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19minor ANSI prototype sparse fixDaniel J Blueman
Fix function prototype to be ANSI-C compliant, consistent with other function prototypes, addressing a sparse warning. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-04-25modules: Enabled dynamic debugging for staging modulesRoland Vossen
Driver modules from the staging directory are marked 'tainted' by module.c. Subsequently, tainted modules are denied dynamic debugging. This is unwanted behavior, since staging modules should be able to use the dynamic debugging mechanism. Please merge this also into the staging-linus branch. Signed-off-by: Roland Vossen <rvossen@broadcom.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-22printk: use %pK for /proc/kallsyms and /proc/modulesKees Cook
In an effort to reduce kernel address leaks that might be used to help target kernel privilege escalation exploits, this patch uses %pK when displaying addresses in /proc/kallsyms, /proc/modules, and /sys/module/*/sections/*. Note that this changes %x to %p, so some legitimately 0 values in /proc/kallsyms would have changed from 00000000 to "(null)". To avoid this, "(null)" is not used when using the "K" format. Anything that was already successfully parsing "(null)" in addition to full hex digits should have no problem with this change. (Thanks to Joe Perches for the suggestion.) Due to the %x to %p, "void *" casts are needed since these addresses are already "unsigned long" everywhere internally, due to their starting life as ELF section offsets. Signed-off-by: Kees Cook <kees.cook@canonical.com> Cc: Eugene Teo <eugene@redhat.com> Cc: Dan Rosenberg <drosenberg@vsecurity.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-03tracepoints: Fix section alignment using pointer arrayMathieu Desnoyers
Make the tracepoints more robust, making them solid enough to handle compiler changes by not relying on anything based on compiler-specific behavior with respect to structure alignment. Implement an approach proposed by David Miller: use an array of const pointers to refer to the individual structures, and export this pointer array through the linker script rather than the structures per se. It will consume 32 extra bytes per tracepoint (24 for structure padding and 8 for the pointers), but are less likely to break due to compiler changes. History: commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE() added the aligned(32) type and variable attribute to the tracepoint structures to deal with gcc happily aligning statically defined structures on 32-byte multiples. One attempt was to use a 8-byte alignment for tracepoint structures by applying both the variable and type attribute to tracepoint structures definitions and declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5. The reason is that the "aligned" attribute only specify the _minimum_ alignment for a structure, leaving both the compiler and the linker free to align on larger multiples. Because tracepoint.c expects the structures to be placed as an array within each section, up-alignment cause NULL-pointer exceptions due to the extra unexpected padding. (this patch applies on top of -tip) Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: David S. Miller <davem@davemloft.net> LKML-Reference: <20110126222622.GA10794@Krystal> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> CC: Andrew Morton <akpm@linux-foundation.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-12-23module: Move RO/NX module protection to after ftrace module updateSteven Rostedt
The commit: 84e1c6bb38eb318e456558b610396d9f1afaabf0 x86: Add RO/NX protection for loadable kernel modules Broke the function tracer with this output: ------------[ cut here ]------------ WARNING: at kernel/trace/ftrace.c:1014 ftrace_bug+0x114/0x171() Hardware name: Precision WorkStation 470 Modules linked in: i2c_core(+) Pid: 86, comm: modprobe Not tainted 2.6.37-rc2+ #68 Call Trace: [<ffffffff8104e957>] warn_slowpath_common+0x85/0x9d [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff8104e989>] warn_slowpath_null+0x1a/0x1c [<ffffffff810a9dfe>] ftrace_bug+0x114/0x171 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff810aa0db>] ftrace_process_locs+0x1ae/0x274 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff810aa29e>] ftrace_module_notify+0x39/0x44 [<ffffffff814405cf>] notifier_call_chain+0x37/0x63 [<ffffffff8106e054>] __blocking_notifier_call_chain+0x46/0x5b [<ffffffff8106e07d>] blocking_notifier_call_chain+0x14/0x16 [<ffffffff8107ffde>] sys_init_module+0x73/0x1f3 [<ffffffff8100acf2>] system_call_fastpath+0x16/0x1b ---[ end trace 2aff4f4ca53ec746 ]--- ftrace faulted on writing [<ffffffffa00026db>] __process_new_adapter+0x7/0x34 [i2c_core] The cause was that the module text was set to read only before ftrace could convert the calls to mcount to nops. Thus, the conversions failed due to not being able to write to the text locations. The simple fix is to move setting the module to read only after the module notifiers are called (where ftrace sets the module mcounts to nops). Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-12-23Merge commit 'v2.6.37-rc7' into x86/securityIngo Molnar
2010-11-18x86: Add RO/NX protection for loadable kernel modulesmatthieu castet
This patch is a logical extension of the protection provided by CONFIG_DEBUG_RODATA to LKMs. The protection is provided by splitting module_core and module_init into three logical parts each and setting appropriate page access permissions for each individual section: 1. Code: RO+X 2. RO data: RO+NX 3. RW data: RW+NX In order to achieve proper protection, layout_sections() have been modified to align each of the three parts mentioned above onto page boundary. Next, the corresponding page access permissions are set right before successful exit from load_module(). Further, free_module() and sys_init_module have been modified to set module_core and module_init as RW+NX right before calling module_free(). By default, the original section layout and access flags are preserved. When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y, the patch will page-align each group of sections to ensure that each page contains only one type of content and will enforce RO/NX for each group of pages. -v1: Initial proof-of-concept patch. -v2: The patch have been re-written to reduce the number of #ifdefs and to make it architecture-agnostic. Code formatting has also been corrected. -v3: Opportunistic RO/NX protection is now unconditional. Section page-alignment is enabled when CONFIG_DEBUG_RODATA=y. -v4: Removed most macros and improved coding style. -v5: Changed page-alignment and RO/NX section size calculation -v6: Fixed comments. Restricted RO/NX enforcement to x86 only -v7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added calls to set_all_modules_text_rw() and set_all_modules_text_ro() in ftrace -v8: updated for compatibility with linux 2.6.33-rc5 -v9: coding style fixes -v10: more coding style fixes -v11: minor adjustments for -tip -v12: minor adjustments for v2.6.35-rc2-tip -v13: minor adjustments for v2.6.37-rc1-tip Signed-off-by: Siarhei Liakh <sliakh.lkml@gmail.com> Signed-off-by: Xuxian Jiang <jiang@cs.ncsu.edu> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@muc.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dave Jones <davej@redhat.com> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4CE2F914.9070106@free.fr> [ minor cleanliness edits, -v14: build failure fix ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-10tracing: Fix module use of trace_bprintk()Steven Rostedt
On use of trace_printk() there's a macro that determines if the format is static or a variable. If it is static, it defaults to __trace_bprintk() otherwise it uses __trace_printk(). A while ago, Lai Jiangshan added __trace_bprintk(). In that patch, we discussed a way to allow modules to use it. The difference between __trace_bprintk() and __trace_printk() is that for faster processing, just the format and args are stored in the trace instead of running it through a sprintf function. In order to do this, the format used by the __trace_bprintk() had to be persistent. See commit 1ba28e02a18cbdbea123836f6c98efb09cbf59ec The problem comes with trace_bprintk() where the module is unloaded. The pointer left in the buffer is still pointing to the format. To solve this issue, the formats in the module were copied into kernel core. If the same format was used, they would use the same copy (to prevent memory leak). This all worked well until we tried to merge everything. At the time this was written, Lai Jiangshan, Frederic Weisbecker, Ingo Molnar and myself were all touching the same code. When this was merged, we lost the part of it that was in module.c. This kept out the copying of the formats and unloading the module could cause bad pointers left in the ring buffer. This patch adds back (with updates required for current kernel) the module code that sets up the necessary pointers. Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-10-27(trivial) Fix compiler warning in kernel/modules.cMichał Mirosław
Building with CONFIG_KALLSYMS=n gives following warning: /mnt/src/linux-git/kernel/module.c: In function ‘post_relocation’: /mnt/src/linux-git/kernel/module.c:2534:2: warning: passing argument 2 of ‘add_kallsyms’ discards qualifiers from pointer target type /mnt/src/linux-git/kernel/module.c:2038:13: note: expected ‘struct load_info *’ but argument is of type ‘const struct load_info *’ Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-10-08Merge commit 'v2.6.36-rc7' into perf/coreIngo Molnar
Conflicts: arch/x86/kernel/module.c Merge reason: Resolve the conflict, pick up fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-05modules: Fix module_bug_list list corruption raceLinus Torvalds
With all the recent module loading cleanups, we've minimized the code that sits under module_mutex, fixing various deadlocks and making it possible to do most of the module loading in parallel. However, that whole conversion totally missed the rather obscure code that adds a new module to the list for BUG() handling. That code was doubly obscure because (a) the code itself lives in lib/bugs.c (for dubious reasons) and (b) it gets called from the architecture-specific "module_finalize()" rather than from generic code. Calling it from arch-specific code makes no sense what-so-ever to begin with, and is now actively wrong since that code isn't protected by the module loading lock any more. So this commit moves the "module_bug_{finalize,cleanup}()" calls away from the arch-specific code, and into the generic code - and in the process protects it with the module_mutex so that the list operations are now safe. Future fixups: - move the module list handling code into kernel/module.c where it belongs. - get rid of 'module_bug_list' and just use the regular list of modules (called 'modules' - imagine that) that we already create and maintain for other reasons. Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Adrian Bunk <bunk@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-22jump label: Base patch for jump labelJason Baron
base patch to implement 'jump labeling'. Based on a new 'asm goto' inline assembly gcc mechanism, we can now branch to labels from an 'asm goto' statment. This allows us to create a 'no-op' fastpath, which can subsequently be patched with a jump to the slowpath code. This is useful for code which might be rarely used, but which we'd like to be able to call, if needed. Tracepoints are the current usecase that these are being implemented for. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <ee8b3595967989fdaf84e698dc7447d315ce972a.1284733808.git.jbaron@redhat.com> [ cleaned up some formating ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>