summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2014-05-23lib/test_bpf.c: don't use gcc union shortcutAndrew Morton
Older gcc's (mine is gcc-4.4.4) make a mess of this. lib/test_bpf.c:74: error: unknown field 'insns' specified in initializer lib/test_bpf.c:75: warning: missing braces around initializer lib/test_bpf.c:75: warning: (near initialization for 'tests[0].<anonymous>.insns[0]') lib/test_bpf.c:76: error: extra brace group at end of initializer lib/test_bpf.c:76: error: (near initialization for 'tests[0].<anonymous>') lib/test_bpf.c:76: warning: excess elements in union initializer lib/test_bpf.c:76: warning: (near initialization for 'tests[0].<anonymous>') lib/test_bpf.c:77: error: extra brace group at end of initializer Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21net: filter: cleanup invocation of internal BPFAlexei Starovoitov
Kernel API for classic BPF socket filters is: sk_unattached_filter_create() - validate classic BPF, convert, JIT SK_RUN_FILTER() - run it sk_unattached_filter_destroy() - destroy socket filter Cleanup internal BPF kernel API as following: sk_filter_select_runtime() - final step of internal BPF creation. Try to JIT internal BPF program, if JIT is not available select interpreter SK_RUN_FILTER() - run it sk_filter_free() - free internal BPF program Disallow direct calls to BPF interpreter. Execution of the BPF program should be done with SK_RUN_FILTER() macro. Example of internal BPF create, run, destroy: struct sk_filter *fp; fp = kzalloc(sk_filter_size(prog_len), GFP_KERNEL); memcpy(fp->insni, prog, prog_len * sizeof(fp->insni[0])); fp->len = prog_len; sk_filter_select_runtime(fp); SK_RUN_FILTER(fp, ctx); sk_filter_free(fp); Sockets, seccomp, testsuite, tracing are using different ways to populate sk_filter, so first steps of program creation are not common. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13net: fix test_bpf build to depend on NETRandy Dunlap
Fix build when CONFIG_NET is not enabled. Fixes these build errors: WARNING: "sk_unattached_filter_destroy" [lib/test_bpf.ko] undefined! WARNING: "kfree_skb" [lib/test_bpf.ko] undefined! WARNING: "sk_unattached_filter_create" [lib/test_bpf.ko] undefined! WARNING: "sk_run_filter_int_skb" [lib/test_bpf.ko] undefined! WARNING: "__alloc_skb" [lib/test_bpf.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12net: filter: additional BPF testsAlexei Starovoitov
All tests should pass with and without JIT. Example output: test_bpf: #0 TAX 35 16 16 PASS test_bpf: #1 TXA 7 7 7 PASS test_bpf: #2 ADD_SUB_MUL_K 10 PASS test_bpf: #3 DIV_KX 33 PASS test_bpf: #4 AND_OR_LSH_K 10 10 PASS test_bpf: #5 LD_IND 8 8 8 PASS test_bpf: #6 LD_ABS 8 8 8 PASS test_bpf: #7 LD_ABS_LL 13 14 PASS test_bpf: #8 LD_IND_LL 12 12 12 PASS test_bpf: #9 LD_ABS_NET 10 12 PASS test_bpf: #10 LD_IND_NET 11 12 12 PASS ... Numbers are times in nsec per filter for given input data. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12net: filter: BPF testsuiteAlexei Starovoitov
The testsuite covers classic and internal BPF instructions. It is particularly useful for JIT compiler developers. Adds to "net" selftest target. The testsuite can be used as a set of micro-benchmarks. It measures execution time of each BPF program in nsec. This patch adds core framework. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-18mm: fix CONFIG_DEBUG_VM_RB descriptionDavidlohr Bueso
This appears to be a copy/paste error. Update the description to reflect extra rbtree debug and checks for the config option instead of duplicating CONFIG_DEBUG_VM. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "The first vfs pile, with deep apologies for being very late in this window. Assorted cleanups and fixes, plus a large preparatory part of iov_iter work. There's a lot more of that, but it'll probably go into the next merge window - it *does* shape up nicely, removes a lot of boilerplate, gets rid of locking inconsistencie between aio_write and splice_write and I hope to get Kent's direct-io rewrite merged into the same queue, but some of the stuff after this point is having (mostly trivial) conflicts with the things already merged into mainline and with some I want more testing. This one passes LTP and xfstests without regressions, in addition to usual beating. BTW, readahead02 in ltp syscalls testsuite has started giving failures since "mm/readahead.c: fix readahead failure for memoryless NUMA nodes and limit readahead pages" - might be a false positive, might be a real regression..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) missing bits of "splice: fix racy pipe->buffers uses" cifs: fix the race in cifs_writev() ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure kill generic_file_buffered_write() ocfs2_file_aio_write(): switch to generic_perform_write() ceph_aio_write(): switch to generic_perform_write() xfs_file_buffered_aio_write(): switch to generic_perform_write() export generic_perform_write(), start getting rid of generic_file_buffer_write() generic_file_direct_write(): get rid of ppos argument btrfs_file_aio_write(): get rid of ppos kill the 5th argument of generic_file_buffered_write() kill the 4th argument of __generic_file_aio_write() lustre: don't open-code kernel_recvmsg() ocfs2: don't open-code kernel_recvmsg() drbd: don't open-code kernel_recvmsg() constify blk_rq_map_user_iov() and friends lustre: switch to kernel_sendmsg() ocfs2: don't open-code kernel_sendmsg() take iov_iter stuff to mm/iov_iter.c process_vm_access: tidy up a bit ...
2014-04-12Merge git://git.infradead.org/users/eparis/auditLinus Torvalds
Pull audit updates from Eric Paris. * git://git.infradead.org/users/eparis/audit: (28 commits) AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range audit: do not cast audit_rule_data pointers pointlesly AUDIT: Allow login in non-init namespaces audit: define audit_is_compat in kernel internal header kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c sched: declare pid_alive as inline audit: use uapi/linux/audit.h for AUDIT_ARCH declarations syscall_get_arch: remove useless function arguments audit: remove stray newline from audit_log_execve_info() audit_panic() call audit: remove stray newlines from audit_log_lost messages audit: include subject in login records audit: remove superfluous new- prefix in AUDIT_LOGIN messages audit: allow user processes to log from another PID namespace audit: anchor all pid references in the initial pid namespace audit: convert PPIDs to the inital PID namespace. pid: get pid_t ppid of task in init_pid_ns audit: rename the misleading audit_get_context() to audit_take_context() audit: Add generic compat syscall support audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL ...
2014-04-08lib/percpu_counter.c: fix bad percpu counter state during suspendJens Axboe
I got a bug report yesterday from Laszlo Ersek in which he states that his kvm instance fails to suspend. Laszlo bisected it down to this commit 1cf7e9c68fe8 ("virtio_blk: blk-mq support") where virtio-blk is converted to use the blk-mq infrastructure. After digging a bit, it became clear that the issue was with the queue drain. blk-mq tracks queue usage in a percpu counter, which is incremented on request alloc and decremented when the request is freed. The initial hunt was for an inconsistency in blk-mq, but everything seemed fine. In fact, the counter only returned crazy values when suspend was in progress. When a CPU is unplugged, the percpu counters merges that CPU state with the general state. blk-mq takes care to register a hotcpu notifier with the appropriate priority, so we know it runs after the percpu counter notifier. However, the percpu counter notifier only merges the state when the CPU is fully gone. This leaves a state transition where the CPU going away is no longer in the online mask, yet it still holds private values. This means that in this state, percpu_counter_sum() returns invalid results, and the suspend then hangs waiting for abs(dead-cpu-value) requests to complete which of course will never happen. Fix this by clearing the state earlier, so we never have a case where the CPU isn't in online mask but still holds private state. This bug has been there since forever, I guess we don't have a lot of users where percpu counters needs to be reliable during the suspend cycle. Signed-off-by: Jens Axboe <axboe@fb.com> Reported-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07percpu: add preemption checks to __this_cpu opsChristoph Lameter
We define a check function in order to avoid trouble with the include files. Then the higher level __this_cpu macros are modified to invoke the preemption check. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Christoph Lameter <cl@linux.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Tejun Heo <tj@kernel.org> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07Kconfig: rename HAS_IOPORT to HAS_IOPORT_MAPUwe Kleine-König
If the renamed symbol is defined lib/iomap.c implements ioport_map and ioport_unmap and currently (nearly) all platforms define the port accessor functions outb/inb and friend unconditionally. So HAS_IOPORT_MAP is the better name for this. Consequently NO_IOPORT is renamed to NO_IOPORT_MAP. The motivation for this change is to reintroduce a symbol HAS_IOPORT that signals if outb/int et al are available. I will address that at least one merge window later though to keep surprises to a minimum and catch new introductions of (HAS|NO)_IOPORT. The changes in this commit were done using: $ git grep -l -E '(NO|HAS)_IOPORT' | xargs perl -p -i -e 's/\b((?:CONFIG_)?(?:NO|HAS)_IOPORT)\b/$1_MAP/' Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07initramfs: debug detected compression methodDaniel M. Weeks
This can greatly aid in narrowing down the real source of initramfs problems such as failures related to the compression of the in-kernel initramfs when an external initramfs is in use as well. Existing errors are ambiguous as to which initramfs is a problem and why. [akpm@linux-foundation.org: use pr_debug()] Signed-off-by: Daniel M. Weeks <dan@danweeks.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07lib/idr.c: use RCU_INIT_POINTER(x, NULL)Monam Agarwal
Replace rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07idr: remove dead codeStephen Hemminger
Remove no longer used deprecated code, and make local functions static. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jean Delvare <jdelvare@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Cc: Jeff Layton <jlayton@redhat.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: George Spelvin <linux@horizon.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03lib/decompress_inflate.c: include appropriate header fileRashika Kheria
Include appropriate header file include/linux/decompress/inflate.h in lib/decompress_inflate.c because it has prototype declaration of function defined in lib/decompress_inflate.c. Also, fix the guard around the header file include/linux/decompress/inflate.h to use a more unique guard symbol. This avoids conflict with the INFLATE_H defined by zlib_inflate/inflate.h. This eliminates the following warning in lib/decompress_inflate.c: lib/decompress_inflate.c:35:17: warning: no previous prototype for `gunzip' [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03lib/clz_ctz.c: add prototype declarations in lib/clz_ctz.cRashika Kheria
Add prototype declarations of functions in lib/clz_ctz.c. These functions are required by GCC builtins and hence can not be removed despite of their unreferenced appearance in kernel source. This eliminates the following warning in lib/clz_ctz.c: lib/clz_ctz.c:16:12: warning: no previous prototype for `__ctzsi2' [-Wmissing-prototypes] lib/clz_ctz.c:22:12: warning: no previous prototype for `__clzsi2' [-Wmissing-prototypes] lib/clz_ctz.c:44:12: warning: no previous prototype for `__clzdi2' [-Wmissing-prototypes] lib/clz_ctz.c:50:12: warning: no previous prototype for `__ctzdi2' [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03lib/random32.c: minor cleanups and kdoc fixDaniel Borkmann
These are just some very minor and misc cleanups in the PRNG. In prandom_u32() we store the result in an unsigned long which is unnecessary as it should be u32 instead that we get from prandom_u32_state(). prandom_bytes_state()'s comment is in kdoc format, so change it into such as it's done everywhere else. Also, use the normal comment style for the header comment. Last but not least for readability, add some newlines. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03lib/devres.c: fix some sparse warningsSteven Rostedt
Having a discussion about sparse warnings in the kernel, and that we should clean them up, I decided to pick a random file to do so. This happened to be devres.c which gives the following warnings: CHECK lib/devres.c lib/devres.c:83:9: warning: cast removes address space of expression lib/devres.c:117:31: warning: incorrect type in return expression (different address spaces) lib/devres.c:117:31: expected void [noderef] <asn:2>* lib/devres.c:117:31: got void * lib/devres.c:125:31: warning: incorrect type in return expression (different address spaces) lib/devres.c:125:31: expected void [noderef] <asn:2>* lib/devres.c:125:31: got void * lib/devres.c:136:26: warning: incorrect type in assignment (different address spaces) lib/devres.c:136:26: expected void [noderef] <asn:2>*[assigned] dest_ptr lib/devres.c:136:26: got void * lib/devres.c:226:9: warning: cast removes address space of expression Mostly it's just the use of typecasting to void * without adding __force, or returning ERR_PTR(-ESOMEERR) without typecasting to a __iomem type. I added a helper macro IOMEM_ERR_PTR() that does the typecast to make the code a little nicer than adding ugly typecasts to the code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03vsprintf: remove %n handlingRyan Mallon
All in-kernel users of %n in format strings have now been removed and the %n directive is ignored. Remove the handling of %n so that it is treated the same as any other invalid format string directive. Keep a warning in place to deter new instances of %n in format strings. Signed-off-by: Ryan Mallon <rmallon@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03lib/syscall.c: unexport task_current_syscall()Andrew Morton
It is only used by procfs and procfs cannot be a module. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03kobject: don't block for each kobject_ueventVladimir Davydov
Currently kobject_uevent has somewhat unpredictable semantics. The point is, since it may call a usermode helper and wait for it to execute (UMH_WAIT_EXEC), it is impossible to say for sure what lock dependencies it will introduce for the caller - strictly speaking it depends on what fs the binary is located on and the set of locks fork may take. There are quite a few kobject_uevent's users that do not take this into account and call it with various mutexes taken, e.g. rtnl_mutex, net_mutex, which might potentially lead to a deadlock. Since there is actually no reason to wait for the usermode helper to execute there, let's make kobject_uevent start the helper asynchronously with the aid of the UMH_NO_WAIT flag. Personally, I'm interested in this, because I really want kobject_uevent to be called under the slab_mutex in the slub implementation as it used to be some time ago, because it greatly simplifies synchronization and automatically fixes a kmemcg-related race. However, there was a deadlock detected on an attempt to call kobject_uevent under the slab_mutex (see https://lkml.org/lkml/2012/1/14/45), which was reported to be fixed by releasing the slab_mutex for kobject_uevent. Unfortunately, there was no information about who exactly blocked on the slab_mutex causing the usermode helper to stall, neither have I managed to find this out or reproduce the issue. BTW, this is not the first attempt to make kobject_uevent use UMH_NO_WAIT. Previous one was made by commit f520360d93cd ("kobject: don't block for each kobject_uevent"), but it was wrong (it passed arguments allocated on stack to async thread) so it was reverted in 05f54c13cd0c ("Revert "kobject: don't block for each kobject_uevent"."). It targeted on speeding up the boot process though. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Greg KH <greg@kroah.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03mm: keep page cache radix tree nodes in checkJohannes Weiner
Previously, page cache radix tree nodes were freed after reclaim emptied out their page pointers. But now reclaim stores shadow entries in their place, which are only reclaimed when the inodes themselves are reclaimed. This is problematic for bigger files that are still in use after they have a significant amount of their cache reclaimed, without any of those pages actually refaulting. The shadow entries will just sit there and waste memory. In the worst case, the shadow entries will accumulate until the machine runs out of memory. To get this under control, the VM will track radix tree nodes exclusively containing shadow entries on a per-NUMA node list. Per-NUMA rather than global because we expect the radix tree nodes themselves to be allocated node-locally and we want to reduce cross-node references of otherwise independent cache workloads. A simple shrinker will then reclaim these nodes on memory pressure. A few things need to be stored in the radix tree node to implement the shadow node LRU and allow tree deletions coming from the list: 1. There is no index available that would describe the reverse path from the node up to the tree root, which is needed to perform a deletion. To solve this, encode in each node its offset inside the parent. This can be stored in the unused upper bits of the same member that stores the node's height at no extra space cost. 2. The number of shadow entries needs to be counted in addition to the regular entries, to quickly detect when the node is ready to go to the shadow node LRU list. The current entry count is an unsigned int but the maximum number of entries is 64, so a shadow counter can easily be stored in the unused upper bits. 3. Tree modification needs tree lock and tree root, which are located in the address space, so store an address_space backpointer in the node. The parent pointer of the node is in a union with the 2-word rcu_head, so the backpointer comes at no extra cost as well. 4. The node needs to be linked to an LRU list, which requires a list head inside the node. This does increase the size of the node, but it does not change the number of objects that fit into a slab page. [akpm@linux-foundation.org: export the right function] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Metin Doslu <metin@citusdata.com> Cc: Michel Lespinasse <walken@google.com> Cc: Ozgun Erdogan <ozgun@citusdata.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roman Gushchin <klamm@yandex-team.ru> Cc: Ryan Mallon <rmallon@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03lib: radix_tree: tree node interfaceJohannes Weiner
Make struct radix_tree_node part of the public interface and provide API functions to create, look up, and delete whole nodes. Refactor the existing insert, look up, delete functions on top of these new node primitives. This will allow the VM to track and garbage collect page cache radix tree nodes. [sasha.levin@oracle.com: return correct error code on insertion failure] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Metin Doslu <metin@citusdata.com> Cc: Michel Lespinasse <walken@google.com> Cc: Ozgun Erdogan <ozgun@citusdata.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roman Gushchin <klamm@yandex-team.ru> Cc: Ryan Mallon <rmallon@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03mm: filemap: move radix tree hole searching hereJohannes Weiner
The radix tree hole searching code is only used for page cache, for example the readahead code trying to get a a picture of the area surrounding a fault. It sufficed to rely on the radix tree definition of holes, which is "empty tree slot". But this is about to change, though, as shadow page descriptors will be stored in the page cache after the actual pages get evicted from memory. Move the functions over to mm/filemap.c and make them native page cache operations, where they can later be adapted to handle the new definition of "page cache hole". Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Metin Doslu <metin@citusdata.com> Cc: Michel Lespinasse <walken@google.com> Cc: Ozgun Erdogan <ozgun@citusdata.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roman Gushchin <klamm@yandex-team.ru> Cc: Ryan Mallon <rmallon@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03lib: radix-tree: add radix_tree_delete_item()Johannes Weiner
Provide a function that does not just delete an entry at a given index, but also allows passing in an expected item. Delete only if that item is still located at the specified index. This is handy when lockless tree traversals want to delete entries as well because they don't have to do an second, locked lookup to verify the slot has not changed under them before deleting the entry. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Metin Doslu <metin@citusdata.com> Cc: Michel Lespinasse <walken@google.com> Cc: Ozgun Erdogan <ozgun@citusdata.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roman Gushchin <klamm@yandex-team.ru> Cc: Ryan Mallon <rmallon@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Here is my initial pull request for the networking subsystem during this merge window: 1) Support for ESN in AH (RFC 4302) from Fan Du. 2) Add full kernel doc for ethtool command structures, from Ben Hutchings. 3) Add BCM7xxx PHY driver, from Florian Fainelli. 4) Export computed TCP rate information in netlink socket dumps, from Eric Dumazet. 5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas Dichtel. 6) Convert many drivers to pci_enable_msix_range(), from Alexander Gordeev. 7) Record SKB timestamps more efficiently, from Eric Dumazet. 8) Switch to microsecond resolution for TCP round trip times, also from Eric Dumazet. 9) Clean up and fix 6lowpan fragmentation handling by making use of the existing inet_frag api for it's implementation. 10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss. 11) Auto size SKB lengths when composing netlink messages based upon past message sizes used, from Eric Dumazet. 12) qdisc dumps can take a long time, add a cond_resched(), From Eric Dumazet. 13) Sanitize netpoll core and drivers wrt. SKB handling semantics. Get rid of never-used-in-tree netpoll RX handling. From Eric W Biederman. 14) Support inter-address-family and namespace changing in VTI tunnel driver(s). From Steffen Klassert. 15) Add Altera TSE driver, from Vince Bridgers. 16) Optimizing csum_replace2() so that it doesn't adjust the checksum by checksumming the entire header, from Eric Dumazet. 17) Expand BPF internal implementation for faster interpreting, more direct translations into JIT'd code, and much cleaner uses of BPF filtering in non-socket ocntexts. From Daniel Borkmann and Alexei Starovoitov" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits) netpoll: Use skb_irq_freeable to make zap_completion_queue safe. net: Add a test to see if a skb is freeable in irq context qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port' net: ptp: move PTP classifier in its own file net: sxgbe: make "core_ops" static net: sxgbe: fix logical vs bitwise operation net: sxgbe: sxgbe_mdio_register() frees the bus Call efx_set_channels() before efx->type->dimension_resources() xen-netback: disable rogue vif in kthread context net/mlx4: Set proper build dependancy with vxlan be2net: fix build dependency on VxLAN mac802154: make csma/cca parameters per-wpan mac802154: allow only one WPAN to be up at any given time net: filter: minor: fix kdoc in __sk_run_filter netlink: don't compare the nul-termination in nla_strcmp can: c_can: Avoid led toggling for every packet. can: c_can: Simplify TX interrupt cleanup can: c_can: Store dlc private can: c_can: Reduce register access can: c_can: Make the code readable ...
2014-04-01get rid of DEBUG_WRITECOUNTAl Viro
it only makes control flow in __fput() and friends more convoluted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01Merge branch 'for-3.15/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver update from Jens Axboe: "On top of the core pull request, here's the pull request for the driver related changes for 3.15. It contains: - Improvements for msi-x registration for block drivers (mtip32xx, skd, cciss, nvme) from Alexander Gordeev. - A round of cleanups and improvements for drbd from Andreas Gruenbacher and Rashika Kheria. - A round of clanups and improvements for bcache from Kent. - Removal of sleep_on() and friends in DAC960, ataflop, swim3 from Arnd Bergmann. - Bug fix for a bug in the mtip32xx async completion code from Sam Bradshaw. - Bug fix for accidentally bouncing IO on 32-bit platforms with mtip32xx from Felipe Franciosi" * 'for-3.15/drivers' of git://git.kernel.dk/linux-block: (103 commits) bcache: remove nested function usage bcache: Kill bucket->gc_gen bcache: Kill unused freelist bcache: Rework btree cache reserve handling bcache: Kill btree_io_wq bcache: btree locking rework bcache: Fix a race when freeing btree nodes bcache: Add a real GC_MARK_RECLAIMABLE bcache: Add bch_keylist_init_single() bcache: Improve priority_stats bcache: Better alloc tracepoints bcache: Kill dead cgroup code bcache: stop moving_gc marking buckets that can't be moved. bcache: Fix moving_pred() bcache: Fix moving_gc deadlocking with a foreground write bcache: Fix discard granularity bcache: Fix another bug recovering from unclean shutdown bcache: Fix a bug recovering from unclean shutdown bcache: Fix a journalling reclaim after recovery bug bcache: Fix a null ptr deref in journal replay ...
2014-04-01Merge tag 'driver-core-3.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and sysfs updates from Greg KH: "Here's the big driver core / sysfs update for 3.15-rc1. Lots of kernfs updates to make it useful for other subsystems, and a few other tiny driver core patches. All have been in linux-next for a while" * tag 'driver-core-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (42 commits) Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()" kernfs: cache atomic_write_len in kernfs_open_file numa: fix NULL pointer access and memory leak in unregister_one_node() Revert "driver core: synchronize device shutdown" kernfs: fix off by one error. kernfs: remove duplicate dir.c at the top dir x86: align x86 arch with generic CPU modalias handling cpu: add generic support for CPU feature based module autoloading sysfs: create bin_attributes under the requested group driver core: unexport static function create_syslog_header firmware: use power efficient workqueue for unloading and aborting fw load firmware: give a protection when map page failed firmware: google memconsole driver fixes firmware: fix google/gsmi duplicate efivars_sysfs_init() drivers/base: delete non-required instances of include <linux/init.h> kernfs: fix kernfs_node_from_dentry() ACPI / platform: drop redundant ACPI_HANDLE check kernfs: fix hash calculation in kernfs_rename_ns() kernfs: add CONFIG_KERNFS sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() ...
2014-04-01Merge tag 'pci-v3.15-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Enumeration - Increment max correctly in pci_scan_bridge() (Andreas Noever) - Clarify the "scan anyway" comment in pci_scan_bridge() (Andreas Noever) - Assign CardBus bus number only during the second pass (Andreas Noever) - Use request_resource_conflict() instead of insert_ for bus numbers (Andreas Noever) - Make sure bus number resources stay within their parents bounds (Andreas Noever) - Remove pci_fixup_parent_subordinate_busnr() (Andreas Noever) - Check for child busses which use more bus numbers than allocated (Andreas Noever) - Don't scan random busses in pci_scan_bridge() (Andreas Noever) - x86: Drop pcibios_scan_root() check for bus already scanned (Bjorn Helgaas) - x86: Use pcibios_scan_root() instead of pci_scan_bus_with_sysdata() (Bjorn Helgaas) - x86: Use pcibios_scan_root() instead of pci_scan_bus_on_node() (Bjorn Helgaas) - x86: Merge pci_scan_bus_on_node() into pcibios_scan_root() (Bjorn Helgaas) - x86: Drop return value of pcibios_scan_root() (Bjorn Helgaas) NUMA - x86: Add x86_pci_root_bus_node() to look up NUMA node from PCI bus (Bjorn Helgaas) - x86: Use x86_pci_root_bus_node() instead of get_mp_bus_to_node() (Bjorn Helgaas) - x86: Remove mp_bus_to_node[], set_mp_bus_to_node(), get_mp_bus_to_node() (Bjorn Helgaas) - x86: Use NUMA_NO_NODE, not -1, for unknown node (Bjorn Helgaas) - x86: Remove acpi_get_pxm() usage (Bjorn Helgaas) - ia64: Use NUMA_NO_NODE, not MAX_NUMNODES, for unknown node (Bjorn Helgaas) - ia64: Remove acpi_get_pxm() usage (Bjorn Helgaas) - ACPI: Fix acpi_get_node() prototype (Bjorn Helgaas) Resource management - i2o: Fix and refactor PCI space allocation (Bjorn Helgaas) - Add resource_contains() (Bjorn Helgaas) - Add %pR support for IORESOURCE_UNSET (Bjorn Helgaas) - Mark resources as IORESOURCE_UNSET if we can't assign them (Bjorn Helgaas) - Don't clear IORESOURCE_UNSET when updating BAR (Bjorn Helgaas) - Check IORESOURCE_UNSET before updating BAR (Bjorn Helgaas) - Don't try to claim IORESOURCE_UNSET resources (Bjorn Helgaas) - Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit (Bjorn Helgaas) - Don't enable decoding if BAR hasn't been assigned an address (Bjorn Helgaas) - Add "weak" generic pcibios_enable_device() implementation (Bjorn Helgaas) - alpha, microblaze, sh, sparc, tile: Use default pcibios_enable_device() (Bjorn Helgaas) - s390: Use generic pci_enable_resources() (Bjorn Helgaas) - Don't check resource_size() in pci_bus_alloc_resource() (Bjorn Helgaas) - Set type in __request_region() (Bjorn Helgaas) - Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region() (Bjorn Helgaas) - Change pci_bus_alloc_resource() type_mask to unsigned long (Bjorn Helgaas) - Log IDE resource quirk in dmesg (Bjorn Helgaas) - Revert "[PATCH] Insert GART region into resource map" (Bjorn Helgaas) PCI device hotplug - Make check_link_active() non-static (Rajat Jain) - Use link change notifications for hot-plug and removal (Rajat Jain) - Enable link state change notifications (Rajat Jain) - Don't disable the link permanently during removal (Rajat Jain) - Don't check adapter or latch status while disabling (Rajat Jain) - Disable link notification across slot reset (Rajat Jain) - Ensure very fast hotplug events are also processed (Rajat Jain) - Add hotplug_lock to serialize hotplug events (Rajat Jain) - Remove a non-existent card, regardless of "surprise" capability (Rajat Jain) - Don't turn slot off when hot-added device already exists (Yijing Wang) MSI - Keep pci_enable_msi() documentation (Alexander Gordeev) - ahci: Fix broken single MSI fallback (Alexander Gordeev) - ahci, vfio: Use pci_enable_msi_range() (Alexander Gordeev) - Check kmalloc() return value, fix leak of name (Greg Kroah-Hartman) - Fix leak of msi_attrs (Greg Kroah-Hartman) - Fix pci_msix_vec_count() htmldocs failure (Masanari Iida) Virtualization - Device-specific ACS support (Alex Williamson) Freescale i.MX6 - Wait for retraining (Marek Vasut) Marvell MVEBU - Use Device ID and revision from underlying endpoint (Andrew Lunn) - Fix incorrect size for PCI aperture resources (Jason Gunthorpe) - Call request_resource() on the apertures (Jason Gunthorpe) - Fix potential issue in range parsing (Jean-Jacques Hiblot) Renesas R-Car - Check platform_get_irq() return code (Ben Dooks) - Add error interrupt handling (Ben Dooks) - Fix bridge logic configuration accesses (Ben Dooks) - Register each instance independently (Magnus Damm) - Break out window size handling (Magnus Damm) - Make the Kconfig dependencies more generic (Magnus Damm) Synopsys DesignWare - Fix RC BAR to be single 64-bit non-prefetchable memory (Mohit Kumar) Miscellaneous - Remove unused SR-IOV VF Migration support (Bjorn Helgaas) - Enable INTx if BIOS left them disabled (Bjorn Helgaas) - Fix hex vs decimal typo in cpqhpc_probe() (Dan Carpenter) - Clean up par-arch object file list (Liviu Dudau) - Set IORESOURCE_ROM_SHADOW only for the default VGA device (Sander Eikelenboom) - ACPI, ARM, drm, powerpc, pcmcia, PCI: Use list_for_each_entry() for bus traversal (Yijing Wang) - Fix pci_bus_b() build failure (Paul Gortmaker)" * tag 'pci-v3.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (108 commits) Revert "[PATCH] Insert GART region into resource map" PCI: Log IDE resource quirk in dmesg PCI: Change pci_bus_alloc_resource() type_mask to unsigned long PCI: Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region() resources: Set type in __request_region() PCI: Don't check resource_size() in pci_bus_alloc_resource() s390/PCI: Use generic pci_enable_resources() tile PCI RC: Use default pcibios_enable_device() sparc/PCI: Use default pcibios_enable_device() (Leon only) sh/PCI: Use default pcibios_enable_device() microblaze/PCI: Use default pcibios_enable_device() alpha/PCI: Use default pcibios_enable_device() PCI: Add "weak" generic pcibios_enable_device() implementation PCI: Don't enable decoding if BAR hasn't been assigned an address PCI: Enable INTx in pci_reenable_device() only when MSI/MSI-X not enabled PCI: Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit PCI: Don't try to claim IORESOURCE_UNSET resources PCI: Check IORESOURCE_UNSET before updating BAR PCI: Don't clear IORESOURCE_UNSET when updating BAR PCI: Mark resources as IORESOURCE_UNSET if we can't assign them ... Conflicts: arch/x86/include/asm/topology.h drivers/ata/ahci.c
2014-04-01netlink: don't compare the nul-termination in nla_strcmpPablo Neira
nla_strcmp compares the string length plus one, so it's implicitly including the nul-termination in the comparison. int nla_strcmp(const struct nlattr *nla, const char *str) { int len = strlen(str) + 1; ... d = memcmp(nla_data(nla), str, len); However, if NLA_STRING is used, userspace can send us a string without the nul-termination. This is a problem since the string comparison will not match as the last byte may be not the nul-termination. Fix this by skipping the comparison of the nul-termination if the attribute data is nul-terminated. Suggested by Thomas Graf. Cc: Florian Westphal <fw@strlen.de> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-31Merge branch 'x86-asmlinkage-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 LTO changes from Peter Anvin: "More infrastructure work in preparation for link-time optimization (LTO). Most of these changes is to make sure symbols accessed from assembly code are properly marked as visible so the linker doesn't remove them. My understanding is that the changes to support LTO are still not upstream in binutils, but are on the way there. This patchset should conclude the x86-specific changes, and remaining patches to actually enable LTO will be fed through the Kbuild tree (other than keeping up with changes to the x86 code base, of course), although not necessarily in this merge window" * 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) Kbuild, lto: Handle basic LTO in modpost Kbuild, lto: Disable LTO for asm-offsets.c Kbuild, lto: Add a gcc-ld script to let run gcc as ld Kbuild, lto: add ld-version and ld-ifversion macros Kbuild, lto: Drop .number postfixes in modpost Kbuild, lto, workaround: Don't warn for initcall_reference in modpost lto: Disable LTO for sys_ni lto: Handle LTO common symbols in module loader lto, workaround: Add workaround for initcall reordering lto: Make asmlinkage __visible x86, lto: Disable LTO for the x86 VDSO initconst, x86: Fix initconst mistake in ts5500 code initconst: Fix initconst mistake in dcdbas asmlinkage: Make trace_hardirqs_on/off_caller visible asmlinkage, x86: Fix 32bit memcpy for LTO asmlinkage Make __stack_chk_failed and memcmp visible asmlinkage: Mark rwsem functions that can be called from assembler asmlinkage asmlinkage: Make main_extable_sort_needed visible asmlinkage, mutex: Mark __visible asmlinkage: Make trace_hardirq visible ...
2014-03-31Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "Main changes: - Torture-test changes, including refactoring of rcutorture and introduction of a vestigial locktorture. - Real-time latency fixes. - Documentation updates. - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits) rcu: Provide grace-period piggybacking API rcu: Ensure kernel/rcu/rcu.h can be sourced/used stand-alone rcu: Fix sparse warning for rcu_expedited from kernel/ksysfs.c notifier: Substitute rcu_access_pointer() for rcu_dereference_raw() Documentation/memory-barriers.txt: Clarify release/acquire ordering rcutorture: Save kvm.sh output to log rcutorture: Add a lock_busted to test the test rcutorture: Place kvm-test-1-run.sh output into res directory rcutorture: Rename TREE_RCU-Kconfig.txt locktorture: Add kvm-recheck.sh plug-in for locktorture rcutorture: Gracefully handle NULL cleanup hooks locktorture: Add vestigial locktorture configuration rcutorture: Introduce "rcu" directory level underneath configs rcutorture: Rename kvm-test-1-rcu.sh rcutorture: Remove RCU dependencies from ver_functions.sh API rcutorture: Create CFcommon file for common Kconfig parameters rcutorture: Create config files for scripted test-the-test testing rcutorture: Add an rcu_busted to test the test locktorture: Add a lock-torture kernel module rcutorture: Abstract kvm-recheck.sh ...
2014-03-28random32: avoid attempt to late reseed if in the middle of seedingSasha Levin
Commit 4af712e8df ("random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized") has added a late reseed stage that happens as soon as the nonblocking pool is marked as initialized. This fails in the case that the nonblocking pool gets initialized during __prandom_reseed()'s call to get_random_bytes(). In that case we'd double back into __prandom_reseed() in an attempt to do a late reseed - deadlocking on 'lock' early on in the boot process. Instead, just avoid even waiting to do a reseed if a reseed is already occuring. Fixes: 4af712e8df99 ("random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-23partly revert commit 8a10bc9: parisc/sti_console: prefer Linux fonts over ↵Helge Deller
built-in ROM fonts STI console is used on parisc and m68k HP machines. This patch partly reverts my previous commit and as such restores the fonts for the m68k machines. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v3.13
2014-03-20audit: Add generic compat syscall supportAKASHI Takahiro
lib/audit.c provides a generic function for auditing system calls. This patch extends it for compat syscall support on bi-architectures (32/64-bit) by adding lib/compat_audit.c. What is required to support this feature are: * add asm/unistd32.h for compat system call names * select CONFIG_AUDIT_ARCH_COMPAT_GENERIC Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-03-04lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlockHugh Dickins
Running fsx on tmpfs with concurrent memhog-swapoff-swapon, lots of BUG: sleeping function called from invalid context at kernel/fork.c:606 in_atomic(): 0, irqs_disabled(): 0, pid: 1394, name: swapoff 1 lock held by swapoff/1394: #0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6 followed by ================================================ [ BUG: lock held when returning to user space! ] 3.14.0-rc1 #3 Not tainted ------------------------------------------------ swapoff/1394 is leaving the kernel with locks still held! 1 lock held by swapoff/1394: #0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6 after which the system recovered nicely. Whoops, I long ago forgot the rcu_read_unlock() on one unlikely branch. Fixes e504f3fdd63d ("tmpfs radix_tree: locate_item to speed up swapoff") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04dma debug: account for cachelines and read-only mappings in overlap trackingDan Williams
While debug_dma_assert_idle() checks if a given *page* is actively undergoing dma the valid granularity of a dma mapping is a *cacheline*. Sander's testing shows that the warning message "DMA-API: exceeded 7 overlapping mappings of pfn..." is falsely triggering. The test is simply mapping multiple cachelines in a given page. Ultimately we want overlap tracking to be valid as it is a real api violation, so we need to track active mappings by cachelines. Update the active dma tracking to use the page-frame-relative cacheline of the mapping as the key, and update debug_dma_assert_idle() to check for all possible mapped cachelines for a given page. However, the need to track active mappings is only relevant when the dma-mapping is writable by the device. In fact it is fairly standard for read-only mappings to have hundreds or thousands of overlapping mappings at once. Limiting the overlap tracking to writable (!DMA_TO_DEVICE) eliminates this class of false-positive overlap reports. Note, the radix gang lookup is sub-optimal. It would be best if it stopped fetching entries once the search passed a page boundary. Nevertheless, this implementation does not perturb the original net_dma failing case. That is to say the extra overhead does not show up in terms of making the failing case pass due to a timing change. References: http://marc.info/?l=linux-netdev&m=139232263419315&w=2 http://marc.info/?l=linux-netdev&m=139217088107122&w=2 Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Reported-by: Dave Jones <davej@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-26vsprintf: Add support for IORESOURCE_UNSET in %pRBjorn Helgaas
Sometimes we have a struct resource where we know the type (MEM/IO/etc.) and the size, but we haven't assigned address space for it. The IORESOURCE_UNSET flag is a way to indicate this situation. For these "unset" resources, the start address is meaningless, so print only the size, e.g., - pci 0000:0c:00.0: reg 184: [mem 0x00000000-0x00001fff 64bit] + pci 0000:0c:00.0: reg 184: [mem size 0x2000 64bit] For %pr (printing with raw flags), we still print the address range, because %pr is mostly used for debugging anyway. Thanks to Fengguang Wu <fengguang.wu@intel.com> for suggesting resource_size(). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-02-23locktorture: Add a lock-torture kernel modulePaul E. McKenney
This commit adds the locking counterpart to rcutorture. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Make n_lock_torture_errors and torture_spinlock static as suggested by Fengguang Wu. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-23rcutorture: Abstract rcu_torture_random()Paul E. McKenney
Because rcu_torture_random() will be used by the locking equivalent to rcutorture, pull it out into its own module. This new module cannot be separately configured, instead, use the Kconfig "select" statement from the Kconfig options of tests depending on it. Suggested-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-02-18Merge 3.14-rc3 into driver-core-nextGreg Kroah-Hartman
We want those fixes here for testing and development. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-17idr: Add new function idr_is_empty()Andreas Gruenbacher
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-14Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block IO fixes from Jens Axboe: "Second round of updates and fixes for 3.14-rc2. Most of this stuff has been queued up for a while. The notable exception is the blk-mq changes, which are naturally a bit more in flux still. The pull request contains: - Two bug fixes for the new immutable vecs, causing crashes with raid or swap. From Kent. - Various blk-mq tweaks and fixes from Christoph. A fix for integrity bio's from Nic. - A few bcache fixes from Kent and Darrick Wong. - xen-blk{front,back} fixes from David Vrabel, Matt Rushton, Nicolas Swenson, and Roger Pau Monne. - Fix for a vec miscount with integrity vectors from Martin. - Minor annotations or fixes from Masanari Iida and Rashika Kheria. - Tweak to null_blk to do more normal FIFO processing of requests from Shlomo Pongratz. - Elevator switching bypass fix from Tejun. - Softlockup in blkdev_issue_discard() fix when !CONFIG_PREEMPT from me" * 'for-linus' of git://git.kernel.dk/linux-block: (31 commits) block: add cond_resched() to potentially long running ioctl discard loop xen-blkback: init persistent_purge_work work_struct blk-mq: pair blk_mq_start_request / blk_mq_requeue_request blk-mq: dont assume rq->errors is set when returning an error from ->queue_rq block: Fix cloning of discard/write same bios block: Fix type mismatch in ssize_t_blk_mq_tag_sysfs_show blk-mq: rework flush sequencing logic null_blk: use blk_complete_request and blk_mq_complete_request virtio_blk: use blk_mq_complete_request blk-mq: rework I/O completions fs: Add prototype declaration to appropriate header file include/linux/bio.h fs: Mark function as static in fs/bio-integrity.c block/null_blk: Fix completion processing from LIFO to FIFO block: Explicitly handle discard/write same segments block: Fix nr_vecs for inline integrity vectors blk-mq: Add bio_integrity setup to blk_mq_make_request blk-mq: initialize sg_reserved_size blk-mq: handle dma_drain_size blk-mq: divert __blk_put_request for MQ ops blk-mq: support at_head inserations for blk_execute_rq ...
2014-02-13asmlinkage Make __stack_chk_failed and memcmp visibleAndi Kleen
In LTO symbols implicitely referenced by the compiler need to be visible. Earlier these symbols were visible implicitely from being exported, but we disabled implicit visibility fo EXPORTs when modules are disabled to improve code size. So now these symbols have to be marked visible explicitely. Do this for __stack_chk_fail (with stack protector) and memcmp. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-10-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-10Merge branch 'master' into driver-core-next-test-merge-rc2Tejun Heo
da9846ae1518 ("kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag") in driver-core-linus conflicts with kernfs_drain() updates in driver-core-next. The former just adds the missing KERNFS_LOCKDEP checks which are already handled by kernfs_lockdep() checks in driver-core-next. The conflict can be resolved by taking code from driver-core-next. Conflicts: fs/kernfs/dir.c
2014-02-08Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Quite a varied little collection of fixes. Most of them are relatively small or isolated; the biggest one is Mel Gorman's fixes for TLB range flushing. A couple of AMD-related fixes (including not crashing when given an invalid microcode image) and fix a crash when compiled with gcov" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode, AMD: Unify valid container checks x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y x86/efi: Allow mapping BGRT on x86-32 x86: Fix the initialization of physnode_map x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable() x86/intel/mid: Fix X86_INTEL_MID dependencies arch/x86/mm/srat: Skip NUMA_NO_NODE while parsing SLIT mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge x86: mm: change tlb_flushall_shift for IvyBridge x86/mm: Eliminate redundant page table walk during TLB range flushing x86/mm: Clean up inconsistencies when flushing TLB ranges mm, x86: Account for TLB flushes only when debugging x86/AMD/NB: Fix amd_set_subcaches() parameter type x86/quirks: Add workaround for AMD F16h Erratum792 x86, doc, kconfig: Fix dud URL for Microcode data
2014-02-07sysfs, kobject: add sysfs wrapper for kernfs_enable_ns()Tejun Heo
Currently, kobject is invoking kernfs_enable_ns() directly. This is fine now as sysfs and kernfs are enabled and disabled together. If sysfs is disabled, kernfs_enable_ns() is switched to dummy implementation too and everything is fine; however, kernfs will soon have its own config option CONFIG_KERNFS and !SYSFS && KERNFS will be possible, which can make kobject call into non-dummy kernfs_enable_ns() with NULL kernfs_node pointers leading to an oops. Introduce sysfs_enable_ns() which is a wrapper around kernfs_enable_ns() so that it can be made a noop depending only on CONFIG_SYSFS regardless of the planned CONFIG_KERNFS. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-07Merge tag 'efi-urgent' into x86/urgentH. Peter Anvin
* Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit). Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-06x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=yPeter Oberparleiter
Commit d61931d89b, "x86: Add optimized popcnt variants" introduced compile flag -fcall-saved-rdi for lib/hweight.c. When combined with options -fprofile-arcs and -O2, this flag causes gcc to generate broken constructor code. As a result, a 64 bit x86 kernel compiled with CONFIG_GCOV_PROFILE_ALL=y prints message "gcov: could not create file" and runs into sproadic BUGs during boot. The gcc people indicate that these kinds of problems are endemic when using ad hoc calling conventions. It is therefore best to treat any file compiled with ad hoc calling conventions as an isolated environment and avoid things like profiling or coverage analysis, since those subsystems assume a "normal" calling conventions. This patch avoids the bug by excluding lib/hweight.o from coverage profiling. Reported-by: Meelis Roos <mroos@linux.ee> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/52F3A30C.7050205@linux.vnet.ibm.com Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@vger.kernel.org>