summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2017-10-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
There were quite a few overlapping sets of changes here. Daniel's bug fix for off-by-ones in the new BPF branch instructions, along with the added allowances for "data_end > ptr + x" forms collided with the metadata additions. Along with those three changes came veritifer test cases, which in their final form I tried to group together properly. If I had just trimmed GIT's conflict tags as-is, this would have split up the meta tests unnecessarily. In the socketmap code, a set of preemption disabling changes overlapped with the rename of bpf_compute_data_end() to bpf_compute_data_pointers(). Changes were made to the mv88e6060.c driver set addr method which got removed in net-next. The hyperv transport socket layer had a locking change in 'net' which overlapped with a change of socket state macro usage in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "A little more than usual this time around. Been travelling, so that is part of it. Anyways, here are the highlights: 1) Deal with memcontrol races wrt. listener dismantle, from Eric Dumazet. 2) Handle page allocation failures properly in nfp driver, from Jaku Kicinski. 3) Fix memory leaks in macsec, from Sabrina Dubroca. 4) Fix crashes in pppol2tp_session_ioctl(), from Guillaume Nault. 5) Several fixes in bnxt_en driver, including preventing potential NVRAM parameter corruption from Michael Chan. 6) Fix for KRACK attacks in wireless, from Johannes Berg. 7) rtnetlink event generation fixes from Xin Long. 8) Deadlock in mlxsw driver, from Ido Schimmel. 9) Disallow arithmetic operations on context pointers in bpf, from Jakub Kicinski. 10) Missing sock_owned_by_user() check in sctp_icmp_redirect(), from Xin Long. 11) Only TCP is supported for sockmap, make that explicit with a check, from John Fastabend. 12) Fix IP options state races in DCCP and TCP, from Eric Dumazet. 13) Fix panic in packet_getsockopt(), also from Eric Dumazet. 14) Add missing locked in hv_sock layer, from Dexuan Cui. 15) Various aquantia bug fixes, including several statistics handling cures. From Igor Russkikh et al. 16) Fix arithmetic overflow in devmap code, from John Fastabend. 17) Fix busted socket memory accounting when we get a fault in the tcp zero copy paths. From Willem de Bruijn. 18) Don't leave opt->tot_len uninitialized in ipv6, from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits) stmmac: Don't access tx_q->dirty_tx before netif_tx_lock ipv6: flowlabel: do not leave opt->tot_len with garbage of_mdio: Fix broken PHY IRQ in case of probe deferral textsearch: fix typos in library helpers rxrpc: Don't release call mutex on error pointer net: stmmac: Prevent infinite loop in get_rx_timestamp_status() net: stmmac: Fix stmmac_get_rx_hwtstamp() net: stmmac: Add missing call to dev_kfree_skb() mlxsw: spectrum_router: Configure TIGCR on init mlxsw: reg: Add Tunneling IPinIP General Configuration Register net: ethtool: remove error check for legacy setting transceiver type soreuseport: fix initialization race net: bridge: fix returning of vlan range op errors sock: correct sk_wmem_queued accounting on efault in tcp zerocopy bpf: add test cases to bpf selftests to cover all access tests bpf: fix pattern matches for direct packet access bpf: fix off by one for range markings with L{T, E} patterns bpf: devmap fix arithmetic overflow in bitmap_size calculation net: aquantia: Bad udp rate on default interrupt coalescing net: aquantia: Enable coalescing management via ethtool interface ...
2017-10-22textsearch: fix typos in library helpersRandy Dunlap
Fix spellos (typos) in textsearch library helpers. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19dql: make dql_init return voidStephen Hemminger
dql_init always returned 0, and the only place that uses it in network core code didn't care about the return value anyway. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19Merge commit 'tags/keys-fixes-20171018' into fixes-v4.14-rc5James Morris
2017-10-14Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Two lockdep fixes for bugs introduced by the cross-release dependency tracking feature - plus a commit that disables it because performance regressed in an absymal fashion on some systems" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/lockdep: Disable cross-release features for now locking/selftest: Avoid false BUG report locking/lockdep: Fix stacktrace mess
2017-10-14locking/lockdep: Disable cross-release features for nowIngo Molnar
Johan Hovold reported a big lockdep slowdown on his system, caused by lockdep: > I had noticed that the BeagleBone Black boot time appeared to have > increased significantly with 4.14 and yesterday I finally had time to > investigate it. > > Boot time (from "Linux version" to login prompt) had in fact doubled > since 4.13 where it took 17 seconds (with my current config) compared to > the 35 seconds I now see with 4.14-rc4. > > I quick bisect pointed to lockdep and specifically the following commit: > > 28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition of a crosslock") Because the final v4.14 release is close, disable the cross-release lockdep features for now. Bisected-by: Johan Hovold <johan@kernel.org> Debugged-by: Johan Hovold <johan@kernel.org> Reported-by: Johan Hovold <johan@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Lindgren <tony@atomide.com> Cc: kernel-team@lge.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mm@kvack.org Cc: linux-omap@vger.kernel.org Link: http://lkml.kernel.org/r/20171014072659.f2yr6mhm5ha3eou7@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-13lib/Kconfig.debug: kernel hacking menu: runtime testing: keep tests togetherRandy Dunlap
Expand the "Runtime testing" menu by including more entries inside it instead of after it. This is just Kconfig symbol movement. This causes the (arch-independent) Runtime tests to be presented (listed) all in one place instead of in multiple places. Link: http://lkml.kernel.org/r/c194e5c4-2042-bf94-a2d8-7aa13756e257@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-12lib/digsig: fix dereference of NULL user_key_payloadEric Biggers
digsig_verify() requests a user key, then accesses its payload. However, a revoked key has a NULL payload, and we failed to check for this. request_key() *does* skip revoked keys, but there is still a window where the key can be revoked before we acquire its semaphore. Fix it by checking for a NULL payload, treating it like a key which was already revoked at the time it was requested. Fixes: 051dbb918c7f ("crypto: digital signature verification support") Reviewed-by: James Morris <james.l.morris@oracle.com> Cc: <stable@vger.kernel.org> [v3.3+] Cc: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com>
2017-10-10locking/selftest: Avoid false BUG reportPeter Zijlstra
The work-around for the expected failure is providing another failure :/ Only when CONFIG_PROVE_LOCKING=y do we increment unexpected_testcase_failures, so only then do we need to decrement, otherwise we'll end up with a negative number and that will again trigger a BUG (printout, not crash). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: d82fed752942 ("locking/lockdep/selftests: Fix mixed read-write ABBA tests") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-09once: switch to new jump label APIEric Biggers
Switch the DO_ONCE() macro from the deprecated jump label API to the new one. The new one is more readable, and for DO_ONCE() it also makes the generated code more icache-friendly: now the one-time initialization code is placed out-of-line at the jump target, rather than at the inline fallthrough case. Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Just simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03lib/ratelimit.c: use deferred printk() versionSergey Senozhatsky
printk_ratelimit() invokes ___ratelimit() which may invoke a normal printk() (pr_warn() in this particular case) to warn about suppressed output. Given that printk_ratelimit() may be called from anywhere, that pr_warn() is dangerous - it may end up deadlocking the system. Fix ___ratelimit() by using deferred printk(). Sasha reported the following lockdep error: : Unregister pv shared memory for cpu 8 : select_fallback_rq: 3 callbacks suppressed : process 8583 (trinity-c78) no longer affine to cpu8 : : ====================================================== : WARNING: possible circular locking dependency detected : 4.14.0-rc2-next-20170927+ #252 Not tainted : ------------------------------------------------------ : migration/8/62 is trying to acquire lock: : (&port_lock_key){-.-.}, at: serial8250_console_write() : : but task is already holding lock: : (&rq->lock){-.-.}, at: sched_cpu_dying() : : which lock already depends on the new lock. : : : the existing dependency chain (in reverse order) is: : : -> #3 (&rq->lock){-.-.}: : __lock_acquire() : lock_acquire() : _raw_spin_lock() : task_fork_fair() : sched_fork() : copy_process.part.31() : _do_fork() : kernel_thread() : rest_init() : start_kernel() : x86_64_start_reservations() : x86_64_start_kernel() : verify_cpu() : : -> #2 (&p->pi_lock){-.-.}: : __lock_acquire() : lock_acquire() : _raw_spin_lock_irqsave() : try_to_wake_up() : default_wake_function() : woken_wake_function() : __wake_up_common() : __wake_up_common_lock() : __wake_up() : tty_wakeup() : tty_port_default_wakeup() : tty_port_tty_wakeup() : uart_write_wakeup() : serial8250_tx_chars() : serial8250_handle_irq.part.25() : serial8250_default_handle_irq() : serial8250_interrupt() : __handle_irq_event_percpu() : handle_irq_event_percpu() : handle_irq_event() : handle_level_irq() : handle_irq() : do_IRQ() : ret_from_intr() : native_safe_halt() : default_idle() : arch_cpu_idle() : default_idle_call() : do_idle() : cpu_startup_entry() : rest_init() : start_kernel() : x86_64_start_reservations() : x86_64_start_kernel() : verify_cpu() : : -> #1 (&tty->write_wait){-.-.}: : __lock_acquire() : lock_acquire() : _raw_spin_lock_irqsave() : __wake_up_common_lock() : __wake_up() : tty_wakeup() : tty_port_default_wakeup() : tty_port_tty_wakeup() : uart_write_wakeup() : serial8250_tx_chars() : serial8250_handle_irq.part.25() : serial8250_default_handle_irq() : serial8250_interrupt() : __handle_irq_event_percpu() : handle_irq_event_percpu() : handle_irq_event() : handle_level_irq() : handle_irq() : do_IRQ() : ret_from_intr() : native_safe_halt() : default_idle() : arch_cpu_idle() : default_idle_call() : do_idle() : cpu_startup_entry() : rest_init() : start_kernel() : x86_64_start_reservations() : x86_64_start_kernel() : verify_cpu() : : -> #0 (&port_lock_key){-.-.}: : check_prev_add() : __lock_acquire() : lock_acquire() : _raw_spin_lock_irqsave() : serial8250_console_write() : univ8250_console_write() : console_unlock() : vprintk_emit() : vprintk_default() : vprintk_func() : printk() : ___ratelimit() : __printk_ratelimit() : select_fallback_rq() : sched_cpu_dying() : cpuhp_invoke_callback() : take_cpu_down() : multi_cpu_stop() : cpu_stopper_thread() : smpboot_thread_fn() : kthread() : ret_from_fork() : : other info that might help us debug this: : : Chain exists of: : &port_lock_key --> &p->pi_lock --> &rq->lock : : Possible unsafe locking scenario: : : CPU0 CPU1 : ---- ---- : lock(&rq->lock); : lock(&p->pi_lock); : lock(&rq->lock); : lock(&port_lock_key); : : *** DEADLOCK *** : : 4 locks held by migration/8/62: : #0: (&p->pi_lock){-.-.}, at: sched_cpu_dying() : #1: (&rq->lock){-.-.}, at: sched_cpu_dying() : #2: (printk_ratelimit_state.lock){....}, at: ___ratelimit() : #3: (console_lock){+.+.}, at: vprintk_emit() : : stack backtrace: : CPU: 8 PID: 62 Comm: migration/8 Not tainted 4.14.0-rc2-next-20170927+ #252 : Call Trace: : dump_stack() : print_circular_bug() : check_prev_add() : ? add_lock_to_list.isra.26() : ? check_usage() : ? kvm_clock_read() : ? kvm_sched_clock_read() : ? sched_clock() : ? check_preemption_disabled() : __lock_acquire() : ? __lock_acquire() : ? add_lock_to_list.isra.26() : ? debug_check_no_locks_freed() : ? memcpy() : lock_acquire() : ? serial8250_console_write() : _raw_spin_lock_irqsave() : ? serial8250_console_write() : serial8250_console_write() : ? serial8250_start_tx() : ? lock_acquire() : ? memcpy() : univ8250_console_write() : console_unlock() : ? __down_trylock_console_sem() : vprintk_emit() : vprintk_default() : vprintk_func() : printk() : ? show_regs_print_info() : ? lock_acquire() : ___ratelimit() : __printk_ratelimit() : select_fallback_rq() : sched_cpu_dying() : ? sched_cpu_starting() : ? rcutree_dying_cpu() : ? sched_cpu_starting() : cpuhp_invoke_callback() : ? cpu_disable_common() : take_cpu_down() : ? trace_hardirqs_off_caller() : ? cpuhp_invoke_callback() : multi_cpu_stop() : ? __this_cpu_preempt_check() : ? cpu_stop_queue_work() : cpu_stopper_thread() : ? cpu_stop_create() : smpboot_thread_fn() : ? sort_range() : ? schedule() : ? __kthread_parkme() : kthread() : ? sort_range() : ? kthread_create_on_node() : ret_from_fork() : process 9121 (trinity-c78) no longer affine to cpu8 : smpboot: CPU 8 is now offline Link: http://lkml.kernel.org/r/20170928120405.18273-1-sergey.senozhatsky@gmail.com Fixes: 6b1d174b0c27b ("ratelimit: extend to print suppressed messages on release") Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Sasha Levin <levinsasha928@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Borislav Petkov <bp@suse.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03lib/idr.c: fix comment for idr_replace()Eric Biggers
idr_replace() returns the old value on success, not 0. Link: http://lkml.kernel.org/r/20170918162642.37511-1-ebiggers3@gmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03lib/lz4: make arrays static const, reduces object code sizeColin Ian King
Don't populate the read-only arrays dec32table and dec64table on the stack, instead make them both static const. Makes the object code smaller by over 10K bytes: Before: text data bss dec hex filename 31500 0 0 31500 7b0c lib/lz4/lz4_decompress.o After: text data bss dec hex filename 20237 176 0 20413 4fbd lib/lz4/lz4_decompress.o (gcc version 7.2.0 x86_64) Link: http://lkml.kernel.org/r/20170921221939.20820-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Sven Schmidt <4sschmid@informatik.uni-hamburg.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03Merge tag 'driver-core-4.14-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are a few small fixes for 4.14-rc4. The removal of DRIVER_ATTR() was almost completed by 4.14-rc1, but one straggler made it in through some other tree (odds are, one of mine...) So there's a simple removal of the last user, and then finally the macro is removed from the tree. There's a fix for old crazy udev instances that insist on reloading a module when it is removed from the kernel due to the new uevents for bind/unbind. This fixes the reported regression, hopefully some year in the future we can drop the workaround, once users update to the latest version, but I'm not holding my breath. And then there's a build fix for a linker warning, and a buffer overflow fix to match the PCI fixes you took through the PCI tree in the same area. All of these have been in linux-next for a few weeks while I've been traveling, sorry for the delay" * tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: driver core: remove DRIVER_ATTR fpga: altera-cvp: remove DRIVER_ATTR() usage driver core: platform: Don't read past the end of "driver_override" buffer base: arch_topology: fix section mismatch build warnings driver core: suppress sending MODALIAS in UNBIND uevents
2017-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2017-09-23Merge branch 'parisc-4.14-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Unbreak parisc bootloader by avoiding a gcc-7 optimization to convert multiple byte-accesses into one word-access. - Add missing HWPOISON page fault handler code. I completely missed that when I added HWPOISON support during this merge window and it only showed up now with the madvise07 LTP test case. - Fix backtrace unwinding to stop when stack start has been reached. - Issue warning if initrd has been loaded into memory regions with broken RAM modules. - Fix HPMC handler (parisc hardware fault handler) to comply with architecture specification. - Avoid compiler warnings about too large frame sizes. - Minor init-section fixes. * 'parisc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Unbreak bootloader due to gcc-7 optimizations parisc: Reintroduce option to gzip-compress the kernel parisc: Add HWPOISON page fault handler code parisc: Move init_per_cpu() into init section parisc: Check if initrd was loaded into broken RAM parisc: Add PDCE_CHECK instruction to HPMC handler parisc: Add wrapper for pdc_instr() firmware function parisc: Move start_parisc() into init section parisc: Stop unwinding at start of stack parisc: Fix too large frame size warnings
2017-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix NAPI poll list corruption in enic driver, from Christian Lamparter. 2) Fix route use after free, from Eric Dumazet. 3) Fix regression in reuseaddr handling, from Josef Bacik. 4) Assert the size of control messages in compat handling since we copy it in from userspace twice. From Meng Xu. 5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.) from Ursula Braun. 6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn. 7) Don't use ARRAY_SIZE on spinlock array which might have zero entries, from Geert Uytterhoeven. 8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov Kasiviswanathan. 9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin Long. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits) inet: fix improper empty comparison net: use inet6_rcv_saddr to compare sockets net: set tb->fast_sk_family net: orphan frags on stand-alone ptype in dev_queue_xmit_nit MAINTAINERS: update git tree locations for ieee802154 subsystem net: prevent dst uses after free net: phy: Fix truncation of large IRQ numbers in phy_attached_print() net/smc: no close wait in case of process shut down net/smc: introduce a delay net/smc: terminate link group if out-of-sync is received net/smc: longer delay for client link group removal net/smc: adapt send request completion notification net/smc: adjust net_device refcount net/smc: take RCU read lock for routing cache lookup net/smc: add receive timeout check net/smc: add missing dev_put net: stmmac: Cocci spatch "of_table" lan78xx: Use default values loaded from EEPROM/OTP after reset lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE lan78xx: Fix for eeprom read/write when device auto suspend ...
2017-09-22parisc: Fix too large frame size warningsHelge Deller
The parisc architecture has larger stack frames than most other architectures on 32-bit kernels. Increase the maximum allowed stack frame to 1280 bytes for parisc to avoid warnings in the do_sys_poll() and pat_memconfig() functions. Signed-off-by: Helge Deller <deller@gmx.de>
2017-09-21test_rhashtable: remove initdata annotationFlorian Westphal
kbuild test robot reported a section mismatch warning w. gcc 4.x: WARNING: lib/test_rhashtable.o(.text+0x139e): Section mismatch in reference from the function rhltable_insert.clone.3() to the variable .init.data:rhlt so remove this annotation. Fixes: cdd4de372ea06 ("test_rhashtable: add test case for rhl_table interface") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-20iov_iter: fix page_copy_sane for compound pagesPetar Penkov
Issue is that if the data crosses a page boundary inside a compound page, this check will incorrectly trigger a WARN_ON. To fix this, compute the order using the head of the compound page and adjust the offset to be relative to that head. Fixes: 72e809ed81ed ("iov_iter: sanity checks for copy to/from page primitives") Signed-off-by: Petar Penkov <ppenkov@google.com> CC: Al Viro <viro@zeniv.linux.org.uk> CC: Eric Dumazet <edumazet@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-09-19kobject: factorize skb setup in kobject_uevent_net_broadcast()Eric Dumazet
We can build one skb and let it be cloned in netlink. This is much faster, and use less memory (all clones will share the same skb->head) Tested: time perf record (for f in `seq 1 3000` ; do ip netns add tast$f; done) [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 4.110 MB perf.data (~179584 samples) ] real 0m24.227s # instead of 0m52.554s user 0m0.329s sys 0m23.753s # instead of 0m51.375s 14.77% ip [kernel.kallsyms] [k] __ip6addrlbl_add 14.56% ip [kernel.kallsyms] [k] netlink_broadcast_filtered 11.65% ip [kernel.kallsyms] [k] netlink_has_listeners 6.19% ip [kernel.kallsyms] [k] _raw_spin_lock_irqsave 5.66% ip [kernel.kallsyms] [k] kobject_uevent_env 4.97% ip [kernel.kallsyms] [k] memset_erms 4.67% ip [kernel.kallsyms] [k] refcount_sub_and_test 4.41% ip [kernel.kallsyms] [k] _raw_read_lock 3.59% ip [kernel.kallsyms] [k] refcount_inc_not_zero 3.13% ip [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 1.55% ip [kernel.kallsyms] [k] __wake_up 1.20% ip [kernel.kallsyms] [k] strlen 1.03% ip [kernel.kallsyms] [k] __wake_up_common 0.93% ip [kernel.kallsyms] [k] consume_skb 0.92% ip [kernel.kallsyms] [k] netlink_trim 0.87% ip [kernel.kallsyms] [k] insert_header 0.63% ip [kernel.kallsyms] [k] unmap_page_range Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-19kobject: copy env blob in one goEric Dumazet
No need to iterate over strings, just copy in one efficient memcpy() call. Tested: time perf record "(for f in `seq 1 3000` ; do ip netns add tast$f; done)" [ perf record: Woken up 10 times to write data ] [ perf record: Captured and wrote 8.224 MB perf.data (~359301 samples) ] real 0m52.554s # instead of 1m7.492s user 0m0.309s sys 0m51.375s # instead of 1m6.875s 9.88% ip [kernel.kallsyms] [k] netlink_broadcast_filtered 8.86% ip [kernel.kallsyms] [k] string 7.37% ip [kernel.kallsyms] [k] __ip6addrlbl_add 5.68% ip [kernel.kallsyms] [k] netlink_has_listeners 5.52% ip [kernel.kallsyms] [k] memcpy_erms 4.76% ip [kernel.kallsyms] [k] __alloc_skb 4.54% ip [kernel.kallsyms] [k] vsnprintf 3.94% ip [kernel.kallsyms] [k] format_decode 3.80% ip [kernel.kallsyms] [k] kmem_cache_alloc_node_trace 3.71% ip [kernel.kallsyms] [k] kmem_cache_alloc_node 3.66% ip [kernel.kallsyms] [k] kobject_uevent_env 3.38% ip [kernel.kallsyms] [k] strlen 2.65% ip [kernel.kallsyms] [k] _raw_spin_lock_irqsave 2.20% ip [kernel.kallsyms] [k] kfree 2.09% ip [kernel.kallsyms] [k] memset_erms 2.07% ip [kernel.kallsyms] [k] ___cache_free 1.95% ip [kernel.kallsyms] [k] kmem_cache_free 1.91% ip [kernel.kallsyms] [k] _raw_read_lock 1.45% ip [kernel.kallsyms] [k] ksize 1.25% ip [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 1.00% ip [kernel.kallsyms] [k] widen_string Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-19kobject: add kobject_uevent_net_broadcast()Eric Dumazet
This removes some #ifdef pollution and will ease follow up patches. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-19test_rhashtable: add test case for rhl_table interfaceFlorian Westphal
also test rhltable. rhltable remove operations are slow as deletions require a list walk, thus test with 1/16th of the given entry count number to get a run duration similar to rhashtable one. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-19test_rhashtable: add a check for max_sizeFlorian Westphal
add a test that tries to insert more than max_size elements. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-19test_rhashtable: don't use global entries variableFlorian Westphal
pass the entries to test as an argument instead. Followup patch will add an rhlist test case; rhlist delete opererations are slow so we need to use a smaller number to test it. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-19test_rhashtable: don't allocate huge static arrayFlorian Westphal
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-19rhashtable: Documentation tweakAndreas Gruenbacher
Clarify that rhashtable_walk_{stop,start} will not reset the iterator to the beginning of the hash table. Confusion between rhashtable_walk_enter and rhashtable_walk_start has already lead to a bug. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-18driver core: suppress sending MODALIAS in UNBIND ueventsDmitry Torokhov
The current udev rules cause modules to be loaded on all device events save for "remove". With the introduction of KOBJ_BIND/KOBJ_UNBIND this causes issues, as driver modules that have devices bound to their drivers get immediately reloaded, and it appears to the user that module unloading doe snot work. The standard udev matching rule is foillowing: ENV{MODALIAS}=="?*", RUN{builtin}+="kmod load $env{MODALIAS}" Given that MODALIAS data is not terribly useful for UNBIND event, let's zap it from the generated uevent environment until we get userspace updated with the correct udev rule that only loads modules on "add" event. Reported-by: Jakub Kicinski <kubakici@wp.pl> Tested-by: Jakub Kicinski <kubakici@wp.pl> Fixes: 1455cf8dbfd0 ("driver core: emit uevents when device is bound ...") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-14Merge branch 'zstd-minimal' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull zstd support from Chris Mason: "Nick Terrell's patch series to add zstd support to the kernel has been floating around for a while. After talking with Dave Sterba, Herbert and Phillip, we decided to send the whole thing in as one pull request. zstd is a big win in speed over zlib and in compression ratio over lzo, and the compression team here at FB has gotten great results using it in production. Nick will continue to update the kernel side with new improvements from the open source zstd userland code. Nick has a number of benchmarks for the main zstd code in his lib/zstd commit: I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is 211,988,480 B large. Run the following commands for the benchmark: sudo modprobe zstd_compress_test sudo mknod zstd_compress_test c 245 0 sudo cp silesia.tar zstd_compress_test The time is reported by the time of the userland `cp`. The MB/s is computed with 1,536,217,008 B / time(buffer size, hash) which includes the time to copy from userland. The Adjusted MB/s is computed with 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). The memory reported is the amount of memory the compressor requests. | Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) | |----------|----------|----------|-------|---------|----------|----------| | none | 11988480 | 0.100 | 1 | 2119.88 | - | - | | zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 | | zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 | | zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 | | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 | | zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 | | zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 | | zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 | | zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 | | zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 | | zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 | I benchmarked zstd decompression using the same method on the same machine. The benchmark file is located in the upstream zstd repo under `contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is the amount of memory required to decompress data compressed with the given compression level. If you know the maximum size of your input, you can reduce the memory usage of decompression irrespective of the compression level. | Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) | |----------|----------|---------|---------------|-------------| | none | 0.025 | 8479.54 | - | - | | zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 | | zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 | | zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 | | zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 | | zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 | | zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 | | zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 | | zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 | | zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 | | zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 | I ran a long series of tests and benchmarks on the btrfs side and the gains are very similar to the core benchmarks Nick ran" * 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: squashfs: Add zstd support btrfs: Add zstd support lib: Add zstd modules lib: Add xxhash module
2017-09-13mm: treewide: remove GFP_TEMPORARY allocation flagMichal Hocko
GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's primary motivation was to allow users to tell that an allocation is short lived and so the allocator can try to place such allocations close together and prevent long term fragmentation. As much as this sounds like a reasonable semantic it becomes much less clear when to use the highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the context holding that memory sleep? Can it take locks? It seems there is no good answer for those questions. The current implementation of GFP_TEMPORARY is basically GFP_KERNEL | __GFP_RECLAIMABLE which in itself is tricky because basically none of the existing caller provide a way to reclaim the allocated memory. So this is rather misleading and hard to evaluate for any benefits. I have checked some random users and none of them has added the flag with a specific justification. I suspect most of them just copied from other existing users and others just thought it might be a good idea to use without any measuring. This suggests that GFP_TEMPORARY just motivates for cargo cult usage without any reasoning. I believe that our gfp flags are quite complex already and especially those with highlevel semantic should be clearly defined to prevent from confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and replace all existing users to simply use GFP_KERNEL. Please note that SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and so they will be placed properly for memory fragmentation prevention. I can see reasons we might want some gfp flag to reflect shorterm allocations but I propose starting from a clear semantic definition and only then add users with proper justification. This was been brought up before LSF this year by Matthew [1] and it turned out that GFP_TEMPORARY really doesn't have a clear semantic. It seems to be a heuristic without any measured advantage for most (if not all) its current users. The follow up discussion has revealed that opinions on what might be temporary allocation differ a lot between developers. So rather than trying to tweak existing users into a semantic which they haven't expected I propose to simply remove the flag and start from scratch if we really need a semantic for short term allocations. [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org [akpm@linux-foundation.org: fix typo] [akpm@linux-foundation.org: coding-style fixes] [sfr@canb.auug.org.au: drm/i915: fix up] Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Brown <neilb@suse.de> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13lib/test_bitmap.c: use ULL suffix for 64-bit constantsGeert Uytterhoeven
With gcc 4.1.2: lib/test_bitmap.c:189: warning: integer constant is too large for `long' type lib/test_bitmap.c:190: warning: integer constant is too large for `long' type lib/test_bitmap.c:194: warning: integer constant is too large for `long' type lib/test_bitmap.c:195: warning: integer constant is too large for `long' type Add the missing "ULL" suffix to fix this. Link: http://lkml.kernel.org/r/1505040523-31230-1-git-send-email-geert@linux-m68k.org Fixes: 60ef690018b262dd ("bitmap: introduce BITMAP_FROM_U64()") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Yury Norov <ynorov@caviumnetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13idr: remove WARN_ON_ONCE() when trying to replace negative IDEric Biggers
IDR only supports non-negative IDs. There used to be a 'WARN_ON_ONCE(id < 0)' in idr_replace(), but it was intentionally removed by commit 2e1c9b286765 ("idr: remove WARN_ON_ONCE() on negative IDs"). Then it was added back by commit 0a835c4f090a ("Reimplement IDR and IDA using the radix tree"). However it seems that adding it back was a mistake, given that some users such as drm_gem_handle_delete() (DRM_IOCTL_GEM_CLOSE) pass in a value from userspace to idr_replace(), allowing the WARN_ON_ONCE to be triggered. drm_gem_handle_delete() actually just wants idr_replace() to return an error code if the ID is not allocated, including in the case where the ID is invalid (negative). So once again remove the bogus WARN_ON_ONCE(). This bug was found by syzkaller, which encountered the following warning: WARNING: CPU: 3 PID: 3008 at lib/idr.c:157 idr_replace+0x1d8/0x240 lib/idr.c:157 Kernel panic - not syncing: panic_on_warn set ... CPU: 3 PID: 3008 Comm: syzkaller218828 Not tainted 4.13.0-rc4-next-20170811 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] do_trap+0x260/0x390 arch/x86/kernel/traps.c:273 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323 invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:930 RIP: 0010:idr_replace+0x1d8/0x240 lib/idr.c:157 RSP: 0018:ffff8800394bf9f8 EFLAGS: 00010297 RAX: ffff88003c6c60c0 RBX: 1ffff10007297f43 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800394bfa78 RBP: ffff8800394bfae0 R08: ffffffff82856487 R09: 0000000000000000 R10: ffff8800394bf9a8 R11: ffff88006c8bae28 R12: ffffffffffffffff R13: ffff8800394bfab8 R14: dffffc0000000000 R15: ffff8800394bfbc8 drm_gem_handle_delete+0x33/0xa0 drivers/gpu/drm/drm_gem.c:297 drm_gem_close_ioctl+0xa1/0xe0 drivers/gpu/drm/drm_gem.c:671 drm_ioctl_kernel+0x1e7/0x2e0 drivers/gpu/drm/drm_ioctl.c:729 drm_ioctl+0x72e/0xa50 drivers/gpu/drm/drm_ioctl.c:825 vfs_ioctl fs/ioctl.c:45 [inline] do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685 SYSC_ioctl fs/ioctl.c:700 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 entry_SYSCALL_64_fastpath+0x1f/0xbe Here is a C reproducer: #include <fcntl.h> #include <stddef.h> #include <stdint.h> #include <sys/ioctl.h> #include <drm/drm.h> int main(void) { int cardfd = open("/dev/dri/card0", O_RDONLY); ioctl(cardfd, DRM_IOCTL_GEM_CLOSE, &(struct drm_gem_close) { .handle = -1 } ); } Link: http://lkml.kernel.org/r/20170906235306.20534-1-ebiggers3@gmail.com Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> [v4.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-11Merge tag 'libnvdimm-for-4.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm from Dan Williams: "A rework of media error handling in the BTT driver and other updates. It has appeared in a few -next releases and collected some late- breaking build-error and warning fixups as a result. Summary: - Media error handling support in the Block Translation Table (BTT) driver is reworked to address sleeping-while-atomic locking and memory-allocation-context conflicts. - The dax_device lookup overhead for xfs and ext4 is moved out of the iomap hot-path to a mount-time lookup. - A new 'ecc_unit_size' sysfs attribute is added to advertise the read-modify-write boundary property of a persistent memory range. - Preparatory fix-ups for arm and powerpc pmem support are included along with other miscellaneous fixes" * tag 'libnvdimm-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (26 commits) libnvdimm, btt: fix format string warnings libnvdimm, btt: clean up warning and error messages ext4: fix null pointer dereference on sbi libnvdimm, nfit: move the check on nd_reserved2 to the endpoint dax: fix FS_DAX=n BLOCK=y compilation libnvdimm: fix integer overflow static analysis warning libnvdimm, nd_blk: remove mmio_flush_range() libnvdimm, btt: rework error clearing libnvdimm: fix potential deadlock while clearing errors libnvdimm, btt: cache sector_size in arena_info libnvdimm, btt: ensure that flags were also unchanged during a map_read libnvdimm, btt: refactor map entry operations with macros libnvdimm, btt: fix a missed NVDIMM_IO_ATOMIC case in the write path libnvdimm, nfit: export an 'ecc_unit_size' sysfs attribute ext4: perform dax_device lookup at mount ext2: perform dax_device lookup at mount xfs: perform dax_device lookup at mount dax: introduce a fs_dax_get_by_bdev() helper libnvdimm, btt: check memory allocation failure libnvdimm, label: fix index block size calculation ...
2017-09-08cpumask: make cpumask_next() out-of-lineAlexey Dobriyan
Every for_each_XXX_cpu() invocation calls cpumask_next() which is an inline function: static inline unsigned int cpumask_next(int n, const struct cpumask *srcp) { /* -1 is a legal arg here. */ if (n != -1) cpumask_check(n); return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1); } However! find_next_bit() is regular out-of-line function which means "nr_cpu_ids" load and increment happen at the caller resulting in a lot of bloat x86_64 defconfig: add/remove: 3/0 grow/shrink: 8/373 up/down: 155/-5668 (-5513) x86_64 allyesconfig-ish: add/remove: 3/1 grow/shrink: 57/634 up/down: 3515/-28177 (-24662) !!! Some archs redefine find_next_bit() but it is OK: m68k inline but SMP is not supported arm out-of-line unicore32 out-of-line Function call will happen anyway, so move load and increment into callee. Link: http://lkml.kernel.org/r/20170824230010.GA1593@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08test_kmod: flip INT checks to be consistentDan Carpenter
Most checks will check for min and then max, except the int check. Flip the checks to be consistent with the other code. [mcgrof@kernel.org: massaged commit log] Link: http://lkml.kernel.org/r/20170802211707.28020-3-mcgrof@kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jessica Yu <jeyu@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michal Marek <mmarek@suse.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Colin Ian King <colin.king@canonical.com> Cc: David Binderman <dcb314@hotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08test_kmod: remove paranoid UINT_MAX check on uint range processingDan Carpenter
The UINT_MAX comparison is not needed because "max" is already an unsigned int, and we expect developer C code max value input to have a sensible 0 - UINT_MAX range. Note that if it so happens to be UINT_MAX + 1 it would lead to an issue, but we expect the developer to know this. [mcgrof@kernel.org: massaged commit log] Link: http://lkml.kernel.org/r/20170802211707.28020-2-mcgrof@kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jessica Yu <jeyu@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michal Marek <mmarek@suse.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Colin Ian King <colin.king@canonical.com> Cc: David Binderman <dcb314@hotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib/oid_registry.c: X.509: fix the buffer overflow in the utility function ↵Takashi Iwai
for OID string The sprint_oid() utility function doesn't properly check the buffer size that it causes that the warning in vsnprintf() be triggered. For example on v4.1 kernel: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2357 at lib/vsprintf.c:1867 vsnprintf+0x5a7/0x5c0() ... We can trigger this issue by injecting maliciously crafted x509 cert in DER format. Just using hex editor to change the length of OID to over the length of the SEQUENCE container. For example: 0:d=0 hl=4 l= 980 cons: SEQUENCE 4:d=1 hl=4 l= 700 cons: SEQUENCE 8:d=2 hl=2 l= 3 cons: cont [ 0 ] 10:d=3 hl=2 l= 1 prim: INTEGER :02 13:d=2 hl=2 l= 9 prim: INTEGER :9B47FAF791E7D1E3 24:d=2 hl=2 l= 13 cons: SEQUENCE 26:d=3 hl=2 l= 9 prim: OBJECT :sha256WithRSAEncryption 37:d=3 hl=2 l= 0 prim: NULL 39:d=2 hl=2 l= 121 cons: SEQUENCE 41:d=3 hl=2 l= 22 cons: SET 43:d=4 hl=2 l= 20 cons: SEQUENCE <=== the SEQ length is 20 45:d=5 hl=2 l= 3 prim: OBJECT :organizationName <=== the original length is 3, change the length of OID to over the length of SEQUENCE Pawel Wieczorkiewicz reported this problem and Takashi Iwai provided patch to fix it by checking the bufsize in sprint_oid(). Link: http://lkml.kernel.org/r/20170903021646.2080-1-jlee@suse.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: "Lee, Chun-Yi" <jlee@suse.com> Reported-by: Pawel Wieczorkiewicz <pwieczorkiewicz@suse.com> Cc: David Howells <dhowells@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Pawel Wieczorkiewicz <pwieczorkiewicz@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08radix-tree: must check __radix_tree_preload() return valueEric Dumazet
__radix_tree_preload() only disables preemption if no error is returned. So we really need to make sure callers always check the return value. idr_preload() contract is to always disable preemption, so we need to add a missing preempt_disable() if an error happened. Similarly, ida_pre_get() only needs to call preempt_enable() in the case no error happened. Link: http://lkml.kernel.org/r/1504637190.15310.62.camel@edumazet-glaptop3.roam.corp.google.com Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Fixes: 7ad3d4d85c7a ("ida: Move ida_bitmap to a percpu variable") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> [4.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib/cmdline.c: remove meaningless commentBaoquan He
One line of code was commented out by c++ style comment for debugging, but forgot removing it. Clean it up. Link: http://lkml.kernel.org/r/1503312113-11843-1-git-send-email-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib/string.c: check for kmalloc() failureDan Carpenter
This is mostly to keep the number of static checker warnings down so we can spot new bugs instead of them being drowned in noise. This function doesn't return normal kernel error codes but instead the return value is used to display exactly which memory failed. I chose -1 as hopefully that's a helpful thing to print. Link: http://lkml.kernel.org/r/20170817115420.uikisjvfmtrqkzjn@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Kees Cook <keescook@chromium.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08bitmap: introduce BITMAP_FROM_U64()Yury Norov
The macro is the compile-time analogue of bitmap_from_u64() with the same purpose: convert the 64-bit number to the properly ordered pair of 32-bit parts, suitable for filling the bitmap in 32-bit BE environment. Use it to make test_bitmap_parselist() correct for 32-bit BE ABIs. Tested on BE mips/qemu. [akpm@linux-foundation.org: tweak code comment] Link: http://lkml.kernel.org/r/20170810172916.24144-1-ynorov@caviumnetworks.com Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Cc: Noam Camus <noamca@mellanox.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib/test_bitmap.c: add test for bitmap_parselist()Yury Norov
Do some basic checks for bitmap_parselist(). [akpm@linux-foundation.org: fix printk warning] Link: http://lkml.kernel.org/r/20170807225438.16161-2-ynorov@caviumnetworks.com Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Cc: Noam Camus <noamca@mellanox.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib/bitmap.c: make bitmap_parselist() thread-safe and much fasterYury Norov
Current implementation of bitmap_parselist() uses a static variable to save local state while setting bits in the bitmap. It is obviously wrong if we assume execution in multiprocessor environment. Fortunately, it's possible to rewrite this portion of code to avoid using the static variable. It is also possible to set bits in the mask per-range with bitmap_set(), not per-bit, as it is implemented now, with set_bit(); which is way faster. The important side effect of this change is that setting bits in this function from now is not per-bit atomic and less memory-ordered. This is because set_bit() guarantees the order of memory accesses, while bitmap_set() does not. I think that it is the advantage of the new approach, because the bitmap_parselist() is intended to initialise bit arrays, and user should protect the whole bitmap during initialisation if needed. So protecting individual bits looks expensive and useless. Also, other range-oriented functions in lib/bitmap.c don't worry much about atomicity. With all that, setting 2k bits in map with the pattern like 0-2047:128/256 becomes ~50 times faster after applying the patch in my testing environment (arm64 hosted on qemu). The second patch of the series adds the test for bitmap_parselist(). It's not intended to cover all tricky cases, just to make sure that I didn't screw up during rework. Link: http://lkml.kernel.org/r/20170807225438.16161-1-ynorov@caviumnetworks.com Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Cc: Noam Camus <noamca@mellanox.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib: add test module for CONFIG_DEBUG_VIRTUALFlorian Fainelli
Add a test module that allows testing that CONFIG_DEBUG_VIRTUAL works correctly, at least that it can catch invalid calls to virt_to_phys() against the non-linear kernel virtual address map. Link: http://lkml.kernel.org/r/20170808164035.26725-1-f.fainelli@gmail.com Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib/hexdump.c: return -EINVAL in case of error in hex2bin()Andy Shevchenko
In some cases caller would like to use error code directly without shadowing. -EINVAL feels a rightful code to return in case of error in hex2bin(). Link: http://lkml.kernel.org/r/20170731135510.68023-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib/interval_tree: fast overlap detectionDavidlohr Bueso
Allow interval trees to quickly check for overlaps to avoid unnecesary tree lookups in interval_tree_iter_first(). As of this patch, all interval tree flavors will require using a 'rb_root_cached' such that we can have the leftmost node easily available. While most users will make use of this feature, those with special functions (in addition to the generic insert, delete, search calls) will avoid using the cached option as they can do funky things with insertions -- for example, vma_interval_tree_insert_after(). [jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()] Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Doug Ledford <dledford@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Jason Wang <jasowang@redhat.com> Cc: Christian Benvenuti <benve@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08lib/rbtree_test.c: support rb_root_cachedDavidlohr Bueso
We can work with a single rb_root_cached root to test both cached and non-cached rbtrees. In addition, also add a test to measure latencies between rb_first and its fast counterpart. Link: http://lkml.kernel.org/r/20170719014603.19029-7-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>