summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2018-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Support offloading wireless authentication to userspace via NL80211_CMD_EXTERNAL_AUTH, from Srinivas Dasari. 2) A lot of work on network namespace setup/teardown from Kirill Tkhai. Setup and cleanup of namespaces now all run asynchronously and thus performance is significantly increased. 3) Add rx/tx timestamping support to mv88e6xxx driver, from Brandon Streiff. 4) Support zerocopy on RDS sockets, from Sowmini Varadhan. 5) Use denser instruction encoding in x86 eBPF JIT, from Daniel Borkmann. 6) Support hw offload of vlan filtering in mvpp2 dreiver, from Maxime Chevallier. 7) Support grafting of child qdiscs in mlxsw driver, from Nogah Frankel. 8) Add packet forwarding tests to selftests, from Ido Schimmel. 9) Deal with sub-optimal GSO packets better in BBR congestion control, from Eric Dumazet. 10) Support 5-tuple hashing in ipv6 multipath routing, from David Ahern. 11) Add path MTU tests to selftests, from Stefano Brivio. 12) Various bits of IPSEC offloading support for mlx5, from Aviad Yehezkel, Yossi Kuperman, and Saeed Mahameed. 13) Support RSS spreading on ntuple filters in SFC driver, from Edward Cree. 14) Lots of sockmap work from John Fastabend. Applications can use eBPF to filter sendmsg and sendpage operations. 15) In-kernel receive TLS support, from Dave Watson. 16) Add XDP support to ixgbevf, this is significant because it should allow optimized XDP usage in various cloud environments. From Tony Nguyen. 17) Add new Intel E800 series "ice" ethernet driver, from Anirudh Venkataramanan et al. 18) IP fragmentation match offload support in nfp driver, from Pieter Jansen van Vuuren. 19) Support XDP redirect in i40e driver, from Björn Töpel. 20) Add BPF_RAW_TRACEPOINT program type for accessing the arguments of tracepoints in their raw form, from Alexei Starovoitov. 21) Lots of striding RQ improvements to mlx5 driver with many performance improvements, from Tariq Toukan. 22) Use rhashtable for inet frag reassembly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1678 commits) net: mvneta: improve suspend/resume net: mvneta: split rxq/txq init and txq deinit into SW and HW parts ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_thresh net: bgmac: Fix endian access in bgmac_dma_tx_ring_free() net: bgmac: Correctly annotate register space route: check sysctl_fib_multipath_use_neigh earlier than hash fix typo in command value in drivers/net/phy/mdio-bitbang. sky2: Increase D3 delay to sky2 stops working after suspend net/mlx5e: Set EQE based as default TX interrupt moderation mode ibmvnic: Disable irqs before exiting reset from closed state net: sched: do not emit messages while holding spinlock vlan: also check phy_driver ts_info for vlan's real device Bluetooth: Mark expected switch fall-throughs Bluetooth: Set HCI_QUIRK_SIMULTANEOUS_DISCOVERY for BTUSB_QCA_ROME Bluetooth: btrsi: remove unused including <linux/version.h> Bluetooth: hci_bcm: Remove DMI quirk for the MINIX Z83-4 sh_eth: kill useless check in __sh_eth_get_regs() sh_eth: add sh_eth_cpu_data::no_xdfar flag ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data() ipv4: factorize sk_wmem_alloc updates done by __ip_append_data() ...
2018-04-02Merge tag 'arch-removal' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pul removal of obsolete architecture ports from Arnd Bergmann: "This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers. I have been working with the (former) maintainers for each one to ensure that my interpretation was right and the code is definitely unused in mainline kernels. Many had fond memories of working on the respective ports to start with and getting them included in upstream, but also saw no point in keeping the port alive without any users. In the end, it seems that while the eight architectures are extremely different, they all suffered the same fate: There was one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem, which was more costly than licensing newer off-the-shelf CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems that all the SoC product lines are still around, but have not used the custom CPU architectures for several years at this point. In contrast, CPU instruction sets that remain popular and have actively maintained kernel ports tend to all be used across multiple licensees. [ See the new nds32 port merged in the previous commit for the next generation of "one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem" - Linus ] The removal came out of a discussion that is now documented at https://lwn.net/Articles/748074/. Unlike the original plans, I'm not marking any ports as deprecated but remove them all at once after I made sure that they are all unused. Some architectures (notably tile, mn10300, and blackfin) are still being shipped in products with old kernels, but those products will never be updated to newer kernel releases. After this series, we still have a few architectures without mainline gcc support: - unicore32 and hexagon both have very outdated gcc releases, but the maintainers promised to work on providing something newer. At least in case of hexagon, this will only be llvm, not gcc. - openrisc, risc-v and nds32 are still in the process of finishing their support or getting it added to mainline gcc in the first place. They all have patched gcc-7.3 ports that work to some degree, but complete upstream support won't happen before gcc-8.1. Csky posted their first kernel patch set last week, their situation will be similar [ Palmer Dabbelt points out that RISC-V support is in mainline gcc since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]" This really says it all: 2498 files changed, 95 insertions(+), 467668 deletions(-) * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits) MAINTAINERS: UNICORE32: Change email account staging: iio: remove iio-trig-bfin-timer driver tty: hvc: remove tile driver tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers serial: remove tile uart driver serial: remove m32r_sio driver serial: remove blackfin drivers serial: remove cris/etrax uart drivers usb: Remove Blackfin references in USB support usb: isp1362: remove blackfin arch glue usb: musb: remove blackfin port usb: host: remove tilegx platform glue pwm: remove pwm-bfin driver i2c: remove bfin-twi driver spi: remove blackfin related host drivers watchdog: remove bfin_wdt driver can: remove bfin_can driver mmc: remove bfin_sdh driver input: misc: remove blackfin rotary driver input: keyboard: remove bf54x driver ...
2018-04-02Merge branch 'x86-dma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 dma mapping updates from Ingo Molnar: "This tree, by Christoph Hellwig, switches over the x86 architecture to the generic dma-direct and swiotlb code, and also unifies more of the dma-direct code between architectures. The now unused x86-only primitives are removed" * 'x86-dma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPS dma/swiotlb: Remove swiotlb_{alloc,free}_coherent() dma/direct: Handle force decryption for DMA coherent buffers in common code dma/direct: Handle the memory encryption bit in common code dma/swiotlb: Remove swiotlb_set_mem_attributes() set_memory.h: Provide set_memory_{en,de}crypted() stubs x86/dma: Remove dma_alloc_coherent_gfp_flags() iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent() iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}() x86/dma/amd_gart: Use dma_direct_{alloc,free}() x86/dma/amd_gart: Look at dev->coherent_dma_mask instead of GFP_DMA x86/dma: Use generic swiotlb_ops x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y) x86/dma: Remove dma_alloc_coherent_mask()
2018-04-02Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in the locking subsystem in this cycle were: - Add the Linux Kernel Memory Consistency Model (LKMM) subsystem, which is an an array of tools in tools/memory-model/ that formally describe the Linux memory coherency model (a.k.a. Documentation/memory-barriers.txt), and also produce 'litmus tests' in form of kernel code which can be directly executed and tested. Here's a high level background article about an earlier version of this work on LWN.net: https://lwn.net/Articles/718628/ The design principles: "There is reason to believe that Documentation/memory-barriers.txt could use some help, and a major purpose of this patch is to provide that help in the form of a design-time tool that can produce all valid executions of a small fragment of concurrent Linux-kernel code, which is called a "litmus test". This tool's functionality is roughly similar to a full state-space search. Please note that this is a design-time tool, not useful for regression testing. However, we hope that the underlying Linux-kernel memory model will be incorporated into other tools capable of analyzing large bodies of code for regression-testing purposes." [...] "A second tool is klitmus7, which converts litmus tests to loadable kernel modules for direct testing. As with herd7, the klitmus7 code is freely available from http://diy.inria.fr/sources/index.html (and via "git" at https://github.com/herd/herdtools7)" [...] Credits go to: "This patch was the result of a most excellent collaboration founded by Jade Alglave and also including Alan Stern, Andrea Parri, and Luc Maranget." ... and to the gents listed in the MAINTAINERS entry: LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM) M: Alan Stern <stern@rowland.harvard.edu> M: Andrea Parri <parri.andrea@gmail.com> M: Will Deacon <will.deacon@arm.com> M: Peter Zijlstra <peterz@infradead.org> M: Boqun Feng <boqun.feng@gmail.com> M: Nicholas Piggin <npiggin@gmail.com> M: David Howells <dhowells@redhat.com> M: Jade Alglave <j.alglave@ucl.ac.uk> M: Luc Maranget <luc.maranget@inria.fr> M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> The LKMM project already found several bugs in Linux locking primitives and improved the understanding and the documentation of the Linux memory model all around. - Add KASAN instrumentation to atomic APIs (Dmitry Vyukov) - Add RWSEM API debugging and reorganize the lock debugging Kconfig (Waiman Long) - ... misc cleanups and other smaller changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) locking/Kconfig: Restructure the lock debugging menu locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches lockdep: Make the lock debug output more useful locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() locking/atomic, asm-generic, x86: Add comments for atomic instrumentation locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h locking/xchg/alpha: Remove superfluous memory barriers from the _local() variants tools/memory-model: Finish the removal of rb-dep, smp_read_barrier_depends(), and lockless_dereference() tools/memory-model: Add documentation of new litmus test tools/memory-model: Remove mention of docker/gentoo image locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more locking/lockdep: Show unadorned pointers mutex: Drop linkage.h from mutex.h tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference tools/memory-model: Convert underscores to hyphens tools/memory-model: Add a S lock-based external-view litmus test tools/memory-model: Add required herd7 version to README file ...
2018-04-02Merge branch 'core-debugobjects-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull debugobjects updates from Ingo Molnar: "Misc improvements: - add better instrumentation/debugging - optimize the freeing logic improve performance" * 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: debugobjects: Avoid another unused variable warning debugobjects: Fix debug_objects_freed accounting debugobjects: Use global free list in __debug_check_no_obj_freed() debugobjects: Use global free list in free_object() debugobjects: Add global free list and the counter debugobjects: Export max loops counter
2018-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-03-31 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add raw BPF tracepoint API in order to have a BPF program type that can access kernel internal arguments of the tracepoints in their raw form similar to kprobes based BPF programs. This infrastructure also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which returns an anon-inode backed fd for the tracepoint object that allows for automatic detach of the BPF program resp. unregistering of the tracepoint probe on fd release, from Alexei. 2) Add new BPF cgroup hooks at bind() and connect() entry in order to allow BPF programs to reject, inspect or modify user space passed struct sockaddr, and as well a hook at post bind time once the port has been allocated. They are used in FB's container management engine for implementing policy, replacing fragile LD_PRELOAD wrapper intercepting bind() and connect() calls that only works in limited scenarios like glibc based apps but not for other runtimes in containerized applications, from Andrey. 3) BPF_F_INGRESS flag support has been added to sockmap programs for their redirect helper call bringing it in line with cls_bpf based programs. Support is added for both variants of sockmap programs, meaning for tx ULP hooks as well as recv skb hooks, from John. 4) Various improvements on BPF side for the nfp driver, besides others this work adds BPF map update and delete helper call support from the datapath, JITing of 32 and 64 bit XADD instructions as well as offload support of bpf_get_prandom_u32() call. Initial implementation of nfp packet cache has been tackled that optimizes memory access (see merge commit for further details), from Jakub and Jiong. 5) Removal of struct bpf_verifier_env argument from the print_bpf_insn() API has been done in order to prepare to use print_bpf_insn() soon out of perf tool directly. This makes the print_bpf_insn() API more generic and pushes the env into private data. bpftool is adjusted as well with the print_bpf_insn() argument removal, from Jiri. 6) Couple of cleanups and prep work for the upcoming BTF (BPF Type Format). The latter will reuse the current BPF verifier log as well, thus bpf_verifier_log() is further generalized, from Martin. 7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read and write support has been added in similar fashion to existing IPv6 IPV6_TCLASS socket option we already have, from Nikita. 8) Fixes in recent sockmap scatterlist API usage, which did not use sg_init_table() for initialization thus triggering a BUG_ON() in scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and uses a small helper sg_init_marker() to properly handle the affected cases, from Prashant. 9) Let the BPF core follow IDR code convention and therefore use the idr_preload() and idr_preload_end() helpers, which would also help idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory pressure, from Shaohua. 10) Last but not least, a spelling fix in an error message for the BPF cookie UID helper under BPF sample code, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31rhashtable: add schedule pointsEric Dumazet
Rehashing and destroying large hash table takes a lot of time, and happens in process context. It is safe to add cond_resched() in rhashtable_rehash_table() and rhashtable_free_and_destroy() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31locking/Kconfig: Restructure the lock debugging menuWaiman Long
Two config options in the lock debugging menu that are probably the most frequently used, as far as I am concerned, is the PROVE_LOCKING and LOCK_STAT. From a UI perspective, they should be front and center. So these two options are now moved to the top of the lock debugging menu. The DEBUG_WW_MUTEX_SLOWPATH option is also added to the PROVE_LOCKING umbrella. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1522445280-7767-4-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readableWaiman Long
There are a couples of lock debugging Kconfig options that depends on the following support options: - TRACE_IRQFLAGS_SUPPORT - STACKTRACE_SUPPORT - LOCKDEP_SUPPORT That makes those lock debugging options harder to read and understand. So a new LOCK_DEBUGGING_SUPPORT option is added that is equivalent to the above three options together. That makes the Kconfig.debug file more readable. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1522445280-7767-3-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatchesWaiman Long
For a rwsem, locking can either be exclusive or shared. The corresponding exclusive or shared unlock must be used. Otherwise, the protected data structures may get corrupted or the lock may be in an inconsistent state. In order to detect such anomaly, a new configuration option DEBUG_RWSEMS is added which can be enabled to look for such mismatches and print warnings that that happens. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1522445280-7767-2-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-30lib/scatterlist: add sg_init_marker() helperPrashant Bhole
sg_init_marker initializes sg_magic in the sg table and calls sg_mark_end() on the last entry of the table. This can be useful to avoid memset in sg_init_table() when scatterlist is already zeroed out For example: when scatterlist is embedded inside other struct and that container struct is zeroed out Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-29test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()Dan Carpenter
The skb_segment() function returns error pointers on error. It never returns NULL. Fixes: 76db8087c4c9 ("net: bpf: add a test for skb_segment in test_bpf module") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-28dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrsChristoph Hellwig
Revert the clearing of __GFP_ZERO in dma_alloc_attrs and move it to dma_direct_alloc for now. While most common architectures always zero dma cohereny allocations (and x86 did so since day one) this is not documented and at least arc and s390 do not zero without the explicit __GFP_ZERO argument. Fixes: 57bf5a8963f8 ("dma-mapping: clear harmful GFP_* flags in common code") Reported-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com> Cc: iommu@lists.linux-foundation.org Link: https://lkml.kernel.org/r/20180328133535.17302-2-hch@lst.de
2018-03-28Merge 4.16-rc7 into staging-nextGreg Kroah-Hartman
We want the IIO and staging driver fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28Merge 4.16-rc7 into char-misc-nextGreg Kroah-Hartman
We want the hyperv fix in here for merging and testing. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-27net: Drop pernet_operations::asyncKirill Tkhai
Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27kfifo: fix inaccurate commentMartin Kelly
The comment in __kfifo_alloc says we round down, but we actually round up, so correct it. Signed-off-by: Martin Kelly <mkelly@xevo.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-03-26raid: remove tile specific raid6 implementationArnd Bergmann
The Tile architecture is getting removed, so we no longer need this either. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26treewide: simplify Kconfig dependencies for removed archsArnd Bergmann
A lot of Kconfig symbols have architecture specific dependencies. In those cases that depend on architectures we have already removed, they can be omitted. Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26libcrc32c: Add crc32c_impl functionNikolay Borisov
This function returns a string with the currently in-use implementation of the crc32c algorithm, i.e crc32c-generic (for unoptimised, generic implementation) or crc32c-intel for the sse optimised version. This will be used by btrfs. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> [ use crypto_shash_driver_name as suggested by Herbert ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26treewide: Align function definition open/close bracesJoe Perches
Some functions definitions have either the initial open brace and/or the closing brace outside of column 1. Move those braces to column 1. This allows various function analyzers like gnu complexity to work properly for these modified functions. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Takashi Iwai <tiwai@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Nicolin Chen <nicoleotsuka@gmail.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-03-25net: bpf: add a test for skb_segment in test_bpf moduleYonghong Song
Without the previous commit, "modprobe test_bpf" will have the following errors: ... [ 98.149165] ------------[ cut here ]------------ [ 98.159362] kernel BUG at net/core/skbuff.c:3667! [ 98.169756] invalid opcode: 0000 [#1] SMP PTI [ 98.179370] Modules linked in: [ 98.179371] test_bpf(+) ... which triggers the bug the previous commit intends to fix. The skbs are constructed to mimic what mlx5 may generate. The packet size/header may not mimic real cases in production. But the processing flow is similar. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26lib: zstd: clean up Makefile for simpler composite object handlingMasahiro Yamada
Now, Kbuild nicely handles composite objects to avoid multiple definition. Makefiles can simply add the same objects multiple times across composite objects. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26kbuild: rename built-in.o to built-in.aNicholas Piggin
Incremental linking is gone, so rename built-in.o to built-in.a, which is the usual extension for archive files. This patch does two things, first is a simple search/replace: git grep -l 'built-in\.o' | xargs sed -i 's/built-in\.o/built-in\.a/g' The second is to invert nesting of nested text manipulations to avoid filtering built-in.a out from libs-y2: -libs-y2 := $(filter-out %.a, $(patsubst %/, %/built-in.a, $(libs-y))) +libs-y2 := $(patsubst %/, %/built-in.a, $(filter-out %.a, $(libs-y))) Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-24Merge branch 'linus' into x86/dma, to resolve a conflict with upstreamIngo Molnar
Conflicts: arch/x86/mm/init_64.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-23swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPSChristoph Hellwig
Otherwise this causes unused symbol warnings for configs that build swiotlb.c only for use by xen-swiotlb.c and that don't otherwise select CONFIG_DMA_DIRECT_OPS, which is possible on arm. Fixes: 16e73adbca76 ("dma/swiotlb: Remove swiotlb_{alloc,free}_coherent()") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: iommu@lists.linux-foundation.org Cc: konrad.wilk@oracle.com Link: https://lkml.kernel.org/r/20180323174930.17767-1-hch@lst.de
2018-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Fun set of conflict resolutions here... For the mac80211 stuff, these were fortunately just parallel adds. Trivially resolved. In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the function phy_disable_interrupts() earlier in the file, whilst in 'net-next' the phy_error() call from this function was removed. In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the 'rt_table_id' member of rtable collided with a bug fix in 'net' that added a new struct member "rt_mtu_locked" which needs to be copied over here. The mlxsw driver conflict consisted of net-next separating the span code and definitions into separate files, whilst a 'net' bug fix made some changes to that moved code. The mlx5 infiniband conflict resolution was quite non-trivial, the RDMA tree's merge commit was used as a guide here, and here are their notes: ==================== Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524 (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit b5ca15ad7e61 (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, thp: do not cause memcg oom for thp mm/vmscan: wake up flushers for legacy cgroups too Revert "mm: page_alloc: skip over regions of invalid pfns where possible" mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink() mm/thp: do not wait for lock_page() in deferred_split_scan() mm/khugepaged.c: convert VM_BUG_ON() to collapse fail x86/mm: implement free pmd/pte page interfaces mm/vmalloc: add interfaces to free unmapped page table h8300: remove extraneous __BIG_ENDIAN definition hugetlbfs: check for pgoff value overflow lockdep: fix fs_reclaim warning MAINTAINERS: update Mark Fasheh's e-mail mm/mempolicy.c: avoid use uninitialized preferred_node
2018-03-22mm/vmalloc: add interfaces to free unmapped page tableToshi Kani
On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may create pud/pmd mappings. A kernel panic was observed on arm64 systems with Cortex-A75 in the following steps as described by Hanjun Guo. 1. ioremap a 4K size, valid page table will build, 2. iounmap it, pte0 will set to 0; 3. ioremap the same address with 2M size, pgd/pmd is unchanged, then set the a new value for pmd; 4. pte0 is leaked; 5. CPU may meet exception because the old pmd is still in TLB, which will lead to kernel panic. This panic is not reproducible on x86. INVLPG, called from iounmap, purges all levels of entries associated with purged address on x86. x86 still has memory leak. The patch changes the ioremap path to free unmapped page table(s) since doing so in the unmap path has the following issues: - The iounmap() path is shared with vunmap(). Since vmap() only supports pte mappings, making vunmap() to free a pte page is an overhead for regular vmap users as they do not need a pte page freed up. - Checking if all entries in a pte page are cleared in the unmap path is racy, and serializing this check is expensive. - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges. Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB purge. Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which clear a given pud/pmd entry and free up a page for the lower level entries. This patch implements their stub functions on x86 and arm64, which work as workaround. [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub] Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings") Reported-by: Lei Li <lious.lilei@hisilicon.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Wang Xuefeng <wxf.wang@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Chintan Pandya <cpandya@codeaurora.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Always validate XFRM esn replay attribute, from Florian Westphal. 2) Fix RCU read lock imbalance in xfrm_get_tos(), from Xin Long. 3) Don't try to get firmware dump if not loaded in iwlwifi, from Shaul Triebitz. 4) Fix BPF helpers to deal with SCTP GSO SKBs properly, from Daniel Axtens. 5) Fix some interrupt handling issues in e1000e driver, from Benjamin Poitier. 6) Use strlcpy() in several ethtool get_strings methods, from Florian Fainelli. 7) Fix rhlist dup insertion, from Paul Blakey. 8) Fix SKB leak in netem packet scheduler, from Alexey Kodanev. 9) Fix driver unload crash when link is up in smsc911x, from Jeremy Linton. 10) Purge out invalid socket types in l2tp_tunnel_create(), from Eric Dumazet. 11) Need to purge the write queue when TCP connections are aborted, otherwise userspace using MSG_ZEROCOPY can't close the fd. From Soheil Hassas Yeganeh. 12) Fix double free in error path of team driver, from Arkadi Sharshevsky. 13) Filter fixes for hv_netvsc driver, from Stephen Hemminger. 14) Fix non-linear packet access in ipv6 ndisc code, from Lorenzo Bianconi. 15) Properly filter out unsupported feature flags in macvlan driver, from Shannon Nelson. 16) Don't request loading the diag module for a protocol if the protocol itself is not even registered. From Xin Long. 17) If datagram connect fails in ipv6, make sure the socket state is consistent afterwards. From Paolo Abeni. 18) Use after free in qed driver, from Dan Carpenter. 19) If received ipv4 PMTU is less than the min pmtu, lock the mtu in the entry. From Sabrina Dubroca. 20) Fix sleep in atomic in tg3 driver, from Jonathan Toppins. 21) Fix vlan in vlan untagging in some situations, from Toshiaki Makita. 22) Fix double SKB free in genlmsg_mcast(). From Nicolas Dichtel. 23) Fix NULL derefs in error paths of tcf_*_init(), from Davide Caratti. 24) Unbalanced PM runtime calls in FEC driver, from Florian Fainelli. 25) Memory leak in gemini driver, from Igor Pylypiv. 26) IDR leaks in error paths of tcf_*_init() functions, from Davide Caratti. 27) Need to use GFP_ATOMIC in seg6_build_state(), from David Lebrun. 28) Missing dev_put() in error path of macsec_newlink(), from Dan Carpenter. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (201 commits) macsec: missing dev_put() on error in macsec_newlink() net: dsa: Fix functional dsa-loop dependency on FIXED_PHY hv_netvsc: common detach logic hv_netvsc: change GPAD teardown order on older versions hv_netvsc: use RCU to fix concurrent rx and queue changes hv_netvsc: disable NAPI before channel close net/ipv6: Handle onlink flag with multipath routes ppp: avoid loop in xmit recursion detection code ipv6: sr: fix NULL pointer dereference when setting encap source address ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state net: aquantia: driver version bump net: aquantia: Implement pci shutdown callback net: aquantia: Allow live mac address changes net: aquantia: Add tx clean budget and valid budget handling logic net: aquantia: Change inefficient wait loop on fw data reads net: aquantia: Fix a regression with reset on old firmware net: aquantia: Fix hardware reset when SPI may rarely hangup s390/qeth: on channel error, reject further cmd requests s390/qeth: lock read device while queueing next buffer s390/qeth: when thread completes, wake up all waiters ...
2018-03-22netns: send uevent messagesChristian Brauner
This patch adds a receive method to NETLINK_KOBJECT_UEVENT netlink sockets to allow sending uevent messages into the network namespace the socket belongs to. Currently non-initial network namespaces are already isolated and don't receive uevents. There are a number of cases where it is beneficial for a sufficiently privileged userspace process to send a uevent into a network namespace. One such use case would be debugging and fuzzing of a piece of software which listens and reacts to uevents. By running a copy of that software inside a network namespace, specific uevents could then be presented to it. More concretely, this would allow for easy testing of udevd/ueventd. This will also allow some piece of software to run components inside a separate network namespace and then effectively filter what that software can receive. Some examples of software that do directly listen to uevents and that we have in the past attempted to run inside a network namespace are rbd (CEPH client) or the X server. Implementation: The implementation has been kept as simple as possible from the kernel's perspective. Specifically, a simple input method uevent_net_rcv() is added to NETLINK_KOBJECT_UEVENT sockets which completely reuses existing af_netlink infrastructure and does neither add an additional netlink family nor requires any user-visible changes. For example, by using netlink_rcv_skb() we can make use of existing netlink infrastructure to report back informative error messages to userspace. Furthermore, this implementation does not introduce any overhead for existing uevent generating codepaths. The struct netns got a new uevent socket member that records the uevent socket associated with that network namespace including its position in the uevent socket list. Since we record the uevent socket for each network namespace in struct net we don't have to walk the whole uevent socket list. Instead we can directly retrieve the relevant uevent socket and send the message. At exit time we can now also trivially remove the uevent socket from the uevent socket list. This keeps the codepath very performant without introducing needless overhead and even makes older codepaths faster. Uevent sequence numbers are kept global. When a uevent message is sent to another network namespace the implementation will simply increment the global uevent sequence number and append it to the received uevent. This has the advantage that the kernel will never need to parse the received uevent message to replace any existing uevent sequence numbers. Instead it is up to the userspace process to remove any existing uevent sequence numbers in case the uevent message to be sent contains any. Security: In order for a caller to send uevent messages to a target network namespace the caller must have CAP_SYS_ADMIN in the owning user namespace of the target network namespace. Additionally, any received uevent message is verified to not exceed size UEVENT_BUFFER_SIZE. This includes the space needed to append the uevent sequence number. Testing: This patch has been tested and verified to work with the following udev implementations: 1. CentOS 6 with udevd version 147 2. Debian Sid with systemd-udevd version 237 3. Android 7.1.1 with ueventd Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22net: add uevent socket memberChristian Brauner
This commit adds struct uevent_sock to struct net. Since struct uevent_sock records the position of the uevent socket in the uevent socket list we can trivially remove it from the uevent socket list during cleanup. This speeds up the old removal codepath. Note, list_del() will hit __list_del_entry_valid() in its call chain which will validate that the element is a member of the list. If it isn't it will take care that the list is not modified. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-21lib: Add generic PIO mapping methodZhichang Yuan
41f8bba7f555 ("of/pci: Add pci_register_io_range() and pci_pio_to_address()") added support for PCI I/O space mapped into CPU physical memory space. With that support, the I/O ranges configured for PCI/PCIe hosts on some architectures can be mapped to logical PIO and converted easily between CPU address and the corresponding logical PIO. Based on this, PCI I/O port space can be accessed via in/out accessors that use memory read/write. But on some platforms, there are bus hosts that access I/O port space with host-local I/O port addresses rather than memory addresses. Add a more generic I/O mapping method to support those devices. With this patch, both the CPU addresses and the host-local port can be mapped into the logical PIO space with different logical/fake PIOs. After this, all the I/O accesses to either PCI MMIO devices or host-local I/O peripherals can be unified into the existing I/O accessors defined in asm-generic/io.h and be redirected to the right device-specific hooks based on the input logical PIO. Tested-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Zhichang Yuan <yuanzhichang@hisilicon.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> [bhelgaas: remove -EFAULT return from logic_pio_register_range() per https://lkml.kernel.org/r/20180403143909.GA21171@ulmo, fix NULL pointer checking per https://lkml.kernel.org/r/20180403211505.GA29612@embeddedor.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2018-03-20test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other archesThadeu Lima de Souza Cascardo
Function bpf_fill_maxinsns11 is designed to not be able to be JITed on x86_64. So, it fails when CONFIG_BPF_JIT_ALWAYS_ON=y, and commit 09584b406742 ("bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y") makes sure that failure is detected on that case. However, it does not fail on other architectures, which have a different JIT compiler design. So, test_bpf has started to fail to load on those. After this fix, test_bpf loads fine on both x86_64 and ppc64el. Fixes: 09584b406742 ("bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-20dma/swiotlb: Remove swiotlb_{alloc,free}_coherent()Christoph Hellwig
Unused now that everyone uses swiotlb_{alloc,free}(). Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-15-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20dma/direct: Handle force decryption for DMA coherent buffers in common codeChristoph Hellwig
With that in place the generic DMA-direct routines can be used to allocate non-encrypted bounce buffers, and the x86 SEV case can use the generic swiotlb ops including nice features such as using CMA allocations. Note that I'm not too happy about using sev_active() in DMA-direct, but I couldn't come up with a good enough name for a wrapper to make it worth adding. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-14-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20dma/direct: Handle the memory encryption bit in common codeChristoph Hellwig
Give the basic phys_to_dma() and dma_to_phys() helpers a __-prefix and add the memory encryption mask to the non-prefixed versions. Use the __-prefixed versions directly instead of clearing the mask again in various places. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-13-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20dma/swiotlb: Remove swiotlb_set_mem_attributes()Christoph Hellwig
Now that set_memory_decrypted() is always available we can just call it directly. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-12-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20lib/raid6: Build proper raid6test files on powerpcMatt Brown
Previously the raid6 test Makefile did not build the POWER specific files (altivec and vpermxor). This patch fixes the bug, so that all appropriate files for powerpc are built. This patch also fixes the missing and mismatched ifdef statements to allow the altivec.uc file to be built correctly. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-20lib/raid6/altivec: Add vpermxor implementation for raid6 Q syndromeMatt Brown
This patch uses the vpermxor instruction to optimise the raid6 Q syndrome. This instruction was made available with POWER8, ISA version 2.07. It allows for both vperm and vxor instructions to be done in a single instruction. This has been tested for correctness on a ppc64le vm with a basic RAID6 setup containing 5 drives. The performance benchmarks are from the raid6test in the /lib/raid6/test directory. These results are from an IBM Firestone machine with ppc64le architecture. The benchmark results show a 35% speed increase over the best existing algorithm for powerpc (altivec). The raid6test has also been run on a big-endian ppc64 vm to ensure it also works for big-endian architectures. Performance benchmarks: raid6: altivecx4 gen() 18773 MB/s raid6: altivecx8 gen() 19438 MB/s raid6: vpermxor4 gen() 25112 MB/s raid6: vpermxor8 gen() 26279 MB/s Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Daniel Axtens <dja@axtens.net> [mpe: Add VPERMXOR macro so we can build with old binutils] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-19Merge branch 'for-4.16-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu fixes from Tejun Heo: "Late percpu pull request for v4.16-rc6. - percpu allocator pool replenishing no longer triggers OOM or warning messages. Also, the alloc interface now understands __GFP_NORETRY and __GFP_NOWARN. This is to allow avoiding OOMs from userland triggered actions like bpf map creation. Also added cond_resched() in alloc loop. - perpcu allocation now can be interrupted by kill sigs to avoid deadlocking OOM killer. - Added Dennis Zhou as a co-maintainer. He has rewritten the area map allocator, understands most of the code base and has been responsive for all bug reports" * 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn() percpu: include linux/sched.h for cond_resched() percpu: add a schedule point in pcpu_balance_workfn() percpu: allow select gfp to be passed to underlying allocators percpu: add __GFP_NORETRY semantics to the percpu balancing path percpu: match chunk allocator declarations with definitions percpu: add Dennis Zhou as a percpu co-maintainer
2018-03-19percpu_ref: Update doc to dissuade users from depending on internal RCU ↵Tejun Heo
grace periods percpu_ref internally uses sched-RCU to implement the percpu -> atomic mode switching and the documentation suggested that this could be depended upon. This doesn't seem like a good idea. * percpu_ref uses sched-RCU which has different grace periods regular RCU. Users may combine percpu_ref with regular RCU usage and incorrectly believe that regular RCU grace periods are performed by percpu_ref. This can lead to, for example, use-after-free due to premature freeing. * percpu_ref has a grace period when switching from percpu to atomic mode. It doesn't have one between the last put and release. This distinction is subtle and can lead to surprising bugs. * percpu_ref allows starting in and switching to atomic mode manually for debugging and other purposes. This means that there may not be any grace periods from kill to release. This patch makes it clear that the grace periods are percpu_ref's internal implementation detail and can't be depended upon by the users. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2018-03-19Merge 4.16-rc6 into staging-nextGreg Kroah-Hartman
We want the staging fixes in here as well to handle merge/test issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15devres: combine function devm_ioremap*Yisheng Xie
When I tried to use devm_ioremap function and review related code, I found devm_ioremap_* almost have the similar realize with each other, which can be combined. In the former version, I have tried to kill ioremap_cache to reduce the size of devres, which can not work for ioremap is not the same as ioremap_nocache in some ARCHs likes ia64. Therefore, as the suggestion of Christophe, I introduce a help function __devm_ioremap, let devm_ioremap* inline and call __devm_ioremap with different devm_ioremap_type. After apply the patch, the size of devres.o can be reduce from 8216 Bytes to 8052 Bytes in my compile environment. Suggested-by: Greg KH <gregkh@linuxfoundation.org> Suggested-by: Christophe LEROY <christophe.leroy@c-s.fr> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Reviewed-by: Christophe LEROY <christophe.leroy@c-s.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15lib/kobject: Join string literals backAndy Shevchenko
There is no need to split string literals. Moreover, it would be simpler to grep for an actual code line, when debugging, by using almost any part of the string literal in question. While here, replace printk(LEVEL) by pr_lvl() macros. No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15printk: move dump stack related code to lib/dump_stack.cDave Young
dump_stack related stuff should belong to lib/dump_stack.c thus move them there. Also conditionally compile lib/dump_stack.c since dump_stack code does not make sense if printk is disabled. Link: http://lkml.kernel.org/r/20180213072834.GA24784@dhcp-128-65.nay.redhat.com To: Steven Rostedt <rostedt@goodmis.org> Cc: linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Dave Young <dyoung@redhat.com> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Suggested-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-03-14btree: avoid variable-length allocationsJoern Engel
geo->keylen cannot be larger than 4. So we might as well make fixed-size allocations. Given the one remaining user, geo->keylen cannot even be larger than 1. Logfs used to have 64bit and 128bit keys, tcm_qla2xxx only has 32bit keys. But let's not break the code if we don't have to. Signed-off-by: Joern Engel <joern@purestorage.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-14debugobjects: Avoid another unused variable warningArnd Bergmann
debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(), and only read in debug_objects_maxchecked, unfortunately both of these are optional and depend on different Kconfig symbols. When both CONFIG_DEBUG_OBJECTS_FREE and CONFIG_DEBUG_FS are disabled this warning is emitted: lib/debugobjects.c:56:14: error: 'debug_objects_maxchecked' defined but not used [-Werror=unused-variable] Rather than trying to add more complex #ifdef protections, mark the variable as __maybe_unused so it can be silently dropped when usused. Fixes: bd9dcd046509 ("debugobjects: Export max loops counter") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20180313131857.158876-1-arnd@arndb.de
2018-03-09lib/test_kmod.c: fix limit check on number of test devices createdLuis R. Rodriguez
As reported by Dan the parentheses is in the wrong place, and since unlikely() call returns either 0 or 1 it's never less than zero. The second issue is that signed integer overflows like "INT_MAX + 1" are undefined behavior. Since num_test_devs represents the number of devices, we want to stop prior to hitting the max, and not rely on the wrap arround at all. So just cap at num_test_devs + 1, prior to assigning a new device. Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09lib/bug.c: exclude non-BUG/WARN exceptions from report_bug()Kees Cook
Commit b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash") changed the ordering of fixups, and did not take into account the case of x86 processing non-WARN() and non-BUG() exceptions. This would lead to output of a false BUG line with no other information. In the case of a refcount exception, it would be immediately followed by the refcount WARN(), producing very strange double-"cut here": lkdtm: attempting bad refcount_inc() overflow ------------[ cut here ]------------ Kernel BUG at 0000000065f29de5 [verbose debug info unavailable] ------------[ cut here ]------------ refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0 WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4 ... In the prior ordering, exceptions were searched first: do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str, ... if (fixup_exception(regs, trapnr)) return 0; - if (fixup_bug(regs, trapnr)) - return 0; - As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account needing to search the exception list first, since that had already happened. So, instead of searching the exception list twice (once in is_valid_bugaddr() and then again in fixup_exception()), just add a simple sanity check to report_bug() that will immediately bail out if a BUG() (or WARN()) entry is not found. Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast Fixes: b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash") Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>