summaryrefslogtreecommitdiff
path: root/arch/alpha
AgeCommit message (Collapse)Author
2018-08-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights: - Gustavo A. R. Silva keeps working on the implicit switch fallthru changes. - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From Luca Coelho. - Re-enable ASPM in r8169, from Kai-Heng Feng. - Add virtual XFRM interfaces, which avoids all of the limitations of existing IPSEC tunnels. From Steffen Klassert. - Convert GRO over to use a hash table, so that when we have many flows active we don't traverse a long list during accumluation. - Many new self tests for routing, TC, tunnels, etc. Too many contributors to mention them all, but I'm really happy to keep seeing this stuff. - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu. - Lots of cleanups and fixes in L2TP code from Guillaume Nault. - Add IPSEC offload support to netdevsim, from Shannon Nelson. - Add support for slotting with non-uniform distribution to netem packet scheduler, from Yousuk Seung. - Add UDP GSO support to mlx5e, from Boris Pismenny. - Support offloading of Team LAG in NFP, from John Hurley. - Allow to configure TX queue selection based upon RX queue, from Amritha Nambiar. - Support ethtool ring size configuration in aquantia, from Anton Mikaev. - Support DSCP and flowlabel per-transport in SCTP, from Xin Long. - Support list based batching and stack traversal of SKBs, this is very exciting work. From Edward Cree. - Busyloop optimizations in vhost_net, from Toshiaki Makita. - Introduce the ETF qdisc, which allows time based transmissions. IGB can offload this in hardware. From Vinicius Costa Gomes. - Add parameter support to devlink, from Moshe Shemesh. - Several multiplication and division optimizations for BPF JIT in nfp driver, from Jiong Wang. - Lots of prepatory work to make more of the packet scheduler layer lockless, when possible, from Vlad Buslov. - Add ACK filter and NAT awareness to sch_cake packet scheduler, from Toke Høiland-Jørgensen. - Support regions and region snapshots in devlink, from Alex Vesker. - Allow to attach XDP programs to both HW and SW at the same time on a given device, with initial support in nfp. From Jakub Kicinski. - Add TLS RX offload and support in mlx5, from Ilya Lesokhin. - Use PHYLIB in r8169 driver, from Heiner Kallweit. - All sorts of changes to support Spectrum 2 in mlxsw driver, from Ido Schimmel. - PTP support in mv88e6xxx DSA driver, from Andrew Lunn. - Make TCP_USER_TIMEOUT socket option more accurate, from Jon Maxwell. - Support for templates in packet scheduler classifier, from Jiri Pirko. - IPV6 support in RDS, from Ka-Cheong Poon. - Native tproxy support in nf_tables, from Máté Eckl. - Maintain IP fragment queue in an rbtree, but optimize properly for in-order frags. From Peter Oskolkov. - Improvde handling of ACKs on hole repairs, from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits) bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT" hv/netvsc: Fix NULL dereference at single queue mode fallback net: filter: mark expected switch fall-through xen-netfront: fix warn message as irq device name has '/' cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0 net: dsa: mv88e6xxx: missing unlock on error path rds: fix building with IPV6=m inet/connection_sock: prefer _THIS_IP_ to current_text_addr net: dsa: mv88e6xxx: bitwise vs logical bug net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd() ieee802154: hwsim: using right kind of iteration net: hns3: Add vlan filter setting by ethtool command -K net: hns3: Set tx ring' tc info when netdev is up net: hns3: Remove tx ring BD len register in hns3_enet net: hns3: Fix desc num set to default when setting channel net: hns3: Fix for phy link issue when using marvell phy driver net: hns3: Fix for information of phydev lost problem when down/up net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero net: hns3: Add support for serdes loopback selftest bnxt_en: take coredump_record structure off stack ...
2018-08-15Merge tag 'kconfig-v4.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig consolidation from Masahiro Yamada: "Consolidation of Kconfig files by Christoph Hellwig. Move the source statements of arch-independent Kconfig files instead of duplicating the includes in every arch/$(SRCARCH)/Kconfig" * tag 'kconfig-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: add a Memory Management options" menu kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt kconfig: use a menu in arch/Kconfig to reduce clutter kconfig: include kernel/Kconfig.preempt from init/Kconfig Kconfig: consolidate the "Kernel hacking" menu kconfig: include common Kconfig files from top-level Kconfig kconfig: remove duplicate SWAP symbol defintions um: create a proper drivers Kconfig um: cleanup Kconfig files um: stop abusing KBUILD_KCONFIG
2018-08-15Merge tag 'kbuild-v4.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - verify depmod is installed before modules_install - support build salt in case build ids must be unique between builds - allow users to specify additional host compiler flags via HOST*FLAGS, and rename internal variables to KBUILD_HOST*FLAGS - update buildtar script to drop vax support, add arm64 support - update builddeb script for better debarch support - document the pit-fall of if_changed usage - fix parallel build of UML with O= option - make 'samples' target depend on headers_install to fix build errors - remove deprecated host-progs variable - add a new coccinelle script for refcount_t vs atomic_t check - improve double-test coccinelle script - misc cleanups and fixes * tag 'kbuild-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits) coccicheck: return proper error code on fail Coccinelle: doubletest: reduce side effect false positives kbuild: remove deprecated host-progs variable kbuild: make samples really depend on headers_install um: clean up archheaders recipe kbuild: add %asm-generic to no-dot-config-targets um: fix parallel building with O= option scripts: Add Python 3 support to tracing/draw_functrace.py builddeb: Add automatic support for sh{3,4}{,eb} architectures builddeb: Add automatic support for riscv* architectures builddeb: Add automatic support for m68k architecture builddeb: Add automatic support for or1k architecture builddeb: Add automatic support for sparc64 architecture builddeb: Add automatic support for mips{,64}r6{,el} architectures builddeb: Add automatic support for mips64el architecture builddeb: Add automatic support for ppc64 and powerpcspe architectures builddeb: Introduce functions to simplify kconfig tests in set_debarch builddeb: Drop check for 32-bit s390 builddeb: Change architecture detection fallback to use dpkg-architecture builddeb: Skip architecture detection when KBUILD_DEBARCH is set ...
2018-08-13Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking/atomics update from Thomas Gleixner: "The locking, atomics and memory model brains delivered: - A larger update to the atomics code which reworks the ordering barriers, consolidates the atomic primitives, provides the new atomic64_fetch_add_unless() primitive and cleans up the include hell. - Simplify cmpxchg() instrumentation and add instrumentation for xchg() and cmpxchg_double(). - Updates to the memory model and documentation" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits) locking/atomics: Rework ordering barriers locking/atomics: Instrument cmpxchg_double*() locking/atomics: Instrument xchg() locking/atomics: Simplify cmpxchg() instrumentation locking/atomics/x86: Reduce arch_cmpxchg64*() instrumentation tools/memory-model: Rename litmus tests to comply to norm7 tools/memory-model/Documentation: Fix typo, smb->smp sched/Documentation: Update wake_up() & co. memory-barrier guarantees locking/spinlock, sched/core: Clarify requirements for smp_mb__after_spinlock() sched/core: Use smp_mb() in wake_woken_function() tools/memory-model: Add informal LKMM documentation to MAINTAINERS locking/atomics/Documentation: Describe atomic_set() as a write operation tools/memory-model: Make scripts executable tools/memory-model: Remove ACCESS_ONCE() from model tools/memory-model: Remove ACCESS_ONCE() from recipes locking/memory-barriers.txt/kokr: Update Korean translation to fix broken DMA vs. MMIO ordering example MAINTAINERS: Add Daniel Lustig as an LKMM reviewer tools/memory-model: Fix ISA2+pooncelock+pooncelock+pombonce name tools/memory-model: Add litmus test for full multicopy atomicity locking/refcount: Always allow checked forms ...
2018-08-02kconfig: include kernel/Kconfig.preempt from init/KconfigChristoph Hellwig
Almost all architectures include it. Add a ARCH_NO_PREEMPT symbol to disable preempt support for alpha, hexagon, non-coldfire m68k and user mode Linux. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02Kconfig: consolidate the "Kernel hacking" menuChristoph Hellwig
Move the source of lib/Kconfig.debug and arch/$(ARCH)/Kconfig.debug to the top-level Kconfig. For two architectures that means moving their arch-specific symbols in that menu into a new arch Kconfig.debug file, and for a few more creating a dummy file so that we can include it unconditionally. Also move the actual 'Kernel hacking' menu to lib/Kconfig.debug, where it belongs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02kconfig: include common Kconfig files from top-level KconfigChristoph Hellwig
Instead of duplicating the source statements in every architecture just do it once in the toplevel Kconfig file. Note that with this the inclusion of arch/$(SRCARCH/Kconfig moves out of the top-level Kconfig into arch/Kconfig so that don't violate ordering constraits while keeping a sensible menu structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-07-25locking/atomics: Rework ordering barriersMark Rutland
Currently architectures can override __atomic_op_*() to define the barriers used before/after a relaxed atomic when used to build acquire/release/fence variants. This has the unfortunate property of requiring the architecture to define the full wrapper for the atomics, rather than just the barriers they care about, and gets in the way of generating atomics which can be easily read. Instead, this patch has architectures define an optional set of barriers: * __atomic_acquire_fence() * __atomic_release_fence() * __atomic_pre_full_fence() * __atomic_post_full_fence() ... which <linux/atomic.h> uses to build the wrappers. It would be nice if we could undef these, along with the __atomic_op_*() wrappers, but that would break the cmpxchg() wrappers, which are written in preprocessor. Undefs would have been nice, but alas. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: andy.shevchenko@gmail.com Cc: arnd@arndb.de Cc: aryabinin@virtuozzo.com Cc: catalin.marinas@arm.com Cc: dvyukov@google.com Cc: glider@google.com Cc: linux-arm-kernel@lists.infradead.org Cc: peter@hurleysoftware.com Link: http://lkml.kernel.org/r/20180716113017.3909-7-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-24Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2018-07-22alpha: fix osf_wait4() breakageAl Viro
kernel_wait4() expects a userland address for status - it's only rusage that goes as a kernel one (and needs a copyout afterwards) [ Also, fix the prototype of kernel_wait4() to have that __user annotation - Linus ] Fixes: 92ebce5ac55d ("osf_wait4: switch to kernel_wait4()") Cc: stable@kernel.org # v4.13+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-18kbuild: Rename HOSTCFLAGS to KBUILD_HOSTCFLAGSLaura Abbott
In preparation for enabling command line CFLAGS, re-name HOSTCFLAGS to KBUILD_HOSTCFLAGS as the internal use only flags. This should not have any visible effects. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-07-17Merge tag 'v4.18-rc5' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-04net: Add a new socket option for a future transmit time.Richard Cochran
This patch introduces SO_TXTIME. User space enables this option in order to pass a desired future transmit time in a CMSG when calling sendmsg(2). The argument to this socket option is a 8-bytes long struct provided by the uapi header net_tstamp.h defined as: struct sock_txtime { clockid_t clockid; u32 flags; }; Note that new fields were added to struct sock by filling a 2-bytes hole found in the struct. For that reason, neither the struct size or number of cachelines were altered. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-21atomics/treewide: Make conditional inc/dec ops optionalMark Rutland
The conditional inc/dec ops differ for atomic_t and atomic64_t: - atomic_inc_unless_positive() is optional for atomic_t, and doesn't exist for atomic64_t. - atomic_dec_unless_negative() is optional for atomic_t, and doesn't exist for atomic64_t. - atomic_dec_if_positive is optional for atomic_t, and is mandatory for atomic64_t. Let's make these consistently optional for both. At the same time, let's clean up the existing fallbacks to use atomic_try_cmpxchg(). The instrumented atomics are updated accordingly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20180621121321.4761-18-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21atomics/treewide: Make unconditional inc/dec ops optionalMark Rutland
Many of the inc/dec ops are mandatory, but for most architectures inc/dec are simply trivial wrappers around their corresponding add/sub ops. Let's make all the inc/dec ops optional, so that we can get rid of these boilerplate wrappers. The instrumented atomics are updated accordingly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Palmer Dabbelt <palmer@sifive.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20180621121321.4761-17-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21atomics/treewide: Make test ops optionalMark Rutland
Some of the atomics return the result of a test applied after the atomic operation, and almost all architectures implement these as trivial wrappers around the underlying atomic. Specifically: * <atomic>_inc_and_test(v) is (<atomic>_inc_return(v) == 0) * <atomic>_dec_and_test(v) is (<atomic>_dec_return(v) == 0) * <atomic>_sub_and_test(i, v) is (<atomic>_sub_return(i, v) == 0) * <atomic>_add_negative(i, v) is (<atomic>_add_return(i, v) < 0) Rather than have these definitions duplicated in all architectures, with minor inconsistencies in formatting and documentation, let's make these operations optional, with default fallbacks as above. Implementations must now provide a preprocessor symbol. The instrumented atomics are updated accordingly. Both x86 and m68k have custom implementations, which are left as-is, given preprocessor symbols to avoid being overridden. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Palmer Dabbelt <palmer@sifive.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20180621121321.4761-16-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21atomics/alpha: Define atomic64_fetch_add_unless()Mark Rutland
As a step towards unifying the atomic/atomic64/atomic_long APIs, this patch converts the arch/alpha implementation of atomic64_add_unless() into an implementation of atomic64_fetch_add_unless(). A wrapper in <linux/atomic.h> will build atomic_add_unless() atop of this, provided it is given a preprocessor definition. No functional change is intended as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20180621121321.4761-10-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21atomics/treewide: Make atomic_fetch_add_unless() optionalMark Rutland
Several architectures these have a near-identical implementation based on atomic_read() and atomic_cmpxchg() which we can instead define in <linux/atomic.h>, so let's do so, using something close to the existing x86 implementation with try_cmpxchg(). Where an architecture provides its own atomic_fetch_add_unless(), it must define a preprocessor symbol for it. The instrumented atomics are updated accordingly. Note that arch/arc's existing atomic_fetch_add_unless() had redundant barriers, as these are already present in its atomic_cmpxchg() implementation. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Palmer Dabbelt <palmer@sifive.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vineet Gupta <vgupta@synopsys.com> Link: https://lore.kernel.org/lkml/20180621121321.4761-7-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21atomics/treewide: Make atomic64_inc_not_zero() optionalMark Rutland
We define a trivial fallback for atomic_inc_not_zero(), but don't do the same for atomic64_inc_not_zero(), leading most architectures to define the same boilerplate. Let's add a fallback in <linux/atomic.h>, and remove the redundant implementations. Note that atomic64_add_unless() is always defined in <linux/atomic.h>, and promotes its arguments to the requisite types, so we need not do this explicitly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Palmer Dabbelt <palmer@sifive.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20180621121321.4761-6-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21atomics/treewide: Rename __atomic_add_unless() => atomic_fetch_add_unless()Mark Rutland
While __atomic_add_unless() was originally intended as a building-block for atomic_add_unless(), it's now used in a number of places around the kernel. It's the only common atomic operation named __atomic*(), rather than atomic_*(), and for consistency it would be better named atomic_fetch_add_unless(). This lack of consistency is slightly confusing, and gets in the way of scripting atomics. Given that, let's clean things up and promote it to an official part of the atomics API, in the form of atomic_fetch_add_unless(). This patch converts definitions and invocations over to the new name, including the instrumented version, using the following script: ---- git grep -w __atomic_add_unless | while read line; do sed -i '{s/\<__atomic_add_unless\>/atomic_fetch_add_unless/}' "${line%%:*}"; done git grep -w __arch_atomic_add_unless | while read line; do sed -i '{s/\<__arch_atomic_add_unless\>/arch_atomic_fetch_add_unless/}' "${line%%:*}"; done ---- Note that we do not have atomic{64,_long}_fetch_add_unless(), which will be introduced by later patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Palmer Dabbelt <palmer@sifive.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20180621121321.4761-2-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-12alpha: Remove custom dec_and_lock() implementationSebastian Andrzej Siewior
Alpha provides a custom implementation of dec_and_lock(). The functions is split into two parts: - atomic_add_unless() + return 0 (fast path in assembly) - remaining part including locking (slow path in C) Comparing the result of the alpha implementation with the generic implementation compiled by gcc it looks like the fast path is optimized by avoiding a stack frame (and reloading the GP), register store and all this. This is only done in the slowpath. After marking the slowpath (atomic_dec_and_lock_1()) as "noinline" and doing the slowpath in C (the atomic_add_unless(atomic, -1, 1) part) I noticed differences in the resulting assembly: - the GP is still reloaded - atomic_add_unless() adds more memory barriers compared to the custom assembly - the custom assembly here does "load, sub, beq" while atomic_add_unless() does "load, cmpeq, add, bne". This is okay because it compares against zero after subtraction while the generic code compares against 1 before. I'm not sure if avoiding the stack frame (and GP reloading) brings a lot in terms of performance. Regarding the different barriers, Peter Zijlstra says: |refcount decrement needs to be a RELEASE operation, such that all the |load/stores to the object happen before we decrement the refcount. | |Otherwise things like: | | obj->foo = 5; | refcnt_dec(&obj->ref); | |can be re-ordered, which then allows fun scenarios like: | | CPU0 CPU1 | | refcnt_dec(&obj->ref); | if (dec_and_test(&obj->ref)) | free(obj); | obj->foo = 5; // oops UaF | | |This means (for alpha) that there should be a memory barrier _before_ |the decrement, however the dec_and_lock asm thing only has one _after_, |which, per the above, is too late. | |The generic version using add_unless will result in memory barrier |before and after (because that is the rule for atomic ops with a return |value) which is strictly too many barriers for the refcount story, but |who knows what other ordering requirements code has. Remove the custom alpha implementation of dec_and_lock() and if it is an issue (performance wise) then the fast path could still be inlined. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-alpha@vger.kernel.org Link: https://lkml.kernel.org/r/20180606115918.GG12198@hirez.programming.kicks-ass.net Link: https://lkml.kernel.org/r20180612161621.22645-2-bigeasy@linutronix.de
2018-06-06Merge tag 'kbuild-v4.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - improve fixdep to coalesce consecutive slashes in dep-files - fix some issues of the maintainer string generation in deb-pkg script - remove unused CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX and clean-up several tools and linker scripts - clean-up modpost - allow to enable the dead code/data elimination for PowerPC in EXPERT mode - improve two coccinelle scripts for better performance - pass endianness and machine size flags to sparse for all architecture - misc fixes * tag 'kbuild-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits) kbuild: add machine size to CHECKFLAGS kbuild: add endianness flag to CHEKCFLAGS kbuild: $(CHECK) doesnt need NOSTDINC_FLAGS twice scripts: Fixed printf format mismatch scripts/tags.sh: use `find` for $ALLSOURCE_ARCHS generation coccinelle: deref_null: improve performance coccinelle: mini_lock: improve performance powerpc: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selected kbuild: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selectable if enabled kbuild: LD_DEAD_CODE_DATA_ELIMINATION no -ffunction-sections/-fdata-sections for module build kbuild: Fix asm-generic/vmlinux.lds.h for LD_DEAD_CODE_DATA_ELIMINATION modpost: constify *modname function argument where possible modpost: remove redundant is_vmlinux() test modpost: use strstarts() helper more widely modpost: pass struct elf_info pointer to get_modinfo() checkpatch: remove VMLINUX_SYMBOL() check vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL() kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX export.h: remove code for prefixing symbols with underscore depmod.sh: remove symbol prefix support ...
2018-06-04Merge branch 'timers-2038-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull time/Y2038 updates from Thomas Gleixner: - Consolidate SySV IPC UAPI headers - Convert SySV IPC to the new COMPAT_32BIT_TIME mechanism - Cleanup the core interfaces and standardize on the ktime_get_* naming convention. - Convert the X86 platform ops to timespec64 - Remove the ugly temporary timespec64 hack * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) x86: Convert x86_platform_ops to timespec64 timekeeping: Add more coarse clocktai/boottime interfaces timekeeping: Add ktime_get_coarse_with_offset timekeeping: Standardize on ktime_get_*() naming timekeeping: Clean up ktime_get_real_ts64 timekeeping: Remove timespec64 hack y2038: ipc: Redirect ipc(SEMTIMEDOP, ...) to compat_ksys_semtimedop y2038: ipc: Enable COMPAT_32BIT_TIME y2038: ipc: Use __kernel_timespec y2038: ipc: Report long times to user space y2038: ipc: Use ktime_get_real_seconds consistently y2038: xtensa: Extend sysvipc data structures y2038: powerpc: Extend sysvipc data structures y2038: sparc: Extend sysvipc data structures y2038: parisc: Extend sysvipc data structures y2038: mips: Extend sysvipc data structures y2038: arm64: Extend sysvipc compat data structures y2038: s390: Remove unneeded ipc uapi header files y2038: ia64: Remove unneeded ipc uapi header files y2038: alpha: Remove unneeded ipc uapi header files ...
2018-06-04Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers and timekeeping updates from Thomas Gleixner: - Core infrastucture work for Y2038 to address the COMPAT interfaces: + Add a new Y2038 safe __kernel_timespec and use it in the core code + Introduce config switches which allow to control the various compat mechanisms + Use the new config switch in the posix timer code to control the 32bit compat syscall implementation. - Prevent bogus selection of CPU local clocksources which causes an endless reselection loop - Remove the extra kthread in the clocksource code which has no value and just adds another level of indirection - The usual bunch of trivial updates, cleanups and fixlets all over the place - More SPDX conversions * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) clocksource/drivers/mxs_timer: Switch to SPDX identifier clocksource/drivers/timer-imx-tpm: Switch to SPDX identifier clocksource/drivers/timer-imx-gpt: Switch to SPDX identifier clocksource/drivers/timer-imx-gpt: Remove outdated file path clocksource/drivers/arc_timer: Add comments about locking while read GFRC clocksource/drivers/mips-gic-timer: Add pr_fmt and reword pr_* messages clocksource/drivers/sprd: Fix Kconfig dependency clocksource: Move inline keyword to the beginning of function declarations timer_list: Remove unused function pointer typedef timers: Adjust a kernel-doc comment tick: Prefer a lower rating device only if it's CPU local device clocksource: Remove kthread time: Change nanosleep to safe __kernel_* types time: Change types to new y2038 safe __kernel_* types time: Fix get_timespec64() for y2038 safe compat interfaces time: Add new y2038 safe __kernel_timespec posix-timers: Make compat syscalls depend on CONFIG_COMPAT_32BIT_TIME time: Introduce CONFIG_COMPAT_32BIT_TIME time: Introduce CONFIG_64BIT_TIME in architectures compat: Enable compat_get/put_timespec64 always ...
2018-06-04Merge branch 'siginfo-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo updates from Eric Biederman: "This set of changes close the known issues with setting si_code to an invalid value, and with not fully initializing struct siginfo. There remains work to do on nds32, arc, unicore32, powerpc, arm, arm64, ia64 and x86 to get the code that generates siginfo into a simpler and more maintainable state. Most of that work involves refactoring the signal handling code and thus careful code review. Also not included is the work to shrink the in kernel version of struct siginfo. That depends on getting the number of places that directly manipulate struct siginfo under control, as it requires the introduction of struct kernel_siginfo for the in kernel things. Overall this set of changes looks like it is making good progress, and with a little luck I will be wrapping up the siginfo work next development cycle" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) signal/sh: Stop gcc warning about an impossible case in do_divide_error signal/mips: Report FPE_FLTUNK for undiagnosed floating point exceptions signal/um: More carefully relay signals in relay_signal. signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR} signal: Remove unncessary #ifdef SEGV_PKUERR in 32bit compat code signal/signalfd: Add support for SIGSYS signal/signalfd: Remove __put_user from signalfd_copyinfo signal/xtensa: Use force_sig_fault where appropriate signal/xtensa: Consistenly use SIGBUS in do_unaligned_user signal/um: Use force_sig_fault where appropriate signal/sparc: Use force_sig_fault where appropriate signal/sparc: Use send_sig_fault where appropriate signal/sh: Use force_sig_fault where appropriate signal/s390: Use force_sig_fault where appropriate signal/riscv: Replace do_trap_siginfo with force_sig_fault signal/riscv: Use force_sig_fault where appropriate signal/parisc: Use force_sig_fault where appropriate signal/parisc: Use force_sig_mceerr where appropriate signal/openrisc: Use force_sig_fault where appropriate signal/nios2: Use force_sig_fault where appropriate ...
2018-06-04Merge tag 'docs-4.18' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation updates from Jonathan Corbet: "There's been a fair amount of work in the docs tree this time around, including: - Extensive RST conversions and organizational work in the memory-management docs thanks to Mike Rapoport. - An update of Documentation/features from Andrea Parri and a script to keep it updated. - Various LICENSES updates from Thomas, along with a script to check SPDX tags. - Work to fix dangling references to documentation files; this involved a fair number of one-liner comment changes outside of Documentation/ ... and the usual list of documentation improvements, typo fixes, etc" * tag 'docs-4.18' of git://git.lwn.net/linux: (103 commits) Documentation: document hung_task_panic kernel parameter docs/admin-guide/mm: add high level concepts overview docs/vm: move ksm and transhuge from "user" to "internals" section. docs: Use the kerneldoc comments for memalloc_no*() doc: document scope NOFS, NOIO APIs docs: update kernel versions and dates in tables docs/vm: transhuge: split userspace bits to admin-guide/mm/transhuge docs/vm: transhuge: minor updates docs/vm: transhuge: change sections order Documentation: arm: clean up Marvell Berlin family info Documentation: gpio: driver: Fix a typo and some odd grammar docs: ranoops.rst: fix location of ramoops.txt scripts/documentation-file-ref-check: rewrite it in perl with auto-fix mode docs: uio-howto.rst: use a code block to solve a warning mm, THP, doc: Add document for thp_swpout/thp_swpout_fallback w1: w1_io.c: fix a kernel-doc warning Documentation/process/posting: wrap text at 80 cols docs: admin-guide: add cgroup-v2 documentation Revert "Documentation/features/vm: Remove arch support status file for 'pte_special'" Documentation: refcount-vs-atomic: Update reference to LKMM doc. ...
2018-06-04Merge tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - replace the force_dma flag with a dma_configure bus method. (Nipun Gupta, although one patch is іncorrectly attributed to me due to a git rebase bug) - use GFP_DMA32 more agressively in dma-direct. (Takashi Iwai) - remove PCI_DMA_BUS_IS_PHYS and rely on the dma-mapping API to do the right thing for bounce buffering. - move dma-debug initialization to common code, and apply a few cleanups to the dma-debug code. - cleanup the Kconfig mess around swiotlb selection - swiotlb comment fixup (Yisheng Xie) - a trivial swiotlb fix. (Dan Carpenter) - support swiotlb on RISC-V. (based on a patch from Palmer Dabbelt) - add a new generic dma-noncoherent dma_map_ops implementation and use it for arc, c6x and nds32. - improve scatterlist validity checking in dma-debug. (Robin Murphy) - add a struct device quirk to limit the dma-mask to 32-bit due to bridge/system issues, and switch x86 to use it instead of a local hack for VIA bridges. - handle devices without a dma_mask more gracefully in the dma-direct code. * tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mapping: (48 commits) dma-direct: don't crash on device without dma_mask nds32: use generic dma_noncoherent_ops nds32: implement the unmap_sg DMA operation nds32: consolidate DMA cache maintainance routines x86/pci-dma: switch the VIA 32-bit DMA quirk to use the struct device flag x86/pci-dma: remove the explicit nodac and allowdac option x86/pci-dma: remove the experimental forcesac boot option Documentation/x86: remove a stray reference to pci-nommu.c core, dma-direct: add a flag 32-bit dma limits dma-mapping: remove unused gfp_t parameter to arch_dma_alloc_attrs dma-debug: check scatterlist segments c6x: use generic dma_noncoherent_ops arc: use generic dma_noncoherent_ops arc: fix arc_dma_{map,unmap}_page arc: fix arc_dma_sync_sg_for_{cpu,device} arc: simplify arc_dma_sync_single_for_{cpu,device} dma-mapping: provide a generic dma-noncoherent implementation dma-mapping: simplify Kconfig dependencies riscv: add swiotlb support riscv: only enable ZONE_DMA32 for 64-bit ...
2018-06-01kbuild: add machine size to CHECKFLAGSLuc Van Oostenryck
By default, sparse assumes a 64bit machine when compiled on x86-64 and 32bit when compiled on anything else. This can of course create all sort of problems for the other archs, like issuing false warnings ('shift too big (32) for type unsigned long'), or worse, failing to emit legitimate warnings. Fix this by adding the -m32/-m64 flag, depending on CONFIG_64BIT, to CHECKFLAGS in the main Makefile (and so for all archs). Also, remove the now unneeded -m32/-m64 in arch specific Makefiles. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-22alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2Sinan Kaya
memory-barriers.txt has been updated with the following requirement. "When using writel(), a prior wmb() is not needed to guarantee that the cache coherent memory writes have completed before writing to the MMIO region." Current writeX() and iowriteX() implementations on alpha are not satisfying this requirement as the barrier is after the register write. Move mb() in writeX() and iowriteX() functions to guarantee that HW observes memory changes before performing register operations. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-22alpha: simplify get_arch_dma_opsChristoph Hellwig
Remove the dma_ops indirection. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-22alpha: use dma_direct_ops for jensenChristoph Hellwig
The generic dma_direct implementation does the same thing as the alpha pci-noop implementation, just with more bells and whistles. And unlike the current code it at least has a theoretical chance to actually compile. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-09arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/KconfigChristoph Hellwig
Define this symbol if the architecture either uses 64-bit pointers or the PHYS_ADDR_T_64BIT is set. This covers 95% of the old arch magic. We only need an additional select for Xen on ARM (why anyway?), and we now always set ARCH_DMA_ADDR_T_64BIT on mips boards with 64-bit physical addressing instead of only doing it when highmem is set. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: James Hogan <jhogan@kernel.org>
2018-05-09dma-mapping: move the NEED_DMA_MAP_STATE config symbol to lib/KconfigChristoph Hellwig
This way we have one central definition of it, and user can select it as needed. Note that we now also always select it when CONFIG_DMA_API_DEBUG is select, which fixes some incorrect checks in a few network drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
2018-05-09scatterlist: move the NEED_SG_DMA_LENGTH config symbol to lib/KconfigChristoph Hellwig
This way we have one central definition of it, and user can select it as needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
2018-05-09iommu-helper: mark iommu_is_span_boundary as inlineChristoph Hellwig
This avoids selecting IOMMU_HELPER just for this function. And we only use it once or twice in normal builds so this often even is a size reduction. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-07PCI: remove PCI_DMA_BUS_IS_PHYSChristoph Hellwig
This was used by the ide, scsi and networking code in the past to determine if they should bounce payloads. Now that the dma mapping always have to support dma to all physical memory (thanks to swiotlb for non-iommu systems) there is no need to this crude hack any more. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Palmer Dabbelt <palmer@sifive.com> (for riscv) Reviewed-by: Jens Axboe <axboe@kernel.dk>
2018-04-25signal/alpha: Use force_sig_fault where appropriateEric W. Biederman
Filling in struct siginfo before calling force_sig_info a tedious and error prone process, where once in a great while the wrong fields are filled out, and siginfo has been inconsistently cleared. Simplify this process by using the helper force_sig_fault. Which takes as a parameters all of the information it needs, ensures all of the fiddly bits of filling in struct siginfo are done properly and then calls force_sig_info. In short about a 5 line reduction in code for every time force_sig_info is called, which makes the calling function clearer. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: linux-alpha@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-25signal/alpha: Use send_sig_fault where appropriateEric W. Biederman
Filling in struct siginfo before calling send_sig_info a tedious and error prone process, where once in a great while the wrong fields are filled out, and siginfo has been inconsistently cleared. Simplify this process by using the helper send_sig_fault. Which takes as a parameters all of the information it needs, ensures all of the fiddly bits of filling in struct siginfo are done properly and then calls send_sig_info. In short about a 5 line reduction in code for every time send_sig_info is called, which makes the calling function clearer. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: linux-alpha@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-25signal/alpha: Replace TRAP_FIXME with TRAP_UNKEric W. Biederman
Using an si_code of 0 that aliases with SI_USER is clearly the wrong thing to do, and causes problems in interesting ways. For it really is not clear to me if using TRAP_UNK bugcheck or the default case of gentrap is really the best way to handle things. There is certainly enough information that that a more specific si_code could potentially be used. That said TRAP_UNK is definitely an improvement over 0 as it removes the ambiguiuty of what si_code of 0 with SIGTRAP means on alpha. Recent history suggests no actually cares about crazy corner cases of the kernel behavior like this so I don't expect any regressions from changing this. However if something does happen this change is easy to revert. Cc: Helge Deller <deller@gmx.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-alpha@vger.kernel.org Fixes: 0a635c7a84cf ("Fill in siginfo_t.") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-25signal/alpha: Replace FPE_FIXME with FPE_FLTUNKEric W. Biederman
Using an si_code of 0 that aliases with SI_USER is clearly the wrong thing todo, and causes problems in interesting ways. The newly defined FPE_FLTUNK semantically appears to fit the bill so use it instead. Given recent experience in this area odds are it will not break anything. Fixing it removes a hazard to kernel maintenance. Cc: Helge Deller <deller@gmx.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-alpha@vger.kernel.org History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: 0a635c7a84cf ("Fill in siginfo_t.") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-25signal: Ensure every siginfo we send has all bits initializedEric W. Biederman
Call clear_siginfo to ensure every stack allocated siginfo is properly initialized before being passed to the signal sending functions. Note: It is not safe to depend on C initializers to initialize struct siginfo on the stack because C is allowed to skip holes when initializing a structure. The initialization of struct siginfo in tracehook_report_syscall_exit was moved from the helper user_single_step_siginfo into tracehook_report_syscall_exit itself, to make it clear that the local variable siginfo gets fully initialized. In a few cases the scope of struct siginfo has been reduced to make it clear that siginfo siginfo is not used on other paths in the function in which it is declared. Instances of using memset to initialize siginfo have been replaced with calls clear_siginfo for clarity. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-20y2038: alpha: Remove unneeded ipc uapi header filesArnd Bergmann
The alpha ipcbuf/msgbuf/sembuf/shmbuf header files are all identical to the version from asm-generic. This patch removes the files and replaces them with 'generic-y' statements as part of the y2038 series. Since there is no 32-bit syscall support for alpha, we don't need the other changes, but it's good to have clean this up anyway. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-19time: Add an asm-generic/compat.h fileArnd Bergmann
We have a couple of files that try to include asm/compat.h on architectures where this is available. Those should generally use the higher-level linux/compat.h file, but that in turn fails to include asm/compat.h when CONFIG_COMPAT is disabled, unless we can provide that header on all architectures. This adds the asm/compat.h for all remaining architectures to simplify the dependencies. Architectures that are getting removed in linux-4.17 are not changed here, to avoid needless conflicts with the removal patches. Those architectures are broken by this patch, but we have already shown that they have no users. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-17signal/alpha: Document a conflict with SI_USER for SIGFPEEric W. Biederman
Setting si_code to 0 is the same as setting si_code to SI_USER. This is the same si_code as SI_USER. Posix and common sense requires that SI_USER not be a signal specific si_code. As such this use of 0 for the si_code is a pretty horribly broken ABI. Cc: Helge Deller <deller@gmx.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-alpha@vger.kernel.org History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Ref: 0a635c7a84cf ("Fill in siginfo_t.") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-16Merge branch 'mm-rst' into docs-nextJonathan Corbet
Mike Rapoport says: These patches convert files in Documentation/vm to ReST format, add an initial index and link it to the top level documentation. There are no contents changes in the documentation, except few spelling fixes. The relatively large diffstat stems from the indentation and paragraph wrapping changes. I've tried to keep the formatting as consistent as possible, but I could miss some places that needed markup and add some markup where it was not necessary. [jc: significant conflicts in vm/hmm.rst]
2018-04-16docs/vm: rename documentation files to .rstMike Rapoport
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-04-11mm: introduce MAP_FIXED_NOREPLACEMichal Hocko
Patch series "mm: introduce MAP_FIXED_NOREPLACE", v2. This has started as a follow up discussion [3][4] resulting in the runtime failure caused by hardening patch [5] which removes MAP_FIXED from the elf loader because MAP_FIXED is inherently dangerous as it might silently clobber an existing underlying mapping (e.g. stack). The reason for the failure is that some architectures enforce an alignment for the given address hint without MAP_FIXED used (e.g. for shared or file backed mappings). One way around this would be excluding those archs which do alignment tricks from the hardening [6]. The patch is really trivial but it has been objected, rightfully so, that this screams for a more generic solution. We basically want a non-destructive MAP_FIXED. The first patch introduced MAP_FIXED_NOREPLACE which enforces the given address but unlike MAP_FIXED it fails with EEXIST if the given range conflicts with an existing one. The flag is introduced as a completely new one rather than a MAP_FIXED extension because of the backward compatibility. We really want a never-clobber semantic even on older kernels which do not recognize the flag. Unfortunately mmap sucks wrt flags evaluation because we do not EINVAL on unknown flags. On those kernels we would simply use the traditional hint based semantic so the caller can still get a different address (which sucks) but at least not silently corrupt an existing mapping. I do not see a good way around that. Except we won't export expose the new semantic to the userspace at all. It seems there are users who would like to have something like that. Jemalloc has been mentioned by Michael Ellerman [7] Florian Weimer has mentioned the following: : glibc ld.so currently maps DSOs without hints. This means that the kernel : will map right next to each other, and the offsets between them a completely : predictable. We would like to change that and supply a random address in a : window of the address space. If there is a conflict, we do not want the : kernel to pick a non-random address. Instead, we would try again with a : random address. John Hubbard has mentioned CUDA example : a) Searches /proc/<pid>/maps for a "suitable" region of available : VA space. "Suitable" generally means it has to have a base address : within a certain limited range (a particular device model might : have odd limitations, for example), it has to be large enough, and : alignment has to be large enough (again, various devices may have : constraints that lead us to do this). : : This is of course subject to races with other threads in the process. : : Let's say it finds a region starting at va. : : b) Next it does: : p = mmap(va, ...) : : *without* setting MAP_FIXED, of course (so va is just a hint), to : attempt to safely reserve that region. If p != va, then in most cases, : this is a failure (almost certainly due to another thread getting a : mapping from that region before we did), and so this layer now has to : call munmap(), before returning a "failure: retry" to upper layers. : : IMPROVEMENT: --> if instead, we could call this: : : p = mmap(va, ... MAP_FIXED_NOREPLACE ...) : : , then we could skip the munmap() call upon failure. This : is a small thing, but it is useful here. (Thanks to Piotr : Jaroszynski and Mark Hairgrove for helping me get that detail : exactly right, btw.) : : c) After that, CUDA suballocates from p, via: : : q = mmap(sub_region_start, ... MAP_FIXED ...) : : Interestingly enough, "freeing" is also done via MAP_FIXED, and : setting PROT_NONE to the subregion. Anyway, I just included (c) for : general interest. Atomic address range probing in the multithreaded programs in general sounds like an interesting thing to me. The second patch simply replaces MAP_FIXED use in elf loader by MAP_FIXED_NOREPLACE. I believe other places which rely on MAP_FIXED should follow. Actually real MAP_FIXED usages should be docummented properly and they should be more of an exception. [1] http://lkml.kernel.org/r/20171116101900.13621-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/20171129144219.22867-1-mhocko@kernel.org [3] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au [4] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com [5] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org [6] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz [7] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au This patch (of 2): MAP_FIXED is used quite often to enforce mapping at the particular range. The main problem of this flag is, however, that it is inherently dangerous because it unmaps existing mappings covered by the requested range. This can cause silent memory corruptions. Some of them even with serious security implications. While the current semantic might be really desiderable in many cases there are others which would want to enforce the given range but rather see a failure than a silent memory corruption on a clashing range. Please note that there is no guarantee that a given range is obeyed by the mmap even when it is free - e.g. arch specific code is allowed to apply an alignment. Introduce a new MAP_FIXED_NOREPLACE flag for mmap to achieve this behavior. It has the same semantic as MAP_FIXED wrt. the given address request with a single exception that it fails with EEXIST if the requested address is already covered by an existing mapping. We still do rely on get_unmaped_area to handle all the arch specific MAP_FIXED treatment and check for a conflicting vma after it returns. The flag is introduced as a completely new one rather than a MAP_FIXED extension because of the backward compatibility. We really want a never-clobber semantic even on older kernels which do not recognize the flag. Unfortunately mmap sucks wrt. flags evaluation because we do not EINVAL on unknown flags. On those kernels we would simply use the traditional hint based semantic so the caller can still get a different address (which sucks) but at least not silently corrupt an existing mapping. I do not see a good way around that. [mpe@ellerman.id.au: fix whitespace] [fail on clashing range with EEXIST as per Florian Weimer] [set MAP_FIXED before round_hint_to_min as per Khalid Aziz] Link: http://lkml.kernel.org/r/20171213092550.2774-2-mhocko@kernel.org Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: Russell King - ARM Linux <linux@armlinux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Joel Stanley <joel@jms.id.au> Cc: Kees Cook <keescook@chromium.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Jason Evans <jasone@google.com> Cc: David Goldblatt <davidtgoldblatt@gmail.com> Cc: Edward Tomasz Napierała <trasz@FreeBSD.org> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-07alpha: io: reorder barriers to guarantee writeX() and iowriteX() orderingSinan Kaya
memory-barriers.txt has been updated with the following requirement. "When using writel(), a prior wmb() is not needed to guarantee that the cache coherent memory writes have completed before writing to the MMIO region." Current writeX() and iowriteX() implementations on alpha are not satisfying this requirement as the barrier is after the register write. Move mb() in writeX() and iowriteX() functions to guarantee that HW observes memory changes before performing register operations. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-04-07alpha: Implement CPU vulnerabilities sysfs functions.Michael Cree
Implement the CPU vulnerabilty show functions for meltdown, spectre_v1 and spectre_v2 on Alpha. Tests on XP1000 (EV67/667MHz) and ES45 (EV68CB/1.25GHz) show them to be vulnerable to Meltdown and Spectre V1. In the case of Meltdown I saw a 1 to 2% success rate in reading bytes on the XP1000 and 50 to 60% success rate on the ES45. (This compares to 99.97% success reported for Intel CPUs.) Report EV6 and later CPUs as vulnerable. Tests on PWS600au (EV56/600MHz) for Spectre V1 attack were unsuccessful (though I did not try particularly hard) so mark EV4 through to EV56 as not vulnerable. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-04-07alpha: rtc: stop validating rtc_time in .read_timeAlexandre Belloni
The RTC core is always calling rtc_valid_tm after the read_time callback. It is not necessary to call it just before returning from the callback. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Matt Turner <mattst88@gmail.com>