summaryrefslogtreecommitdiff
path: root/include/asm-generic
AgeCommit message (Collapse)Author
2016-03-25arch, ftrace: for KASAN put hard/soft IRQ entries into separate sectionsAlexander Potapenko
KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to <linux/interrupt.h> so that the users don't need to pull in <linux/ftrace.h>. Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-24Merge tag 'asm-generic-4.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are only three patches this time, most other changes to files in include/asm-generic tend to go through the tree of whoever depends on the change. Two patches are cleanups for stuff that is no longer needed, the main change is to adapt the generic version of BUG_ON() for CONFIG_BUG=n to make it behave consistently with BUG(). This avoids undefined behavior along with a number of warnings about that undefined behavior in randconfig builds when we keep going on after hitting a BUG_ON()" * tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: remove old nonatomic-io wrapper files asm-generic: default BUG_ON(x) to if(x)BUG() asm-generic: page.h: Remove useless get_user_page and free_user_page
2016-03-24Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Documentation updates and a bitops ordering fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: bitops: Do not default to __clear_bit() for __clear_bit_unlock() documentation: Clarify compiler store-fusion example documentation: Transitivity is not cumulativity documentation: Add alternative release-acquire outcome documentation: Distinguish between local and global transitivity documentation: Subsequent writes ordered by rcu_dereference() documentation: Remove obsolete reference to RCU-protected indexes documentation: Fix memory-barriers.txt section references documentation: Fix control dependency and identical stores
2016-03-21bitops: Do not default to __clear_bit() for __clear_bit_unlock()Peter Zijlstra
__clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com> Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Noam Camus <noamc@ezchip.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20160309114054.GJ6356@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-20Merge branch 'mm-pkeys-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 protection key support from Ingo Molnar: "This tree adds support for a new memory protection hardware feature that is available in upcoming Intel CPUs: 'protection keys' (pkeys). There's a background article at LWN.net: https://lwn.net/Articles/643797/ The gist is that protection keys allow the encoding of user-controllable permission masks in the pte. So instead of having a fixed protection mask in the pte (which needs a system call to change and works on a per page basis), the user can map a (handful of) protection mask variants and can change the masks runtime relatively cheaply, without having to change every single page in the affected virtual memory range. This allows the dynamic switching of the protection bits of large amounts of virtual memory, via user-space instructions. It also allows more precise control of MMU permission bits: for example the executable bit is separate from the read bit (see more about that below). This tree adds the MM infrastructure and low level x86 glue needed for that, plus it adds a high level API to make use of protection keys - if a user-space application calls: mmap(..., PROT_EXEC); or mprotect(ptr, sz, PROT_EXEC); (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice this special case, and will set a special protection key on this memory range. It also sets the appropriate bits in the Protection Keys User Rights (PKRU) register so that the memory becomes unreadable and unwritable. So using protection keys the kernel is able to implement 'true' PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies PROT_READ as well. Unreadable executable mappings have security advantages: they cannot be read via information leaks to figure out ASLR details, nor can they be scanned for ROP gadgets - and they cannot be used by exploits for data purposes either. We know about no user-space code that relies on pure PROT_EXEC mappings today, but binary loaders could start making use of this new feature to map binaries and libraries in a more secure fashion. There is other pending pkeys work that offers more high level system call APIs to manage protection keys - but those are not part of this pull request. Right now there's a Kconfig that controls this feature (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled (like most x86 CPU feature enablement code that has no runtime overhead), but it's not user-configurable at the moment. If there's any serious problem with this then we can make it configurable and/or flip the default" * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) x86/mm/pkeys: Fix mismerge of protection keys CPUID bits mm/pkeys: Fix siginfo ABI breakage caused by new u64 field x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA mm/core, x86/mm/pkeys: Add execute-only protection keys support x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags x86/mm/pkeys: Allow kernel to modify user pkey rights register x86/fpu: Allow setting of XSAVE state x86/mm: Factor out LDT init from context init mm/core, x86/mm/pkeys: Add arch_validate_pkey() mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits() x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU x86/mm/pkeys: Add Kconfig prompt to existing config option x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps x86/mm/pkeys: Dump PKRU with other kernel registers mm/core, x86/mm/pkeys: Differentiate instruction fetches x86/mm/pkeys: Optimize fault handling in access_error() mm/core: Do not enforce PKEY permissions on remote mm access um, pkeys: Add UML arch_*_access_permitted() methods mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys x86/mm/gup: Simplify get_user_pages() PTE bit handling ...
2016-03-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights: 1) Support more Realtek wireless chips, from Jes Sorenson. 2) New BPF types for per-cpu hash and arrap maps, from Alexei Starovoitov. 3) Make several TCP sysctls per-namespace, from Nikolay Borisov. 4) Allow the use of SO_REUSEPORT in order to do per-thread processing of incoming TCP/UDP connections. The muxing can be done using a BPF program which hashes the incoming packet. From Craig Gallek. 5) Add a multiplexer for TCP streams, to provide a messaged based interface. BPF programs can be used to determine the message boundaries. From Tom Herbert. 6) Add 802.1AE MACSEC support, from Sabrina Dubroca. 7) Avoid factorial complexity when taking down an inetdev interface with lots of configured addresses. We were doing things like traversing the entire address less for each address removed, and flushing the entire netfilter conntrack table for every address as well. 8) Add and use SKB bulk free infrastructure, from Jesper Brouer. 9) Allow offloading u32 classifiers to hardware, and implement for ixgbe, from John Fastabend. 10) Allow configuring IRQ coalescing parameters on a per-queue basis, from Kan Liang. 11) Extend ethtool so that larger link mode masks can be supported. From David Decotigny. 12) Introduce devlink, which can be used to configure port link types (ethernet vs Infiniband, etc.), port splitting, and switch device level attributes as a whole. From Jiri Pirko. 13) Hardware offload support for flower classifiers, from Amir Vadai. 14) Add "Local Checksum Offload". Basically, for a tunneled packet the checksum of the outer header is 'constant' (because with the checksum field filled into the inner protocol header, the payload of the outer frame checksums to 'zero'), and we can take advantage of that in various ways. From Edward Cree" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits) bonding: fix bond_get_stats() net: bcmgenet: fix dma api length mismatch net/mlx4_core: Fix backward compatibility on VFs phy: mdio-thunder: Fix some Kconfig typos lan78xx: add ndo_get_stats64 lan78xx: handle statistics counter rollover RDS: TCP: Remove unused constant RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket net: smc911x: convert pxa dma to dmaengine team: remove duplicate set of flag IFF_MULTICAST bonding: remove duplicate set of flag IFF_MULTICAST net: fix a comment typo ethernet: micrel: fix some error codes ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it bpf, dst: add and use dst_tclassid helper bpf: make skb->tc_classid also readable net: mvneta: bm: clarify dependencies cls_bpf: reset class and reuse major in da ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c ldmvsw: Add ldmvsw.c driver code ...
2016-03-18Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge second patch-bomb from Andrew Morton: - a couple of hotfixes - the rest of MM - a new timer slack control in procfs - a couple of procfs fixes - a few misc things - some printk tweaks - lib/ updates, notably to radix-tree. - add my and Nick Piggin's old userspace radix-tree test harness to tools/testing/radix-tree/. Matthew said it was a godsend during the radix-tree work he did. - a few code-size improvements, switching to __always_inline where gcc screwed up. - partially implement character sets in sscanf * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) sscanf: implement basic character sets lib/bug.c: use common WARN helper param: convert some "on"/"off" users to strtobool lib: add "on"/"off" support to kstrtobool lib: update single-char callers of strtobool() lib: move strtobool() to kstrtobool() include/linux/unaligned: force inlining of byteswap operations include/uapi/linux/byteorder, swab: force inlining of some byteswap operations include/asm-generic/atomic-long.h: force inlining of some atomic_long operations usb: common: convert to use match_string() helper ide: hpt366: convert to use match_string() helper ata: hpt366: convert to use match_string() helper power: ab8500: convert to use match_string() helper power: charger_manager: convert to use match_string() helper drm/edid: convert to use match_string() helper pinctrl: convert to use match_string() helper device property: convert to use match_string() helper lib/string: introduce match_string() helper radix-tree tests: add test for radix_tree_iter_next radix-tree tests: add regression3 test ...
2016-03-17Merge tag 'gpio-v4.6-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO changes for kernel v4.6. There is quite a lot of interesting stuff going on. The patches to other subsystems and arch-wide are ACKed as far as possible, though I consider things like per-arch <asm/gpio.h> as essentially a part of the GPIO subsystem so it should not be needed. Core changes: - The gpio_chip is now a *real device*. Until now the gpio chips were just piggybacking the parent device or (gasp) floating in space outside of the device model. We now finally make GPIO chips devices. The gpio_chip will create a gpio_device which contains a struct device, and this gpio_device struct is kept private. Anything that needs to be kept private from the rest of the kernel will gradually be moved over to the gpio_device. - As a result of making the gpio_device a real device, we have added resource management, so devm_gpiochip_add_data() will cut down on overhead and reduce code lines. A huge slew of patches convert almost all drivers in the subsystem to use this. - Building on making the GPIO a real device, we add the first step of a new userspace ABI: the GPIO character device. We take small steps here, so we first add a pure *information* ABI and the tool "lsgpio" that will list all GPIO devices on the system and all lines on these devices. We can now discover GPIOs properly from userspace. We still have not come up with a way to actually *use* GPIOs from userspace. - To encourage people to use the character device for the future, we have it always-enabled when using GPIO. The old sysfs ABI is still opt-in (and can be used in parallel), but is marked as deprecated. We will keep it around for the foreseeable future, but it will not be extended to cover ever more use cases. Cleanup: - Bjorn Helgaas removed a whole slew of per-architecture <asm/gpio.h> includes. This dates back to when GPIO was an opt-in feature and no shared library even existed: just a header file with proper prototypes was provided and all semantics were up to the arch to implement. These patches make the GPIO chip even more a proper device and cleans out leftovers of the old in-kernel API here and there. Still some cruft is left but it's very little now. - There is still some clamping of return values for .get() going on, but we now return sane values in the vast majority of drivers and the errorpath is sanitized. Some patches for powerpc, blackfin and unicore still drop in. - We continue to switch the ARM, MIPS, blackfin, m68k local GPIO implementations to use gpiochip_add_data() and cut down on code lines. - MPC8xxx is converted to use the generic GPIO helpers. - ATH79 is converted to use the generic GPIO helpers. New drivers: - WinSystems WS16C48 - Acces 104-DIO-48E - F81866 (a F7188x variant) - Qoric (a MPC8xxx variant) - TS-4800 - SPI serializers (pisosr): simple 74xx shift registers connected to SPI to obtain a dirt-cheap output-only GPIO expander. - Texas Instruments TPIC2810 - Texas Instruments TPS65218 - Texas Instruments TPS65912 - X-Gene (ARM64) standby GPIO controller" * tag 'gpio-v4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (194 commits) Revert "Share upstreaming patches" gpio: mcp23s08: Fix clearing of interrupt. gpiolib: Fix comment referring to gpio_*() in gpiod_*() gpio: pca953x: Fix pca953x_gpio_set_multiple() on 64-bit gpio: xgene: Fix kconfig for standby GIPO contoller gpio: Add generic serializer DT binding gpio: uapi: use 0xB4 as ioctl() major gpio: tps65912: fix bad merge Revert "gpio: lp3943: Drop pin_used and lp3943_gpio_request/lp3943_gpio_free" gpio: omap: drop dev field from gpio_bank structure gpio: mpc8xxx: Slightly update the code for better readability gpio: mpc8xxx: Remove *read_reg and *write_reg from struct mpc8xxx_gpio_chip gpio: mpc8xxx: Fixup setting gpio direction output gpio: mcp23s08: Add support for mcp23s18 dt-bindings: gpio: altera: Fix altr,interrupt-type property gpio: add driver for MEN 16Z127 GPIO controller gpio: lp3943: Drop pin_used and lp3943_gpio_request/lp3943_gpio_free gpio: timberdale: Switch to devm_ioremap_resource() gpio: ts4800: Add IMX51 dependency gpiolib: rewrite gpiodev_add_to_list ...
2016-03-17Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Here are the main arm64 updates for 4.6. There are some relatively intrusive changes to support KASLR, the reworking of the kernel virtual memory layout and initial page table creation. Summary: - Initial page table creation reworked to avoid breaking large block mappings (huge pages) into smaller ones. The ARM architecture requires break-before-make in such cases to avoid TLB conflicts but that's not always possible on live page tables - Kernel virtual memory layout: the kernel image is no longer linked to the bottom of the linear mapping (PAGE_OFFSET) but at the bottom of the vmalloc space, allowing the kernel to be loaded (nearly) anywhere in physical RAM - Kernel ASLR: position independent kernel Image and modules being randomly mapped in the vmalloc space with the randomness is provided by UEFI (efi_get_random_bytes() patches merged via the arm64 tree, acked by Matt Fleming) - Implement relative exception tables for arm64, required by KASLR (initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c but actual x86 conversion to deferred to 4.7 because of the merge dependencies) - Support for the User Access Override feature of ARMv8.2: this allows uaccess functions (get_user etc.) to be implemented using LDTR/STTR instructions. Such instructions, when run by the kernel, perform unprivileged accesses adding an extra level of protection. The set_fs() macro is used to "upgrade" such instruction to privileged accesses via the UAO bit - Half-precision floating point support (part of ARMv8.2) - Optimisations for CPUs with or without a hardware prefetcher (using run-time code patching) - copy_page performance improvement to deal with 128 bytes at a time - Sanity checks on the CPU capabilities (via CPUID) to prevent incompatible secondary CPUs from being brought up (e.g. weird big.LITTLE configurations) - valid_user_regs() reworked for better sanity check of the sigcontext information (restored pstate information) - ACPI parking protocol implementation - CONFIG_DEBUG_RODATA enabled by default - VDSO code marked as read-only - DEBUG_PAGEALLOC support - ARCH_HAS_UBSAN_SANITIZE_ALL enabled - Erratum workaround Cavium ThunderX SoC - set_pte_at() fix for PROT_NONE mappings - Code clean-ups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (99 commits) arm64: kasan: Fix zero shadow mapping overriding kernel image shadow arm64: kasan: Use actual memory node when populating the kernel image shadow arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission arm64: Fix misspellings in comments. arm64: efi: add missing frame pointer assignment arm64: make mrs_s prefixing implicit in read_cpuid arm64: enable CONFIG_DEBUG_RODATA by default arm64: Rework valid_user_regs arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly arm64: KVM: Move kvm_call_hyp back to its original localtion arm64: mm: treat memstart_addr as a signed quantity arm64: mm: list kernel sections in order arm64: lse: deal with clobbered IP registers after branch via PLT arm64: mm: dump: Use VA_START directly instead of private LOWEST_ADDR arm64: kconfig: add submenu for 8.2 architectural features arm64: kernel: acpi: fix ioremap in ACPI parking protocol cpu_postboot arm64: Add support for Half precision floating point arm64: Remove fixmap include fragility arm64: Add workaround for Cavium erratum 27456 arm64: mm: Mark .rodata as RO ...
2016-03-17lib/bug.c: use common WARN helperJosh Poimboeuf
The traceoff_on_warning option doesn't have any effect on s390, powerpc, arm64, parisc, and sh because there are two different types of WARN implementations: 1) The above mentioned architectures treat WARN() as a special case of a BUG() exception. They handle warnings in report_bug() in lib/bug.c. 2) All other architectures just call warn_slowpath_*() directly. Their warnings are handled in warn_slowpath_common() in kernel/panic.c. Support traceoff_on_warning on all architectures and prevent any future divergence by using a single common function to emit the warning. Also remove the '()' from '%pS()', because the parentheses look funky: [ 45.607629] WARNING: at /root/warn_mod/warn_mod.c:17 .init_dummy+0x20/0x40 [warn_mod]() Reported-by: Chunyu Hu <chuhu@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17include/asm-generic/atomic-long.h: force inlining of some atomic_long operationsDenys Vlasenko
Sometimes gcc mysteriously doesn't inline very small functions we expect to be inlined. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122 With this .config: http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os, atomic_long_inc(), atomic_long_dec() and atomic_long_add() functions get deinlined about 40 times. Examples of disassembly: <atomic_long_inc> (21 copies, 147 calls): 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 48 ff 07 lock incq (%rdi) 5d pop %rbp c3 retq <atomic_long_dec> (4 copies, 14 calls) is similar to inc. <atomic_long_add> (11 copies, 41 calls): 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 48 01 3e lock add %rdi,(%rsi) 5d pop %rbp c3 retq This patch fixes this via s/inline/__always_inline/. Code size decrease after the patch is ~1.3k: text data bss dec hex filename 92203657 20826112 36417536 149447305 8e86289 vmlinux 92202377 20826112 36417536 149446025 8e85d89 vmlinux4_atomiclong_after Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Rientjes <rientjes@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17bug: set warn variable before calling WARN()Steven Rostedt
This has hit me a couple of times already. I would be debugging code and the system would simply hang and then reboot. Finally, I found that the problem was caused by WARN_ON_ONCE() and friends. The macro WARN_ON_ONCE(condition) is defined as: static bool __section(.data.unlikely) __warned; int __ret_warn_once = !!(condition); if (unlikely(__ret_warn_once)) if (WARN_ON(!__warned)) __warned = true; unlikely(__ret_warn_once); Which looks great and all. But what I have hit, is an issue when WARN_ON() itself hits the same WARN_ON_ONCE() code. Because, the variable __warned is not yet set. Then it too calls WARN_ON() and that triggers the warning again. It keeps doing this until the stack is overflowed and the system crashes. By setting __warned first before calling WARN_ON() makes the original WARN_ON_ONCE() really only warn once, and not an infinite amount of times if the WARN_ON() also triggers the warning. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17mm/thp/migration: switch from flush_tlb_range to flush_pmd_tlb_rangeAneesh Kumar K.V
We remove one instace of flush_tlb_range here. That was added by commit f714f4f20e59 ("mm: numa: call MMU notifiers on THP migration"). But the pmdp_huge_clear_flush_notify should have done the require flush for us. Hence remove the extra flush. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17Merge tag 'tty-4.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here's the big tty/serial driver pull request for 4.6-rc1. Lots of changes in here, Peter has been on a tear again, with lots of refactoring and bugs fixes, many thanks to the great work he has been doing. Lots of driver updates and fixes as well, full details in the shortlog. All have been in linux-next for a while with no reported issues" * tag 'tty-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (220 commits) serial: 8250: describe CONFIG_SERIAL_8250_RSA serial: samsung: optimize UART rx fifo access routine serial: pl011: add mark/space parity support serial: sa1100: make sa1100_register_uart_fns a function tty: serial: 8250: add MOXA Smartio MUE boards support serial: 8250: convert drivers to use up_to_u8250p() serial: 8250/mediatek: fix building with SERIAL_8250=m serial: 8250/ingenic: fix building with SERIAL_8250=m serial: 8250/uniphier: fix modular build Revert "drivers/tty/serial: make 8250/8250_ingenic.c explicitly non-modular" Revert "drivers/tty/serial: make 8250/8250_mtk.c explicitly non-modular" serial: mvebu-uart: initial support for Armada-3700 serial port serial: mctrl_gpio: Add missing module license serial: ifx6x60: avoid uninitialized variable use tty/serial: at91: fix bad offset for UART timeout register tty/serial: at91: restore dynamic driver binding serial: 8250: Add hardware dependency to RT288X option TTY, devpts: document pty count limiting tty: goldfish: support platform_device with id -1 drivers: tty: goldfish: Add device tree bindings ...
2016-03-16Merge tag 'pci-v4.6-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for v4.6: Enumeration: - Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas) - Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas Resource management: - Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas) - Don't assign or reassign immutable resources (Bjorn Helgaas) - Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas) - Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas) - Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas) - ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas) - ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas) - MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas) - Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas) - Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas) - rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi) - designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi) Virtualization: - Wait for up to 1000ms after FLR reset (Alex Williamson) - Support SR-IOV on any function type (Kelly Zytaruk) - Add ACS quirk for all Cavium devices (Manish Jaggi) AER: - Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas) - Restore pci_ops pointer while calling original pci_ops (David Daney) - Fix aer_inject error codes (Jean Delvare) - Use dev_warn() in aer_inject (Jean Delvare) - Log actual error causes in aer_inject (Jean Delvare) - Log aer_inject error injections (Jean Delvare) VPD: - Prevent VPD access for buggy devices (Babu Moger) - Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas) - Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas) - Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas) - Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas) - Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas) - Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas) - Update VPD definitions (Hannes Reinecke) - Allow access to VPD attributes with size 0 (Hannes Reinecke) - Determine actual VPD size on first access (Hannes Reinecke) Generic host bridge driver: - Move structure definitions to separate header file (David Daney) - Add pci_host_common_probe(), based on gen_pci_probe() (David Daney) - Expose pci_host_common_probe() for use by other drivers (David Daney) Altera host bridge driver: - Fix altera_pcie_link_is_up() (Ley Foon Tan) Cavium ThunderX host bridge driver: - Add PCIe host driver for ThunderX processors (David Daney) - Add driver for ThunderX-pass{1,2} on-chip devices (David Daney) Freescale i.MX6 host bridge driver: - Add DT bindings to configure PHY Tx driver settings (Justin Waters) - Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach) - Move PHY reset into imx6_pcie_establish_link() (Lucas Stach) - Remove broken Gen2 workaround (Lucas Stach) - Move link up check into imx6_pcie_wait_for_link() (Lucas Stach) Freescale Layerscape host bridge driver: - Add "fsl,ls2085a-pcie" compatible ID (Yang Shi) Intel VMD host bridge driver: - Attach VMD resources to parent domain's resource tree (Jon Derrick) - Set bus resource start to 0 (Keith Busch) Microsoft Hyper-V host bridge driver: - Add fwnode_handle to x86 pci_sysdata (Jake Oshins) - Look up IRQ domain by fwnode_handle (Jake Oshins) - Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins) NVIDIA Tegra host bridge driver: - Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding) - Implement ->{add,remove}_bus() callbacks (Thierry Reding) - Remove unused struct tegra_pcie.num_ports field (Thierry Reding) - Track bus -> CPU mapping (Thierry Reding) - Remove misleading PHYS_OFFSET (Thierry Reding) Renesas R-Car host bridge driver: - Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman) Synopsys DesignWare host bridge driver: - ARC: Add PCI support (Joao Pinto) - Add generic dw_pcie_wait_for_link() (Joao Pinto) - Add default link up check if sub-driver doesn't override (Joao Pinto) - Add driver for prototyping kits based on ARC SDP (Joao Pinto) TI Keystone host bridge driver: - Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin) Xilinx AXI host bridge driver: - Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada) - Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada) - Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada) - Update Zynq binding with Microblaze node (Bharat Kumar Gogada) - microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada) Xilinx NWL host bridge driver: - Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada) Miscellaneous: - Check device_attach() return value always (Bjorn Helgaas) - Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas) - Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas) - ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas) - Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas) - Remove includes of asm/pci-bridge.h (Bjorn Helgaas) - Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas) - unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas) - Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler) - Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas) - Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa) - frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig) - Move pci_dma_* helpers to common code (Christoph Hellwig) - Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus) - Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson) - Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)" * tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits) PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition PCI: designware: Add driver for prototyping kits based on ARC SDP PCI: designware: Add default link up check if sub-driver doesn't override PCI: designware: Add generic dw_pcie_wait_for_link() PCI: Cleanup pci/pcie/Kconfig whitespace PCI: Simplify pci_create_attr() control flow PCI: Don't leak memory if sysfs_create_bin_file() fails PCI: Simplify sysfs ROM cleanup PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource MIPS: Loongson 3: Use temporary struct resource * to avoid repetition ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource ia64/PCI: Use ioremap() instead of open-coded equivalent ia64/PCI: Use temporary struct resource * to avoid repetition PCI: Clean up pci_map_rom() whitespace PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs PCI: thunder: Add driver for ThunderX-pass{1,2} on-chip devices PCI: thunder: Add PCIe host driver for ThunderX processors PCI: generic: Expose pci_host_common_probe() for use by other drivers PCI: generic: Add pci_host_common_probe(), based on gen_pci_probe() ...
2016-03-14Merge branch 'mm-readonly-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull read-only kernel memory updates from Ingo Molnar: "This tree adds two (security related) enhancements to the kernel's handling of read-only kernel memory: - extend read-only kernel memory to a new class of formerly writable kernel data: 'post-init read-only memory' via the __ro_after_init attribute, and mark the ARM and x86 vDSO as such read-only memory. This kind of attribute can be used for data that requires a once per bootup initialization sequence, but is otherwise never modified after that point. This feature was based on the work by PaX Team and Brad Spengler. (by Kees Cook, the ARM vDSO bits by David Brown.) - make CONFIG_DEBUG_RODATA always enabled on x86 and remove the Kconfig option. This simplifies the kernel and also signals that read-only memory is the default model and a first-class citizen. (Kees Cook)" * 'mm-readonly-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ARM/vdso: Mark the vDSO code read-only after init x86/vdso: Mark the vDSO code read-only after init lkdtm: Verify that '__ro_after_init' works correctly arch: Introduce post-init read-only memory x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings asm-generic: Consolidate mark_rodata_ro()
2016-03-13ipv4: Update parameters for csum_tcpudp_magic to their original typesAlexander Duyck
This patch updates all instances of csum_tcpudp_magic and csum_tcpudp_nofold to reflect the types that are usually used as the source inputs. For example the protocol field is populated based on nexthdr which is actually an unsigned 8 bit value. The length is usually populated based on skb->len which is an unsigned integer. This addresses an issue in which the IPv6 function csum_ipv6_magic was generating a checksum using the full 32b of skb->len while csum_tcpudp_magic was only using the lower 16 bits. As a result we could run into issues when attempting to adjust the checksum as there was no protocol agnostic way to update it. With this change the value is still truncated as many architectures use "(len + proto) << 8", however this truncation only occurs for values greater than 16776960 in length and as such is unlikely to occur as we stop the inner headers at ~64K in size. I did have to make a few minor changes in the arm, mn10300, nios2, and score versions of the function in order to support these changes as they were either using things such as an OR to combine the protocol and length, or were using ntohs to convert the length which would have truncated the value. I also updated a few spots in terms of whitespace and type differences for the addresses. Most of this was just to make sure all of the definitions were in sync going forward. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07PCI: Move pci_dma_* helpers to common codeChristoph Hellwig
For a long time all architectures implement the pci_dma_* functions using the generic DMA API, and they all use the same header to do so. Move this header, pci-dma-compat.h, to include/linux and include it from the generic pci.h instead of having each arch duplicate this include. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-01asm-generic: remove old nonatomic-io wrapper filesArnd Bergmann
The two header files got moved to include/linux, and most users were already converted, this changes the remaining drivers and removes the files. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Vinod Koul <vinod.koul@intel.com> Acked-by: Simon Horman <simon.horman@netronome.com> Acked-by: Yisen Zhuang <yisen.zhuang@huawei.com>
2016-03-01asm-generic: default BUG_ON(x) to if(x)BUG()Arnd Bergmann
When CONFIG_BUG is disabled, BUG_ON() will only evaluate the condition, but will not actually stop the current thread. GCC warns about a couple of BUG_ON() users where this actually leads to further undefined behavior: include/linux/ceph/osdmap.h: In function 'ceph_can_shift_osds': include/linux/ceph/osdmap.h:54:1: warning: control reaches end of non-void function fs/ext4/inode.c: In function 'ext4_map_blocks': fs/ext4/inode.c:548:5: warning: 'retval' may be used uninitialized in this function drivers/mfd/db8500-prcmu.c: In function 'prcmu_config_clkout': drivers/mfd/db8500-prcmu.c:762:10: warning: 'div_mask' may be used uninitialized in this function drivers/mfd/db8500-prcmu.c:769:13: warning: 'mask' may be used uninitialized in this function drivers/mfd/db8500-prcmu.c:757:7: warning: 'bits' may be used uninitialized in this function drivers/tty/serial/8250/8250_core.c: In function 'univ8250_release_irq': drivers/tty/serial/8250/8250_core.c:252:18: warning: 'i' may be used uninitialized in this function drivers/tty/serial/8250/8250_core.c:235:19: note: 'i' was declared here There is an obvious conflict of interest here: on the one hand, someone who disables CONFIG_BUG() will want the kernel to be as small as possible and doesn't care about printing error messages to a console that nobody looks at. On the other hand, running into a BUG_ON() condition means that something has gone wrong, and we probably want to also stop doing things that might cause data corruption. This patch picks the second choice, and changes the NOP to BUG(), which normally stops the execution of the current thread in some form (endless loop or a trap). This follows the logic we applied in a4b5d580e078 ("bug: Make BUG() always stop the machine"). For ARM multi_v7_defconfig, the size slightly increases: section CONFIG_BUG=y CONFIG_BUG=n CONFIG_BUG=n+patch .text 8320248 | 8180944 | 8207688 .rodata 3633720 | 3567144 | 3570648 __bug_table 32508 | --- | --- __modver 692 | 1584 | 2176 .init.text 558132 | 548300 | 550088 .exit.text 12380 | 12256 | 12380 .data 1016672 | 1016064 | 1016128 Total 14622556 | 14374510 | 14407326 So instead of saving 1.70% of the total image size, we only save 1.48% by turning off CONFIG_BUG, but in return we can ensure that we don't run into cases of uninitialized variable or return code uses when something bad happens. Aside from that, we significantly reduce the number of warnings in randconfig builds, which makes it easier to fix the warnings about other problems. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-02-29locking/qspinlock: Move __ARCH_SPIN_LOCK_UNLOCKED to qspinlock_types.hDan Streetman
Move the __ARCH_SPIN_LOCK_UNLOCKED definition from qspinlock.h into qspinlock_types.h. The definition of __ARCH_SPIN_LOCK_UNLOCKED comes from the build arch's include files; but on x86 when CONFIG_QUEUED_SPINLOCKS=y, it just it's defined in asm-generic/qspinlock.h. In most cases, this doesn't matter because linux/spinlock.h includes asm/spinlock.h, which for x86 includes asm-generic/qspinlock.h. However, any code that only includes linux/mutex.h will break, because it only includes asm/spinlock_types.h. For example, this breaks systemtap, which only includes mutex.h. Signed-off-by: Dan Streetman <dan.streetman@canonical.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <Waiman.Long@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1455907767-17821-1-git-send-email-dan.streetman@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-26asm-generic: page.h: Remove useless get_user_page and free_user_pageChen Gang
They are not symmetric with each other, neither are used in real world (can not be found by grep command in source code root directory), so remove them. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-02-22arch: Introduce post-init read-only memoryKees Cook
One of the easiest ways to protect the kernel from attack is to reduce the internal attack surface exposed when a "write" flaw is available. By making as much of the kernel read-only as possible, we reduce the attack surface. Many things are written to only during __init, and never changed again. These cannot be made "const" since the compiler will do the wrong thing (we do actually need to write to them). Instead, move these items into a memory region that will be made read-only during mark_rodata_ro() which happens after all kernel __init code has finished. This introduces __ro_after_init as a way to mark such memory, and adds some documentation about the existing __read_mostly marking. This improves the security of the Linux kernel by marking formerly read-write memory regions as read-only on a fully booted up system. Based on work by PaX Team and Brad Spengler. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Brown <david.brown@linaro.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Emese Revfy <re.emese@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathias Krause <minipli@googlemail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: PaX Team <pageexec@freemail.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1455748879-21872-5-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-20Merge tag 'powerpc-4.5-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix build error on 32-bit with checkpoint restart from Aneesh Kumar - Fix dedotify for binutils >= 2.26 from Andreas Schwab - Don't trace hcalls on offline CPUs from Denis Kirjanov - eeh: Fix stale cached primary bus from Gavin Shan - eeh: Fix stale PE primary bus from Gavin Shan - mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V - ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy * tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/ioda: Set "read" permission when "write" is set powerpc/mm: Fix Multi hit ERAT cause by recent THP update powerpc/powernv: Fix stale PE primary bus powerpc/eeh: Fix stale cached primary bus powerpc/pseries: Don't trace hcalls on offline CPUs powerpc: Fix dedotify for binutils >= 2.26 powerpc/book3s_32: Fix build error with checkpoint restart
2016-02-19gpio: allow setting ARCH_NR_GPIOS from KconfigArnd Bergmann
The ARM version of asm/gpio.h basically just contains the same definitions as the gpiolib version, with the exception of ARCH_NR_GPIOS. This adds the option for overriding the constant through Kconfig to the architecture-independent header, so we can remove the ARM specific file later. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-02-18mm/core, x86/mm/pkeys: Differentiate instruction fetchesDave Hansen
As discussed earlier, we attempt to enforce protection keys in software. However, the code checks all faults to ensure that they are not violating protection key permissions. It was assumed that all faults are either write faults where we check PKRU[key].WD (write disable) or read faults where we check the AD (access disable) bit. But, there is a third category of faults for protection keys: instruction faults. Instruction faults never run afoul of protection keys because they do not affect instruction fetches. So, plumb the PF_INSTR bit down in to the arch_vma_access_permitted() function where we do the protection key checks. We also add a new FAULT_FLAG_INSTRUCTION. This is because handle_mm_fault() is not passed the architecture-specific error_code where we keep PF_INSTR, so we need to encode the instruction fetch information in to the arch-generic fault flags. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210224.96928009@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18mm/core: Do not enforce PKEY permissions on remote mm accessDave Hansen
We try to enforce protection keys in software the same way that we do in hardware. (See long example below). But, we only want to do this when accessing our *own* process's memory. If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then tried to PTRACE_POKE a target process which just happened to have some mprotect_pkey(pkey=6) memory, we do *not* want to deny the debugger access to that memory. PKRU is fundamentally a thread-local structure and we do not want to enforce it on access to _another_ thread's data. This gets especially tricky when we have workqueues or other delayed-work mechanisms that might run in a random process's context. We can check that we only enforce pkeys when operating on our *own* mm, but delayed work gets performed when a random user context is active. We might end up with a situation where a delayed-work gup fails when running randomly under its "own" task but succeeds when running under another process. We want to avoid that. To avoid that, we use the new GUP flag: FOLL_REMOTE and add a fault flag: FAULT_FLAG_REMOTE. They indicate that we are walking an mm which is not guranteed to be the same as current->mm and should not be subject to protection key enforcement. Thanks to Jerome Glisse for pointing out this scenario. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Dominik Vogt <vogt@linux.vnet.ibm.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Geliang Tang <geliangtang@163.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Low <jason.low2@hp.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xie XiuQi <xiexiuqi@huawei.com> Cc: iommu@lists.linux-foundation.org Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keysDave Hansen
Today, for normal faults and page table walks, we check the VMA and/or PTE to ensure that it is compatible with the action. For instance, if we get a write fault on a non-writeable VMA, we SIGSEGV. We try to do the same thing for protection keys. Basically, we try to make sure that if a user does this: mprotect(ptr, size, PROT_NONE); *ptr = foo; they see the same effects with protection keys when they do this: mprotect(ptr, size, PROT_READ|PROT_WRITE); set_pkey(ptr, size, 4); wrpkru(0xffffff3f); // access disable pkey 4 *ptr = foo; The state to do that checking is in the VMA, but we also sometimes have to do it on the page tables only, like when doing a get_user_pages_fast() where we have no VMA. We add two functions and expose them to generic code: arch_pte_access_permitted(pte_flags, write) arch_vma_access_permitted(vma, write) These are, of course, backed up in x86 arch code with checks against the PTE or VMA's protection key. But, there are also cases where we do not want to respect protection keys. When we ptrace(), for instance, we do not want to apply the tracer's PKRU permissions to the PTEs from the process being traced. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Dominik Vogt <vogt@linux.vnet.ibm.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Low <jason.low2@hp.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20160212210219.14D5D715@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-16asm-generic: Fix local variable shadow in __set_fixmap_offsetMark Rutland
Currently __set_fixmap_offset is a macro function which has a local variable called 'addr'. If a caller passes a 'phys' parameter which is derived from a variable also called 'addr', the local variable will shadow this, and the compiler will complain about the use of an uninitialized variable. To avoid the issue with namespace clashes, 'addr' is prefixed with a liberal sprinkling of underscores. Turning __set_fixmap_offset into a static inline breaks the build for several architectures. Fixing this properly requires updates to a number of architectures to make them agree on the prototype of __set_fixmap (it could be done as a subsequent patch series). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> [catalin.marinas@arm.com: squashed the original function patch and macro fixup] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-15powerpc/mm: Fix Multi hit ERAT cause by recent THP updateAneesh Kumar K.V
With ppc64 we use the deposited pgtable_t to store the hash pte slot information. We should not withdraw the deposited pgtable_t without marking the pmd none. This ensure that low level hash fault handling will skip this huge pte and we will handle them at upper levels. Recent change to pmd splitting changed the above in order to handle the race between pmd split and exit_mmap. The race is explained below. Consider following race: CPU0 CPU1 shrink_page_list() add_to_swap() split_huge_page_to_list() __split_huge_pmd_locked() pmdp_huge_clear_flush_notify() // pmd_none() == true exit_mmap() unmap_vmas() zap_pmd_range() // no action on pmd since pmd_none() == true pmd_populate() As result the THP will not be freed. The leak is detected by check_mm(): BUG: Bad rss-counter state mm:ffff880058d2e580 idx:1 val:512 The above required us to not mark pmd none during a pmd split. The fix for ppc is to clear the huge pte of _PAGE_USER, so that low level fault handling code skip this pte. At higher level we do take ptl lock. That should serialze us against the pmd split. Once the lock is acquired we do check the pmd again using pmd_same. That should always return false for us and hence we should retry the access. We do the pmd_same check in all case after taking plt with THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and huge_pmd_set_accessed) Also make sure we wait for irq disable section in other cpus to finish before flipping a huge pte entry with a regular pmd entry. Code paths like find_linux_pte_or_hugepte depend on irq disable to get a stable pte_t pointer. A parallel thp split need to make sure we don't convert a pmd pte to a regular pmd entry without waiting for the irq disable section to finish. Fixes: eef1b3ba053a ("thp: implement split_huge_pmd()") Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-14Merge 4.5-rc4 into tty-nextGreg Kroah-Hartman
We want the fixes in here, and this resolves a merge error in tty_io.c Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-06earlycon: Use common framework for earlycon declarationsPeter Hurley
Use a single common table of struct earlycon_id for both command line and devicetree. Re-define OF_EARLYCON_DECLARE() macro to instance a unique earlycon declaration (the declaration is only guaranteed to be unique within a compilation unit; separate compilation units must still use unique earlycon names). The semantics of OF_EARLYCON_DECLARE() is different; it declares an earlycon which can matched either on the command line or by devicetree. EARLYCON_DECLARE() is semantically unchanged; it declares an earlycon which is matched by command line only. Remove redundant instances of EARLYCON_DECLARE(). This enables all earlycons to properly initialize struct console with the appropriate name and index, which improves diagnostics and enables direct earlycon-to-console handoff. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-05PCI: Remove empty asm-generic/pci-bridge.hBjorn Helgaas
include/asm-generic/pci-bridge.h is empty, and nobody includes it, so remove it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-02-05PCI: Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.hBjorn Helgaas
The PCI flag management constants and functions were previously declared in include/asm-generic/pci-bridge.h. But they are not specific to bridges, and arches did not include pci-bridge.h consistently. Move the following interfaces and related constants to include/linux/pci.h and remove pci-bridge.h: pci_set_flags() pci_add_flags() pci_clear_flags() pci_has_flag() This fixes these warnings when building for some arches: drivers/pci/host/pcie-designware.c:562:20: error: 'PCI_PROBE_ONLY' undeclared (first use in this function) drivers/pci/host/pcie-designware.c:562:7: error: implicit declaration of function 'pci_has_flag' [-Werror=implicit-function-declaration] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-02-02cputime: Prevent 32bit overflow in time[val|spec]_to_cputime()zengtao
The datatype __kernel_time_t is u32 on 32bit platform, so its subject to overflows in the timeval/timespec to cputime conversion. Currently the following functions are affected: 1. setitimer() 2. timer_create/timer_settime() 3. sys_clock_nanosleep This can happen on MIPS32 and ARM32 with "Full dynticks CPU time accounting" enabled, which is required for CONFIG_NO_HZ_FULL. Enforce u64 conversion to prevent the overflow. Fixes: 31c1fc818715 ("ARM: Kconfig: allow full nohz CPU accounting") Signed-off-by: zengtao <prime.zeng@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: <fweisbec@gmail.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1454384314-154784-1-git-send-email-prime.zeng@huawei.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-21Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge third patch-bomb from Andrew Morton: "I'm pretty much done for -rc1 now: - the rest of MM, basically - lib/ updates - checkpatch, epoll, hfs, fatfs, ptrace, coredump, exit - cpu_mask simplifications - kexec, rapidio, MAINTAINERS etc, etc. - more dma-mapping cleanups/simplifications from hch" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (109 commits) MAINTAINERS: add/fix git URLs for various subsystems mm: memcontrol: add "sock" to cgroup2 memory.stat mm: memcontrol: basic memory statistics in cgroup2 memory controller mm: memcontrol: do not uncharge old page in page cache replacement Documentation: cgroup: add memory.swap.{current,max} description mm: free swap cache aggressively if memcg swap is full mm: vmscan: do not scan anon pages if memcg swap limit is hit swap.h: move memcg related stuff to the end of the file mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online mm: vmscan: pass memcg to get_scan_count() mm: memcontrol: charge swap to cgroup2 mm: memcontrol: clean up alloc, online, offline, free functions mm: memcontrol: flatten struct cg_proto mm: memcontrol: rein in the CONFIG space madness net: drop tcp_memcontrol.c mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM mm: memcontrol: allow to disable kmem accounting for cgroup2 mm: memcontrol: account "kmem" consumers in cgroup2 memory controller mm: memcontrol: move kmem accounting code to CONFIG_MEMCG mm: memcontrol: separate kmem code from legacy tcp accounting code ...
2016-01-20Merge tag 'asm-generic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "The asm-generic tree this time contains one series from Nicolas Pitre that makes the optimized do_div() implementation from the ARM architecture available to all architectures. This also adds stricter type checking for callers of do_div, which has uncovered a number of bugs in existing code, and fixes up the ones we have found" * tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: ARM: asm/div64.h: adjust to generic codde __div64_32(): make it overridable at compile time __div64_const32(): abstract out the actual 128-bit cross product code do_div(): generic optimization for constant divisor on 32-bit machines div64.h: optimize do_div() for power-of-two constant divisors mtd/sm_ftl.c: fix wrong do_div() usage drm/mgag200/mgag200_mode.c: fix wrong do_div() usage hid-sensor-hub.c: fix wrong do_div() usage ti/fapll: fix wrong do_div() usage ti/clkt_dpll: fix wrong do_div() usage tegra/clk-divider: fix wrong do_div() usage imx/clk-pllv2: fix wrong do_div() usage imx/clk-pllv1: fix wrong do_div() usage nouveau/nvkm/subdev/clk/gk20a.c: fix wrong do_div() usage
2016-01-20dma-mapping: remove <asm-generic/dma-coherent.h>Christoph Hellwig
This wasn't an asm-generic header to start with, and can be merged into dma-mapping.h trivially. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Steven Miao <realmz6@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20dma-mapping: always provide the dma_map_ops based implementationChristoph Hellwig
Move the generic implementation to <linux/dma-mapping.h> now that all architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now that everyone supports them. [valentinrothberg@gmail.com: remove leftovers in Kconfig] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Steven Miao <realmz6@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-18Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio barrier rework+fixes from Michael Tsirkin: "This adds a new kind of barrier, and reworks virtio and xen to use it. Plus some fixes here and there" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits) checkpatch: add virt barriers checkpatch: check for __smp outside barrier.h checkpatch.pl: add missing memory barriers virtio: make find_vqs() checkpatch.pl-friendly virtio_balloon: fix race between migration and ballooning virtio_balloon: fix race by fill and leak s390: more efficient smp barriers s390: use generic memory barriers xen/events: use virt_xxx barriers xen/io: use virt_xxx barriers xenbus: use virt_xxx barriers virtio_ring: use virt_store_mb sh: move xchg_cmpxchg to a header by itself sh: support 1 and 2 byte xchg virtio_ring: update weak barriers to use virt_xxx Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb" asm-generic: implement virt_xxx memory barriers x86: define __smp_xxx xtensa: define __smp_xxx tile: define __smp_xxx ...
2016-01-16asm/sections: add helpers to check for section dataThierry Reding
Add a helper to check if an object (given an address and a size) is part of a section (given beginning and end addresses). For convenience, also provide a helper that performs this check for __init data using the __init_begin and __init_end limits. Signed-off-by: Thierry Reding <treding@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm, dax: convert vmf_insert_pfn_pmd() to pfn_tDan Williams
Similar to the conversion of vm_insert_mixed() use pfn_t in the vmf_insert_pfn_pmd() to tag the resulting pte with _PAGE_DEVICE when the pfn is backed by a devm_memremap_pages() mapping. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm, thp: remove infrastructure for handling splitting PMDsKirill A. Shutemov
With new refcounting we don't need to mark PMDs splitting. Let's drop code to handle this. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14mm: add PHYS_PFN, use it in __phys_to_pfn()Chen Gang
__phys_to_pfn and __pfn_to_phys are symmetric, PHYS_PFN and PFN_PHYS are semmetric: - y = (phys_addr_t)x << PAGE_SHIFT - y >> PAGE_SHIFT = (phys_add_t)x - (unsigned long)(y >> PAGE_SHIFT) = x [akpm@linux-foundation.org: use macro arg name `x'] [arnd@arndb.de: include linux/pfn.h for PHYS_PFN definition] Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-12asm-generic: implement virt_xxx memory barriersMichael S. Tsirkin
Guests running within virtual machines might be affected by SMP effects even if the guest itself is compiled without SMP support. This is an artifact of interfacing with an SMP host while running an UP kernel. Using mandatory barriers for this use-case would be possible but is often suboptimal. In particular, virtio uses a bunch of confusing ifdefs to work around this, while xen just uses the mandatory barriers. To better handle this case, low-level virt_mb() etc macros are made available. These are implemented trivially using the low-level __smp_xxx macros, the purpose of these wrappers is to annotate those specific cases. These have the same effect as smp_mb() etc when SMP is enabled, but generate identical code for SMP and non-SMP systems. For example, virtual machine guests should use virt_mb() rather than smp_mb() when synchronizing against a (possibly SMP) host. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-01-12asm-generic: add __smp_xxx wrappersMichael S. Tsirkin
On !SMP, most architectures define their barriers as compiler barriers. On SMP, most need an actual barrier. Make it possible to remove the code duplication for !SMP by defining low-level __smp_xxx barriers which do not depend on the value of SMP, then use them from asm-generic conditionally. Besides reducing code duplication, these low level APIs will also be useful for virtualization, where a barrier is sometimes needed even if !SMP since we might be talking to another kernel on the same SMP system. Both virtio and Xen drivers will benefit. The smp_xxx variants should use __smp_XXX ones or barrier() depending on SMP, identically for all architectures. We keep ifndef guards around them for now - once/if all architectures are converted to use the generic code, we'll be able to remove these. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-01-12asm-generic: guard smp_store_release/load_acquireMichael S. Tsirkin
Allow architectures to override smp_store_release and smp_load_acquire by guarding the defines in asm-generic/barrier.h with ifndef directives. This is in preparation to reusing asm-generic/barrier.h on architectures which have their own definition of these macros. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-01-12lcoking/barriers, arch: Use smp barriers in smp_store_release()Davidlohr Bueso
With commit b92b8b35a2e ("locking/arch: Rename set_mb() to smp_store_mb()") it was made clear that the context of this call (and thus set_mb) is strictly for CPU ordering, as opposed to IO. As such all archs should use the smp variant of mb(), respecting the semantics and saving a mandatory barrier on UP. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <linux-arch@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-01-11Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The main changes in this cycle were: - make the debugfs 'kernel_page_tables' file read-only, as it only has read ops. (Borislav Petkov) - micro-optimize clflush_cache_range() (Chris Wilson) - swiotlb enhancements, which fixes certain KVM emulated devices (Igor Mammedov) - fix an LDT related debug message (Jan Beulich) - modularize CONFIG_X86_PTDUMP (Kees Cook) - tone down an overly alarming warning (Laura Abbott) - Mark variable __initdata (Rasmus Villemoes) - PAT additions (Toshi Kani)" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Micro-optimise clflush_cache_range() x86/mm/pat: Change free_memtype() to support shrinking case x86/mm/pat: Add untrack_pfn_moved for mremap x86/mm: Drop WARN from multi-BAR check x86/LDT: Print the real LDT base address x86/mm/64: Enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN x86/mm: Introduce max_possible_pfn x86/mm/ptdump: Make (debugfs)/kernel_page_tables read-only x86/mm/mtrr: Mark the 'range_new' static variable in mtrr_calc_range_state() as __initdata x86/mm: Turn CONFIG_X86_PTDUMP into a module
2016-01-11Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "So we have a laundry list of locking subsystem changes: - continuing barrier API and code improvements - futex enhancements - atomics API improvements - pvqspinlock enhancements: in particular lock stealing and adaptive spinning - qspinlock micro-enhancements" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op futex: Cleanup the goto confusion in requeue_pi() futex: Remove pointless put_pi_state calls in requeue() futex: Document pi_state refcounting in requeue code futex: Rename free_pi_state() to put_pi_state() futex: Drop refcount if requeue_pi() acquired the rtmutex locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation lcoking/barriers, arch: Use smp barriers in smp_store_release() locking/cmpxchg, arch: Remove tas() definitions locking/pvqspinlock: Queue node adaptive spinning locking/pvqspinlock: Allow limited lock stealing locking/pvqspinlock: Collect slowpath lock statistics sched/core, locking: Document Program-Order guarantees locking, sched: Introduce smp_cond_acquire() and use it locking/pvqspinlock, x86: Optimize the PV unlock code path locking/qspinlock: Avoid redundant read of next pointer locking/qspinlock: Prefetch the next node cacheline locking/qspinlock: Use _acquire/_release() versions of cmpxchg() & xchg() atomics: Add test for atomic operations with _relaxed variants