summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2018-03-03Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A 4.16 regression fix, three fixes for -stable, and a cleanup fix: - During the merge window support for the new ACPI NVDIMM Platform Capabilities structure disabled support for "deep flush", a force-unit- access like mechanism for persistent memory. Restore that mechanism. - VFIO like RDMA is yet one more memory registration / pinning interface that is incompatible with Filesystem-DAX. Disable long term pins of Filesystem-DAX mappings via VFIO. - The Filesystem-DAX detection to prevent long terms pins mistakenly also disabled Device-DAX pins which are not subject to the same block- map collision concerns. - Similar to the setup path, softlockup warnings can trigger in the shutdown path for large persistent memory namespaces. Teach for_each_device_pfn() to perform cond_resched() in all cases. - Boaz noticed that the might_sleep() in dax_direct_access() is stale as of the v4.15 kernel. These have received a build success notification from the 0day robot, and the longterm pin fixes have appeared in -next. However, I recently rebased the tree to remove some other fixes that need to be reworked after review feedback. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: memremap: fix softlockup reports at teardown libnvdimm: re-enable deep flush for pmem devices via fsync() vfio: disable filesystem-dax page pinning dax: fix vma_is_fsdax() helper dax: ->direct_access does not sleep anymore
2018-03-02Merge tag 'for-linus-20180302' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A collection of fixes for this series. This is a little larger than usual at this time, but that's mainly because I was out on vacation last week. Nothing in here is major in any way, it's just two weeks of fixes. This contains: - NVMe pull from Keith, with a set of fixes from the usual suspects. - mq-deadline zone unlock fix from Damien, fixing an issue with the SMR zone locking added for 4.16. - two bcache fixes sent in by Michael, with changes from Coly and Tang. - comment typo fix from Eric for blktrace. - return-value error handling fix for nbd, from Gustavo. - fix a direct-io case where we don't defer to a completion handler, making us sleep from IRQ device completion. From Jan. - a small series from Jan fixing up holes around handling of bdev references. - small set of regression fixes from Jiufei, mostly fixing problems around the gendisk pointer -> partition index change. - regression fix from Ming, fixing a boundary issue with the discard page cache invalidation. - two-patch series from Ming, fixing both a core blk-mq-sched and kyber issue around token freeing on a requeue condition" * tag 'for-linus-20180302' of git://git.kernel.dk/linux-block: (24 commits) block: fix a typo block: display the correct diskname for bio block: fix the count of PGPGOUT for WRITE_SAME mq-deadline: Make sure to always unlock zones nvmet: fix PSDT field check in command format nvme-multipath: fix sysfs dangerously created links nbd: fix return value in error handling path bcache: fix kcrashes with fio in RAID5 backend dev bcache: correct flash only vols (check all uuids) blktrace_api.h: fix comment for struct blk_user_trace_setup blockdev: Avoid two active bdev inodes for one device genhd: Fix BUG in blkdev_open() genhd: Fix use after free in __blkdev_get() genhd: Add helper put_disk_and_module() genhd: Rename get_disk() to get_disk_and_module() genhd: Fix leaked module reference for NVME devices direct-io: Fix sleep in atomic due to sync AIO nvme-pci: Fix nvme queue cleanup if IRQ setup fails block: kyber: fix domain token leak during requeue blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch ...
2018-03-01block: display the correct diskname for bioJiufei Xue
bio_devname use __bdevname to display the device name, and can only show the major and minor of the part0, Fix this by using disk_name to display the correct name. Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index") Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-26dax: fix vma_is_fsdax() helperDan Williams
Gerd reports that ->i_mode may contain other bits besides S_IFCHR. Use S_ISCHR() instead. Otherwise, get_user_pages_longterm() may fail on device-dax instances when those are meant to be explicitly allowed. Fixes: 2bb6d2837083 ("mm: introduce get_user_pages_longterm") Cc: <stable@vger.kernel.org> Reported-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Jane Chu <jane.chu@oracle.com> Reported-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-02-26Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Yet another pile of melted spectrum related changes: - sanitize the array_index_nospec protection mechanism: Remove the overengineered array_index_nospec_mask_check() magic and allow const-qualified types as index to avoid temporary storage in a non-const local variable. - make the microcode loader more robust by properly propagating error codes. Provide information about new feature bits after micro code was updated so administrators can act upon. - optimizations of the entry ASM code which reduce code footprint and make the code simpler and faster. - fix the {pmd,pud}_{set,clear}_flags() implementations to work properly on paravirt kernels by removing the address translation operations. - revert the harmful vmexit_fill_RSB() optimization - use IBRS around firmware calls - teach objtool about retpolines and add annotations for indirect jumps and calls. - explicitly disable jumplabel patching in __init code and handle patching failures properly instead of silently ignoring them. - remove indirect paravirt calls for writing the speculation control MSR as these calls are obviously proving the same attack vector which is tried to be mitigated. - a few small fixes which address build issues with recent compiler and assembler versions" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() KVM/x86: Remove indirect MSR op calls from SPEC_CTRL objtool, retpolines: Integrate objtool with retpoline support more closely x86/entry/64: Simplify ENCODE_FRAME_POINTER extable: Make init_kernel_text() global jump_label: Warn on failed jump_label patching attempt jump_label: Explicitly disable jump labels in __init code x86/entry/64: Open-code switch_to_thread_stack() x86/entry/64: Move ASM_CLAC to interrupt_entry() x86/entry/64: Remove 'interrupt' macro x86/entry/64: Move the switch_to_thread_stack() call to interrupt_entry() x86/entry/64: Move ENTER_IRQ_STACK from interrupt macro to interrupt_entry x86/entry/64: Move PUSH_AND_CLEAR_REGS from interrupt macro to helper function x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP objtool: Add module specific retpoline rules objtool: Add retpoline validation objtool: Use existing global variables for options x86/mm/sme, objtool: Annotate indirect call in sme_encrypt_execute() x86/boot, objtool: Annotate indirect jump in secondary_startup_64() x86/paravirt, objtool: Annotate indirect calls ...
2018-02-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "s390: - optimization for the exitless interrupt support that was merged in 4.16-rc1 - improve the branch prediction blocking for nested KVM - replace some jump tables with switch statements to improve expoline performance - fixes for multiple epoch facility ARM: - fix the interaction of userspace irqchip VMs with in-kernel irqchip VMs - make sure we can build 32-bit KVM/ARM with gcc-8. x86: - fixes for AMD SEV - fixes for Intel nested VMX, emulated UMIP and a dump_stack() on VM startup - fixes for async page fault migration - small optimization to PV TLB flush (new in 4.16-rc1) - syzkaller fixes Generic: - compiler warning fixes - syzkaller fixes - more improvements to the kvm_stat tool Two more small Spectre fixes are going to reach you via Ingo" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (40 commits) KVM: SVM: Fix SEV LAUNCH_SECRET command KVM: SVM: install RSM intercept KVM: SVM: no need to call access_ok() in LAUNCH_MEASURE command include: psp-sev: Capitalize invalid length enum crypto: ccp: Fix sparse, use plain integer as NULL pointer KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled x86/kvm: Make parse_no_xxx __init for kvm KVM: x86: fix backward migration with async_PF kvm: fix warning for non-x86 builds kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD builds tools/kvm_stat: print 'Total' line for multiple events only tools/kvm_stat: group child events indented after parent tools/kvm_stat: separate drilldown and fields filtering tools/kvm_stat: eliminate extra guest/pid selection dialog tools/kvm_stat: mark private methods as such tools/kvm_stat: fix debugfs handling tools/kvm_stat: print error on invalid regex tools/kvm_stat: fix crash when filtering out all non-child trace events tools/kvm_stat: avoid 'is' for equality checks tools/kvm_stat: use a more pythonic way to iterate over dictionaries ...
2018-02-26genhd: Fix BUG in blkdev_open()Jan Kara
When two blkdev_open() calls for a partition race with device removal and recreation, we can hit BUG_ON(!bd_may_claim(bdev, whole, holder)) in blkdev_open(). The race can happen as follows: CPU0 CPU1 CPU2 del_gendisk() bdev_unhash_inode(part1); blkdev_open(part1, O_EXCL) blkdev_open(part1, O_EXCL) bdev = bd_acquire() bdev = bd_acquire() blkdev_get(bdev) bd_start_claiming(bdev) - finds old inode 'whole' bd_prepare_to_claim() -> 0 bdev_unhash_inode(whole); <device removed> <new device under same number created> blkdev_get(bdev); bd_start_claiming(bdev) - finds new inode 'whole' bd_prepare_to_claim() - this also succeeds as we have different 'whole' here... - bad things happen now as we have two exclusive openers of the same bdev The problem here is that block device opens can see various intermediate states while gendisk is shutting down and then being recreated. We fix the problem by introducing new lookup_sem in gendisk that synchronizes gendisk deletion with get_gendisk() and furthermore by making sure that get_gendisk() does not return gendisk that is being (or has been) deleted. This makes sure that once we ever manage to look up newly created bdev inode, we are also guaranteed that following get_gendisk() will either return failure (and we fail open) or it returns gendisk for the new device and following bdget_disk() will return new bdev inode (i.e., blkdev_open() follows the path as if it is completely run after new device is created). Reported-and-analyzed-by: Hou Tao <houtao1@huawei.com> Tested-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-26genhd: Add helper put_disk_and_module()Jan Kara
Add a proper counterpart to get_disk_and_module() - put_disk_and_module(). Currently it is opencoded in several places. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-26genhd: Rename get_disk() to get_disk_and_module()Jan Kara
Rename get_disk() to get_disk_and_module() to make sure what the function does. It's not a great name but at least it is now clear that put_disk() is not it's counterpart. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-25Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Three patches to fix memory ordering issues on ALPHA and a comment to clarify the usage scope of a mutex internal function" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs locking/xchg/alpha: Clean up barrier usage by using smp_mb() in place of __ASM__MB locking/xchg/alpha: Add unconditional memory barrier to cmpxchg() locking/mutex: Add comment to __mutex_owner() to deter usage
2018-02-24kvm: fix warning for non-x86 buildsSebastian Ott
Fix the following sparse warning by moving the prototype of kvm_arch_mmu_notifier_invalidate_range() to linux/kvm_host.h . CHECK arch/s390/kvm/../../../virt/kvm/kvm_main.c arch/s390/kvm/../../../virt/kvm/kvm_main.c:138:13: warning: symbol 'kvm_arch_mmu_notifier_invalidate_range' was not declared. Should it be static? Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-24kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD buildsSebastian Ott
Move the kvm_arch_irq_routing_update() prototype outside of ifdef CONFIG_HAVE_KVM_EVENTFD guards to fix the following sparse warning: arch/s390/kvm/../../../virt/kvm/irqchip.c:171:28: warning: symbol 'kvm_arch_irq_routing_update' was not declared. Should it be static? Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-23Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: "arm64 and perf fixes: - build error when accessing MPIDR_HWID_BITMASK from .S - fix CTR_EL0 field definitions - remove/disable some kernel messages on user faults (unhandled signals, unimplemented syscalls) - fix kernel page fault in unwind_frame() with function graph tracing - fix perf sleeping while atomic errors when booting with ACPI" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: fix unwind_frame() for filtered out fn for function graph tracing arm64: Enforce BBM for huge IO/VMAP mappings arm64: perf: correct PMUVer probing arm_pmu: acpi: request IRQs up-front arm_pmu: note IRQs and PMUs per-cpu arm_pmu: explicitly enable/disable SPIs at hotplug arm_pmu: acpi: check for mismatched PPIs arm_pmu: add armpmu_alloc_atomic() arm_pmu: fold platform helpers into platform code arm_pmu: kill arm_pmu_platdata ARM: ux500: remove PMU IRQ bouncer arm64: __show_regs: Only resolve kernel symbols when running at EL1 arm64: Remove unimplemented syscall log message arm64: Disable unhandled signal log messages by default arm64: cpufeature: Fix CTR_EL0 field definitions arm64: uaccess: Formalise types for access_ok() arm64: Fix compilation error while accessing MPIDR_HWID_BITMASK from .S files
2018-02-23Merge tag 'drm-fixes-for-v4.16-rc3' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "A bunch of fixes for rc3: Exynos: - fixes for using monotonic timestamps - register definitions - removal of unused file ipu-v3L - minor changes - make some register arrays const+static - fix some leaks meson: - fix for vsync atomic: - fix for memory leak EDID parser: - add quirks for some more non-desktop devices - 6-bit panel fix. drm_mm: - fix a bug in the core drm mm hole handling cirrus: - fix lut loading regression Lastly there is a deadlock fix around runtime suspend for secondary GPUs. There was a deadlock between one thread trying to wait for a workqueue job to finish in the runtime suspend path, and the workqueue job it was waiting for in turn waiting for a runtime_get_sync to return. The fixes avoids it by not doing the runtime sync in the workqueue as then we always wait for all those tasks to complete before we runtime suspend" * tag 'drm-fixes-for-v4.16-rc3' of git://people.freedesktop.org/~airlied/linux: (25 commits) drm/tve200: fix kernel-doc documentation comment include drm/edid: quirk Sony PlayStation VR headset as non-desktop drm/edid: quirk Windows Mixed Reality headsets as non-desktop drm/edid: quirk Oculus Rift headsets as non-desktop drm/meson: fix vsync buffer update drm: Handle unexpected holes in color-eviction drm: exynos: Use proper macro definition for HDMI_I2S_PIN_SEL_1 drm/exynos: remove exynos_drm_rotator.h drm/exynos: g2d: Delete an error message for a failed memory allocation in two functions drm/exynos: fix comparison to bitshift when dealing with a mask drm/exynos: g2d: use monotonic timestamps drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA gpu: ipu-csi: add 10/12-bit grayscale support to mbus_code_to_bus_cfg gpu: ipu-cpmem: add 16-bit grayscale support to ipu_cpmem_set_image gpu: ipu-v3: prg: fix device node leak in ipu_prg_lookup_by_phandle gpu: ipu-v3: pre: fix device node leak in ipu_pre_lookup_by_phandle drm/amdgpu: Fix deadlock on runtime suspend drm/radeon: Fix deadlock on runtime suspend drm/nouveau: Fix deadlock on runtime suspend drm: Allow determining if current task is output poll worker ...
2018-02-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "16 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: don't defer struct page initialization for Xen pv guests lib/Kconfig.debug: enable RUNTIME_TESTING_MENU vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems selftests/memfd: add run_fuse_test.sh to TEST_FILES bug.h: work around GCC PR82365 in BUG() mm/swap.c: make functions and their kernel-doc agree (again) mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc ida: do zeroing in ida_pre_get() mm, swap, frontswap: fix THP swap if frontswap enabled certs/blacklist_nohashes.c: fix const confusion in certs blacklist kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE mm, mlock, vmscan: no more skipping pagevecs mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats Kbuild: always define endianess in kconfig.h include/linux/sched/mm.h: re-inline mmdrop() tools: fix cross-compile var clobbering
2018-02-22efivarfs: Limit the rate for non-root to read filesLuck, Tony
Each read from a file in efivarfs results in two calls to EFI (one to get the file size, another to get the actual data). On X86 these EFI calls result in broadcast system management interrupts (SMI) which affect performance of the whole system. A malicious user can loop performing reads from efivarfs bringing the system to its knees. Linus suggested per-user rate limit to solve this. So we add a ratelimit structure to "user_struct" and initialize it for the root user for no limit. When allocating user_struct for other users we set the limit to 100 per second. This could be used for other places that want to limit the rate of some detrimental user action. In efivarfs if the limit is exceeded when reading, we take an interruptible nap for 50ms and check the rate limit again. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-22kconfig.h: Include compiler types to avoid missed struct attributesKees Cook
The header files for some structures could get included in such a way that struct attributes (specifically __randomize_layout from path.h) would be parsed as variable names instead of attributes. This could lead to some instances of a structure being unrandomized, causing nasty GPFs, etc. This patch makes sure the compiler_types.h header is included in kconfig.h so that we've always got types and struct attributes defined, since kconfig.h is included from the compiler command line. Reported-by: Patrick McLean <chutzpah@gentoo.org> Root-caused-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21bug.h: work around GCC PR82365 in BUG()Arnd Bergmann
Looking at functions with large stack frames across all architectures led me discovering that BUG() suffers from the same problem as fortify_panic(), which I've added a workaround for already. In short, variables that go out of scope by calling a noreturn function or __builtin_unreachable() keep using stack space in functions afterwards. A workaround that was identified is to insert an empty assembler statement just before calling the function that doesn't return. I'm adding a macro "barrier_before_unreachable()" to document this, and insert calls to that in all instances of BUG() that currently suffer from this problem. The files that saw the largest change from this had these frame sizes before, and much less with my patch: fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=] drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=] In case of ARC and CRIS, it turns out that the BUG() implementation actually does return (or at least the compiler thinks it does), resulting in lots of warnings about uninitialized variable use and leaving noreturn functions, such as: block/cfq-iosched.c: In function 'cfq_async_queue_prio': block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type] include/linux/dmaengine.h: In function 'dma_maxpq': include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type] This makes them call __builtin_trap() instead, which should normally dump the stack and kill the current process, like some of the other architectures already do. I tried adding barrier_before_unreachable() to panic() and fortify_panic() as well, but that had very little effect, so I'm not submitting that patch. Vineet said: : For ARC, it is double win. : : 1. Fixes 3 -Wreturn-type warnings : : | ../net/core/ethtool.c:311:1: warning: control reaches end of non-void function : [-Wreturn-type] : | ../kernel/sched/core.c:3246:1: warning: control reaches end of non-void function : [-Wreturn-type] : | ../include/linux/sunrpc/svc_xprt.h:180:1: warning: control reaches end of : non-void function [-Wreturn-type] : : 2. bloat-o-meter reports code size improvements as gcc elides the : generated code for stack return. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 Link: http://lkml.kernel.org/r/20171219114112.939391-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc] Tested-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc] Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Christopher Li <sparse@chrisli.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21mm, mlock, vmscan: no more skipping pagevecsShakeel Butt
When a thread mlocks an address space backed either by file pages which are currently not present in memory or swapped out anon pages (not in swapcache), a new page is allocated and added to the local pagevec (lru_add_pvec), I/O is triggered and the thread then sleeps on the page. On I/O completion, the thread can wake on a different CPU, the mlock syscall will then sets the PageMlocked() bit of the page but will not be able to put that page in unevictable LRU as the page is on the pagevec of a different CPU. Even on drain, that page will go to evictable LRU because the PageMlocked() bit is not checked on pagevec drain. The page will eventually go to right LRU on reclaim but the LRU stats will remain skewed for a long time. This patch puts all the pages, even unevictable, to the pagevecs and on the drain, the pages will be added on their LRUs correctly by checking their evictability. This resolves the mlocked pages on pagevec of other CPUs issue because when those pagevecs will be drained, the mlocked file pages will go to unevictable LRU. Also this makes the race with munlock easier to resolve because the pagevec drains happen in LRU lock. However there is still one place which makes a page evictable and does PageLRU check on that page without LRU lock and needs special attention. TestClearPageMlocked() and isolate_lru_page() in clear_page_mlock(). #0: __pagevec_lru_add_fn #1: clear_page_mlock SetPageLRU() if (!TestClearPageMlocked()) return smp_mb() // <--required // inside does PageLRU if (!PageMlocked()) if (isolate_lru_page()) move to evictable LRU putback_lru_page() else move to unevictable LRU In '#1', TestClearPageMlocked() provides full memory barrier semantics and thus the PageLRU check (inside isolate_lru_page) can not be reordered before it. In '#0', without explicit memory barrier, the PageMlocked() check can be reordered before SetPageLRU(). If that happens, '#0' can put a page in unevictable LRU and '#1' might have just cleared the Mlocked bit of that page but fails to isolate as PageLRU fails as '#0' still hasn't set PageLRU bit of that page. That page will be stranded on the unevictable LRU. There is one (good) side effect though. Without this patch, the pages allocated for System V shared memory segment are added to evictable LRUs even after shmctl(SHM_LOCK) on that segment. This patch will correctly put such pages to unevictable LRU. Link: http://lkml.kernel.org/r/20171121211241.18877-1-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Shaohua Li <shli@fb.com> Cc: Jan Kara <jack@suse.cz> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21mm: memcontrol: fix NR_WRITEBACK leak in memcg and system statsJohannes Weiner
After commit a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting"), we observed slowly upward creeping NR_WRITEBACK counts over the course of several days, both the per-memcg stats as well as the system counter in e.g. /proc/meminfo. The conversion from full per-cpu stat counts to per-cpu cached atomic stat counts introduced an irq-unsafe RMW operation into the updates. Most stat updates come from process context, but one notable exception is the NR_WRITEBACK counter. While writebacks are issued from process context, they are retired from (soft)irq context. When writeback completions interrupt the RMW counter updates of new writebacks being issued, the decs from the completions are lost. Since the global updates are routed through the joint lruvec API, both the memcg counters as well as the system counters are affected. This patch makes the joint stat and event API irq safe. Link: http://lkml.kernel.org/r/20180203082353.17284-1-hannes@cmpxchg.org Fixes: a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Debugged-by: Tejun Heo <tj@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21Kbuild: always define endianess in kconfig.hArnd Bergmann
Build testing with LTO found a couple of files that get compiled differently depending on whether asm/byteorder.h gets included early enough or not. In particular, include/asm-generic/qrwlock_types.h is affected by this, but there are probably others as well. The symptom is a series of LTO link time warnings, including these: net/netlabel/netlabel_unlabeled.h:223: error: type of 'netlbl_unlhsh_add' does not match original declaration [-Werror=lto-type-mismatch] int netlbl_unlhsh_add(struct net *net, net/netlabel/netlabel_unlabeled.c:377: note: 'netlbl_unlhsh_add' was previously declared here include/net/ipv6.h:360: error: type of 'ipv6_renew_options_kern' does not match original declaration [-Werror=lto-type-mismatch] ipv6_renew_options_kern(struct sock *sk, net/ipv6/exthdrs.c:1162: note: 'ipv6_renew_options_kern' was previously declared here net/core/dev.c:761: note: 'dev_get_by_name_rcu' was previously declared here struct net_device *dev_get_by_name_rcu(struct net *net, const char *name) net/core/dev.c:761: note: code may be misoptimized unless -fno-strict-aliasing is used drivers/gpu/drm/i915/i915_drv.h:3377: error: type of 'i915_gem_object_set_to_wc_domain' does not match original declaration [-Werror=lto-type-mismatch] i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); drivers/gpu/drm/i915/i915_gem.c:3639: note: 'i915_gem_object_set_to_wc_domain' was previously declared here include/linux/debugfs.h:92:9: error: type of 'debugfs_attr_read' does not match original declaration [-Werror=lto-type-mismatch] ssize_t debugfs_attr_read(struct file *file, char __user *buf, fs/debugfs/file.c:318: note: 'debugfs_attr_read' was previously declared here include/linux/rwlock_api_smp.h:30: error: type of '_raw_read_unlock' does not match original declaration [-Werror=lto-type-mismatch] void __lockfunc _raw_read_unlock(rwlock_t *lock) __releases(lock); kernel/locking/spinlock.c:246:26: note: '_raw_read_unlock' was previously declared here include/linux/fs.h:3308:5: error: type of 'simple_attr_open' does not match original declaration [-Werror=lto-type-mismatch] int simple_attr_open(struct inode *inode, struct file *file, fs/libfs.c:795: note: 'simple_attr_open' was previously declared here All of the above are caused by include/asm-generic/qrwlock_types.h failing to include asm/byteorder.h after commit e0d02285f16e ("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'") in linux-4.15. Similar bugs may or may not exist in older kernels as well, but there is no easy way to test those with link-time optimizations, and kernels before 4.14 are harder to fix because they don't have Babu's patch series We had similar issues with CONFIG_ symbols in the past and ended up always including the configuration headers though linux/kconfig.h. This works around the issue through that same file, defining either __BIG_ENDIAN or __LITTLE_ENDIAN depending on CONFIG_CPU_BIG_ENDIAN, which is now always set on all architectures since commit 4c97a0c8fee3 ("arch: define CPU_BIG_ENDIAN for all fixed big endian archs"). Link: http://lkml.kernel.org/r/20180202154104.1522809-2-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Babu Moger <babu.moger@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21include/linux/sched/mm.h: re-inline mmdrop()Andrew Morton
As Peter points out, Doing a CALL+RET for just the decrement is a bit silly. Fixes: d70f2a14b72a4bc ("include/linux/sched/mm.h: uninline mmdrop_async(), etc") Acked-by: Peter Zijlstra (Intel) <peterz@infraded.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-22Merge tag 'drm-misc-fixes-2018-02-21' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Fixes for 4.16. I contains fixes for deadlock on runtime suspend on few drivers, a memory leak on non-blocking commits, a crash on color-eviction. The is also meson and edid fixes, plus a fix for a doc warning. * tag 'drm-misc-fixes-2018-02-21' of git://anongit.freedesktop.org/drm/drm-misc: drm/tve200: fix kernel-doc documentation comment include drm/meson: fix vsync buffer update drm: Handle unexpected holes in color-eviction drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA drm/amdgpu: Fix deadlock on runtime suspend drm/radeon: Fix deadlock on runtime suspend drm/nouveau: Fix deadlock on runtime suspend drm: Allow determining if current task is output poll worker workqueue: Allow retrieval of current task's work struct drm/atomic: Fix memleak on ERESTARTSYS during non-blocking commits
2018-02-21extable: Make init_kernel_text() globalJosh Poimboeuf
Convert init_kernel_text() to a global function and use it in a few places instead of manually comparing _sinittext and _einittext. Note that kallsyms.h has a very similar function called is_kernel_inittext(), but its end check is inclusive. I'm not sure whether that's intentional behavior, so I didn't touch it. Suggested-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4335d02be8d45ca7d265d2f174251d0b7ee6c5fd.1519051220.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21jump_label: Explicitly disable jump labels in __init codeJosh Poimboeuf
After initmem has been freed, any jump labels in __init code are prevented from being written to by the kernel_text_address() check in __jump_label_update(). However, this check is quite broad. If kernel_text_address() were to return false for any other reason, the jump label write would fail silently with no warning. For jump labels in module init code, entry->code is set to zero to indicate that the entry is disabled. Do the same thing for core kernel init code. This makes the behavior more consistent, and will also make it more straightforward to detect non-init jump label write failures in the next patch. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/c52825c73f3a174e8398b6898284ec20d4deb126.1519051220.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21locking/mutex: Add comment to __mutex_owner() to deter usagePeter Zijlstra
Attempt to deter usage, this is not a public interface. It is entirely possible to implement a conformant mutex without having this owner field (in fact, we used to have that). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-20arm_pmu: acpi: request IRQs up-frontMark Rutland
We can't request IRQs in atomic context, so for ACPI systems we'll have to request them up-front, and later associate them with CPUs. This patch reorganises the arm_pmu code to do so. As we no longer have the arm_pmu structure at probe time, a number of prototypes need to be adjusted, requiring changes to the common arm_pmu code and arm_pmu platform code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: note IRQs and PMUs per-cpuMark Rutland
To support ACPI systems, we need to request IRQs before we know the associated PMU, and thus we need some percpu variable that the IRQ handler can find the PMU from. As we're going to request IRQs without the PMU, we can't rely on the arm_pmu::active_irqs mask, and similarly need to track requested IRQs with a percpu variable. Signed-off-by: Mark Rutland <mark.rutland@arm.com> [will: made armpmu_count_irq_users static] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: add armpmu_alloc_atomic()Mark Rutland
In ACPI systems, we don't know the makeup of CPUs until we hotplug them on, and thus have to allocate the PMU datastructures at hotplug time. Thus, we must use GFP_ATOMIC allocations. Let's add an armpmu_alloc_atomic() that we can use in this case. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: fold platform helpers into platform codeMark Rutland
The armpmu_{request,free}_irqs() helpers are only used by arm_pmu_platform.c, so let's fold them in and make them static. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: kill arm_pmu_platdataMark Rutland
Now that we have no platforms passing platform data to the arm_pmu code, we can get rid of the platdata and associated hooks, paving the way for rework of our IRQ handling. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20x86/retpoline: Support retpoline builds with ClangDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: arjan.van.de.ven@intel.com Cc: bp@alien8.de Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Link: http://lkml.kernel.org/r/1519037457-7643-5-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Prevent index integer overflow in ptr_ring, from Jason Wang. 2) Program mvpp2 multicast filter properly, from Mikulas Patocka. 3) The bridge brport attribute file is write only and doesn't have a ->show() method, don't blindly invoke it. From Xin Long. 4) Inverted mask used in genphy_setup_forced(), from Ingo van Lil. 5) Fix multiple definition issue with if_ether.h UAPI header, from Hauke Mehrtens. 6) Fix GFP_KERNEL usage in atomic in RDS protocol code, from Sowmini Varadhan. 7) Revert XDP redirect support from thunderx driver, it is not implemented properly. From Jesper Dangaard Brouer. 8) Fix missing RTNL protection across some tipc operations, from Ying Xue. 9) Return the correct IV bytes in the TLS getsockopt code, from Boris Pismenny. 10) Take tclassid into consideration properly when doing FIB rule matching. From Stefano Brivio. 11) cxgb4 device needs more PCI VPD quirks, from Casey Leedom. 12) TUN driver doesn't align frags properly, and we can end up doing unaligned atomics on misaligned metadata. From Eric Dumazet. 13) Fix various crashes found using DEBUG_PREEMPT in rmnet driver, from Subash Abhinov Kasiviswanathan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits) tg3: APE heartbeat changes mlxsw: spectrum_router: Do not unconditionally clear route offload indication net: qualcomm: rmnet: Fix possible null dereference in command processing net: qualcomm: rmnet: Fix warning seen with 64 bit stats net: qualcomm: rmnet: Fix crash on real dev unregistration sctp: remove the left unnecessary check for chunk in sctp_renege_events rxrpc: Work around usercopy check tun: fix tun_napi_alloc_frags() frag allocator udplite: fix partial checksum initialization skbuff: Fix comment mis-spelling. dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock PCI/cxgb4: Extend T3 PCI quirk to T4+ devices cxgb4: fix trailing zero in CIM LA dump cxgb4: free up resources of pf 0-3 fib_semantics: Don't match route with mismatching tclassid NFC: llcp: Limit size of SDP URI tls: getsockopt return record sequence number tls: reset the crypto info if copy_from_user fails tls: retrun the correct IV in getsockopt docs: segmentation-offloads.txt: add SCTP info ...
2018-02-18Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fix from Thomas Gleixner: "A small fix which adds the missing for_each_cpu_wrap() stub for the UP case to avoid build failures" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpumask: Make for_each_cpu_wrap() available on UP as well
2018-02-17Merge tag 'for-linus-20180217' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request from Keith, with fixes all over the map for nvme. From various folks. - Classic polling fix, that avoids a latency issue where we still end up waiting for an interrupt in some cases. From Nitesh Shetty. - Comment typo fix from Minwoo Im. * tag 'for-linus-20180217' of git://git.kernel.dk/linux-block: block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS nvme-rdma: fix sysfs invoked reset_ctrl error flow nvmet: Change return code of discard command if not supported nvme-pci: Fix timeouts in connecting state nvme-pci: Remap CMB SQ entries on every controller reset nvme: fix the deadlock in nvme_update_formats blk: optimization for classic polling nvme: Don't use a stack buffer for keep-alive command nvme_fc: cleanup io completion nvme_fc: correct abort race condition on resets nvme: Fix discard buffer overrun nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
2018-02-17nospec: Include <asm/barrier.h> dependencyDan Williams
The nospec.h header expects the per-architecture header file <asm/barrier.h> to optionally define array_index_mask_nospec(). Include that dependency to prevent inadvertent fallback to the default array_index_mask_nospec() implementation. The default implementation may not provide a full mitigation on architectures that perform data value speculation. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/151881605404.17395.1341935530792574707.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-17nospec: Allow index argument to have const-qualified typeRasmus Villemoes
The last expression in a statement expression need not be a bare variable, quoting gcc docs The last thing in the compound statement should be an expression followed by a semicolon; the value of this subexpression serves as the value of the entire construct. and we already use that in e.g. the min/max macros which end with a ternary expression. This way, we can allow index to have const-qualified type, which will in some cases avoid the need for introducing a local copy of index of non-const qualified type. That, in turn, can prevent readers not familiar with the internals of array_index_nospec from wondering about the seemingly redundant extra variable, and I think that's worthwhile considering how confusing the whole _nospec business is. The expression _i&_mask has type unsigned long (since that is the type of _mask, and the BUILD_BUG_ONs guarantee that _i will get promoted to that), so in order not to change the type of the whole expression, add a cast back to typeof(_i). Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/151881604837.17395.10812767547837568328.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-17nospec: Kill array_index_nospec_mask_check()Dan Williams
There are multiple problems with the dynamic sanity checking in array_index_nospec_mask_check(): * It causes unnecessary overhead in the 32-bit case since integer sized @index values will no longer cause the check to be compiled away like in the 64-bit case. * In the 32-bit case it may trigger with user controllable input when the expectation is that should only trigger during development of new kernel enabling. * The macro reuses the input parameter in multiple locations which is broken if someone passes an expression like 'index++' to array_index_nospec(). Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/151881604278.17395.6605847763178076520.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16workqueue: Allow retrieval of current task's work structLukas Wunner
Introduce a helper to retrieve the current task's work struct if it is a workqueue worker. This allows us to fix a long-standing deadlock in several DRM drivers wherein the ->runtime_suspend callback waits for a specific worker to finish and that worker in turn calls a function which waits for runtime suspend to finish. That function is invoked from multiple call sites and waiting for runtime suspend to finish is the correct thing to do except if it's executing in the context of the worker. Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://patchwork.freedesktop.org/patch/msgid/2d8f603074131eb87e588d2b803a71765bd3a2fd.1518338788.git.lukas@wunner.de
2018-02-16skbuff: Fix comment mis-spelling.David S. Miller
'peform' --> 'perform' Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16Merge tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fixes from Christoph Hellwig: "A few dma-mapping fixes for the fallout from the changes in rc1" * tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping: powerpc/macio: set a proper dma_coherent_mask dma-mapping: fix a comment typo dma-direct: comment the dma_direct_free calling convention dma-direct: mark as is_phys ia64: fix build failure with CONFIG_SWIOTLB
2018-02-16cpumask: Make for_each_cpu_wrap() available on UP as wellMichael Kelley
for_each_cpu_wrap() was originally added in the #else half of a large "#if NR_CPUS == 1" statement, but was omitted in the #if half. This patch adds the missing #if half to prevent compile errors when NR_CPUS is 1. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Michael Kelley <mhkelley@outlook.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kys@microsoft.com Cc: martin.petersen@oracle.com Cc: mikelley@microsoft.com Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()") Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901MB2045.namprd19.prod.outlook.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15Merge tag 'acpi-4.16-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix a system resume regression from the 4.13 cycle, clean up device table handling in the ACPI core, update sysfs ABI documentation of a couple of drivers and add an expected switch fall-through marker to the SPCR table parsing code. Specifics: - Revert a problematic EC driver change from the 4.13 cycle that introduced a system resume regression on Thinkpad X240 (Rafael Wysocki). - Clean up device tables handling in the ACPI core and the related part of the device properties framework (Andy Shevchenko). - Update the sysfs ABI documentatio of the dock and the INT3407 special device drivers (Aishwarya Pant). - Add an expected switch fall-through marker to the SPCR table parsing code (Gustavo Silva)" * tag 'acpi-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: dock: document sysfs interface ACPI / DPTF: Document dptf_power sysfs atttributes device property: Constify device_get_match_data() ACPI / bus: Rename acpi_get_match_data() to acpi_device_get_match_data() ACPI / bus: Remove checks in acpi_get_match_data() ACPI / bus: Do not traverse through non-existed device table ACPI: SPCR: Mark expected switch fall-through in acpi_parse_spcr ACPI / EC: Restore polling during noirq suspend/resume phases
2018-02-15Merge tag 'pm-4.16-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a recently introduced build issue related to cpuidle and two bugs in the PM core, update cpuidle documentation and clean up memory allocations in the operating performance points (OPP) framework. Specifics: - Fix a recently introduced build issue related to cpuidle by covering all of the relevant combinations of Kconfig options in its header (Rafael Wysocki). - Add missing invocation of pm_runtime_drop_link() to the !CONFIG_SRCU variant of __device_link_del() (Lukas Wunner). - Fix unbalanced IRQ enable in the wakeup interrupts framework (Tony Lindgren). - Update cpuidle sysfs ABI documentation (Aishwarya Pant). - Use GFP_KERNEL instead of GFP_ATOMIC for allocating memory in dev_pm_opp_init_cpufreq_table() (Jia-Ju Bai)" * tag 'pm-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: cpuidle: Fix cpuidle_poll_state_init() prototype PM / runtime: Update links_count also if !CONFIG_SRCU PM / wakeirq: Fix unbalanced IRQ enable for wakeirq Documentation/ABI: update cpuidle sysfs documentation opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table
2018-02-15Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "This contains two qspinlock fixes and three documentation and comment fixes" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/semaphore: Update the file path in documentation locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit() locking/qspinlock: Ensure node->count is updated before initialising node locking/qspinlock: Ensure node is initialised before updating prev->next Documentation/locking/mutex-design: Update to reflect latest changes
2018-02-15block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTSMinwoo Im
Update comment typo _consisitent_ to _consistent_ from following commit. commit 0206319fdfee ("blk-mq: Fix poll_stat for new size-based bucketing.") Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-15Merge branches 'pm-cpuidle' and 'pm-opp'Rafael J. Wysocki
* pm-cpuidle: PM: cpuidle: Fix cpuidle_poll_state_init() prototype Documentation/ABI: update cpuidle sysfs documentation * pm-opp: opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table
2018-02-14Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes all across the map: - /proc/kcore vsyscall related fixes - LTO fix - build warning fix - CPU hotplug fix - Kconfig NR_CPUS cleanups - cpu_has() cleanups/robustification - .gitignore fix - memory-failure unmapping fix - UV platform fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages x86/error_inject: Make just_return_func() globally visible x86/platform/UV: Fix GAM Range Table entries less than 1GB x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page x86/Kconfig: Further simplify the NR_CPUS config x86/Kconfig: Simplify NR_CPUS config x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()" x86/cpufeature: Update _static_cpu_has() to use all named variables x86/cpufeature: Reindent _static_cpu_has()
2018-02-14Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar: "Here's the latest set of Spectre and PTI related fixes and updates: Spectre: - Add entry code register clearing to reduce the Spectre attack surface - Update the Spectre microcode blacklist - Inline the KVM Spectre helpers to get close to v4.14 performance again. - Fix indirect_branch_prediction_barrier() - Fix/improve Spectre related kernel messages - Fix array_index_nospec_mask() asm constraint - KVM: fix two MSR handling bugs PTI: - Fix a paranoid entry PTI CR3 handling bug - Fix comments objtool: - Fix paranoid_entry() frame pointer warning - Annotate WARN()-related UD2 as reachable - Various fixes - Add Add Peter Zijlstra as objtool co-maintainer Misc: - Various x86 entry code self-test fixes - Improve/simplify entry code stack frame generation and handling after recent heavy-handed PTI and Spectre changes. (There's two more WIP improvements expected here.) - Type fix for cache entries There's also some low risk non-fix changes I've included in this branch to reduce backporting conflicts: - rename a confusing x86_cpu field name - de-obfuscate the naming of single-TLB flushing primitives" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) x86/entry/64: Fix CR3 restore in paranoid_exit() x86/cpu: Change type of x86_cache_size variable to unsigned int x86/spectre: Fix an error message x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping selftests/x86/mpx: Fix incorrect bounds with old _sigfault x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]() x86/speculation: Add <asm/msr-index.h> dependency nospec: Move array_index_nospec() parameter checking into separate macro x86/speculation: Fix up array_index_nospec_mask() asm constraint x86/debug: Use UD2 for WARN() x86/debug, objtool: Annotate WARN()-related UD2 as reachable objtool: Fix segfault in ignore_unreachable_insn() selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory selftests/x86/pkeys: Remove unused functions selftests/x86: Clean up and document sscanf() usage selftests/x86: Fix vDSO selftest segfault for vsyscall=none x86/entry/64: Remove the unused 'icebp' macro ...
2018-02-15nospec: Move array_index_nospec() parameter checking into separate macroWill Deacon
For architectures providing their own implementation of array_index_mask_nospec() in asm/barrier.h, attempting to use WARN_ONCE() to complain about out-of-range parameters using WARN_ON() results in a mess of mutually-dependent include files. Rather than unpick the dependencies, simply have the core code in nospec.h perform the checking for us. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1517840166-15399-1-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>