summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-08Merge tag 'rpmsg-v5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc Pull rpmsg updates from Bjorn Andersson: "This replaces a zero-length array with flexible-array and fixes a typo in a comment in the rpmsg core" * tag 'rpmsg-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc: rpmsg: Replace zero-length array with flexible-array rpmsg: fix a comment typo for rpmsg_device_match()
2020-06-08Merge tag 'ceph-for-5.8-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "The highlights are: - OSD/MDS latency and caps cache metrics infrastructure for the filesytem (Xiubo Li). Currently available through debugfs and will be periodically sent to the MDS in the future. - support for replica reads (balanced and localized reads) for rbd and the filesystem (myself). The default remains to always read from primary, users can opt-in with the new crush_location and read_from_replica options. Note that reading from replica is safe for general use only since Octopus. - support for RADOS allocation hint flags (myself). Currently used by rbd to propagate the compressible/incompressible hint given with the new compression_hint map option and ready for passing on more advanced hints, e.g. based on fadvise() from the filesystem. - support for efficient cross-quota-realm renames (Luis Henriques) - assorted cap handling improvements and cleanups, particularly untangling some of the locking (Jeff Layton)" * tag 'ceph-for-5.8-rc1' of git://github.com/ceph/ceph-client: (29 commits) rbd: compression_hint option libceph: support for alloc hint flags libceph: read_from_replica option libceph: support for balanced and localized reads libceph: crush_location infrastructure libceph: decode CRUSH device/bucket types and names libceph: add non-asserting rbtree insertion helper ceph: skip checking caps when session reconnecting and releasing reqs ceph: make sure mdsc->mutex is nested in s->s_mutex to fix dead lock ceph: don't return -ESTALE if there's still an open file libceph, rbd: replace zero-length array with flexible-array ceph: allow rename operation under different quota realms ceph: normalize 'delta' parameter usage in check_quota_exceeded ceph: ceph_kick_flushing_caps needs the s_mutex ceph: request expedited service on session's last cap flush ceph: convert mdsc->cap_dirty to a per-session list ceph: reset i_requested_max_size if file write is not wanted ceph: throw a warning if we destroy session with mutex still locked ceph: fix potential race in ceph_check_caps ceph: document what protects i_dirty_item and i_flushing_item ...
2020-06-08Merge tag 'gfs2-for-5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - An iopen glock locking scheme rework that speeds up deletes of inodes accessed from multiple nodes - Various bug fixes and debugging improvements - Convert gfs2-glocks.txt to ReST * tag 'gfs2-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: fix use-after-free on transaction ail lists gfs2: new slab for transactions gfs2: initialize transaction tr_ailX_lists earlier gfs2: Smarter iopen glock waiting gfs2: Wake up when setting GLF_DEMOTE gfs2: Check inode generation number in delete_work_func gfs2: Move inode generation number check into gfs2_inode_lookup gfs2: Minor gfs2_lookup_by_inum cleanup gfs2: Try harder to delete inodes locally gfs2: Give up the iopen glock on contention gfs2: Turn gl_delete into a delayed work gfs2: Keep track of deleted inode generations in LVBs gfs2: Allow ASPACE glocks to also have an lvb gfs2: instrumentation wrt log_flush stuck gfs2: introduce new gfs2_glock_assert_withdraw gfs2: print mapping->nrpages in glock dump for address space glocks gfs2: Only do glock put in gfs2_create_inode for free inodes gfs2: Allow lock_nolock mount to specify jid=X gfs2: Don't ignore inode write errors during inode_go_sync docs: filesystems: convert gfs2-glocks.txt to ReST
2020-06-08Merge tag 's390-5.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for multi-function devices in pci code. - Enable PF-VF linking for architectures using the pdev->no_vf_scan flag (currently just s390). - Add reipl from NVMe support. - Get rid of critical section cleanup in entry.S. - Refactor PNSO CHSC (perform network subchannel operation) in cio and qeth. - QDIO interrupts and error handling fixes and improvements, more refactoring changes. - Align ioremap() with generic code. - Accept requests without the prefetch bit set in vfio-ccw. - Enable path handling via two new regions in vfio-ccw. - Other small fixes and improvements all over the code. * tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits) vfio-ccw: make vfio_ccw_regops variables declarations static vfio-ccw: Add trace for CRW event vfio-ccw: Wire up the CRW irq and CRW region vfio-ccw: Introduce a new CRW region vfio-ccw: Refactor IRQ handlers vfio-ccw: Introduce a new schib region vfio-ccw: Refactor the unregister of the async regions vfio-ccw: Register a chp_event callback for vfio-ccw vfio-ccw: Introduce new helper functions to free/destroy regions vfio-ccw: document possible errors vfio-ccw: Enable transparent CCW IPL from DASD s390/pci: Log new handle in clp_disable_fh() s390/cio, s390/qeth: cleanup PNSO CHSC s390/qdio: remove q->first_to_kick s390/qdio: fix up qdio_start_irq() kerneldoc s390: remove critical section cleanup from entry.S s390: add machine check SIGP s390/pci: ioremap() align with generic code s390/ap: introduce new ap function ap_get_qdev() Documentation/s390: Update / remove developerWorks web links ...
2020-06-08Merge tag 'iommu-updates-v5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "A big part of this is a change in how devices get connected to IOMMUs in the core code. It contains the change from the old add_device() / remove_device() to the new probe_device() / release_device() call-backs. As a result functionality that was previously in the IOMMU drivers has been moved to the IOMMU core code, including IOMMU group allocation for each device. The reason for this change was to get more robust allocation of default domains for the iommu groups. A couple of fixes were necessary after this was merged into the IOMMU tree, but there are no known bugs left. The last fix is applied on-top of the merge commit for the topic branches. Other than that change, we have: - Removal of the driver private domain handling in the Intel VT-d driver. This was fragile code and I am glad it is gone now. - More Intel VT-d updates from Lu Baolu: - Nested Shared Virtual Addressing (SVA) support to the Intel VT-d driver - Replacement of the Intel SVM interfaces to the common IOMMU SVA API - SVA Page Request draining support - ARM-SMMU Updates from Will: - Avoid mapping reserved MMIO space on SMMUv3, so that it can be claimed by the PMU driver - Use xarray to manage ASIDs on SMMUv3 - Reword confusing shutdown message - DT compatible string updates - Allow implementations to override the default domain type - A new IOMMU driver for the Allwinner Sun50i platform - Support for ATS gets disabled for untrusted devices (like Thunderbolt devices). This includes a PCI patch, acked by Bjorn. - Some cleanups to the AMD IOMMU driver to make more use of IOMMU core features. - Unification of some printk formats in the Intel and AMD IOMMU drivers and in the IOVA code. - Updates for DT bindings - A number of smaller fixes and cleanups. * tag 'iommu-updates-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (109 commits) iommu: Check for deferred attach in iommu_group_do_dma_attach() iommu/amd: Remove redundant devid checks iommu/amd: Store dev_data as device iommu private data iommu/amd: Merge private header files iommu/amd: Remove PD_DMA_OPS_MASK iommu/amd: Consolidate domain allocation/freeing iommu/amd: Free page-table in protection_domain_free() iommu/amd: Allocate page-table in protection_domain_init() iommu/amd: Let free_pagetable() not rely on domain->pt_root iommu/amd: Unexport get_dev_data() iommu/vt-d: Fix compile warning iommu/vt-d: Remove real DMA lookup in find_domain iommu/vt-d: Allocate domain info for real DMA sub-devices iommu/vt-d: Only clear real DMA device's context entries iommu: Remove iommu_sva_ops::mm_exit() uacce: Remove mm_exit() op iommu/sun50i: Constify sun50i_iommu_ops iommu/hyper-v: Constify hyperv_ir_domain_ops iommu/vt-d: Use pci_ats_supported() iommu/arm-smmu-v3: Use pci_ats_supported() ...
2020-06-08Merge tag 'drm-next-msm-5.8-2020-06-08' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm msm updates from Dave Airlie: "This tree has been in next for a couple of weeks, but Rob missed an arm32 build issue, so I was awaiting the tree with a patch reverted. - new gpu support: a405, a640, a650 - dpu: color processing support - mdp5: support for msm8x36 (the thing with a405) - some prep work for per-context pagetables (ie the part that does not depend on in-flight iommu patches) - last but not least, UABI update for submit ioctl to support syncobj (from Bas)" * tag 'drm-next-msm-5.8-2020-06-08' of git://anongit.freedesktop.org/drm/drm: (30 commits) Revert "drm/msm/dpu: add support for clk and bw scaling for display" drm/msm/a6xx: skip HFI set freq if GMU is powered down drm/msm: Update the MMU helper function APIs drm/msm: Refactor address space initialization drm/msm: Attach the IOMMU device during initialization drm/msm/dpu: dpu_setup_dspp_pcc() can be static drm/msm/a6xx: a6xx_hfi_send_start() can be static drm/msm/a4xx: add a405_registers for a405 device drm/msm/a4xx: add adreno a405 support drm/msm/a6xx: update a6xx_hw_init for A640 and A650 drm/msm/a6xx: enable GMU log drm/msm/a6xx: update pdc/rscc GMU registers for A640/A650 drm/msm/a6xx: A640/A650 GMU firmware path drm/msm/a6xx: HFI v2 for A640 and A650 drm/msm/a6xx: add A640/A650 to gpulist drm/msm/a6xx: use msm_gem for GMU memory objects drm/msm: add internal MSM_BO_MAP_PRIV flag drm/msm: add msm_gem_get_and_pin_iova_range drm/msm: Check for powered down HW in the devfreq callbacks drm/msm/dpu: update bandwidth threshold check ...
2020-06-08Merge tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "These are the fixes from last week for the stuff merged in the merge window. It got a bunch of nouveau fixes for HDA audio on some new GPUs, some i915 and some amdpgu fixes. i915: - gvt: Fix one clang warning on debug only function - Use ARRAY_SIZE for coccicheck warning - Use after free fix for display global state. - Whitelisting context-local timestamp on Gen9 and two scheduler fixes with deps (Cc: stable) - Removal of write flag from sysfs files where ineffective nouveau: - HDMI/DP audio HDA fixes - display hang fix for Volta/Turing - GK20A regression fix. amdgpu: - Prevent hwmon accesses while GPU is in reset - CTF interrupt fix - Backlight fix for renoir - Fix for display sync groups - Display bandwidth validation workaround" * tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drm: (28 commits) drm/nouveau/kms/nv50-: clear SW state of disabled windows harder drm/nouveau: gr/gk20a: Use firmware version 0 drm/nouveau/disp/gm200-: detect and potentially disable HDA support on some SORs drm/nouveau/disp/gp100: split SOR implementation from gm200 drm/nouveau/disp: modify OR allocation policy to account for HDA requirements drm/nouveau/disp: split part of OR allocation logic into a function drm/nouveau/disp: provide hint to OR allocation about HDA requirements drm/amd/display: Revalidate bandwidth before commiting DC updates drm/amdgpu/display: use blanked rather than plane state for sync groups drm/i915/params: fix i915.fake_lmem_start module param sysfs permissions drm/i915/params: don't expose inject_probe_failure in debugfs drm/i915: Whitelist context-local timestamp in the gen9 cmdparser drm/i915: Fix global state use-after-frees with a refcount drm/i915: Check for awaits on still currently executing requests drm/i915/gt: Do not schedule normal requests immediately along virtual drm/i915: Reorder await_execution before await_request drm/nouveau/kms/gt215-: fix race with audio driver runpm drm/nouveau/disp/gm200-: fix NV_PDISP_SOR_HDMI2_CTRL(n) selection Revert "drm/amd/display: disable dcn20 abm feature for bring up" drm/amd/powerplay: ack the SMUToHost interrupt on receive V2 ...
2020-06-08Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge still more updates from Andrew Morton: "Various trees. Mainly those parts of MM whose linux-next dependents are now merged. I'm still sitting on ~160 patches which await merges from -next. Subsystems affected by this patch series: mm/proc, ipc, dynamic-debug, panic, lib, sysctl, mm/gup, mm/pagemap" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (52 commits) doc: cgroup: update note about conditions when oom killer is invoked module: move the set_fs hack for flush_icache_range to m68k nommu: use flush_icache_user_range in brk and mmap binfmt_flat: use flush_icache_user_range exec: use flush_icache_user_range in read_code exec: only build read_code when needed m68k: implement flush_icache_user_range arm: rename flush_cache_user_range to flush_icache_user_range xtensa: implement flush_icache_user_range sh: implement flush_icache_user_range asm-generic: add a flush_icache_user_range stub mm: rename flush_icache_user_range to flush_icache_user_page arm,sparc,unicore32: remove flush_icache_user_range riscv: use asm-generic/cacheflush.h powerpc: use asm-generic/cacheflush.h openrisc: use asm-generic/cacheflush.h m68knommu: use asm-generic/cacheflush.h microblaze: use asm-generic/cacheflush.h ia64: use asm-generic/cacheflush.h hexagon: use asm-generic/cacheflush.h ...
2020-06-08doc: cgroup: update note about conditions when oom killer is invokedKonstantin Khlebnikov
Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the charge path") cgroup oom killer is no longer invoked only from page faults. Now it implements the same semantics as global OOM killer: allocation context invokes OOM killer and keeps retrying until success. [akpm@linux-foundation.org: fixes per Randy] Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <guro@fb.com> Cc: Randy Dunlap <rdunlap@infradead.org> Link: http://lkml.kernel.org/r/158894738928.208854.5244393925922074518.stgit@buzz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08module: move the set_fs hack for flush_icache_range to m68kChristoph Hellwig
flush_icache_range generally operates on kernel addresses, but for some reason m68k needed a set_fs override. Move that into the m68k code insted of keeping it in the module loader. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Jessica Yu <jeyu@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-30-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08nommu: use flush_icache_user_range in brk and mmapChristoph Hellwig
These obviously operate on user addresses. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-29-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08binfmt_flat: use flush_icache_user_rangeChristoph Hellwig
load_flat_file works on user addresses. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/20200515143646.3857579-28-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08exec: use flush_icache_user_range in read_codeChristoph Hellwig
read_code operates on user addresses. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/20200515143646.3857579-27-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08exec: only build read_code when neededChristoph Hellwig
Only build read_code when binary formats that use it are built into the kernel. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/20200515143646.3857579-26-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08m68k: implement flush_icache_user_rangeChristoph Hellwig
Rename the current flush_icache_range to flush_icache_user_range as per commit ae92ef8a4424 ("PATCH] flush icache in correct context") there seems to be an assumption that it operates on user addresses. Add a flush_icache_range around it that for now is a no-op. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-25-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08arm: rename flush_cache_user_range to flush_icache_user_rangeChristoph Hellwig
flush_icache_user_range will be the name for a generic primitive. Move the arm name so that arm already has an implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Russell King <linux@armlinux.org.uk> Link: http://lkml.kernel.org/r/20200515143646.3857579-24-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08xtensa: implement flush_icache_user_rangeChristoph Hellwig
The Xtensa implementation of flush_icache_range seems to be able to cope with user addresses. Just define flush_icache_user_range to flush_icache_range. [jcmvbkbc@gmail.com: fix flush_icache_user_range in noMMU configs] Link: http://lkml.kernel.org/r/20200525221556.4270-1-jcmvbkbc@gmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-23-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08sh: implement flush_icache_user_rangeChristoph Hellwig
The SuperH implementation of flush_icache_range seems to be able to cope with user addresses. Just define flush_icache_user_range to flush_icache_range. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-22-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08asm-generic: add a flush_icache_user_range stubChristoph Hellwig
Define flush_icache_user_range to flush_icache_range unless the architecture provides its own implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-21-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08mm: rename flush_icache_user_range to flush_icache_user_pageChristoph Hellwig
The function currently known as flush_icache_user_range only operates on a single page. Rename it to flush_icache_user_page as we'll need the name flush_icache_user_range for something else soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-20-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08arm,sparc,unicore32: remove flush_icache_user_rangeChristoph Hellwig
flush_icache_user_range is only used by <asm-generic/cacheflush.h>, so remove it from the architectures that implement it, but don't use <asm-generic/cacheflush.h>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Russell King <linux@armlinux.org.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@pku.edu.cn> Link: http://lkml.kernel.org/r/20200515143646.3857579-19-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08riscv: use asm-generic/cacheflush.hChristoph Hellwig
RISC-V needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Also remove the pointless __KERNEL__ ifdef while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Link: http://lkml.kernel.org/r/20200515143646.3857579-18-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08powerpc: use asm-generic/cacheflush.hChristoph Hellwig
Power needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Also remove the pointless __KERNEL__ ifdef while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20200515143646.3857579-17-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08openrisc: use asm-generic/cacheflush.hChristoph Hellwig
OpenRISC needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-16-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08m68knommu: use asm-generic/cacheflush.hChristoph Hellwig
m68knommu needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-15-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08microblaze: use asm-generic/cacheflush.hChristoph Hellwig
Microblaze needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Simek <monstr@monstr.eu> Link: http://lkml.kernel.org/r/20200515143646.3857579-14-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08ia64: use asm-generic/cacheflush.hChristoph Hellwig
IA64 needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-13-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08hexagon: use asm-generic/cacheflush.hChristoph Hellwig
Hexagon needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Brian Cain <bcain@codeaurora.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-12-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08c6x: use asm-generic/cacheflush.hChristoph Hellwig
C6x needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-11-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08arm64: use asm-generic/cacheflush.hChristoph Hellwig
ARM64 needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-10-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08alpha: use asm-generic/cacheflush.hChristoph Hellwig
Alpha needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-9-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08asm-generic: improve the flush_dcache_page stubChristoph Hellwig
There is a magic ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE cpp symbol that guards non-stub availability of flush_dcache_pagge. Use that to check if flush_dcache_pagg is implemented. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-8-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08asm-generic: don't include <linux/mm.h> in cacheflush.hChristoph Hellwig
This seems to lead to some crazy include loops when using asm-generic/cacheflush.h on more architectures, so leave it to the arch header for now. [hch@lst.de: fix warning] Link: http://lkml.kernel.org/r/20200520173520.GA11199@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Will Deacon <will@kernel.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-7-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08asm-generic: fix the inclusion guards for cacheflush.hChristoph Hellwig
cacheflush.h uses a somewhat to generic include guard name that clashes with various arch files. Use a more specific one. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-6-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08unicore32: remove flush_cache_user_rangeChristoph Hellwig
flush_cache_user_range is an ARMism not used by any generic or unicore32 specific code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Link: http://lkml.kernel.org/r/20200515143646.3857579-5-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08powerpc: unexport flush_icache_user_rangeChristoph Hellwig
flush_icache_user_range is only used by copy_to_user_page, which is only used by core VM code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20200515143646.3857579-4-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08nds32: unexport flush_icache_pageChristoph Hellwig
flush_icache_page is only used by mm/memory.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-3-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08arm: fix the flush_icache_range arguments in set_fiq_handlerChristoph Hellwig
Patch series "sort out the flush_icache_range mess", v2. flush_icache_range is mostly used for kernel address, except for the following cases: - the nommu brk and mmap implementations - the read_code helper that is only used for binfmt_flat, binfmt_elf_fdpic, and binfmt_aout including the broken ia32 compat version - binfmt_flat itself none of which really are used by a typical MMU enabled kernel, as a.out can only be build for alpha and m68k to start with. But strangely enough commit ae92ef8a4424 ("PATCH] flush icache in correct context") added a "set_fs(KERNEL_DS)" around the flush_icache_range call in the module loader, because apparently m68k assumed user pointers. This series first cleans up the cacheflush implementations, largely by switching as much as possible to the asm-generic version after a few preparations, then moves the misnamed current flush_icache_user_range to a new name, to finally introduce a real flush_icache_user_range to be used for the above use cases to flush the instruction cache for a userspace address range. The last patch then drops the set_fs in the module code and moves it into the m68k implementation. This patch (of 29): The arguments passed look bogus, try to fix them to something that seems to make sense. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Keith Busch <keith.busch@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <songliubraving@fb.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200515143646.3857579-1-hch@lst.de Link: http://lkml.kernel.org/r/20200515143646.3857579-2-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08vhost: convert get_user_pages() --> pin_user_pages()John Hubbard
This code was using get_user_pages*(), in approximately a "Case 5" scenario (accessing the data within a page), using the categorization from [1]. That means that it's time to convert the get_user_pages*() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200529234309.484480-3-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08docs: mm/gup: pin_user_pages.rst: add a "case 5"John Hubbard
Patch series "vhost, docs: convert to pin_user_pages(), new "case 5"" It recently became clear to me that there are some get_user_pages*() callers that don't fit neatly into any of the four cases that are so far listed in pin_user_pages.rst. vhost.c is one of those. Add a Case 5 to the documentation, and refer to that when converting vhost.c. Thanks to Jan Kara for helping me (again) in understanding the interaction between get_user_pages() and page writeback [1]. This is based on today's mmotm, which has a nearby patch to pin_user_pages.rst that rewords cases 3 and 4. Note that I have only compile-tested the vhost.c patch, although that does also include cross-compiling for a few other arches. Any run-time testing would be greatly appreciated. [1] https://lore.kernel.org/r/20200529070343.GL14550@quack2.suse.cz This patch (of 2): There are four cases listed in pin_user_pages.rst. These are intended to help developers figure out whether to use get_user_pages*(), or pin_user_pages*(). However, the four cases do not cover all the situations. For example, drivers/vhost/vhost.c has a "pin, write to page, set page dirty, unpin" case. Add a fifth case, to help explain that there is a general pattern that requires pin_user_pages*() API calls. [jhubbard@nvidia.com: v2] Link: http://lkml.kernel.org/r/20200601052633.853874-2-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Link: http://lkml.kernel.org/r/20200529234309.484480-1-jhubbard@nvidia.com Link: http://lkml.kernel.org/r/20200529234309.484480-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08mm/gup: documentation fix for pin_user_pages*() APIsJohn Hubbard
All of the pin_user_pages*() API calls will cause pages to be dma-pinned. As such, they are all suitable for either DMA, RDMA, and/or Direct IO. The documentation should say so, but it was instead saying that three of the API calls were only suitable for Direct IO. This was discovered when a reviewer wondered why an API call that specifically recommended against Case 2 (DMA/RDMA) was being used in a DMA situation [1]. Fix this by simply deleting those claims. The gup.c comments already refer to the more extensive Documentation/core-api/pin_user_pages.rst, which does have the correct guidance. So let's just write it once, there. [1] https://lore.kernel.org/r/20200529074658.GM30374@kadam Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Jan Kara <jack@suse.cz> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200529084515.46259-1-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08mm/gup: frame_vector: convert get_user_pages() --> pin_user_pages()John Hubbard
This code was using get_user_pages*(), and all of the callers so far were in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages*() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Link: http://lkml.kernel.org/r/20200527223243.884385-3-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08mm/gup: introduce pin_user_pages_locked()John Hubbard
Patch series "mm/gup: introduce pin_user_pages_locked(), use it in frame_vector.c", v2. This adds yet one more pin_user_pages*() variant, and uses that to convert mm/frame_vector.c. With this, along with maybe 20 or 30 other recent patches in various trees, we are close to having the relevant gup call sites converted--with the notable exception of the bio/block layer. This patch (of 2): Introduce pin_user_pages_locked(), which is nearly identical to get_user_pages_locked() except that it sets FOLL_PIN and rejects FOLL_GET. As with other pairs of get_user_pages*() and pin_user_pages() API calls, it's prudent to assert that FOLL_PIN is *not* set in the get_user_pages*() call, so add that as part of this. [jhubbard@nvidia.com: v2] Link: http://lkml.kernel.org/r/20200531234131.770697-2-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Link: http://lkml.kernel.org/r/20200531234131.770697-1-jhubbard@nvidia.com Link: http://lkml.kernel.org/r/20200527223243.884385-1-jhubbard@nvidia.com Link: http://lkml.kernel.org/r/20200527223243.884385-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08mm/gup: update pin_user_pages.rst for "case 3" (mmu notifiers)John Hubbard
Update case 3 so that it covers the use of mmu notifiers, for hardware that does, or does not have replayable page faults. Also, elaborate case 4 slightly, as it was quite cryptic. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Jonathan Corbet <corbet@lwn.net> Link: http://lkml.kernel.org/r/20200527194953.11130-1-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08mm/gup.c: convert to use get_user_{page|pages}_fast_only()Souptick Joarder
API __get_user_pages_fast() renamed to get_user_pages_fast_only() to align with pin_user_pages_fast_only(). As part of this we will get rid of write parameter. Instead caller will pass FOLL_WRITE to get_user_pages_fast_only(). This will not change any existing functionality of the API. All the callers are changed to pass FOLL_WRITE. Also introduce get_user_page_fast_only(), and use it in a few places that hard-code nr_pages to 1. Updated the documentation of the API. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> [arch/powerpc/kvm] Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michal Suchanek <msuchanek@suse.de> Link: http://lkml.kernel.org/r/1590396812-31277-1-git-send-email-jrdr.linux@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08kernel/sysctl.c: ignore out-of-range taint bits introduced via kernel.taintedRafael Aquini
Users with SYS_ADMIN capability can add arbitrary taint flags to the running kernel by writing to /proc/sys/kernel/tainted or issuing the command 'sysctl -w kernel.tainted=...'. This interface, however, is open for any integer value and this might cause an invalid set of flags being committed to the tainted_mask bitset. This patch introduces a simple way for proc_taint() to ignore any eventual invalid bit coming from the user input before committing those bits to the kernel tainted_mask. Signed-off-by: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Link: http://lkml.kernel.org/r/20200512223946.888020-1-aquini@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08panic: add sysctl to dump all CPUs backtraces on oops eventGuilherme G. Piccoli
Usually when the kernel reaches an oops condition, it's a point of no return; in case not enough debug information is available in the kernel splat, one of the last resorts would be to collect a kernel crash dump and analyze it. The problem with this approach is that in order to collect the dump, a panic is required (to kexec-load the crash kernel). When in an environment of multiple virtual machines, users may prefer to try living with the oops, at least until being able to properly shutdown their VMs / finish their important tasks. This patch implements a way to collect a bit more debug details when an oops event is reached, by printing all the CPUs backtraces through the usage of NMIs (on architectures that support that). The sysctl added (and documented) here was called "oops_all_cpu_backtrace", and when set will (as the name suggests) dump all CPUs backtraces. Far from ideal, this may be the last option though for users that for some reason cannot panic on oops. Most of times oopses are clear enough to indicate the kernel portion that must be investigated, but in virtual environments it's possible to observe hypervisor/KVM issues that could lead to oopses shown in other guests CPUs (like virtual APIC crashes). This patch hence aims to help debug such complex issues without resorting to kdump. Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200327224116.21030-1-gpiccoli@canonical.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08kernel/hung_task.c: introduce sysctl to print all traces when a hung task is ↵Guilherme G. Piccoli
detected Commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks before panic") introduced a change in that we started to show all CPUs backtraces when a hung task is detected _and_ the sysctl/kernel parameter "hung_task_panic" is set. The idea is good, because usually when observing deadlocks (that may lead to hung tasks), the culprit is another task holding a lock and not necessarily the task detected as hung. The problem with this approach is that dumping backtraces is a slightly expensive task, specially printing that on console (and specially in many CPU machines, as servers commonly found nowadays). So, users that plan to collect a kdump to investigate the hung tasks and narrow down the deadlock definitely don't need the CPUs backtrace on dmesg/console, which will delay the panic and pollute the log (crash tool would easily grab all CPUs traces with 'bt -a' command). Also, there's the reciprocal scenario: some users may be interested in seeing the CPUs backtraces but not have the system panic when a hung task is detected. The current approach hence is almost as embedding a policy in the kernel, by forcing the CPUs backtraces' dump (only) on hung_task_panic. This patch decouples the panic event on hung task from the CPUs backtraces dump, by creating (and documenting) a new sysctl called "hung_task_all_cpu_backtrace", analog to the approach taken on soft/hard lockups, that have both a panic and an "all_cpu_backtrace" sysctl to allow individual control. The new mechanism for dumping the CPUs backtraces on hung task detection respects "hung_task_warnings" by not dumping the traces in case there's no warnings left. Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: http://lkml.kernel.org/r/20200327223646.20779-1-gpiccoli@canonical.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliasesGuilherme G. Piccoli
After a recent change introduced by Vlastimil's series [0], kernel is able now to handle sysctl parameters on kernel command line; also, the series introduced a simple infrastructure to convert legacy boot parameters (that duplicate sysctls) into sysctl aliases. This patch converts the watchdog parameters softlockup_panic and {hard,soft}lockup_all_cpu_backtrace to use the new alias infrastructure. It fixes the documentation too, since the alias only accepts values 0 or 1, not the full range of integers. We also took the opportunity here to improve the documentation of the previously converted hung_task_panic (see the patch series [0]) and put the alias table in alphabetical order. [0] http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Link: http://lkml.kernel.org/r/20200507214624.21911-1-gpiccoli@canonical.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08lib/test_sysctl: support testing of sysctl. boot parameterVlastimil Babka
Testing is done by a new parameter debug.test_sysctl.boot_int which defaults to 0 and it's expected that the tester passes a boot parameter that sets it to 1. The test checks if it's set to 1. To distinguish true failure from parameter not being set, the test checks /proc/cmdline for the expected parameter, and whether test_sysctl is built-in and not a module. [vbabka@suse.cz: skip the new test if boot_int sysctl is not present] Link: http://lkml.kernel.org/r/305af605-1e60-cf84-fada-6ce1ca37c102@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: David Rientjes <rientjes@google.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200427180433.7029-6-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>