summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcDavid S. Miller
2020-06-07Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
2020-06-07fix a braino in "sparc32: fix register window handling in genregs32_[gs]et()"Al Viro
lost npc in PTRACE_SETREGSET, breaking PTRACE_SETREGS as well Fixes: cf51e129b968 "sparc32: fix register window handling in genregs32_[gs]et()" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-06-02Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Some ptrace fixes from Al. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-02Merge branch 'sparc32-SRMMU-fixes-for-SMP'David S. Miller
Will Deacon says: ==================== sparc32 SRMMU fixes for SMP Enabling SMP for sparc32 uncovered some issues in the SRMMU page-table allocation code. One of these was introduced by me, but the other two seem to have been there a while and are probably just exposed more easily by my recent changes. Tested on QEMU. I'm assuming these will go via David's tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-02sparc32: mm: Only call ctor()/dtor() functions for first and last userWill Deacon
The SRMMU page-table allocator allocates multiple PTE tables per page, since they are only 1K in size. However, this means that calls to pgtable_pte_page_{ctor,dtor}() must be serialised and performed only by the first and last page-table allocation for the page respectively. Use the page reference count to track how many PTE tables we have allocated for a given page returned by the SRMMU allocator and only call the ctor()/dtor() functions for the first and last user respectively. Cc: David S. Miller <davem@davemloft.net> Fixes: 8c8f3156dd40 ("sparc32: mm: Reduce allocation size for PMD and PTE tables") Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-02sparc32: mm: Disable SPLIT_PTLOCK_CPUSWill Deacon
The SRMMU page-table allocator is not compatible with SPLIT_PTLOCK_CPUS for two major reasons: 1. Pages are allocated via memblock, and therefore the ptl is not cleared by prep_new_page(), which is expected by ptlock_init() 2. Multiple PTE tables can exist in a single page, causing them to share the same ptl and deadlock when attempting to take the same lock twice (e.g. as part of copy_page_range()). Ensure that SPLIT_PTLOCK_CPUS is not selected for SPARC32. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-02sparc32: mm: Don't try to free page-table pages if ctor() failsWill Deacon
The pages backing page-table allocations for SRMMU are allocated via memblock as part of the "nocache" region initialisation during srmmu_paging_init() and should not be freed even if a later call to pgtable_pte_page_ctor() fails. Remove the broken call to __free_page(). Cc: David S. Miller <davem@davemloft.net> Cc: Kirill A. Shutemov <kirill@shutemov.name> Fixes: 1ae9ae5f7df7 ("sparc: handle pgtable_page_ctor() fail") Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-02sparc32: register memory occupied by kernel as memblock.memoryMike Rapoport
sparc32 never registered the memory occupied by the kernel image with memblock_add() and it only reserved this memory with meblock_reserve(). With openbios as system firmware, the memory occupied by the kernel is reserved in openbios and removed from mem.available. The prom setup code in the kernel uses mem.available to set up the memory banks and essentially there is a hole for the memory occupied by the kernel image. Later in bootmem_init() this memory is memblock_reserve()d. Up until recently, memmap initialization would call __init_single_page() for the pages in that hole, the free_low_memory_core_early() would mark them as reserved and everything would be Ok. After the change in memmap initialization introduced by the commit "mm: memmap_init: iterate over memblock regions rather that check each PFN", the hole is skipped and the page structs for it are not initialized. And when they are passed from memblock to page allocator as reserved, the latter gets confused. Simply registering the memory occupied by the kernel with memblock_add() resolves this issue. Tested on qemu-system-sparc with Debian Etch [1] userspace. [1] https://people.debian.org/~aurel32/qemu/sparc/debian_etch_sparc_small.qcow2 Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lkml.kernel.org/r/20200517000050.GA87467@roeck-us.nlllllet/ Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-02sparc: remove unused header file nfs_fs.hAnupam Aggarwal
Remove unused header file linux/nfs_fs.h Signed-off-by: Anupam Aggarwal <anupam.al@samsung.com> Signed-off-by: Vivek Trivedi <t.vivek@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20sparc32: fix register window handling in genregs32_[gs]et()Al Viro
It needs access_process_vm() if the traced process does not share mm with the caller. Solution is similar to what sparc64 does. Note that genregs32_set() is only ever called with pos being 0 or 32 * sizeof(u32) (the latter - as part of PTRACE_SETREGS handling). Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-17sparc64: fix misuses of access_process_vm() in genregs32_[sg]et()Al Viro
We do need access_process_vm() to access the target's reg_window. However, access to caller's memory (storing the result in genregs32_get(), fetching the new values in case of genregs32_set()) should be done by normal uaccess primitives. Fixes: ad4f95764040 ([SPARC64]: Fix user accesses in regset code.) Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-17oradax: convert get_user_pages() --> pin_user_pages()John Hubbard
This code was using get_user_pages_fast(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages_fast() + put_page() calls to pin_user_pages_fast() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13Merge branch 'sparc-scnprintf'David S. Miller
Chen Zhou says: ==================== sparc: use scnprintf() in show() methods snprintf() returns the number of bytes that would be written, which may be greater than the the actual length to be written. show() methods should return the number of bytes printed into the buffer. This is the return value of scnprintf(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13sparc: use scnprintf() in show_pciobppath_attr() in vio.cChen Zhou
snprintf() returns the number of bytes that would be written, which may be greater than the the actual length to be written. show_pciobppath_attr() should return the number of bytes printed into the buffer. This is the return value of scnprintf(). Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13sparc: use scnprintf() in show_pciobppath_attr() in pci.cChen Zhou
snprintf() returns the number of bytes that would be written, which may be greater than the the actual length to be written. show_pciobppath_attr() should return the number of bytes printed into the buffer. This is the return value of scnprintf(). Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13tty: vcc: Fix error return code in vcc_probe()Wei Yongjun
Fix to return negative error code -ENOMEM from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13Merge branch 'Rework-sparc32-page-table-layout'David S. Miller
Will Deacon says: ==================== Rework sparc32 page-table layout This is a reposting of the patch series I sent previously to rework the sparc32 page-table layout so that 'pmd_t' can be used safely with READ_ONCE(): https://lore.kernel.org/lkml/20200324104005.11279-1-will@kernel.org This is blocking the READ_ONCE() rework, which in turn allows us to bump the minimum GCC version for building the kernel up to 4.8. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13sparc32: mm: Reduce allocation size for PMD and PTE tablesWill Deacon
Now that the page table allocator can free page table allocations smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations to avoid needlessly wasting memory. Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *Will Deacon
Change the 'pgtable_t' type for sparc32 so that it represents the uncached virtual address of the PTE table, rather than the underlying 'struct page'. This allows us to free page table allocations smaller than a page. Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13sparc32: mm: Restructure sparc32 MMU page-table layoutWill Deacon
The "SRMMU" supports 4k pages using a fixed three-level walk with a 256-entry PGD and 64-entry PMD/PTE levels. In order to fill a page with a 'pgtable_t', the SRMMU code allocates four native PTE tables into a single PTE allocation and similarly for the PMD level, leading to an array of 16 physical pointers in a 'pmd_t' This breaks the generic code which assumes READ_ONCE(*pmd) will be word sized. In a manner similar to ef22d8abd876 ("m68k: mm: Restructure Motorola MMU page-table layout"), this patch implements the native page-table setup directly. This significantly increases the page-table memory overhead, but will be addresses in a subsequent patch. Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13sparc32: mm: Fix argument checking in __srmmu_get_nocache()Will Deacon
The 'size' argument to __srmmu_get_nocache() is a number of bytes not a shift value, so fix up the sanity checking to treat it properly. Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12Merge tag 'trace-v5.7-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Fixes to previous fixes. Unfortunately, the last set of fixes introduced some minor bugs: - The bootconfig apply_xbc() leak fix caused the application to return a positive number on success, when it should have returned zero. - The preempt_irq_delay_thread fix to make the creation code wait for the kthread to finish to prevent it from executing after module unload, can now cause the kthread to exit before it even executes (preventing it to run its tests). - The fix to the bootconfig that fixed the initrd to remove the bootconfig from causing the kernel to panic, now prints a warning that the bootconfig is not found, even when bootconfig is not on the command line" * tag 'trace-v5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: bootconfig: Fix to prevent warning message if no bootconfig option tracing: Wait for preempt irq delay thread to execute tools/bootconfig: Fix apply_xbc() to return zero on success
2020-05-12Merge tag 'gpio-v5.7-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Some GPIO fixes for v5.7, slightly overdue. Been learning MMUs and KASan that is why it's late. Bartosz helped me out, luckily! - Fix pin configuration in the PCA953x driver - Ruggedize the watch/unwatch ioctl() - Possible call to a sleeping function when holding a spinlock, avoid this - Fix UML builds with DT overlays - Mask Tegra GPIO IRQs during shutdown()" * tag 'gpio-v5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: tegra: mask GPIO IRQs during IRQ shutdown gpio: of: Build fails if CONFIG_OF_DYNAMIC enabled without CONFIG_OF_GPIO gpiolib: don't call sleeping functions with a spinlock taken gpiolib: improve the robustness of watch/unwatch ioctl() gpio: pca953x: Fix pca953x_gpio_set_config
2020-05-12Merge tag 'gfs2-v5.7-rc1.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: "Various gfs2 fixes. Fixes for bugs prior to v5.7: - Fix random block reads when reading fragmented journals (v5.2) - Fix a possible random memory access in gfs2_walk_metadata (v5.3) Fixes for v5.7: - Fix several overlooked gfs2_qa_get / gfs2_qa_put imbalances - Fix several bugs in the new filesystem withdraw logic" * tag 'gfs2-v5.7-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: Revert "gfs2: Don't demote a glock until its revokes are written" gfs2: If go_sync returns error, withdraw but skip invalidate gfs2: Grab glock reference sooner in gfs2_add_revoke gfs2: don't call quota_unhold if quotas are not locked gfs2: move privileged user check to gfs2_quota_lock_check gfs2: remove check for quotas on in gfs2_quota_check gfs2: Change BUG_ON to an assert_withdraw in gfs2_quota_change gfs2: Fix problems regarding gfs2_qa_get and _put gfs2: More gfs2_find_jhead fixes gfs2: Another gfs2_walk_metadata fix gfs2: Fix use-after-free in gfs2_logd after withdraw gfs2: Fix BUG during unmount after file system withdraw gfs2: Fix error exit in do_xmote gfs2: fix withdraw sequence deadlock
2020-05-12bootconfig: Fix to prevent warning message if no bootconfig optionMasami Hiramatsu
Commit de462e5f1071 ("bootconfig: Fix to remove bootconfig data from initrd while boot") causes a cosmetic regression on dmesg, which warns "no bootconfig data" message without bootconfig cmdline option. Fix setup_boot_config() by moving no bootconfig check after commandline option check. Link: http://lkml.kernel.org/r/9b1ba335-071d-c983-89a4-2677b522dcc8@molgen.mpg.de Link: http://lkml.kernel.org/r/158916116468.21787.14558782332170588206.stgit@devnote2 Fixes: de462e5f1071 ("bootconfig: Fix to remove bootconfig data from initrd while boot") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-11tracing: Wait for preempt irq delay thread to executeSteven Rostedt (VMware)
A bug report was posted that running the preempt irq delay module on a slow machine, and removing it quickly could lead to the thread created by the modlue to execute after the module is removed, and this could cause the kernel to crash. The fix for this was to call kthread_stop() after creating the thread to make sure it finishes before allowing the module to be removed. Now this caused the opposite problem on fast machines. What now happens is the kthread_stop() can cause the kthread never to execute and the test never to run. To fix this, add a completion and wait for the kthread to execute, then wait for it to end. This issue caused the ftracetest selftests to fail on the preemptirq tests. Link: https://lore.kernel.org/r/20200510114210.15d9e4af@oasis.local.home Cc: stable@vger.kernel.org Fixes: d16a8c31077e ("tracing: Wait for preempt irq delay thread to finish") Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-11tools/bootconfig: Fix apply_xbc() to return zero on successSteven Rostedt (VMware)
The return of apply_xbc() returns the result of the last write() call, which is not what is expected. It should only return zero on success. Link: https://lore.kernel.org/r/20200508093059.GF9365@kadam Fixes: 8842604446d1 ("tools/bootconfig: Fix resource leak in apply_xbc()") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-11Merge tag 'nfsd-5.7-rc-2' of git://git.linux-nfs.org/projects/cel/cel-2.6Linus Torvalds
Pull nfsd fixes from Chuck Lever: "Resolve a data integrity problem with NFSD that I inadvertently introduced last year. The change I made makes the NFS server's duplicate reply cache ineffective when krb5i or krb5p are in use, thus allowing the replay of non-idempotent NFS requests such as RENAME, SETATTR, or even WRITEs" * tag 'nfsd-5.7-rc-2' of git://git.linux-nfs.org/projects/cel/cel-2.6: SUNRPC: Revert 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()") SUNRPC: Fix GSS privacy computation of auth->au_ralign SUNRPC: Add "@len" parameter to gss_unwrap()
2020-05-11drm: fix trivial field description cut-and-paste errorLinus Torvalds
As reported by Amarnath Baliyase, the drm_mode_status enumeration documentation describes MODE_V_ILLEGAL as "mode has illegal horizontal timings". But that's just a cut-and-paste error from the previous line. The "V" stands for vertical, of course. I'm just fixing this directly rather than bothering with going through the proper channels. Less work for everybody. Reported-by: Amarnath Baliyase <baliyaseamarnath@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-10Linux 5.7-rc5Linus Torvalds
2020-05-10Merge tag 'x86-urgent-2020-05-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes for x86: - Ensure that direct mapping alias is always flushed when changing page attributes. The optimization for small ranges failed to do so when the virtual address was in the vmalloc or module space. - Unbreak the trace event registration for syscalls without arguments caused by the refactoring of the SYSCALL_DEFINE0() macro. - Move the printk in the TSC deadline timer code to a place where it is guaranteed to only be called once during boot and cannot be rearmed by clearing warn_once after boot. If it's invoked post boot then lockdep rightfully complains about a potential deadlock as the calling context is different. - A series of fixes for objtool and the ORC unwinder addressing variety of small issues: - Stack offset tracking for indirect CFAs in objtool ignored subsequent pushs and pops - Repair the unwind hints in the register clearing entry ASM code - Make the unwinding in the low level exit to usermode code stop after switching to the trampoline stack. The unwind hint is no longer valid and the ORC unwinder emits a warning as it can't find the registers anymore. - Fix unwind hints in switch_to_asm() and rewind_stack_do_exit() which caused objtool to generate bogus ORC data. - Prevent unwinder warnings when dumping the stack of a non-current task as there is no way to be sure about the validity because the dumped stack can be a moving target. - Make the ORC unwinder behave the same way as the frame pointer unwinder when dumping an inactive tasks stack and do not skip the first frame. - Prevent ORC unwinding before ORC data has been initialized - Immediately terminate unwinding when a unknown ORC entry type is found. - Prevent premature stop of the unwinder caused by IRET frames. - Fix another infinite loop in objtool caused by a negative offset which was not catched. - Address a few build warnings in the ORC unwinder and add missing static/ro_after_init annotations" * tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/unwind/orc: Move ORC sorting variables under !CONFIG_MODULES x86/apic: Move TSC deadline timer debug printk ftrace/x86: Fix trace event registration for syscalls without arguments x86/mm/cpa: Flush direct map alias during cpa objtool: Fix infinite loop in for_offset_range() x86/unwind/orc: Fix premature unwind stoppage due to IRET frames x86/unwind/orc: Fix error path for bad ORC entry type x86/unwind/orc: Prevent unwinding before ORC initialization x86/unwind/orc: Don't skip the first frame for inactive tasks x86/unwind: Prevent false warnings for non-current tasks x86/unwind/orc: Convert global variables to static x86/entry/64: Fix unwind hints in rewind_stack_do_exit() x86/entry/64: Fix unwind hints in __switch_to_asm() x86/entry/64: Fix unwind hints in kernel exit path x86/entry/64: Fix unwind hints in register clearing code objtool: Fix stack offset tracking for indirect CFAs
2020-05-10Merge tag 'objtool-urgent-2020-05-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Thomas Gleixner: "A single fix for objtool to prevent an infinite loop in the jump table search which can be triggered when building the kernel with '-ffunction-sections'" * tag 'objtool-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix infinite loop in find_jump_table()
2020-05-10Merge tag 'locking-urgent-2020-05-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Thomas Gleixner: "A single fix for the fallout of the recent futex uacess rework. With those changes GCC9 fails to analyze arch_futex_atomic_op_inuser() correctly and emits a 'maybe unitialized' warning. While we usually ignore compiler stupidity the conditional store is pointless anyway because the correct case has to store. For the fault case the extra store does no harm" * tag 'locking-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ARM: futex: Address build warning
2020-05-10Merge tag 'iommu-fixes-v5.7-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Race condition fixes for the AMD IOMMU driver. These are five patches fixing two race conditions around increase_address_space(). The first race condition was around the non-atomic update of the domain page-table root pointer and the variable containing the page-table depth (called mode). This is fixed now be merging page-table root and mode into one 64-bit field which is read/written atomically. The second race condition was around updating the page-table root pointer and making it public before the hardware caches were flushed. This could cause addresses to be mapped and returned to drivers which are not reachable by IOMMU hardware yet, causing IO page-faults. This is fixed too by adding the necessary flushes before a new page-table root is published. Related to the race condition fixes these patches also add a missing domain_flush_complete() barrier to update_domain() and a fix to bail out of the loop which tries to increase the address space when the call to increase_address_space() fails. Qian was able to trigger the race conditions under high load and memory pressure within a few days of testing. He confirmed that he has seen no issues anymore with the fixes included here. - Fix for a list-handling bug in the VirtIO IOMMU driver. * tag 'iommu-fixes-v5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/virtio: Reverse arguments to list_add iommu/amd: Do not flush Device Table in iommu_map_page() iommu/amd: Update Device Table in increase_address_space() iommu/amd: Call domain_flush_complete() in update_domain() iommu/amd: Do not loop forever when trying to increase address space iommu/amd: Fix race in increase_address_space()/fetch_pte()
2020-05-10Merge tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - a small series fixing a use-after-free of bdi name (Christoph,Yufen) - NVMe fix for a regression with the smaller CQ update (Alexey) - NVMe fix for a hang at namespace scanning error recovery (Sagi) - fix race with blk-iocost iocg->abs_vdebt updates (Tejun) * tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block: nvme: fix possible hang when ns scanning fails during error recovery nvme-pci: fix "slimmer CQ head update" bdi: add a ->dev_name field to struct backing_dev_info bdi: use bdi_dev_name() to get device name bdi: move bdi_dev_name out of line vboxsf: don't use the source name in the bdi name iocost: protect iocg->abs_vdebt with iocg->waitq.lock
2020-05-09gcc-10: mark more functions __init to avoid section mismatch warningsLinus Torvalds
It seems that for whatever reason, gcc-10 ends up not inlining a couple of functions that used to be inlined before. Even if they only have one single callsite - it looks like gcc may have decided that the code was unlikely, and not worth inlining. The code generation difference is harmless, but caused a few new section mismatch errors, since the (now no longer inlined) function wasn't in the __init section, but called other init functions: Section mismatch in reference from the function kexec_free_initrd() to the function .init.text:free_initrd_mem() Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memremap() Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memunmap() So add the appropriate __init annotation to make modpost not complain. In both cases there were trivially just a single callsite from another __init function. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09Merge tag 'riscv-for-linus-5.7-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "A smattering of fixes and cleanups: - Dead code removal. - Exporting riscv_cpuid_to_hartid_mask for modules. - Per-CPU tracking of ISA features. - Setting max_pfn correctly when probing memory. - Adding a note to the VDSO so glibc can check the kernel's version without a uname(). - A fix to force the bootloader to initialize the boot spin tables, which still get used as a fallback when SBI-0.1 is enabled" * tag 'riscv-for-linus-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Remove unused code from STRICT_KERNEL_RWX riscv: force __cpu_up_ variables to put in data section riscv: add Linux note to vdso riscv: set max_pfn to the PFN of the last page RISC-V: Remove N-extension related defines RISC-V: Add bitmap reprensenting ISA features common across CPUs RISC-V: Export riscv_cpuid_to_hartid_mask() API
2020-05-09gcc-10: avoid shadowing standard library 'free()' in cryptoLinus Torvalds
gcc-10 has started warning about conflicting types for a few new built-in functions, particularly 'free()'. This results in warnings like: crypto/xts.c:325:13: warning: conflicting types for built-in function ‘free’; expected ‘void(void *)’ [-Wbuiltin-declaration-mismatch] because the crypto layer had its local freeing functions called 'free()'. Gcc-10 is in the wrong here, since that function is marked 'static', and thus there is no chance of confusion with any standard library function namespace. But the simplest thing to do is to just use a different name here, and avoid this gcc mis-feature. [ Side note: gcc knowing about 'free()' is in itself not the mis-feature: the semantics of 'free()' are special enough that a compiler can validly do special things when seeing it. So the mis-feature here is that gcc thinks that 'free()' is some restricted name, and you can't shadow it as a local static function. Making the special 'free()' semantics be a function attribute rather than tied to the name would be the much better model ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09gcc-10: disable 'restrict' warning for nowLinus Torvalds
gcc-10 now warns about passing aliasing pointers to functions that take restricted pointers. That's actually a great warning, and if we ever start using 'restrict' in the kernel, it might be quite useful. But right now we don't, and it turns out that the only thing this warns about is an idiom where we have declared a few functions to be "printf-like" (which seems to make gcc pick up the restricted pointer thing), and then we print to the same buffer that we also use as an input. And people do that as an odd concatenation pattern, with code like this: #define sysfs_show_gen_prop(buffer, fmt, ...) \ snprintf(buffer, PAGE_SIZE, "%s"fmt, buffer, __VA_ARGS__) where we have 'buffer' as both the destination of the final result, and as the initial argument. Yes, it's a bit questionable. And outside of the kernel, people do have standard declarations like int snprintf( char *restrict buffer, size_t bufsz, const char *restrict format, ... ); where that output buffer is marked as a restrict pointer that cannot alias with any other arguments. But in the context of the kernel, that 'use snprintf() to concatenate to the end result' does work, and the pattern shows up in multiple places. And we have not marked our own version of snprintf() as taking restrict pointers, so the warning is incorrect for now, and gcc picks it up on its own. If we do start using 'restrict' in the kernel (and it might be a good idea if people find places where it matters), we'll need to figure out how to avoid this issue for snprintf and friends. But in the meantime, this warning is not useful. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09gcc-10: disable 'stringop-overflow' warning for nowLinus Torvalds
This is the final array bounds warning removal for gcc-10 for now. Again, the warning is good, and we should re-enable all these warnings when we have converted all the legacy array declaration cases to flexible arrays. But in the meantime, it's just noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09nvme: fix possible hang when ns scanning fails during error recoverySagi Grimberg
When the controller is reconnecting, the host fails I/O and admin commands as the host cannot reach the controller. ns scanning may revalidate namespaces during that period and it is wrong to remove namespaces due to these failures as we may hang (see 205da2434301). One command that may fail is nvme_identify_ns_descs. Since we return success due to having ns identify descriptor list optional, we continue to compare ns identifiers in nvme_revalidate_disk, obviously fail and return -ENODEV to nvme_validate_ns, which will remove the namespace. Exactly what we don't want to happen. Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional") Tested-by: Anton Eidelman <anton@lightbitslabs.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09nvme-pci: fix "slimmer CQ head update"Alexey Dobriyan
Pre-incrementing ->cq_head can't be done in memory because OOB value can be observed by another context. This devalues space savings compared to original code :-\ $ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-32 (-32) Function old new delta nvme_poll_irqdisable 464 456 -8 nvme_poll 455 447 -8 nvme_irq 388 380 -8 nvme_dev_disable 955 947 -8 But the code is minimal now: one read for head, one read for q_depth, one increment, one comparison, single instruction phase bit update and one write for new head. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: John Garry <john.garry@huawei.com> Tested-by: John Garry <john.garry@huawei.com> Fixes: e2a366a4b0feaeb ("nvme-pci: slimmer CQ head update") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09bdi: add a ->dev_name field to struct backing_dev_infoChristoph Hellwig
Cache a copy of the name for the life time of the backing_dev_info structure so that we can reference it even after unregistering. Fixes: 68f23b89067f ("memcg: fix a crash in wb_workfn when a device disappears") Reported-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09bdi: use bdi_dev_name() to get device nameYufen Yu
Use the common interface bdi_dev_name() to get device name. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Add missing <linux/backing-dev.h> include BFQ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09gcc-10: disable 'array-bounds' warning for nowLinus Torvalds
This is another fine warning, related to the 'zero-length-bounds' one, but hitting the same historical code in the kernel. Because C didn't historically support flexible array members, we have code that instead uses a one-sized array, the same way we have cases of zero-sized arrays. The one-sized arrays come from either not wanting to use the gcc zero-sized array extension, or from a slight convenience-feature, where particularly for strings, the size of the structure now includes the allocation for the final NUL character. So with a "char name[1];" at the end of a structure, you can do things like v = my_malloc(sizeof(struct vendor) + strlen(name)); and avoid the "+1" for the terminator. Yes, the modern way to do that is with a flexible array, and using 'offsetof()' instead of 'sizeof()', and adding the "+1" by hand. That also technically gets the size "more correct" in that it avoids any alignment (and thus padding) issues, but this is another long-term cleanup thing that will not happen for 5.7. So disable the warning for now, even though it's potentially quite useful. Having a slew of warnings that then hide more urgent new issues is not an improvement. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09gcc-10: disable 'zero-length-bounds' warning for nowLinus Torvalds
This is a fine warning, but we still have a number of zero-length arrays in the kernel that come from the traditional gcc extension. Yes, they are getting converted to flexible arrays, but in the meantime the gcc-10 warning about zero-length bounds is very verbose, and is hiding other issues. I missed one actual build failure because it was hidden among hundreds of lines of warning. Thankfully I caught it on the second go before pushing things out, but it convinced me that I really need to disable the new warnings for now. We'll hopefully be all done with our conversion to flexible arrays in the not too distant future, and we can then re-enable this warning. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09Stop the ad-hoc games with -Wno-maybe-initializedLinus Torvalds
We have some rather random rules about when we accept the "maybe-initialized" warnings, and when we don't. For example, we consider it unreliable for gcc versions < 4.9, but also if -O3 is enabled, or if optimizing for size. And then various kernel config options disabled it, because they know that they trigger that warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES). And now gcc-10 seems to be introducing a lot of those warnings too, so it falls under the same heading as 4.9 did. At the same time, we have a very straightforward way to _enable_ that warning when wanted: use "W=2" to enable more warnings. So stop playing these ad-hoc games, and just disable that warning by default, with the known and straight-forward "if you want to work on the extra compiler warnings, use W=123". Would it be great to have code that is always so obvious that it never confuses the compiler whether a variable is used initialized or not? Yes, it would. In a perfect world, the compilers would be smarter, and our source code would be simpler. That's currently not the world we live in, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09Merge tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix finish_wait() balancing in file cancelation (Xiaoguang) - Ensure early cleanup of resources in ring map failure (Xiaoguang) - Ensure IORING_OP_SLICE does the right file mode checks (Pavel) - Remove file opening from openat/openat2/statx, it's not needed and messes with O_PATH * tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block: io_uring: don't use 'fd' for openat/openat2/statx splice: move f_mode checks to do_{splice,tee}() io_uring: handle -EFAULT properly in io_uring_setup() io_uring: fix mismatched finish_wait() calls in io_uring_cancel_files()
2020-05-08Revert "gfs2: Don't demote a glock until its revokes are written"Bob Peterson
This reverts commit df5db5f9ee112e76b5202fbc331f990a0fc316d6. This patch fixes a regression: patch df5db5f9ee112 allowed function run_queue() to bypass its call to do_xmote() if revokes were queued for the glock. That's wrong because its call to do_xmote() is what is responsible for calling the go_sync() glops functions to sync both the ail list and any revokes queued for it. By bypassing the call, gfs2 could get into a stand-off where the glock could not be demoted until its revokes are written back, but the revokes would not be written back because do_xmote() was never called. It "sort of" works, however, because there are other mechanisms like the log flush daemon (logd) that can sync the ail items and revokes, if it deems it necessary. The problem is: without file system pressure, it might never deem it necessary. Signed-off-by: Bob Peterson <rpeterso@redhat.com>