summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-23mm/mmu_notifier: add an interval tree notifierJason Gunthorpe
Of the 13 users of mmu_notifiers, 8 of them use only invalidate_range_start/end() and immediately intersect the mmu_notifier_range with some kind of internal list of VAs. 4 use an interval tree (i915_gem, radeon_mn, umem_odp, hfi1). 4 use a linked list of some kind (scif_dma, vhost, gntdev, hmm) And the remaining 5 either don't use invalidate_range_start() or do some special thing with it. It turns out that building a correct scheme with an interval tree is pretty complicated, particularly if the use case is synchronizing against another thread doing get_user_pages(). Many of these implementations have various subtle and difficult to fix races. This approach puts the interval tree as common code at the top of the mmu notifier call tree and implements a shareable locking scheme. It includes: - An interval tree tracking VA ranges, with per-range callbacks - A read/write locking scheme for the interval tree that avoids sleeping in the notifier path (for OOM killer) - A sequence counter based collision-retry locking scheme to tell device page fault that a VA range is being concurrently invalidated. This is based on various ideas: - hmm accumulates invalidated VA ranges and releases them when all invalidates are done, via active_invalidate_ranges count. This approach avoids having to intersect the interval tree twice (as umem_odp does) at the potential cost of a longer device page fault. - kvm/umem_odp use a sequence counter to drive the collision retry, via invalidate_seq - a deferred work todo list on unlock scheme like RTNL, via deferred_list. This makes adding/removing interval tree members more deterministic - seqlock, except this version makes the seqlock idea multi-holder on the write side by protecting it with active_invalidate_ranges and a spinlock To minimize MM overhead when only the interval tree is being used, the entire SRCU and hlist overheads are dropped using some simple branches. Similarly the interval tree overhead is dropped when in hlist mode. The overhead from the mandatory spinlock is broadly the same as most of existing users which already had a lock (or two) of some sort on the invalidation path. Link: https://lore.kernel.org/r/20191112202231.3856-3-jgg@ziepe.ca Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-12mm/mmu_notifier: define the header pre-processor parts even if disabledJason Gunthorpe
Now that we have KERNEL_HEADER_TEST all headers are generally compile tested, so relying on makefile tricks to avoid compiling code that depends on CONFIG_MMU_NOTIFIER is more annoying. Instead follow the usual pattern and provide most of the header with only the functions stubbed out when CONFIG_MMU_NOTIFIER is disabled. This ensures code compiles no matter what the config setting is. While here, struct mmu_notifier_mm is private to mmu_notifier.c, move it. Link: https://lore.kernel.org/r/20191112202231.3856-2-jgg@ziepe.ca Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-01Merge branch 'odp_rework' into hmm.gitJason Gunthorpe
This branch is shared with the rdma.git for dependencies in following patches: ==================== In order to hoist the interval tree code out of the drivers and into the mmu_notifiers it is necessary for the drivers to not use the interval tree for other things. This series replaces the interval tree with an xarray and along the way re-aligns all the locking to use a sensible SRCU model where the 'update' step is done by modifying an xarray. The result is overall much simpler and with less locking in the critical path. Many functions were reworked for clarity and small details like using 'imr' to refer to the implicit MR make the entire code flow here more readable. This also squashes at least two race bugs on its own, and quite possibily more that haven't been identified. ==================== * branch 'odp_rework': RDMA/odp: Remove broken debugging call to invalidate_range RDMA/mlx5: Do not race with mlx5_ib_invalidate_range during create and destroy RDMA/mlx5: Do not store implicit children in the odp_mkeys xarray RDMA/mlx5: Rework implicit ODP destroy RDMA/mlx5: Avoid double lookups on the pagefault path RDMA/mlx5: Reduce locking in implicit_mr_get_data() RDMA/mlx5: Use an xarray for the children of an implicit ODP RDMA/mlx5: Split implicit handling from pagefault_mr RDMA/mlx5: Set the HW IOVA of the child MRs to their place in the tree RDMA/mlx5: Lift implicit_mr_alloc() into the two routines that call it RDMA/mlx5: Rework implicit_mr_get_data RDMA/mlx5: Delete struct mlx5_priv->mkey_table RDMA/mlx5: Use a dedicated mkey xarray for ODP RDMA/mlx5: Split sig_err MR data into its own xarray RDMA/mlx5: Use SRCU properly in ODP prefetch
2019-10-29mm/hmm: allow snapshot of the special zero pageRalph Campbell
If a device driver like nouveau tries to use hmm_range_fault() to access the special shared zero page in system memory, hmm_range_fault() will return -EFAULT and kill the process. Allow hmm_range_fault() to return success (0) when the CPU pagetable entry points to the special shared zero page. page_to_pfn() and pfn_to_page() are defined on the zero page so just handle it like any other page. Link: https://lore.kernel.org/r/20191023195515.13168-3-rcampbell@nvidia.com Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/odp: Remove broken debugging call to invalidate_rangeJason Gunthorpe
invalidate_range() also obtains the umem_mutex which is being held at this point, so if this path were was ever called it would deadlock. Thus conclude the debugging never triggers and rework it into a simple WARN_ON and leave things as they are. While here add a note to explain how we could possibly get inconsistent page pointers. Link: https://lore.kernel.org/r/20191009160934.3143-16-jgg@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Do not race with mlx5_ib_invalidate_range during create and destroyJason Gunthorpe
For creation, as soon as the umem_odp is created the notifier can be called, however the underlying MR may not have been setup yet. This would cause problems if mlx5_ib_invalidate_range() runs. There is some confusing/ulocked/racy code that might by trying to solve this, but without locks it isn't going to work right. Instead trivially solve the problem by short-circuiting the invalidation if there are not yet any DMA mapped pages. By definition there is nothing to invalidate in this case. The create code will have the umem fully setup before anything is DMA mapped, and npages is fully locked by the umem_mutex. For destroy, invalidate the entire MR at the HW to stop DMA then DMA unmap the pages before destroying the MR. This drives npages to zero and prevents similar racing with invalidate while the MR is undergoing destruction. Arguably it would be better if the umem was created after the MR and destroyed before, but that would require a big rework of the MR code. Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/20191009160934.3143-15-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Do not store implicit children in the odp_mkeys xarrayJason Gunthorpe
These mkeys are entirely internal and are never used by the HW for page fault. They should also never be used by userspace for prefetch. Simplify & optimize things by not including them in the xarray. Since the prefetch path can now never see a child mkey there is no need for the second synchronize_srcu() during imr destroy. Link: https://lore.kernel.org/r/20191009160934.3143-14-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Rework implicit ODP destroyJason Gunthorpe
Use SRCU in a sensible way by removing all MRs in the implicit tree from the two xarrays (the update operation), then a synchronize, followed by a normal single threaded teardown. This is only a little unusual from the normal pattern as there can still be some work pending in the unbound wq that may also require a workqueue flush. This is tracked with a single atomic, consolidating the redundant existing atomics and wait queue. For understand-ability the entire ODP implicit create/destroy flow now largely exists in a single pair of functions within odp.c, with a few support functions for tearing down an unused child. Link: https://lore.kernel.org/r/20191009160934.3143-13-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Avoid double lookups on the pagefault pathJason Gunthorpe
Now that the locking is simplified combine pagefault_implicit_mr() with implicit_mr_get_data() so that we sweep over the idx range only once, and do the single xlt update at the end, after the child umems are setup. This avoids double iteration/xa_loads plus the sketchy failure path if the xa_load() fails. Link: https://lore.kernel.org/r/20191009160934.3143-12-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Reduce locking in implicit_mr_get_data()Jason Gunthorpe
Now that the child MRs are stored in an xarray we can rely on the SRCU lock to protect the xa_load and use xa_cmpxchg on the slow allocation path to resolve races with concurrent page fault. This reduces the scope of the critical section of umem_mutex for implicit MRs to only cover mlx5_ib_update_xlt, and avoids taking a lock at all if the child MR is already in the xarray. This makes it consistent with the normal ODP MR critical section for umem_lock, and the locking approach used for destroying an unusued implicit child MR. The MLX5_IB_UPD_XLT_ATOMIC is no longer needed in implicit_get_child_mr() since it is no longer called with any locks. Link: https://lore.kernel.org/r/20191009160934.3143-11-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Use an xarray for the children of an implicit ODPJason Gunthorpe
Currently the child leaves are stored in the shared interval tree and every lookup for a child must be done under the interval tree rwsem. This is further complicated by dropping the rwsem during iteration (ie the odp_lookup(), odp_next() pattern), which requires a very tricky an difficult to understand locking scheme with SRCU. Instead reserve the interval tree for the exclusive use of the mmu notifier related code in umem_odp.c and give each implicit MR a xarray containing all the child MRs. Since the size of each child is 1GB of VA, a 1 level xarray will index 64G of VA, and a 2 level will index 2TB, making xarray a much better data structure choice than an interval tree. The locking properties of xarray will be used in the next patches to rework the implicit ODP locking scheme into something simpler. At this point, the xarray is locked by the implicit MR's umem_mutex, and read can also be locked by the odp_srcu. Link: https://lore.kernel.org/r/20191009160934.3143-10-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Split implicit handling from pagefault_mrJason Gunthorpe
The single routine has a very confusing scheme to advance to the next child MR when working on an implicit parent. This scheme can only be used when working with an implicit parent and must not be triggered when working on a normal MR. Re-arrange things by directly putting all the single-MR stuff into one function and calling it in a loop for the implicit case. Simplify some of the error handling in the new pagefault_real_mr() to remove unneeded gotos. Link: https://lore.kernel.org/r/20191009160934.3143-9-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Set the HW IOVA of the child MRs to their place in the treeJason Gunthorpe
Instead of rewriting all the IOVA's to 0 as things progress down the tree make the IOVA of the children equal to placement in the tree. This makes things easier to understand by keeping mmkey.iova == HW configuration. Link: https://lore.kernel.org/r/20191009160934.3143-8-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Lift implicit_mr_alloc() into the two routines that call itJason Gunthorpe
This makes the routines easier to understand, particularly with respect the locking requirements of the entire sequence. The implicit_mr_alloc() had a lot of ifs specializing it to each of the callers, and only a very small amount of code was actually shared. Following patches will cause the flow in the two functions to diverge further. Link: https://lore.kernel.org/r/20191009160934.3143-7-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Rework implicit_mr_get_dataJason Gunthorpe
This function is intended to loop across each MTT chunk in the implicit parent that intersects the range [io_virt, io_virt+bnct). But it is has a confusing construction, so: - Consistently use imr and odp_imr to refer to the implicit parent to avoid confusion with the normal mr and odp of the child - Directly compute the inclusive start/end indexes by shifting. This is clearer to understand the intent and avoids any errors from unaligned values of addr - Iterate directly over the range of MTT indexes, do not make a loop out of goto - Follow 'success oriented flow', with goto error unwind - Directly calculate the range of idx's that need update_xlt - Ensure that any leaf MR added to the interval tree always results in an update to the XLT Link: https://lore.kernel.org/r/20191009160934.3143-6-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Delete struct mlx5_priv->mkey_tableJason Gunthorpe
No users are left, delete it. Link: https://lore.kernel.org/r/20191009160934.3143-5-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Use a dedicated mkey xarray for ODPJason Gunthorpe
There is a per device xarray storing mkeys that is used to store every mkey in the system. However, this xarray is now only read by ODP for certain ODP designated MRs (ODP, implicit ODP, MW, DEVX_INDIRECT). Create an xarray only for use by ODP, that only contains ODP related MKeys. This xarray is protected by SRCU and all erases are protected by a synchronize. This improves performance: - All MRs in the odp_mkeys xarray are ODP MRs, so some tests for is_odp() can be deleted. The xarray will also consume fewer nodes. - normal MR's are never mixed with ODP MRs in a SRCU data structure so performance sucking synchronize_srcu() on every MR destruction is not needed. - No smp_load_acquire(live) and xa_load() double barrier on read Due to the SRCU locking scheme care must be taken with the placement of the xa_store(). Once it completes the MR is immediately visible to other threads and only through a xa_erase() & synchronize_srcu() cycle could it be destroyed. Link: https://lore.kernel.org/r/20191009160934.3143-4-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Split sig_err MR data into its own xarrayJason Gunthorpe
The locking model for signature is completely different than ODP, do not share the same xarray that relies on SRCU locking to support ODP. Simply store the active mlx5_core_sig_ctx's in an xarray when signature MRs are created and rely on trivial xarray locking to serialize everything. The overhead of storing only a handful of SIG related MRs is going to be much less than an xarray full of every mkey. Link: https://lore.kernel.org/r/20191009160934.3143-3-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Use SRCU properly in ODP prefetchJason Gunthorpe
When working with SRCU protected xarrays the xarray itself should be the SRCU 'update' point. Instead prefetch is using live as the SRCU update point and this prevents switching the locking design to use the xarray instead. To solve this the prefetch must only read from the xarray once, and hold on to the actual MR pointer for the duration of the async operation. Incrementing num_pending_prefetch delays destruction of the MR, so it is suitable. Prefetch calls directly to the pagefault_mr using the MR pointer and only does a single xarray lookup. All the testing if a MR is prefetchable or not is now done only in the prefetch code and removed from the pagefault critical path. Link: https://lore.kernel.org/r/20191009160934.3143-2-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-27Linux 5.4-rc5Linus Torvalds
2019-10-27Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Two fixes for the VMWare guest support: - Unbreak VMWare platform detection which got wreckaged by converting an integer constant to a string constant. - Fix the clang build of the VMWAre hypercall by explicitely specifying the ouput register for INL instead of using the short form" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/vmware: Fix platform detection VMWARE_PORT macro x86/cpu/vmware: Use the full form of INL in VMWARE_HYPERCALL, for clang/llvm
2019-10-27Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A small set of fixes for time(keeping): - Add a missing include to prevent compiler warnings. - Make the VDSO implementation of clock_getres() POSIX compliant again. A recent change dropped the NULL pointer guard which is required as NULL is a valid pointer value for this function. - Fix two function documentation typos" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-cpu-timers: Fix two trivial comments timers/sched_clock: Include local timekeeping.h for missing declarations lib/vdso: Make clock_getres() POSIX compliant again
2019-10-27Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of perf fixes: kernel: - Unbreak the tracking of auxiliary buffer allocations which got imbalanced causing recource limit failures. - Fix the fallout of splitting of ToPA entries which missed to shift the base entry PA correctly. - Use the correct context to lookup the AUX event when unmapping the associated AUX buffer so the event can be stopped and the buffer reference dropped. tools: - Fix buildiid-cache mode setting in copyfile_mode_ns() when copying /proc/kcore - Fix freeing id arrays in the event list so the correct event is closed. - Sync sched.h anc kvm.h headers with the kernel sources. - Link jvmti against tools/lib/ctype.o to have weak strlcpy(). - Fix multiple memory and file descriptor leaks, found by coverity in perf annotate. - Fix leaks in error handling paths in 'perf c2c', 'perf kmem', found by a static analysis tool" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/aux: Fix AUX output stopping perf/aux: Fix tracking of auxiliary trace buffer allocation perf/x86/intel/pt: Fix base for single entry topa perf kmem: Fix memory leak in compact_gfp_flags() tools headers UAPI: Sync sched.h with the kernel tools headers kvm: Sync kvm.h headers with the kernel sources tools headers kvm: Sync kvm headers with the kernel sources tools headers kvm: Sync kvm headers with the kernel sources perf c2c: Fix memory leak in build_cl_output() perf tools: Fix mode setting in copyfile_mode_ns() perf annotate: Fix multiple memory and file descriptor leaks perf tools: Fix resource leak of closedir() on the error paths perf evlist: Fix fix for freed id arrays perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()
2019-10-27Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two fixes for interrupt controller drivers: - Skip IRQ_M_EXT entries in the device tree when initializing the RISCV PLIC controller to avoid a double init attempt. - Use the correct ITS list when issuing the VMOVP synchronization command so the operation works only on the ITS instances which are associated to a VM" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/sifive-plic: Skip contexts except supervisor in plic_init() irqchip/gic-v3-its: Use the exact ITSList for VMOVP
2019-10-27Merge tag '5.4-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Seven cifs/smb3 fixes, including three for stable" * tag '5.4-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs CIFS: Fix use after free of file info structures CIFS: Fix retry mid list corruption on reconnects cifs: Fix missed free operations CIFS: avoid using MID 0xFFFF cifs: clarify comment about timestamp granularity for old servers cifs: Handle -EINPROGRESS only when noblockcnt is set
2019-10-27Merge tag 'riscv/for-v5.4-rc5-b' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "Several minor fixes and cleanups for v5.4-rc5: - Three build fixes for various SPARSEMEM-related kernel configurations - Two cleanup patches for the kernel bug and breakpoint trap handler code" * tag 'riscv/for-v5.4-rc5-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: cleanup do_trap_break riscv: cleanup <asm/bug.h> riscv: Fix undefined reference to vmemmap_populate_basepages riscv: Fix implicit declaration of 'page_to_section' riscv: fix fs/proc/kcore.c compilation with sparsemem enabled
2019-10-26Merge tag 'mips_fixes_5.4_3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A few MIPS fixes: - Fix VDSO time-related function behavior for systems where we need to fall back to syscalls, but were instead returning bogus results. - A fix to TLB exception handlers for Cavium Octeon systems where they would inadvertently clobber the $1/$at register. - A build fix for bcm63xx configurations. - Switch to using my @kernel.org email address" * tag 'mips_fixes_5.4_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: tlbex: Fix build_restore_pagemask KScratch restore MIPS: bmips: mark exception vectors as char arrays mips: vdso: Fix __arch_get_hw_counter() MAINTAINERS: Use @kernel.org address for Paul Burton
2019-10-26Merge tag 'tty-5.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fix from Greg KH: "Here is a single tty/serial driver fix for 5.4-rc5 that resolves a reported issue. It has been in linux-next for a while with no problems" * tag 'tty-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: 8250-men-mcb: fix error checking when get_num_ports returns -ENODEV
2019-10-26Merge tag 'staging-5.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fix from Greg KH: "Here is a single staging driver fix, for the wlan-ng driver, that resolves a reported issue. It is been in linux-next for a while with no reported issues" * tag 'staging-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: wlan-ng: fix exit return when sme->key_idx >= NUM_WEPKEYS
2019-10-26Merge tag 'driver-core-5.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fix from Greg KH: "Here is a single sysfs fix for 5.4-rc5. It resolves an error if you actually try to use the __BIN_ATTR_WO() macro, seems I never tested it properly before :( This has been in linux-next for a while with no reported issues" * tag 'driver-core-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: sysfs: Fixes __BIN_ATTR_WO() macro
2019-10-26Merge tag 'char-misc-5.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull binder fix from Greg KH: "This is a single binder fix to resolve a reported issue by Jann. It's been in linux-next for a while with no reported issues" * tag 'char-misc-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: binder: Don't modify VMA bounds in ->mmap handler
2019-10-26Merge tag 'usb-5.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of small USB driver fixes for 5.4-rc5. More "fun" with some of the misc USB drivers as found by syzbot, and there are a number of other small bugfixes in here for reported issues. All have been in linux-next for a while with no reported issues" * tag 'usb-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: cdns3: Error out if USB_DR_MODE_UNKNOWN in cdns3_core_init_role() USB: ldusb: fix read info leaks USB: serial: ti_usb_3410_5052: clean up serial data access USB: serial: ti_usb_3410_5052: fix port-close races USB: usblp: fix use-after-free on disconnect usb: udc: lpc32xx: fix bad bit shift operation usb: cdns3: Fix dequeue implementation. USB: legousbtower: fix a signedness bug in tower_probe() USB: legousbtower: fix memleak on disconnect USB: ldusb: fix memleak on disconnect
2019-10-26Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "A few driver fixes for the I2C subsystem" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: stm32f7: remove warning when compiling with W=1 i2c: stm32f7: fix a race in slave mode with arbitration loss irq i2c: stm32f7: fix first byte to send in slave mode i2c: mt65xx: fix NULL ptr dereference i2c: aspeed: fix master pending state handling
2019-10-26Merge tag 'for-linus-2019-10-26' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block and io_uring fixes from Jens Axboe: "A bit bigger than usual at this point in time, mostly due to some good bug hunting work by Pavel that resulted in three io_uring fixes from him and two from me. Anyway, this pull request contains: - Revert of the submit-and-wait optimization for io_uring, it can't always be done safely. It depends on commands always making progress on their own, which isn't necessarily the case outside of strict file IO. (me) - Series of two patches from me and three from Pavel, fixing issues with shared data and sequencing for io_uring. - Lastly, two timeout sequence fixes for io_uring (zhangyi) - Two nbd patches fixing races (Josef) - libahci regulator_get_optional() fix (Mark)" * tag 'for-linus-2019-10-26' of git://git.kernel.dk/linux-block: nbd: verify socket is supported during setup ata: libahci_platform: Fix regulator_get_optional() misuse nbd: handle racing with error'ed out commands nbd: protect cmd->status with cmd->lock io_uring: fix bad inflight accounting for SETUP_IOPOLL|SETUP_SQTHREAD io_uring: used cached copies of sq->dropped and cq->overflow io_uring: Fix race for sqes with userspace io_uring: Fix broken links with offloading io_uring: Fix corrupted user_data io_uring: correct timeout req sequence when inserting a new entry io_uring : correct timeout req sequence when waiting timeout io_uring: revert "io_uring: optimize submit_and_wait API"
2019-10-26Merge tag 's390-5.4-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Add R_390_GLOB_DAT relocation type support. This fixes boot problem on linux-next. - Fix memory leak in zcrypt * tag 's390-5.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/kaslr: add support for R_390_GLOB_DAT relocation type s390/zcrypt: fix memleak at release
2019-10-26Merge tag 'for-linus-5.4-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixlet from Juergen Gross: "Just one patch for issuing a deprecation warning for 32-bit Xen pv guests" * tag 'for-linus-5.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: issue deprecation warning for 32-bit pv guest
2019-10-26Merge tag 'dma-mapping-5.4-2' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fix from Christoph Hellwig: "Fix a regression in the intel-iommu get_required_mask conversion (Arvind Sankar)" * tag 'dma-mapping-5.4-2' of git://git.infradead.org/users/hch/dma-mapping: iommu/vt-d: Return the correct dma mask when we are bypassing the IOMMU
2019-10-26Merge tag 'dax-fix-5.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull dax fix from Dan Williams: "Fix a performance regression that followed from a fix to the conversion of the fsdax implementation to the xarray. v5.3 users report that they stop seeing huge page mappings on an application + filesystem layout that was seeing huge pages previously on v5.2" * tag 'dax-fix-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: fs/dax: Fix pmd vs pte conflict detection
2019-10-25Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch, 53c710[x2], target) and one core change that tries to close a race between sysfs delete and module removal" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: lpfc: remove left-over BUILD_NVME defines scsi: core: try to get module before removing device scsi: hpsa: add missing hunks in reset-patch scsi: target: core: Do not overwrite CDB byte 1 scsi: ch: Make it possible to open a ch device multiple times again scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE scsi: sni_53c710: fix compilation error scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions scsi: qla2xxx: fix a potential NULL pointer dereference
2019-10-25riscv: cleanup do_trap_breakChristoph Hellwig
If we always compile the get_break_insn_length inline function we can remove the ifdefs and let dead code elimination take care of the warn branch that is now unreadable because the report_bug stub always returns BUG_TRAP_TYPE_BUG. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-10-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fix from Dmitry Torokhov: "A fix for st1232 driver to properly report coordinates for 2nd and subsequent fingers when more than one is on the surface" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: st1232 - fix reporting multitouch coordinates
2019-10-25nbd: verify socket is supported during setupMike Christie
nbd requires socket families to support the shutdown method so the nbd recv workqueue can be woken up from its sock_recvmsg call. If the socket does not support the callout we will leave recv works running or get hangs later when the device or module is removed. This adds a check during socket connection/reconnection to make sure the socket being passed in supports the needed callout. Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs") Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-25ata: libahci_platform: Fix regulator_get_optional() misuseMark Brown
This driver is using regulator_get_optional() to handle all the supplies that it handles, and only ever enables and disables all supplies en masse without ever doing any other configuration of the device to handle missing power. These are clear signs that the API is being misused - it should only be used for supplies that may be physically absent from the system and in these cases the hardware usually needs different configuration if the supply is missing. Instead use normal regualtor_get(), if the supply is not described in DT then the framework will substitute a dummy regulator in so no special handling is needed by the consumer driver. In the case of the PHY regulator the handling in the driver is a hack to deal with integrated PHYs; the supplies are only optional in the sense that that there's some confusion in the code about where they're bound to. From a code point of view they function exactly as normal supplies so can be treated as such. It'd probably be better to model this by instantiating a PHY object for integrated PHYs. Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-25nbd: handle racing with error'ed out commandsJosef Bacik
We hit the following warning in production print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700 ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60 Workqueue: knbd-recv recv_work [nbd] RIP: 0010:refcount_sub_and_test_checked+0x53/0x60 Call Trace: blk_mq_free_request+0xb7/0xf0 blk_mq_complete_request+0x62/0xf0 recv_work+0x29/0xa1 [nbd] process_one_work+0x1f5/0x3f0 worker_thread+0x2d/0x3d0 ? rescuer_thread+0x340/0x340 kthread+0x111/0x130 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x1f/0x30 ---[ end trace b079c3c67f98bb7c ]--- This was preceded by us timing out everything and shutting down the sockets for the device. The problem is we had a request in the queue at the same time, so we completed the request twice. This can actually happen in a lot of cases, we fail to get a ref on our config, we only have one connection and just error out the command, etc. Fix this by checking cmd->status in nbd_read_stat. We only change this under the cmd->lock, so we are safe to check this here and see if we've already error'ed this command out, which would indicate that we've completed it as well. Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-25nbd: protect cmd->status with cmd->lockJosef Bacik
We already do this for the most part, except in timeout and clear_req. For the timeout case we take the lock after we grab a ref on the config, but that isn't really necessary because we're safe to touch the cmd at this point, so just move the order around. For the clear_req cause this is initiated by the user, so again is safe. Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-25Merge tag 'modules-for-v5.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules fixes from Jessica Yu: - Revert __ksymtab_$namespace.$symbol naming scheme back to __ksymtab_$symbol, as it was causing issues with depmod. Instead, have modpost extract a symbol's namespace from __kstrtabns and __ksymtab_strings. - Fix `make nsdeps` for out of tree kernel builds (make O=...) caused by unescaped '/'. Use a different sed delimiter to avoid this problem. * tag 'modules-for-v5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: scripts/nsdeps: use alternative sed delimiter symbol namespaces: revert to previous __ksymtab name scheme modpost: make updating the symbol namespace explicit modpost: delegate updating namespaces to separate function
2019-10-25Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "A slightly larger set of fixes have accrued in the last two weeks. Mostly a collection of the usual smaller fixes: - Marvell Armada: USB phy setup issues on Turris Mox - Broadcom: GPIO/pinmux DT mapping corrections for Stingray, MMC bus width fix for RPi Zero W, GPIO LED removal for RPI CM3. Also some maintainer updates. - OMAP: Fixlets for display config, interrupt settings for wifi, some clock/PM pieces. Also IOMMU regression fix and a ti-sysc no-watchdog regression fix. - i.MX: A few fixes around PM/settings, some devicetree fixlets and catching up with config option changes in DRM - Rockchip: RockRro64 misc DT fixups, Hugsun X99 USB-C, Kevin display panel settings ... and some smaller fixes for Davinci (backlight, McBSP DMA), Allwinner (phy regulators, PMU removal on A64, etc)" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (42 commits) ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157 MAINTAINERS: Update the Spreadtrum SoC maintainer MAINTAINERS: Remove Gregory and Brian for ARCH_BRCMSTB ARM: dts: bcm2837-rpi-cm3: Avoid leds-gpio probing issue bus: ti-sysc: Fix watchdog quirk handling ARM: OMAP2+: Add pdata for OMAP3 ISP IOMMU ARM: OMAP2+: Plug in device_enable/idle ops for IOMMUs ARM: davinci_all_defconfig: enable GPIO backlight ARM: davinci: dm365: Fix McBSP dma_slave_map entry ARM: dts: bcm2835-rpi-zero-w: Fix bus-width of sdhci ARM: imx_v6_v7_defconfig: Enable CONFIG_DRM_MSM arm64: dts: imx8mn: Use correct clock for usdhc's ipg clk arm64: dts: imx8mm: Use correct clock for usdhc's ipg clk arm64: dts: imx8mq: Use correct clock for usdhc's ipg clk ARM: dts: imx7s: Correct GPT's ipg clock source ARM: dts: vf610-zii-scu4-aib: Specify 'i2c-mux-idle-disconnect' ARM: dts: imx6q-logicpd: Re-Enable SNVS power key arm64: dts: lx2160a: Correct CPU core idle state name mailmap: Add Simon Arlott (replacement for expired email address) arm64: dts: rockchip: Fix override mode for rk3399-kevin panel ...
2019-10-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "Bugfixes for ARM, PPC and x86, plus selftest improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: nVMX: Don't leak L1 MMIO regions to L2 KVM: SVM: Fix potential wrong physical id in avic_handle_ldr_update kvm: clear kvmclock MSR on reset KVM: x86: fix bugon.cocci warnings KVM: VMX: Remove specialized handling of unexpected exit-reasons selftests: kvm: fix sync_regs_test with newer gccs selftests: kvm: vmx_dirty_log_test: skip the test when VMX is not supported selftests: kvm: consolidate VMX support checks selftests: kvm: vmx_set_nested_state_test: don't check for VMX support twice KVM: Don't shrink/grow vCPU halt_poll_ns if host side polling is disabled selftests: kvm: synchronize .gitignore to Makefile kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID KVM: arm64: pmu: Reset sample period on overflow handling KVM: arm64: pmu: Set the CHAINED attribute before creating the in-kernel event arm64: KVM: Handle PMCR_EL0.LC as RES1 on pure AArch64 systems KVM: arm64: pmu: Fix cycle counter truncation KVM: PPC: Book3S HV: XIVE: Ensure VP isn't already in use
2019-10-25Merge tag 'drm-fixes-2019-10-25' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Quiet week this week, which I suspect means some people just didn't get around to sending me fixes pulls in time. This has 2 komeda and a bunch of amdgpu fixes in it: komeda: - typo fixes - flushing pipes fix amdgpu: - Fix suspend/resume issue related to multi-media engines - Fix memory leak in user ptr code related to hmm conversion - Fix possible VM faults when allocating page table memory - Fix error handling in bo list ioctl" * tag 'drm-fixes-2019-10-25' of git://anongit.freedesktop.org/drm/drm: drm/komeda: Fix typos in komeda_splitter_validate drm/komeda: Don't flush inactive pipes drm/amdgpu/vce: fix allocation size in enc ring test drm/amdgpu: fix error handling in amdgpu_bo_list_create drm/amdgpu: fix potential VM faults drm/amdgpu: user pages array memory leak fix drm/amdgpu/vcn: fix allocation size in enc ring test drm/amdgpu/uvd7: fix allocation size in enc ring test (v2) drm/amdgpu/uvd6: fix allocation size in enc ring test (v2)
2019-10-25Merge tag 'mmc-v5.4-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC host fixes: - mxs: Fix flags passed to dmaengine_prep_slave_sg - cqhci: Add a missing memory barrier - sdhci-omap: Fix tuning procedure for temperatures < -20C" * tag 'mmc-v5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: mxs: fix flags passed to dmaengine_prep_slave_sg mmc: cqhci: Commit descriptors before setting the doorbell mmc: sdhci-omap: Fix Tuning procedure for temperatures < -20C