summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-04-12SMB311: Improve checking of negotiate security contextsSteve French
SMB3.11 crypto and hash contexts were not being checked strictly enough. Add parsing and validity checking for the security contexts in the SMB3.11 negotiate response. Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-04-12SMB3: Fix length checking of SMB3.11 negotiate requestSteve French
The length checking for SMB3.11 negotiate request includes "negotiate contexts" which caused a buffer validation problem and a confusing warning message on SMB3.11 mount e.g.: SMB2 server sent bad RFC1001 len 236 not 170 Fix the length checking for SMB3.11 negotiate to account for the new negotiate context so that we don't log a warning on SMB3.11 mount by default but do log warnings if lengths returned by the server are incorrect. CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-04-12Merge tag 'xfs-4.17-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull more xfs updates from Darrick Wong: "Most of these are code cleanups, but there are a couple of notable use-after-free bug fixes. This series has been run through a full xfstests run over the week and through a quick xfstests run against this morning's master, with no major failures reported. - clean up unnecessary function call parameters - fix a use-after-free bug when aborting logging intents - refactor filestreams state data to avoid use-after-free bug - fix incorrect removal of cow extents when truncating extended attributes. - refactor open-coded __set_page_dirty in favor of using vfs function. - fix a deadlock when fstrim and fs shutdown race" * tag 'xfs-4.17-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: Force log to disk before reading the AGF during a fstrim Export __set_page_dirty xfs: only cancel cow blocks when truncating the data fork xfs: non-scrub - remove unused function parameters xfs: remove filestream item xfs_inode reference xfs: fix intent use-after-free on abort xfs: Remove "committed" argument of xfs_dir_ialloc
2018-04-12Merge tag 'gfs2-4.17.fixes2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull more gfs2 updates from Bob Peterson: "We decided to request the latest three patches to be merged into this merge window while it's still open. - The first patch adds a new function to lockref: lockref_put_not_zero - The second patch fixes GFS2's glock dump code so it uses the new lockref function. This fixes a problem whereby lock dumps could miss glocks. - I made a minor patch to update some comments and fix the lock ordering text in our gfs2-glocks.txt Documentation file" * tag 'gfs2-4.17.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: GFS2: Minor improvements to comments and documentation gfs2: Stop using rhashtable_walk_peek lockref: Add lockref_put_not_zero
2018-04-12Merge tag 'nfs-for-4.17-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client updates from Anna Schumaker: "Stable bugfixes: - xprtrdma: Fix corner cases when handling device removal # v4.12+ - xprtrdma: Fix latency regression on NUMA NFS/RDMA clients # v4.15+ Features: - New sunrpc tracepoint for RPC pings - Finer grained NFSv4 attribute checking - Don't unnecessarily return NFS v4 delegations Other bugfixes and cleanups: - Several other small NFSoRDMA cleanups - Improvements to the sunrpc RTT measurements - A few sunrpc tracepoint cleanups - Various fixes for NFS v4 lock notifications - Various sunrpc and NFS v4 XDR encoding cleanups - Switch to the ida_simple API - Fix NFSv4.1 exclusive create - Forget acl cache after setattr operation - Don't advance the nfs_entry readdir cookie if xdr decoding fails" * tag 'nfs-for-4.17-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (47 commits) NFS: advance nfs_entry cookie only after decoding completes successfully NFSv3/acl: forget acl cache after setattr NFSv4.1: Fix exclusive create NFSv4: Declare the size up to date after it was set. nfs: Use ida_simple API NFSv4: Fix the nfs_inode_set_delegation() arguments NFSv4: Clean up CB_GETATTR encoding NFSv4: Don't ask for attributes when ACCESS is protected by a delegation NFSv4: Add a helper to encode/decode struct timespec NFSv4: Clean up encode_attrs NFSv4; Clean up XDR encoding of type bitmap4 NFSv4: Allow GFP_NOIO sleeps in decode_attr_owner/decode_attr_group SUNRPC: Add a helper for encoding opaque data inline SUNRPC: Add helpers for decoding opaque and string types NFSv4: Ignore change attribute invalidations if we hold a delegation NFS: More fine grained attribute tracking NFS: Don't force unnecessary cache invalidation in nfs_update_inode() NFS: Don't redirty the attribute cache in nfs_wcc_update_inode() NFS: Don't force a revalidation of all attributes if change is missing NFS: Convert NFS_INO_INVALID flags to unsigned long ...
2018-04-12Merge branch 'work.thaw' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs thaw updates from Al Viro: "An ancient series that has fallen through the cracks in the previous cycle" * 'work.thaw' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: buffer.c: call thaw_super during emergency thaw vfs: factor sb iteration out of do_emergency_remount
2018-04-12Revert "drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)"Harry Wentland
This seems to cause flickering and lock-ups for a wide range of users. Revert until we've found a proper fix for the flickering and lock-ups. This reverts commit 36cc549d59864b7161f0e23d710c1c4d1b9cf022. Cc: Shirish S <shirish.s@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-12Merge branch 'afs-dh' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull AFS updates from Al Viro: "The AFS series posted by dhowells depended upon lookup_one_len() rework; now that prereq is in the mainline, that series had been rebased on top of it and got some exposure and testing..." * 'afs-dh' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: afs: Do better accretion of small writes on newly created content afs: Add stats for data transfer operations afs: Trace protocol errors afs: Locally edit directory data for mkdir/create/unlink/... afs: Adjust the directory XDR structures afs: Split the directory content defs into a header afs: Fix directory handling afs: Split the dynroot stuff out and give it its own ops tables afs: Keep track of invalid-before version for dentry coherency afs: Rearrange status mapping afs: Make it possible to get the data version in readpage afs: Init inode before accessing cache afs: Introduce a statistics proc file afs: Dump bad status record afs: Implement @cell substitution handling afs: Implement @sys substitution handling afs: Prospectively look up extra files when doing a single lookup afs: Don't over-increment the cell usage count when pinning it afs: Fix checker warnings vfs: Remove the const from dir_context::actor
2018-04-12Revert "drm/amd/display: fix dereferencing possible ERR_PTR()"Harry Wentland
This reverts commit cd2d6c92a8e39d7e50a5af9fcc67d07e6a89e91d. Cc: Shirish S <shirish.s@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-12drm/amd/display: Fix regamma not affecting full-intensity color valuesLeo (Sunpeng) Li
Hardware understands the regamma LUT as a piecewise linear function, with points spaced exponentially along the range. We previously programmed the LUT for range [2^-10, 2^0). This causes (normalized) color values of 1 (=2^0) to miss the programmed LUT, and fall onto the end region. For DCE, the end region is extrapolated using a single (base, slope) pair, using the max y-value from the last point in the curve as base. This presents a problem, since this value affects all three color channels. Scaling down the intensity of say - the blue regamma curve - will not affect it's end region. This is especially noticiable when using RedShift. It scales down the blue and green channels, but leaves full-intensity colors unshifted. Therefore, extend the range to cover [2^-10, 2^1) by programming another hardware segment, containing only one point. That way, we won't be hitting the end region. Note that things are a bit different for DCN, since the end region can be set per-channel. Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com> Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-12drm/amd/display: Fix FBC text console corruptionRoman Li
Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-12drm/amd/display: Only register backlight device if embedded panel connectedHarry Wentland
Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) In ip_gre tunnel, handle the conflict between TUNNEL_{SEQ,CSUM} and GSO/LLTX properly. From Sabrina Dubroca. 2) Stop properly on error in lan78xx_read_otp(), from Phil Elwell. 3) Don't uncompress in slip before rstate is initialized, from Tejaswi Tanikella. 4) When using 1.x firmware on aquantia, issue a deinit before we hardware reset the chip, otherwise we break dirty wake WOL. From Igor Russkikh. 5) Correct log check in vhost_vq_access_ok(), from Stefan Hajnoczi. 6) Fix ethtool -x crashes in bnxt_en, from Michael Chan. 7) Fix races in l2tp tunnel creation and duplicate tunnel detection, from Guillaume Nault. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits) l2tp: fix race in duplicate tunnel detection l2tp: fix races in tunnel creation tun: send netlink notification when the device is modified tun: set the flags before registering the netdevice lan78xx: Don't reset the interface on open bnxt_en: Fix NULL pointer dereference at bnxt_free_irq(). bnxt_en: Need to include RDMA rings in bnxt_check_rings(). bnxt_en: Support max-mtu with VF-reps bnxt_en: Ignore src port field in decap filter nodes bnxt_en: do not allow wildcard matches for L2 flows bnxt_en: Fix ethtool -x crash when device is down. vhost: return bool from *_access_ok() functions vhost: fix vhost_vq_access_ok() log check vhost: Fix vhost_copy_to_user() net: aquantia: oops when shutdown on already stopped device net: aquantia: Regression on reset with 1.x firmware cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN slip: Check if rstate is initialized before uncompressing lan78xx: Avoid spurious kevent 4 "error" lan78xx: Correctly indicate invalid OTP ...
2018-04-12Merge tag 'for-linus-4.17-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "A few fixes of Xen related core code and drivers" * tag 'for-linus-4.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen xen/acpi: off by one in read_acpi_id() xen/acpi: upload _PSD info for non Dom0 CPUs too x86/xen: Delay get_cpu_cap until stack canary is established xen: xenbus_dev_frontend: Verify body of XS_TRANSACTION_END xen: xenbus: Catch closing of non existent transactions xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling
2018-04-12Merge tag 'dma-mapping-4.17-2' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fix from Christoph Hellwig: "Fix for one swiotlb regression in 2.16 from Takashi" * tag 'dma-mapping-4.17-2' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: fix unexpected swiotlb_alloc_coherent failures
2018-04-12Merge tag 'mmc-v4.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Prevent bus reference leak in mmc_blk_init() MMC host: - tmio: Fix error handling when issuing CMD23 - jz4740: Fix race condition in IRQ mask update" * tag 'mmc-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: tmio: Fix error handling when issuing CMD23 mmc: core: Prevent bus reference leak in mmc_blk_init() mmc: jz4740: Fix race condition in IRQ mask update
2018-04-12Merge tag 'for_linus-4.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb Pull kdb updates from Jason Wessel: - fix 2032 time access issues and new compiler warnings - minor regression test cleanup - formatting fixes for end user use of kdb * tag 'for_linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb: kdb: use memmove instead of overlapping memcpy kdb: use ktime_get_mono_fast_ns() instead of ktime_get_ts() kdb: bl: don't use tab character in output kdb: drop newline in unknown command output kdb: make "mdr" command repeat kdb: use __ktime_get_real_seconds instead of __current_kernel_time misc: kgdbts: Display progress of asynchronous tests
2018-04-12Merge tag 'microblaze-4.17-rc1' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds
Pull microblaze updates from Michal Simek: "Use generic pci_mmap_resource_range()" * tag 'microblaze-4.17-rc1' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Use generic pci_mmap_resource_range() microblaze: Provide pgprot_device/writecombine macros for nommu
2018-04-12GFS2: Minor improvements to comments and documentationBob Peterson
This patch simply fixes some comments and the gfs2-glocks.txt file: Places where i_rwsem was called i_mutex, and adding i_rw_mutex. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-04-12gfs2: Stop using rhashtable_walk_peekAndreas Gruenbacher
Function rhashtable_walk_peek is problematic because there is no guarantee that the glock previously returned still exists; when that key is deleted, rhashtable_walk_peek can end up returning a different key, which will cause an inconsistent glock dump. Fix this by keeping track of the current glock in the seq file iterator functions instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-04-12lockref: Add lockref_put_not_zeroAndreas Gruenbacher
Put a lockref unless the lockref is dead or its count would become zero. This is the same as lockref_put_or_lock except that the lock is never left held. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-04-12Merge tag 'asm-generic' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic fixes from Arnd Bergmann: "I have one regression fix for a minor build problem after the architecture removal series, plus a rework of the barriers in the readl/writel functions, thanks to work by Sinan Kaya: This started from a discussion on the linuxpcc and rdma mailing lists[1]. To summarize, we decided that architectures are responsible to serialize readl() and writel() accesses on a device MMIO space relative to DMA performed by that device. This series provides a pessimistic implementation of that behavior for asm-generic/io.h, which is in turn used by a number of architectures (h8300, microblaze, nios2, openrisc, s390, sparc, um, unicore32, and xtensa). Some of those presumably need no extra barriers, or something weaker than rmb()/wmb(), and they are advised to override the new default for better performance. For inb()/outb(), the same barriers are used, but architectures might want to add another barrier to outb() here if that can guarantee non-posted behavior (some architectures can, others cannot do that). The readl_relaxed()/writel_relaxed() family of functions retains the existing behavior with no extra barriers" [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-March/170481.html * tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: io: change writeX_relaxed() to remove barriers io: change readX_relaxed() to remove barriers dts: remove cris & metag dts hard link file io: change inX() to have their own IO barrier overrides io: change outX() to have their own IO barrier overrides io: define stronger ordering for the default writeX() implementation io: define stronger ordering for the default readX() implementation io: define several IO & PIO barrier types for the asm-generic version
2018-04-12nvme: expand nvmf_check_if_ready checksJames Smart
The nvmf_check_if_ready() checks that were added are very simplistic. As such, the routine allows a lot of cases to fail ios during windows of reset or re-connection. In cases where there are not multi-path options present, the error goes back to the callee - the filesystem or application. Not good. The common routine was rewritten and calling syntax slightly expanded so that per-transport is_ready routines don't need to be present. The transports now call the routine directly. The routine is now a fabrics routine rather than an inline function. The routine now looks at controller state to decide the action to take. Some states mandate io failure. Others define the condition where a command can be accepted. When the decision is unclear, a generic queue-or-reject check is made to look for failfast or multipath ios and only fails the io if it is so marked. Otherwise, the io will be queued and wait for the controller state to resolve. Admin commands issued via ioctl share a live admin queue with commands from the transport for controller init. The ioctls could be intermixed with the initialization commands. It's possible for the ioctl cmd to be issued prior to the controller being enabled. To block this, the ioctl admin commands need to be distinguished from admin commands used for controller init. Added a USERCMD nvme_req(req)->rq_flags bit to reflect this division and set it on ioctls requests. As the nvmf_check_if_ready() routine is called prior to nvme_setup_cmd(), ensure that commands allocated by the ioctl path (actually anything in core.c) preps the nvme_req(req) before starting the io. This will preserve the USERCMD flag during execution and/or retry. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.e> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme: Use admin command effects for admin commandsKeith Busch
Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvmet: fix space padding in serial numberDaniel Verkamp
Commit 42de82a8b544 previously attempted to fix this, and it did correctly pad the MN and FR fields with spaces, but the SN field still contains 0 bytes. The current code fills out the first 16 bytes with hex2bin, leaving the last 4 bytes zeroed. Rather than adding a lot of error-prone math to avoid overwriting SN twice, just set the whole thing to spaces up front (it's only 20 bytes). Fixes: 42de82a8b544 ("nvmet: don't report 0-bytes in serial number") Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme: check return value of init_srcu_struct functionMax Gurtovoy
Also add error flow in case srcu initialization function fails. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvmet: Fix nvmet_execute_write_zeroes sector countRodrigo R. Galvao
We have to increment the number of logical blocks to a 1's based value in the native format prior to converting to 512b units. Signed-off-by: Rodrigo R. Galvao <rosattig@linux.vnet.ibm.com> [changelog] Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme-pci: Separate IO and admin queue IRQ vectorsKeith Busch
The admin and first IO queues shared the first irq vector, which has an affinity mask including cpu0. If a system allows cpu0 to be offlined, the admin queue may not be usable if no other CPUs in the affinity mask are online. This is a problem since unlike IO queues, there is only one admin queue that always needs to be usable. To fix, this patch allocates one pre_vector for the admin queue that is assigned all CPUs, so will always be accessible. The IO queues are assigned the remaining managed vectors. In case a controller has only one interrupt vector available, the admin and IO queues will share the pre_vector with all CPUs assigned. Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme-pci: Remove unused queue parameterKeith Busch
All the queue memory is allocated up front. We don't take the node into consideration when creating queues anymore, so removing the unused parameter. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme-pci: Skip queue deletion if there are no queuesKeith Busch
User reported controller always retains CSTS.RDY to 1, which fails controller disabling when resetting the controller. This is also before the admin queue is allocated, and trying to disable an unallocated queue results in a NULL dereference. Reported-by: Alex Gagniuc <Alex_Gagniuc@Dellteam.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme: target: fix buffer overflowArnd Bergmann
nvmet_execute_get_disc_log_page() passes a fixed-length string into nvmet_format_discovery_entry(), which then does a longer memcpy() on it, as pointed out by gcc-8: In function 'nvmet_format_discovery_entry', inlined from 'nvmet_execute_get_disc_log_page' at drivers/nvme/target/discovery.c:126:4: drivers/nvme/target/discovery.c:62:2: error: 'memcpy' forming offset [38, 223] is out of the bounds [0, 37] [-Werror=array-bounds] memcpy(e->subnqn, subsys_nqn, NVMF_NQN_SIZE); Using strncpy() will make this well-defined, filling the rest of the buffer with zeroes, under the assumption that the input is either a NUL-terminated string, or a byte sequence containing no zeroes. If the input is a string that is longer than NVMF_NQN_SIZE, we continue to have no NUL-termination in the output. Fixes: a07b4970f464 ("nvmet: add a generic NVMe target") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme: don't send keep-alives to the discovery controllerJohannes Thumshirn
NVMe over Fabrics 1.0 Section 5.2 "Discovery Controller Properties and Command Support" Figure 31 "Discovery Controller – Admin Commands" explicitly listst all commands but "Get Log Page" and "Identify" as reserved, but NetApp report the Linux host is sending Keep Alive commands to the discovery controller, which is a violation of the Spec. We're already checking for discovery controllers when configuring the keep alive timeout but when creating a discovery controller we're not hard wiring the keep alive timeout to 0 and thus remain on NVME_DEFAULT_KATO for the discovery controller. This can be easily remproduced when issuing a direct connect to the discovery susbsystem using: 'nvme connect [...] --nqn=nqn.2014-08.org.nvmexpress.discovery' Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: 07bfcd09a288 ("nvme-fabrics: add a generic NVMe over Fabrics library") Reported-by: Martin George <marting@netapp.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme: unexport nvme_start_keep_aliveJohannes Thumshirn
nvme_start_keep_alive() isn't used outside core.c so unexport it and make it static. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme-loop: fix kernel oops in case of unhandled commandMing Lei
When nvmet_req_init() fails, __nvmet_req_complete() is called to handle the target request via .queue_response(), so nvme_loop_queue_response() shouldn't be called again for handling the failure. This patch fixes this case by the following way: - move blk_mq_start_request() before nvmet_req_init(), so nvme_loop_queue_response() may work well to complete this host request - don't call nvme_cleanup_cmd() which is done in nvme_loop_complete_rq() - don't call nvme_loop_queue_response() which is done via .queue_response() Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> [trimmed changelog] Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12nvme: enforce 64bit offset for nvme_get_log_ext fnMatias Bjørling
Compiling on 32 bits system produces a warning for the shift width when shifting 32 bit integer with 64bit integer. Make sure that offset always is 64bit, and use macros for retrieving lower and upper bits of the offset. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-12btrfs: add SPDX header to KconfigDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-12btrfs: replace GPL boilerplate by SPDX -- sourcesDavid Sterba
Remove GPL boilerplate text (long, short, one-line) and keep the rest, ie. personal, company or original source copyright statements. Add the SPDX header. Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-12btrfs: replace GPL boilerplate by SPDX -- headersDavid Sterba
Remove GPL boilerplate text (long, short, one-line) and keep the rest, ie. personal, company or original source copyright statements. Add the SPDX header. Unify the include protection macros to match the file names. Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-12powerpc/mm/radix: Fix checkstops caused by invalid tlbielMichael Ellerman
In tlbiel_radix_set_isa300() we use the PPC_TLBIEL() macro to construct tlbiel instructions. The instruction takes 5 fields, two of which are registers, and the others are constants. But because it's constructed with inline asm the compiler doesn't know that. We got the constraint wrong on the 'r' field, using "r" tells the compiler to put the value in a register. The value we then get in the macro is the *register number*, not the value of the field. That means when we mask the register number with 0x1 we get 0 or 1 depending on which register the compiler happens to put the constant in, eg: li r10,1 tlbiel r8,r9,2,0,0 li r7,1 tlbiel r10,r6,0,0,1 If we're unlucky we might generate an invalid instruction form, for example RIC=0, PRS=1 and R=0, tlbiel r8,r7,0,1,0, this has been observed to cause machine checks: Oops: Machine check, sig: 7 [#1] CPU: 24 PID: 0 Comm: swapper NIP: 00000000000385f4 LR: 000000000100ed00 CTR: 000000000000007f REGS: c00000000110bb40 TRAP: 0200 MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48002222 XER: 20040000 CFAR: 00000000000385d0 DAR: 0000000000001c00 DSISR: 00000200 SOFTE: 1 If the machine check happens early in boot while we have MSR_ME=0 it will escalate into a checkstop and kill the box entirely. To fix it we could change the inline asm constraint to "i" which tells the compiler the value is a constant. But a better fix is to just pass a literal 1 into the macro, which bypasses any problems with inline asm constraints. Fixes: d4748276ae14 ("powerpc/64s: Improve local TLB flush for boot and MCE on POWER9") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2018-04-12Btrfs: fix loss of prealloc extents past i_size after fsync log replayFilipe Manana
Currently if we allocate extents beyond an inode's i_size (through the fallocate system call) and then fsync the file, we log the extents but after a power failure we replay them and then immediately drop them. This behaviour happens since about 2009, commit c71bf099abdd ("Btrfs: Avoid orphan inodes cleanup while replaying log"), because it marks the inode as an orphan instead of dropping any extents beyond i_size before replaying logged extents, so after the log replay, and while the mount operation is still ongoing, we find the inode marked as an orphan and then perform a truncation (drop extents beyond the inode's i_size). Because the processing of orphan inodes is still done right after replaying the log and before the mount operation finishes, the intention of that commit does not make any sense (at least as of today). However reverting that behaviour is not enough, because we can not simply discard all extents beyond i_size and then replay logged extents, because we risk dropping extents beyond i_size created in past transactions, for example: add prealloc extent beyond i_size fsync - clears the flag BTRFS_INODE_NEEDS_FULL_SYNC from the inode transaction commit add another prealloc extent beyond i_size fsync - triggers the fast fsync path power failure In that scenario, we would drop the first extent and then replay the second one. To fix this just make sure that all prealloc extents beyond i_size are logged, and if we find too many (which is far from a common case), fallback to a full transaction commit (like we do when logging regular extents in the fast fsync path). Trivial reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0xab 0 256K" /mnt/foo $ sync $ xfs_io -c "falloc -k 256K 1M" /mnt/foo $ xfs_io -c "fsync" /mnt/foo <power failure> # mount to replay log $ mount /dev/sdb /mnt # at this point the file only has one extent, at offset 0, size 256K A test case for fstests follows soon, covering multiple scenarios that involve adding prealloc extents with previous shrinking truncates and without such truncates. Fixes: c71bf099abdd ("Btrfs: Avoid orphan inodes cleanup while replaying log") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-12Btrfs: clean up resources during umount after trans is abortedLiu Bo
Currently if some fatal errors occur, like all IO get -EIO, resources would be cleaned up when a) transaction is being committed or b) BTRFS_FS_STATE_ERROR is set However, in some rare cases, resources may be left alone after transaction gets aborted and umount may run into some ASSERT(), e.g. ASSERT(list_empty(&block_group->dirty_list)); For case a), in btrfs_commit_transaciton(), there're several places at the beginning where we just call btrfs_end_transaction() without cleaning up resources. For case b), it is possible that the trans handle doesn't have any dirty stuff, then only trans hanlde is marked as aborted while BTRFS_FS_STATE_ERROR is not set, so resources remain in memory. This makes btrfs also check BTRFS_FS_STATE_TRANS_ABORTED to make sure that all resources won't stay in memory after umount. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-12ovl: update documentation w.r.t "xino" featureAmir Goldstein
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: add support for "xino" mount and config optionsAmir Goldstein
With mount option "xino=on", mounter declares that there are enough free high bits in underlying fs to hold the layer fsid. If overlayfs does encounter underlying inodes using the high xino bits reserved for layer fsid, a warning will be emitted and the original inode number will be used. The mount option name "xino" goes after a similar meaning mount option of aufs, but in overlayfs case, the mapping is stateless. An example for a use case of "xino=on" is when upper/lower is on an xfs filesystem. xfs uses 64bit inode numbers, but it currently never uses the upper 8bit for inode numbers exposed via stat(2) and that is not likely to change in the future without user opting-in for a new xfs feature. The actual number of unused upper bit is much larger and determined by the xfs filesystem geometry (64 - agno_log - agblklog - inopblog). That means that for all practical purpose, there are enough unused bits in xfs inode numbers for more than OVL_MAX_STACK unique fsid's. Another use case of "xino=on" is when upper/lower is on tmpfs. tmpfs inode numbers are allocated sequentially since boot, so they will practially never use the high inode number bits. For compatibility with applications that expect 32bit inodes, the feature can be disabled with "xino=off". The option "xino=auto" automatically detects underlying filesystem that use 32bit inodes and enables the feature. The Kconfig option OVERLAY_FS_XINO_AUTO and module parameter of the same name, determine if the default mode for overlayfs mount is "xino=auto" or "xino=off". Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: consistent d_ino for non-samefs with xinoAmir Goldstein
When overlay layers are not all on the same fs, but all inode numbers of underlying fs do not use the high 'xino' bits, overlay st_ino values are constant and persistent. In that case, relax non-samefs constraint for consistent d_ino and always iterate non-merge dir using ovl_fill_real() actor so we can remap lower inode numbers to unique lower fs range. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: consistent i_ino for non-samefs with xinoAmir Goldstein
When overlay layers are not all on the same fs, but all inode numbers of underlying fs do not use the high 'xino' bits, overlay st_ino values are constant and persistent. In that case, set i_ino value to the same value as st_ino for nfsd readdirplus validator. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: constant st_ino for non-samefs with xinoAmir Goldstein
On 64bit systems, when overlay layers are not all on the same fs, but all inode numbers of underlying fs are not using the high bits, use the high bits to partition the overlay st_ino address space. The high bits hold the fsid (upper fsid is 0). This way overlay inode numbers are unique and all inodes use overlay st_dev. Inode numbers are also persistent for a given layer configuration. Currently, our only indication for available high ino bits is from a filesystem that supports file handles and uses the default encode_fh() operation, which encodes a 32bit inode number. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: allocate anon bdev per unique lower fsAmir Goldstein
Instead of allocating an anonymous bdev per lower layer, allocate one anonymous bdev per every unique lower fs that is different than upper fs. Every unique lower fs is assigned an fsid > 0 and the number of unique lower fs are stored in ofs->numlowerfs. The assigned fsid is stored in the lower layer struct and will be used also for inode number multiplexing. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: factor out ovl_map_dev_ino() helperAmir Goldstein
A helper for ovl_getattr() to map the values of st_dev and st_ino according to constant st_ino rules. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: cleanup ovl_update_time()Miklos Szeredi
No need to mess with an alias, the upperdentry can be retrieved directly from the overlay inode. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: add WARN_ON() for non-dir redirect casesMiklos Szeredi
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>