summaryrefslogtreecommitdiff
path: root/drivers/scsi/sd.c
AgeCommit message (Collapse)Author
2016-07-18sd: don't use the ALL_TG_PT bit for reservationsChristoph Hellwig
These only work if the we use the same initiator ID for all path, which might not be true if we use different protocols, or even just different HBAs. Instead dm-mpath will grow support to register all path manually later in this series. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <mchristi@redhat.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-27block: convert to device_add_disk()Dan Williams
For block drivers that specify a parent device, convert them to use device_add_disk(). This conversion was done with the following semantic patch: @@ struct gendisk *disk; expression E; @@ - disk->driverfs_dev = E; ... - add_disk(disk); + device_add_disk(E, disk); @@ struct gendisk *disk; expression E1, E2; @@ - disk->driverfs_dev = E1; ... E2 = disk; ... - add_disk(E2); + device_add_disk(E1, E2); ...plus some manual fixups for a few missed conversions. Cc: Jens Axboe <axboe@fb.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-06-07block, drivers: add REQ_OP_FLUSH operationMike Christie
This adds a REQ_OP_FLUSH operation that is sent to request_fn based drivers by the block layer's flush code, instead of sending requests with the request->cmd_flags REQ_FLUSH bit set. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-07block, fs, drivers: remove REQ_OP compat defs and related codeMike Christie
This patch drops the compat definition of req_op where it matches the rq_flag_bits definitions, and drops the related old and compat code that allowed users to set either the op or flags for the operation. We also then store the operation in the bi_rw/cmd_flags field similar to how we used to store the bio ioprio where it sat in the upper bits of the field. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-07drivers: use req op accessorMike Christie
The req operation REQ_OP is separated from the rq_flag_bits definition. This converts the block layer drivers to use req_op to get the op from the request struct. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-01sd: Fix rw_max for devices that report an optimal xfer sizeMartin K. Petersen
For historic reasons, io_opt is in bytes and max_sectors in block layer sectors. This interface inconsistency is error prone and should be fixed. But for 4.4--4.7 let's make the unit difference explicit via a wrapper function. Fixes: d0eb20a863ba ("sd: Optimal I/O size is in bytes, not sectors") Cc: stable@vger.kernel.org # 4.4+ Reported-by: Fam Zheng <famz@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Andrew Patterson <andrew.patterson@hpe.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-05-22sd: get disk reference in sd_check_events()Hannes Reinecke
sd_check_events() is called asynchronously, and might race with device removal. So always take a disk reference when processing the event to avoid the device being removed while the event is processed. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-12sd: switch to using blk_queue_write_cache()Jens Axboe
Switch to the newer interface, instead of using blk_queue_flush() directly. Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-12block: add offset in blk_add_request_payload()Ming Lin
We could kmalloc() the payload, so need the offset in page. Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-04-09Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of eight fixes. Two are trivial gcc-6 updates (brace additions and unused variable removal). There's a couple of cxlflash regressions, a correction for sd being overly chatty on revalidation (causing excess log increases). A VPD issue which could crash USB devices because they seem very intolerant to VPD inquiries, an ALUA deadlock fix and a mpt3sas buffer overrun fix" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: Do not attach VPD to devices that don't support it sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes scsi_dh_alua: Fix a recently introduced deadlock scsi: Declare local symbols static cxlflash: Move to exponential back-off when cmd_room is not available cxlflash: Fix regression issue with re-ordering patch mpt3sas: Don't overreach ioc->reply_post[] during initialization aacraid: add missing curly braces
2016-04-05scsi: Do not attach VPD to devices that don't support itHannes Reinecke
The patch "scsi: rescan VPD attributes" introduced a regression in which devices that don't support VPD were being scanned for VPD attributes anyway. This could cause issues for some devices and should be avoided so the check for scsi_level has been moved out of scsi_add_lun and into scsi_attach_vpd so that all callers will not scan VPD for devices that don't support it. [mkp: Merge fix] Fixes: 09e2b0b14690 ("scsi: rescan VPD attributes") Cc: <stable@vger.kernel.org> #v4.5+ Suggested-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-31sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytesMartin K. Petersen
During revalidate we check whether device capacity has changed before we decide whether to output disk information or not. The check for old capacity failed to take into account that we scaled sdkp->capacity based on the reported logical block size. And therefore the capacity test would always fail for devices with sectors bigger than 512 bytes and we would print several copies of the same discovery information. Avoid scaling sdkp->capacity and instead adjust the value on the fly when setting the block device capacity and generating fake C/H/S geometry. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> Reported-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Hannes Reinicke <hare@suse.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-14sd: Fix discard granularity when LBPRZ=1Martin K. Petersen
Commit 397737223c59 ("sd: Make discard granularity match logical block size when LBPRZ=1") accidentally set the granularity to one byte instead of one logical block on devices that provide deterministic zeroes after UNMAP. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Fixes: 397737223c59e89dca7305feb6528caef8fbef84 Cc: <stable@vger.kernel.org> #v4.4+
2016-03-08sd: Fix discard granularity when LBPRZ=1Martin K. Petersen
Commit 397737223c59 ("sd: Make discard granularity match logical block size when LBPRZ=1") accidentally set the granularity to one byte instead of one logical block on devices that provide deterministic zeroes after UNMAP. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Fixes: 397737223c59e89dca7305feb6528caef8fbef84 Cc: <stable@vger.kernel.org> #v4.4+
2016-02-04Merge remote-tracking branch 'mkp-scsi/4.5/scsi-fixes' into fixesJames Bottomley
2016-02-04block/sd: Return -EREMOTEIO when WRITE SAME and DISCARD are disabledMartin K. Petersen
When a storage device rejects a WRITE SAME command we will disable write same functionality for the device and return -EREMOTEIO to the block layer. -EREMOTEIO will in turn prevent DM from retrying the I/O and/or failing the path. Yiwen Jiang discovered a small race where WRITE SAME requests issued simultaneously would cause -EIO to be returned. This happened because any requests being prepared after WRITE SAME had been disabled for the device caused us to return BLKPREP_KILL. The latter caused the block layer to return -EIO upon completion. To overcome this we introduce BLKPREP_INVALID which indicates that this is an invalid request for the device. blk_peek_request() is modified to return -EREMOTEIO in that case. Reported-by: Yiwen Jiang <jiangyiwen@huawei.com> Suggested-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-26Merge remote-tracking branch 'mkp-scsi/4.5/scsi-fixes' into fixesJames Bottomley
2016-01-26SCSI: fix crashes in sd and sr runtime PMAlan Stern
Runtime suspend during driver probe and removal can cause problems. The driver's runtime_suspend or runtime_resume callbacks may invoked before the driver has finished binding to the device or after the driver has unbound from the device. This problem shows up with the sd and sr drivers, and can cause disk or CD/DVD drives to become unusable as a result. The fix is simple. The drivers store a pointer to the scsi_disk or scsi_cd structure as their private device data when probing is finished, so we simply have to be sure to clear the private data during removal and test it during runtime suspend/resume. This fixes <https://bugs.debian.org/801925>. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Paul Menzel <paul.menzel@giantmonkey.de> Reported-by: Erich Schubert <erich@debian.org> Reported-by: Alexandre Rossi <alexandre.rossi@gmail.com> Tested-by: Paul Menzel <paul.menzel@giantmonkey.de> Tested-by: Erich Schubert <erich@debian.org> CC: <stable@vger.kernel.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2016-01-20sd: Optimal I/O size is in bytes, not sectorsMartin K. Petersen
Commit ca369d51b3e1 ("block/sd: Fix device-imposed transfer length limits") accidentally switched optimal I/O size reporting from bytes to block layer sectors. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: ca369d51b3e1649be4a72addd6d6a168cfb3f537 Cc: stable@vger.kernel.org # 4.4+ Reviewed-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
2015-12-21sd: Reject optimal transfer length smaller than page sizeMartin K. Petersen
Eryu Guan reported that loading scsi_debug would fail. This turned out to be caused by scsi_debug reporting an optimal I/O size of 32KB which is smaller than the 64KB page size on the PowerPC system in question. Add a check to ensure that we only use the device-reported OPTIMAL TRANSFER LENGTH if it is bigger than or equal to the page cache size. Reported-by: Eryu Guan <guaneryu@gmail.com> Reported-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-12-03Merge branch 'mkp-fixes' into fixesJames Bottomley
2015-11-25block/sd: Fix device-imposed transfer length limitsMartin K. Petersen
Commit 4f258a46346c ("sd: Fix maximum I/O size for BLOCK_PC requests") had the unfortunate side-effect of removing an implicit clamp to BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer code. This caused problems for some SMR drives. Debugging this issue revealed a few problems with the existing infrastructure since the block layer didn't know how to deal with device-imposed limits, only limits set by the I/O controller. - Introduce a new queue limit, max_dev_sectors, which is used by the ULD to signal the maximum sectors for a REQ_TYPE_FS request. - Ensure that max_dev_sectors is correctly stacked and taken into account when overriding max_sectors through sysfs. - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer values for later processing. - In sd_revalidate() set the queue's max_dev_sectors based on the MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value is not reported, fall back to a cap based on the CDB TRANSFER LENGTH field size. - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits VPD--if reported and sane--to signal the preferred device transfer size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS. - blk_limits_max_hw_sectors() is no longer used and can be removed. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581 Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: sweeneygj@gmx.com Tested-by: Arzeets <anatol.pomozov@gmail.com> Tested-by: David Eisner <david.eisner@oriel.oxon.org> Tested-by: Mario Kicherer <dev@kicherer.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-11-25sd: Make discard granularity match logical block size when LBPRZ=1Martin K. Petersen
A device may report an OPTIMAL UNMAP GRANULARITY and UNMAP GRANULARITY ALIGNMENT in the Block Limits VPD. These parameters describe the device's internal provisioning allocation units. By default the block layer will round and align any discard requests based on these limits. If a device reports LBPRZ=1 to guarantee zeroes after discard, however, it is imperative that the block layer does not leave out any parts of the requested block range. Otherwise the device can not do the required zeroing of any partial allocation units and this can lead to data corruption. Since the dm thinp personality relies on the block layer's current behavior and is unable to deal with partial discard blocks we work around the problem by setting the granularity to match the logical block size when LBPRZ is enabled. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-11-13Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull final round of SCSI updates from James Bottomley: "Sorry for the delay in this patch which was mostly caused by getting the merger of the mpt2/mpt3sas driver, which was seen as an essential item of maintenance work to do before the drivers diverge too much. Unfortunately, this caused a compile failure (detected by linux-next), which then had to be fixed up and incubated. In addition to the mpt2/3sas rework, there are updates from pm80xx, lpfc, bnx2fc, hpsa, ipr, aacraid, megaraid_sas, storvsc and ufs plus an assortment of changes including some year 2038 issues, a fix for a remove before detach issue in some drivers and a couple of other minor issues" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits) mpt3sas: fix inline markers on non inline function declarations sd: Clear PS bit before Mode Select. ibmvscsi: set max_lun to 32 ibmvscsi: display default value for max_id, max_lun and max_channel. mptfusion: don't allow negative bytes in kbuf_alloc_2_sgl() scsi: pmcraid: replace struct timeval with ktime_get_real_seconds() mvumi: 64bit value for seconds_since1970 be2iscsi: Fix bogus WARN_ON length check scsi_scan: don't dump trace when scsi_prep_async_scan() is called twice mpt3sas: Bump mpt3sas driver version to 09.102.00.00 mpt3sas: Single driver module which supports both SAS 2.0 & SAS 3.0 HBAs mpt2sas, mpt3sas: Update the driver versions mpt3sas: setpci reset kernel oops fix mpt3sas: Added OEM Gen2 PnP ID branding names mpt3sas: Refcount fw_events and fix unsafe list usage mpt3sas: Refcount sas_device objects and fix unsafe list usage mpt3sas: sysfs attribute to report Backup Rail Monitor Status mpt3sas: Ported WarpDrive product SSS6200 support mpt3sas: fix for driver fails EEH, recovery from injected pci bus error mpt3sas: Manage MSI-X vectors according to HBA device type ...
2015-11-11sd: Clear PS bit before Mode Select.Gabriel Krisman Bertazi
According to SPC-4, in a Mode Select, the PS bit in Mode Pages is reserved and must be set to 0 by the driver. In the sd implementation, function cache_type_store does a Mode Sense, which might set the PS bit on the read buffer, followed by a Mode Select, which receives the same buffer, without explicitly clearing the PS bit. So, in cases where target supports saving the Mode Page to a non-volatile location, we end up doing a Mode Select with the PS bit set, which could cause an illegal request error if the target is checking this. This was observed on a new firmware change, which was subsequently reverted, but this changes sd.c to be more compliant with SPC-4. This patch clears the PS bit in the buffer returned by Mode Select, right before it is used in the Mode Select command. Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-11-04Merge branch 'for-4.4/reservations' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block reservation support from Jens Axboe: "This adds support for persistent reservations, both at the core level, as well as for sd and NVMe" [ Background from the docs: "Persistent Reservations allow restricting access to block devices to specific initiators in a shared storage setup. All implementations are expected to ensure the reservations survive a power loss and cover all connections in a multi path environment" ] * 'for-4.4/reservations' of git://git.kernel.dk/linux-block: NVMe: Precedence error in nvme_pr_clear() nvme: add missing endianess annotations in nvme_pr_command NVMe: Add persistent reservation ops sd: implement the Persistent Reservation API block: add an API for Persistent Reservations block: cleanup blkdev_ioctl
2015-10-21sd: implement the Persistent Reservation APIChristoph Hellwig
This is a mostly trivial mapping to the PERSISTENT RESERVE IN/OUT commands. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-21md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdownDan Williams
Now that the integrity profile is statically allocated there is no work to do when shutting down an integrity enabled block device. Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: James Bottomley <JBottomley@Odin.com> Acked-by: NeilBrown <neilb@suse.com> Acked-by: Keith Busch <keith.busch@intel.com> Acked-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-02Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block updates from Jens Axboe: "This first core part of the block IO changes contains: - Cleanup of the bio IO error signaling from Christoph. We used to rely on the uptodate bit and passing around of an error, now we store the error in the bio itself. - Improvement of the above from myself, by shrinking the bio size down again to fit in two cachelines on x86-64. - Revert of the max_hw_sectors cap removal from a revision again, from Jeff Moyer. This caused performance regressions in various tests. Reinstate the limit, bump it to a more reasonable size instead. - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me. Most devices have huge trim limits, which can cause nasty latencies when deleting files. Enable the admin to configure the size down. We will look into having a more sane default instead of UINT_MAX sectors. - Improvement of the SGP gaps logic from Keith Busch. - Enable the block core to handle arbitrarily sized bios, which enables a nice simplification of bio_add_page() (which is an IO hot path). From Kent. - Improvements to the partition io stats accounting, making it faster. From Ming Lei. - Also from Ming Lei, a basic fixup for overflow of the sysfs pending file in blk-mq, as well as a fix for a blk-mq timeout race condition. - Ming Lin has been carrying Kents above mentioned patches forward for a while, and testing them. Ming also did a few fixes around that. - Sasha Levin found and fixed a use-after-free problem introduced by the bio->bi_error changes from Christoph. - Small blk cgroup cleanup from Viresh Kumar" * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) blk: Fix bio_io_vec index when checking bvec gaps block: Replace SG_GAPS with new queue limits mask block: bump BLK_DEF_MAX_SECTORS to 2560 Revert "block: remove artifical max_hw_sectors cap" blk-mq: fix race between timeout and freeing request blk-mq: fix buffer overflow when reading sysfs file of 'pending' Documentation: update notes in biovecs about arbitrarily sized bios block: remove bio_get_nr_vecs() fs: use helper bio_add_page() instead of open coding on bi_io_vec block: kill merge_bvec_fn() completely md/raid5: get rid of bio_fits_rdev() md/raid5: split bio for chunk_aligned_read block: remove split code in blkdev_issue_{discard,write_same} btrfs: remove bio splitting and merge_bvec_fn() calls bcache: remove driver private bio splitting code block: simplify bio_add_page() block: make generic_make_request handle arbitrarily sized bios blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) block: don't access bio->bi_error after bio_put() block: shrink struct bio down to 2 cache lines again ...
2015-08-12sd: Fix maximum I/O size for BLOCK_PC requestsMartin K. Petersen
Commit bcdb247c6b6a ("sd: Limit transfer length") clamped the maximum size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the BLOCK LIMITS VPD. This had the unfortunate effect of also limiting the maximum size of non-filesystem requests sent to the device through sg/bsg. Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue limit directly. Also update the comment in blk_limits_max_hw_sectors() to clarify that max_hw_sectors defines the limit for the I/O controller only. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Brian King <brking@linux.vnet.ibm.com> Tested-by: Brian King <brking@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # 3.17+ Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-07-17block: have drivers use blk_queue_max_discard_sectors()Jens Axboe
Some drivers use it now, others just set the limits field manually. But in preparation for splitting this into a hard and soft limit, ensure that they all call the proper function for setting the hw limit for discards. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-25sd: fix an error return in probe()Dan Carpenter
If device_add() fails then it should return the error code but instead the current code returns success. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-05-18sd: Disable support for 256 byte/sector disksMark Hounschell
256 bytes per sector support has been broken since 2.6.X, and no-one stepped up to fix this. So disable support for it. Signed-off-by: Mark Hounschell <dmarkh@cfl.rr.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-04-16sd: Unregister integrity profileMartin K. Petersen
The new integrity code did not correctly unregister the profile for SD disks. Call blk_integrity_unregister() when we release a disk. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Tested-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # v3.17+ Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-04-10sd, mmc, virtio_blk, string_helpers: fix block size unitsJames Bottomley
The current string_get_size() overflows when the device size goes over 2^64 bytes because the string helper routine computes the suffix from the size in bytes. However, the entirety of Linux thinks in terms of blocks, not bytes, so this will artificially induce an overflow on very large devices. Fix this by making the function string_get_size() take blocks and the block size instead of bytes. This should allow us to keep working until the current SCSI standard overflows. Also fix virtio_blk and mmc (both of which were also artificially multiplying by the block size to pass a byte side to string_get_size()). The mathematics of this is pretty simple: we're taking a product of size in blocks (S) and block size (B) and trying to re-express this in exponential form: S*B = R*N^E (where N, the exponent is either 1000 or 1024) and R < N. Mathematically, S = RS*N^ES and B=RB*N^EB, so if RS*RB < N it's easy to see that S*B = RS*RB*N^(ES+EB). However, if RS*BS > N, we can see that this can be re-expressed as RS*BS = R*N (where R = RS*BS/N < N) so the whole exponent becomes R*N^(ES+EB+1) [jejb: fix incorrect 32 bit do_div spotted by kbuild test robot <fengguang.wu@intel.com>] Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-03-19sd: don't grab a device references from driver methodsChristoph Hellwig
The device model already takes care of races between ->remove and ->shutdown vs its other methods, and we now take care about locking them out for ->rescan as well. This is a partial revert of commit 39b7f1 ("[SCSI] sd: Fix refcounting"). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2015-02-11Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull first round of SCSI updates from James Bottomley: "This is the usual grab bag of driver updates (hpsa, storvsc, mp2sas, megaraid_sas, ses) plus an assortment of minor updates. There's also an update to ufs which adds new phy drivers and finally a new logging infrastructure for SCSI" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (114 commits) scsi_logging: return void for dev_printk() functions scsi: print single-character strings with seq_putc scsi: merge consecutive seq_puts calls scsi: replace seq_printf with seq_puts aha152x: replace seq_printf with seq_puts advansys: replace seq_printf with seq_puts scsi: remove SPRINTF macro sg: remove an unused variable hpsa: Use local workqueues instead of system workqueues hpsa: add in P840ar controller model name hpsa: add in gen9 controller model names hpsa: detect and report failures changing controller transport modes hpsa: shorten the wait for the CISS doorbell mode change ack hpsa: refactor duplicated scan completion code into a new routine hpsa: move SG descriptor set-up out of hpsa_scatter_gather() hpsa: do not use function pointers in fast path command submission hpsa: print CDBs instead of kernel virtual addresses for uncommon errors hpsa: do not use a void pointer for scsi_cmd field of struct CommandList hpsa: return failed from device reset/abort handlers hpsa: check for ctlr lockup after command allocation in main io path ...
2015-02-02sd: Fix max transfer length for 4k disksBrian King
The following patch fixes an issue observed with 4k sector disks where the max_hw_sectors attribute was getting set too large in sd_revalidate_disk. Since sdkp->max_xfer_blocks is in units of SCSI logical blocks and queue_max_hw_sectors is in units of 512 byte blocks, on a 4k sector disk, every time we went through sd_revalidate_disk, we were taking the current value of queue_max_hw_sectors and increasing it by a factor of 8. Fix this by only shifting sdkp->max_xfer_blocks. Cc: stable@vger.kernel.org Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-01-09scsi: use per-cpu buffer for formatting senseHannes Reinecke
Convert sense buffer logging to use the per-cpu buffer to avoid line breakup. Tested-by: Robert Elliott <elliott@hp.com> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-12-30sd: tweak discard heuristics to work around QEMU SCSI issueMartin K. Petersen
7985090aa020 changed the discard heuristics to give preference to the WRITE SAME commands that (unlike UNMAP) guarantee deterministic results. Ming Lei discovered that QEMU SCSI's WRITE SAME implementation internally relied on limits that were only communicated for the UNMAP case. And therefore discard commands backed by WRITE SAME would fail. Tweak the heuristics so we still pick UNMAP in the LBPRZ=0 case and only prefer the WRITE SAME variants if the device has the LBPRZ flag set. Reported-by: Ming Lei <ming.lei@canonical.com> Tested-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-24scsi: rename SERVICE_ACTION_IN to SERVICE_ACTION_IN_16Hannes Reinecke
SPC-3 defines SERVICE ACTION IN(12) and SERVICE ACTION IN(16). So rename SERVICE_ACTION_IN to SERVICE_ACTION_IN_16 to be consistent with SPC and to allow for better distinction. Signed-off-by: Hannes Reinecke <hare@suse.de> Tested-by: Robert Elliott <elliott@hp.com> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-24scsi: remove scsi_driver owner fieldChristoph Hellwig
The driver core driver structure has grown an owner field and now requires it to be set for all modular drivers. Set it up for all scsi_driver instances and get rid of the now superflous scsi_driver owner field. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Shane M Seymour <shane.seymour@hp.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-24scsi: stop passing a gfp_mask argument down the command setup pathChristoph Hellwig
There is no reason for ULDs to pass in a flag on how to allocate the S/G lists. While we don't need GFP_ATOMIC for the blk-mq case because we don't hold locks, that decision can be made way down the chain without having to pass a pointless gfp_mask argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12sd: disable discard_zeroes_data for UNMAPMartin K. Petersen
The T10 SBC UNMAP command does not provide any hard guarantees that blocks will return zeroes on a subsequent READ. This is due to the fact that the device server is free to silently ignore all or parts of the request. The only way to ensure that a block consistently returns zeroes after being unmapped is to use WRITE SAME with the UNMAP bit set. Should the device be unable to unmap one or more blocks described by the command it is required to manually write zeroes to them. Until now we have preferred UNMAP over the WRITE SAME variants to accommodate thinly provisioned devices that predated the final SBC-3 spec. This patch changes the heuristic so that we favor WRITE SAME(16) or (10) over UNMAP if these commands are marked as supported in the Logical Block Provisioning VPD page. The patch also disables discard_zeroes_data for devices operating in UNMAP mode. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12sd: fix up ->compat_ioctlChristoph Hellwig
No need to verify the passthrough ioctls, the real handler will take care of that. Also make sure not to block for resets on O_NONBLOCK fds. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12scsi: split scsi_nonblockable_ioctlChristoph Hellwig
The calling conventions for this function are bad as it could return -ENODEV both for a device not currently online and a not recognized ioctl. Add a new scsi_ioctl_block_when_processing_errors function that wraps scsi_block_when_processing_errors with the a special case for the SG_SCSI_RESET ioctl command, and handle the SG_SCSI_RESET case itself in scsi_ioctl. All callers of scsi_ioctl now must call the above helper to check for the EH state, so that the ioctl handler itself doesn't have to. Reported-by: Robert Elliott <Elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12scsi: remove scsi_show_result()Hannes Reinecke
Open-code scsi_print_result in sd.c, and cleanup logging to not print duplicate informations. Also remove the call to scsi_show_result() in ufshcd.c to be consistent with other callers of scsi_execute(). With that we can remove scsi_show_result in constants.c Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12scsi: use sdev as argument for sense code printingHannes Reinecke
We should be using the standard dev_printk() variants for sense code printing. [hch: remove __scsi_print_sense call in xen-scsiback, Acked by Juergen] [hch: folded bracing fix from Dan Carpenter] Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12sd: remove scsi_print_sense() in sd_done()Hannes Reinecke
sd_done() was calling scsi_print_sense() for a sense code of 'NO_SENSE'. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>