summaryrefslogtreecommitdiff
path: root/drivers/scsi/scsi_error.c
AgeCommit message (Collapse)Author
2017-08-25scsi: Use blk_mq_rq_to_pdu() to convert a request to a SCSI command pointerBart Van Assche
Since commit e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of struct request") struct request and struct scsi_cmnd are adjacent. This means that there is now an alternative to reading req->special to convert a pointer to a prepared request into a SCSI command pointer, namely by using blk_mq_rq_to_pdu(). Make this change where appropriate. Although this patch does not change any functionality, it slightly improves performance and slightly improves readability. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25scsi: Suppress gcc 7 fall-through warnings reported with W=1Bart Van Assche
The conclusion of a recent discussion about the new warnings reported by gcc 7 is that the new warnings reported when building with W=1 should be suppressed. However, gcc 7 still warns about fall-through in switch statements when building with W=1. Suppress these warnings by annotating the SCSI core properly. See also Linus Torvalds, Lots of new warnings with gcc-7.1.1, 11 July 2017 (https://www.mail-archive.com/linux-media@vger.kernel.org/msg115428.html). References: commit bd664f6b3e37 ("disable new gcc-7.1.1 warnings for now") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-06Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly updates of the usual suspects: lpfc, qla2xxx, bnx2fc, qedf, hpsa, hisi_sas, smartpqi, cxlflash, aacraid, csiostor along with a host of minor and miscellaneous changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (276 commits) qla2xxx: Fix NVMe entry_type for iocb packet on BE system scsi: qla2xxx: avoid unused-function warning scsi: snic: fix a couple of spelling mistakes/typos scsi: qla2xxx: fix a bunch of typos and spelling mistakes scsi: lpfc: don't double count abort errors scsi: lpfc: spin_lock_irq() is not nestable scsi: hisi_sas: optimise DMA slot memory scsi: ibmvfc: constify dev_pm_ops structures. scsi: ibmvscsi: constify dev_pm_ops structures. scsi: cxlflash: Update debug prints in reset handlers scsi: cxlflash: Update send_tmf() parameters scsi: cxlflash: Avoid double free of character device scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails. scsi: ufs: flush eh_work when eh_work scheduled. scsi: qla2xxx: Protect access to qpair members with qpair->qp_lock scsi: sun_esp: fix device reference leaks scsi: fnic: changing queue command to return result DID_IMM_RETRY when rport is init scsi: fnic: correct speed display and add support for 25,40 and 100G scsi: fnic: added timestamp reporting in fnic debug stats ...
2017-06-20block: Make most scsi_req_init() calls implicitBart Van Assche
Instead of explicitly calling scsi_req_init() after blk_get_request(), call that function from inside blk_get_request(). Add an .initialize_rq_fn() callback function to the block drivers that need it. Merge the IDE .init_rq_fn() function into .initialize_rq_fn() because it is too small to keep it as a separate function. Keep the scsi_req_init() call in ide_prep_sense() because it follows a blk_rq_init() call. References: commit 82ed4db499b8 ("block: split scsi_request out of struct request") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-12scsi: Protect SCSI device state changes with a mutexBart Van Assche
Serializing SCSI device state changes avoids that two state changes can occur concurrently, e.g. the state changes in scsi_target_block() and __scsi_remove_device(). This serialization is essential to make patch "Make __scsi_remove_device go straight from BLOCKED to DEL" work reliably. Enable this mechanism for all scsi_target_*block() callers but not for the scsi_internal_device_unblock() calls from the mpt3sas driver because that driver can call scsi_internal_device_unblock() from atomic context. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-09block: introduce new block status code typeChristoph Hellwig
Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-04Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the usual amount of cosmetic and spelling stuff" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (155 commits) scsi: qla4xxx: fix spelling mistake: "Tempalate" -> "Template" scsi: stex: make S6flag static scsi: mac_esp: fix to pass correct device identity to free_irq() scsi: aacraid: pci_alloc_consistent() failures on ARM64 scsi: ufs: make ufshcd_get_lists_status() register operation obvious scsi: ufs: use MASK_EE_STATUS scsi: mac_esp: Replace bogus memory barrier with spinlock scsi: fcoe: make fcoe_e_d_tov and fcoe_r_a_tov static scsi: sd_zbc: Do not write lock zones for reset scsi: sd_zbc: Remove superfluous assignments scsi: sd: sd_zbc: Rename sd_zbc_setup_write_cmnd scsi: Improve scsi_get_sense_info_fld scsi: sd: Cleanup sd_done sense data handling scsi: sd: Improve sd_completed_bytes scsi: sd: Fix function descriptions scsi: mpt3sas: remove redundant wmb scsi: mpt: Move scsi_remove_host() out of mptscsih_remove_host() scsi: sg: reset 'res_in_use' after unlinking reserved array scsi: mvumi: remove code handling zero scsi_sg_count(scmd) case scsi: fusion: fix spelling mistake: "Persistancy" -> "Persistency" ...
2017-04-25scsi: Improve scsi_get_sense_info_fldDamien Le Moal
Use get_unaligned_be32 and get_unaligned_be64 to obtain values from the sense buffer instead of open coding the operations. Also change the function return value to a bool and fix the function signature declaration to remove spaces triggering checkpatch warnings. No functional change is introduced by this patch. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: make asynchronous aborts mandatoryHannes Reinecke
There hasn't been any reports for HBAs where asynchronous abort would not work, so we should make it mandatory and remove the fallback. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: make scsi_eh_scmd_add() always succeedHannes Reinecke
scsi_eh_scmd_add() currently only will fail if no error handler thread is started (which will never be the case) or if the state machine encounters an illegal transition. But if we're encountering an invalid state transition chances is we cannot fixup things with the error handler. So better add a WARN_ON for illegal host states and make scsi_dh_scmd_add() a void function. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: make eh_eflags persistentHannes Reinecke
If a failed command is retried and fails again we need to enter SCSI EH, otherwise we will never be able to recover the command. To detect this situation we must not clear scmd->eh_eflags when EH finishes but rather make it persistent throughout the lifetime of the command. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: always send command abortsHannes Reinecke
When a command has timed out we always should be sending an abort; with the previous code a failed abort might signal SCSI EH to start, and all other timed out commands will never be aborted, even though they might belong to a different ITL nexus. Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: scsi_error: count medium access timeout only once per EH runHannes Reinecke
The current medium access timeout counter will be increased for each command, so if there are enough failed commands we'll hit the medium access timeout for even a single device failure and the following kernel message is displayed: sd H:C:T:L: [sdXY] Medium access timeout failure. Offlining disk! Fix this by making the timeout per EH run, ie the counter will only be increased once per device and EH run. Fixes: 18a4d0a ("[SCSI] Handle disk devices which can not process medium access commands") Cc: Ewan Milne <emilne@redhat.com> Cc: Lawrence Obermann <loberman@redhat.com> Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Cc: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-05block, scsi: move the retries field to struct scsi_requestChristoph Hellwig
Instead of bloating the generic struct request with it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-21Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (ncr5380, ufs, lpfc, be2iscsi, hisi_sas, storvsc, cxlflash, aacraid, megaraid_sas, ...). There's also an assortment of minor fixes and the major update of switching a bunch of drivers to pci_alloc_irq_vectors from Christoph" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (188 commits) scsi: megaraid_sas: handle dma_addr_t right on 32-bit scsi: megaraid_sas: array overflow in megasas_dump_frame() scsi: snic: switch to pci_irq_alloc_vectors scsi: megaraid_sas: driver version upgrade scsi: megaraid_sas: Change RAID_1_10_RMW_CMDS to RAID_1_PEER_CMDS and set value to 2 scsi: megaraid_sas: Indentation and smatch warning fixes scsi: megaraid_sas: Cleanup VD_EXT_DEBUG and SPAN_DEBUG related debug prints scsi: megaraid_sas: Increase internal command pool scsi: megaraid_sas: Use synchronize_irq to wait for IRQs to complete scsi: megaraid_sas: Bail out the driver load if ld_list_query fails scsi: megaraid_sas: Change build_mpt_mfi_pass_thru to return void scsi: megaraid_sas: During OCR, if get_ctrl_info fails do not continue with OCR scsi: megaraid_sas: Do not set fp_possible if TM capable for non-RW syspdIO, change fp_possible to bool scsi: megaraid_sas: Remove unused pd_index from megasas_build_ld_nonrw_fusion scsi: megaraid_sas: megasas_return_cmd does not memset IO frame to zero scsi: megaraid_sas: max_fw_cmds are decremented twice, remove duplicate scsi: megaraid_sas: update can_queue only if the new value is less scsi: megaraid_sas: Change max_cmd from u32 to u16 in all functions scsi: megaraid_sas: set pd_after_lb from MR_BuildRaidContext and initialize pDevHandle to MR_DEVHANDLE_INVALID scsi: megaraid_sas: latest controller OCR capability from FW before sending shutdown DCMD ...
2017-02-06scsi: remove eh_timed_out methods in the transport templateChristoph Hellwig
Instead define the timeout behavior purely based on the host_template eh_timed_out method and wire up the existing transport implementations in the host templates. This also clears up the confusion that the transport template method overrides the host template one, so some drivers have to re-override the transport template one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-31block: fold cmd_type into the REQ_OP_ spaceChristoph Hellwig
Instead of keeping two levels of indirection for requests types, fold it all into the operations. The little caveat here is that previously cmd_type only applied to struct request, while the request and bio op fields were set to plain REQ_OP_READ/WRITE even for passthrough operations. Instead this patch adds new REQ_OP_* for SCSI passthrough and driver private requests, althought it has to add two for each so that we can communicate the data in/out nature of the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31block: introduce blk_rq_is_passthroughChristoph Hellwig
This can be used to check for fs vs non-fs requests and basically removes all knowledge of BLOCK_PC specific from the block layer, as well as preparing for removing the cmd_type field in struct request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-27block: split scsi_request out of struct requestChristoph Hellwig
And require all drivers that want to support BLOCK_PC to allocate it as the first thing of their private data. To support this the legacy IDE and BSG code is switched to set cmd_size on their queues to let the block layer allocate the additional space. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-27scsi: allocate scsi_cmnd structures as part of struct requestChristoph Hellwig
Rely on the new block layer functionality to allocate additional driver specific data behind struct request instead of implementing it in SCSI itѕelf. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-28block: split out request-only flags into a new namespaceChristoph Hellwig
A lot of the REQ_* flags are only used on struct requests, and only of use to the block layer and a few drivers that dig into struct request internals. This patch adds a new req_flags_t rq_flags field to struct request for them, and thus dramatically shrinks the number of common requests. It also removes the unfortunate situation where we have to fit the fields from the same enum into 32 bits for struct bio and 64 bits for struct request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-18Merge remote-tracking branch 'mkp-scsi/4.7/scsi-fixes' into fixesJames Bottomley
2016-06-08scsi: fix race between simultaneous decrements of ->host_failedWei Fang
sas_ata_strategy_handler() adds the works of the ata error handler to system_unbound_wq. This workqueue asynchronously runs work items, so the ata error handler will be performed concurrently on different CPUs. In this case, ->host_failed will be decreased simultaneously in scsi_eh_finish_cmd() on different CPUs, and become abnormal. It will lead to permanently inequality between ->host_failed and ->host_busy, and scsi error handler thread won't start running. IO errors after that won't be handled. Since all scmds must have been handled in the strategy handler, just remove the decrement in scsi_eh_finish_cmd() and zero ->host_busy after the strategy handler to fix this race. Fixes: 50824d6c5657 ("[SCSI] libsas: async ata-eh") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang <fangwei1@huawei.com> Reviewed-by: James Bottomley <jejb@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-04libata: evaluate SCSI sense codeHannes Reinecke
Whenever a sense code is set it would need to be evaluated to update the error mask. tj: Cosmetic formatting updates. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2015-11-06mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIMMel Gorman
__GFP_WAIT was used to signal that the caller was in atomic context and could not sleep. Now it is possible to distinguish between true atomic context and callers that are not willing to sleep. The latter should clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing __GFP_WAIT behaves differently, there is a risk that people will clear the wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly indicate what it does -- setting it allows all reclaim activity, clearing them prevents it. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-11Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Here are the outstanding target-pending updates for v4.3-rc1. Mostly bug-fixes and minor changes this round. The fallout from the big v4.2-rc1 RCU conversion have (thus far) been minimal. The highlights this round include: - Move sense handling routines into scsi_common code (Sagi) - Return ABORTED_COMMAND sense key for PI errors (Sagi) - Add tpg_enabled_sendtargets attribute for disabled iscsi-target discovery (David) - Shrink target struct se_cmd by rearranging fields (Roland) - Drop iSCSI use of mutex around max_cmd_sn increment (Roland) - Replace iSCSI __kernel_sockaddr_storage with sockaddr_storage (Andy + Chris) - Honor fabric max_data_sg_nents I/O transfer limit (Arun + Himanshu + nab) - Fix EXTENDED_COPY >= v4.1 regression OOPsen (Alex + nab)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (37 commits) target: use stringify.h instead of own definition target/user: Fix UFLAG_UNKNOWN_OP handling target: Remove no-op conditional target/user: Remove unused variable target: Fix max_cmd_sn increment w/o cmdsn mutex regressions target: Attach EXTENDED_COPY local I/O descriptors to xcopy_pt_sess target/qla2xxx: Honor max_data_sg_nents I/O transfer limit target/iscsi: Replace __kernel_sockaddr_storage with sockaddr_storage target/iscsi: Replace conn->login_ip with login_sockaddr target/iscsi: Keep local_ip as the actual sockaddr target/iscsi: Fix np_ip bracket issue by removing np_ip target: Drop iSCSI use of mutex around max_cmd_sn increment qla2xxx: Update tcm_qla2xxx module description to 24xx+ iscsi-target: Add tpg_enabled_sendtargets for disabled discovery drivers: target: Drop unlikely before IS_ERR(_OR_NULL) target: check DPO/FUA usage for COMPARE AND WRITE target: Shrink struct se_cmd by rearranging fields target: Remove cmd->se_ordered_id (unused except debug log lines) target: add support for START_STOP_UNIT SCSI opcode target: improve unsupported opcode message ...
2015-09-11Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull second round of SCSI updates from James Bottomley: "There's one late arriving patch here (added today), fixing a build issue which the scsi_dh patch set in here uncovered. Other than that, everything has been incubated in -next and the checkers for a week. The major pieces of this patch are a set patches facilitating better integration between scsi and scsi_dh (the device handling layer used by multi-path; all the dm parts are acked by Mike Snitzer). This also includes driver updates for mp3sas, scsi_debug and an assortment of bug fixes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (50 commits) scsi_dh: fix randconfig build error scsi: fix scsi_error_handler vs. scsi_host_dev_release race fcoe: Convert use of __constant_htons to htons mpt2sas: setpci reset kernel oops fix pm80xx: Don't override ts->stat on IO_OPEN_CNX_ERROR_HW_RESOURCE_BUSY lpfc: Fix possible use-after-free and double free in lpfc_mbx_cmpl_rdp_page_a2() bfa: Fix incorrect de-reference of pointer bfa: Fix indentation scsi_transport_sas: Remove check for SAS expander when querying bay/enclosure IDs. scsi_debug: resp_request: remove unused variable scsi_debug: fix REPORT LUNS Well Known LU scsi_debug: schedule_resp fix input variable check scsi_debug: make dump_sector static scsi_debug: vfree is null safe so drop the check scsi_debug: use SCSI_W_LUN_REPORT_LUNS instead of SAM2_WLUN_REPORT_LUNS; scsi_debug: define pr_fmt() for consistent logging mpt2sas: Refcount fw_events and fix unsafe list usage mpt2sas: Refcount sas_device objects and fix unsafe list usage scsi_dh: return SCSI_DH_NOTCONN in scsi_dh_activate() scsi_dh: don't allow to detach device handlers at runtime ...
2015-09-06scsi: fix scsi_error_handler vs. scsi_host_dev_release raceMichal Hocko
b9d5c6b7ef57 ("[SCSI] cleanup setting task state in scsi_error_handler()") has introduced a race between scsi_error_handler and scsi_host_dev_release resulting in the hang when the device goes away because scsi_error_handler might miss a wake up: CPU0 CPU1 scsi_error_handler scsi_host_dev_release kthread_stop() kthread_should_stop() test_bit(KTHREAD_SHOULD_STOP) set_bit(KTHREAD_SHOULD_STOP) wake_up_process() wait_for_completion() set_current_state(TASK_INTERRUPTIBLE) schedule() The most straightforward solution seems to be to invert the ordering of the set_current_state and kthread_should_stop. The issue has been noticed during reboot test on a 3.0 based kernel but the current code seems to be affected in the same way. [jejb: additional comment added] Cc: <stable@vger.kernel.org> # 3.6+ Reported-and-debugged-by: Mike Mayer <Mike.Meyer@teradata.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-09-02Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull first round of SCSI updates from James Bottomley: "This includes one new driver: cxlflash plus the usual grab bag of updates for the major drivers: qla2xxx, ipr, storvsc, pm80xx, hptiop, plus a few assorted fixes. There's another tranch coming, but I want to incubate it another few days in the checkers, plus it includes a mpt2sas separated lifetime fix, which Avago won't get done testing until Friday" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (85 commits) aic94xx: set an error code on failure storvsc: Set the error code correctly in failure conditions storvsc: Allow write_same when host is windows 10 storvsc: use storage protocol version to determine storage capabilities storvsc: use correct defaults for values determined by protocol negotiation storvsc: Untangle the storage protocol negotiation from the vmbus protocol negotiation. storvsc: Use a single value to track protocol versions storvsc: Rather than look for sets of specific protocol versions, make decisions based on ranges. cxlflash: Remove unused variable from queuecommand cxlflash: shift wrapping bug in afu_link_reset() cxlflash: off by one bug in cxlflash_show_port_status() cxlflash: Virtual LUN support cxlflash: Superpipe support cxlflash: Base error recovery support qla2xxx: Update driver version to 8.07.00.26-k qla2xxx: Add pci device id 0x2261. qla2xxx: Fix missing device login retries. qla2xxx: do not clear slot in outstanding cmd array qla2xxx: Remove decrement of sp reference count in abort handler. qla2xxx: Add support to show MPI and PEP FW version for ISP27xx. ...
2015-08-28scsi_dh: kill struct scsi_dh_dataChristoph Hellwig
Add a ->handler and a ->handler_data field to struct scsi_device and kill this indirection. Also move struct scsi_device_handler to scsi_dh.h so that changes to it don't require rebuilding every SCSI LLDD. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-26scsi_error: should not get sense for timeout IO in scsi error handlerjiang.biao2@zte.com.cn
scsi_error: should not get sense for timeout IO in scsi error handler When an IO timeout occurs, the IO will be aborted in scsi_abort_command() and SCSI_EH_ABORT_SCHEDULED will be set. Because of that, the SCSI_EH_CANCEL_CMD will be clear in scsi_eh_scmd_add(). So when scsi error handler starts, it will get sense for this timeout IO and the scmd of the IO request will be reused. In that case, the scmd may be double released when racing with io_done(), which will result in crash. SO SCSI_EH_ABORT_SCHEDULED should also be checked when getting sense. The bug maybe reproduced when the link between host and disk is unstable. Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Signed-off-by: Long Chun <long.chun@zte.com.cn> Reviewed-by: Tan Hu <tan.hu@zte.com.cn> Reviewed-by: Chen Donghai <chen.donghai@zte.com.cn> Reviewed-by: Cai Qu <cai.qu@zte.com.cn> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-26scsi: Add ALUA state change UA handlingHannes Reinecke
Log the ALUA state change unit attention correctly with the message log and emit an event to allow user-space tools to react to it. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-17Merge branch 'for-4.2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Three minor device-specific fixes and revert of NCQ autosense added during this -rc1. It turned out that NCQ autosense as currently implemented interferes with the usual error handling behavior. It will be revisited in the near future" * 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ata: ahci_brcmstb: Fix misuse of IS_ENABLED sata_sx4: Check return code from pdc20621_i2c_read() Revert "libata: Implement NCQ autosense" Revert "libata: Implement support for sense data reporting" Revert "libata-eh: Set 'information' field for autosense" ata: ahci_brcmstb: Fix warnings with CONFIG_PM_SLEEP=n
2015-08-03Revert "libata-eh: Set 'information' field for autosense"Tejun Heo
This reverts commit a1524f226a02aa6edebd90ae0752e97cfd78b159. As implemented, ACS-4 sense reporting for ATA devices bypasses error diagnosis and handling in libata degrading EH behavior significantly. Revert the related changes for now. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org #v4.1+
2015-07-30scsi: fix memory leak with scsi-mqTony Battersby
Fix a memory leak with scsi-mq triggered by commands with large data transfer length. __sg_alloc_table() sets both table->nents and table->orig_nents to the same value. When the scatterlist is DMA-mapped, table->nents is overwritten with the (possibly smaller) size of the DMA-mapped scatterlist, while table->orig_nents retains the original size of the allocated scatterlist. scsi_free_sgtable() should therefore check orig_nents instead of nents, and all code that initializes sdb->table without calling __sg_alloc_table() should set both nents and orig_nents. Fixes: d285203cf647 ("scsi: add support for a blk-mq based I/O path.") Cc: <stable@vger.kernel.org> # 3.17+ Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-07-23scsi: Move sense handling routines to scsi_commonSagi Grimberg
Sense data handling is also done in the target stack. Hence, move sense handling routines to scsi_common so the target will be able to use them as well. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-06-01Move code that is used both by initiator and target driversBart Van Assche
Move the functions that are used by both the initiator and target subsystems into scsi_common.c/.h. This change will allow to remove the initiator SCSI header include directives from most SCSI target source files in a later patch. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-03-27libata-eh: Set 'information' field for autosenseHannes Reinecke
If NCQ autosense or the sense data reporting feature is enabled the LBA of the offending command should be stored in the sense data 'information' field. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2015-01-09scsi: do not display kernel pointer in message logsHannes Reinecke
It is not good practice to display the kernel pointer in any message logs, and it doesn't display any additional information. And as we know have block-layer assigned tags we can use them to differentiate the messages. So remove any pointer references from the displayed messages. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-01-09scsi: fix scsi_error.c kernel-doc warningRandy Dunlap
Fix kernel-doc warning in scsi_error.c: Warning(..//drivers/scsi/scsi_error.c:887): No description found for parameter 'hostt' Fixes: 883a030f989a17b81167f3a181cf93d741fa98b4 (scsi: document scsi_try_to_abort_cmd) Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-12-30SCSI: fix regression in scsi_send_eh_cmnd()Alan Stern
Commit ac61d1955934 (scsi: set correct completion code in scsi_send_eh_cmnd()) introduced a bug. It changed the stored return value from a queuecommand call, but it didn't take into account that the return value was used again later on. This patch fixes the bug by changing the later usage. There is a big comment in the middle of scsi_send_eh_cmnd() which does a good job of explaining how the routine works. But it mentions a "rtn = FAILURE" value that doesn't exist in the code. This patch adjusts the code to match the comment (I assume the comment is right and the code is wrong). This fixes Bugzilla #88341. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Андрей Аладьев <aladjev.andrew@gmail.com> Tested-by: Андрей Аладьев <aladjev.andrew@gmail.com> Fixes: ac61d19559349e205dad7b5122b281419aa74a82 Acked-by: Hannes Reinecke <hare@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-12-08Merge remote-tracking branch 'scsi-queue/drivers-for-3.19' into for-linusJames Bottomley
Conflicts: drivers/scsi/scsi_debug.c Agreed and tested resolution to a merge problem between a fix in scsi_debug and a driver update Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-11-24scsi: don't use scsi_next_command in scsi_reset_providerChristoph Hellwig
scsi_reset_provider already manually runs all queues for the given host, so it doesn't need the scsi_run_queues call from it, and it doesn't need a reference on the device because it's synchronous. So let's just call scsi_put_command directly and avoid the device reference dance to simplify the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-24scsi: drop reason argument from ->change_queue_depthChristoph Hellwig
Drop the now unused reason argument from the ->change_queue_depth method. Also add a return value to scsi_adjust_queue_depth, and rename it to scsi_change_queue_depth now that it can be used as the default ->change_queue_depth implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-24scsi: avoid ->change_queue_depth indirection for queue full trackingChristoph Hellwig
All drivers use the implementation for ramping the queue up and down, so instead of overloading the change_queue_depth method call the implementation diretly if the driver opts into it by setting the track_queue_depth flag in the host template. Note that a few drivers validated the new queue depth in their change_queue_depth method, but as we never go over the queue depth set during slave_configure or the sysfs file this isn't nessecary and can safely be removed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
2014-11-12scsi: refactor scsi_reset_provider handlingChristoph Hellwig
Pull the common code from the two callers into the function, and rename it to scsi_ioctl_reset. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12scsi: document scsi_try_to_abort_cmdHannes Reinecke
scsi_try_to_abort_cmd() should only return SUCCESS, FAILED, or FAST_IO_FAIL. So document that in the function description and simplify the logging message. Signed-off-by: Hannes Reinecke <hare@suse.de> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12scsi: use shost argument in scsi_eh_prt_fail_statsHannes Reinecke
The EH statistics are per host, so we should be using shost_printk() here. Signed-off-by: Hannes Reinecke <hare@suse.de> Suggested-by: Robert Elliott <elliott@hp.com> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12scsi: fixup logging messages in scsi_error.cHannes Reinecke
Use the matching scope for logging messages to allow for better command tracing. Signed-off-by: Hannes Reinecke <hare@suse.de> Suggested-by: Robert Elliott <elliott@hp.com> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12scsi: use 'bool' as return value for scsi_normalize_sense()Hannes Reinecke
Convert scsi_normalize_sense() and friends to return 'bool' instead of an integer. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Reviewed-by: Yoshihiro Yunomae <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Christoph Hellwig <hch@lst.de>