summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-08-24scsi: hisi_sas: remove phy_down_v3_hw() res variableJohn Garry
Variable res only holds value 0, so remove it. This cleans up a coccicheck warning. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: hisi_sas: add phy_set_linkrate_v3_hw()Xiang Chen
Add function to set linkrate for v3 hw. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: hisi_sas: update some v3 register init settingsXiang Chen
This patch updates some register setting according to recommendation from HW designer and experiment. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: hisi_sas: add reset handler for v3 hwXiang Chen
Use ACPI "_RST" method to reset the controller, since FLR is not supported. Function hisi_sas_stop_phys() is introduced to remove some code duplication. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: kill tasklet when destroying irq in v3 hwXiang Chen
This patch adds calls to kill CQ takslets v3 hw during probe failure. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: fix v3 hw channel interrupt processingXiang Chen
The channel interrupt is to process all the interrupts except PHY UP/DOWN and broadcast interrupt. So we need to clear all the interrupts except those 3 interrupts after processing channel interrupts. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: Modify v3 hw STP_LINK_TIMER settingXiang Chen
Modify STP link timer from 10ms to 500ms. Also add the register address. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: add status and command buffer for internal abortXiang Chen
For v3 hw, internal abort function required status and command buffer to be set, so add necessary code for this. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: support zone management commandsXiaofei Tan
Add two ATA commands, ATA_CMD_ZAC_MGMT_IN and ATA_CMD_ZAC_MGMT_OUT in hisi_sas_get_ata_protocol(), to support SATA SMR disk. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: service interrupt ITCT_CLR interrupt in v2 hwXiang Chen
This patch is a fix related to freeing a device in v2 hw driver. Before, we polled to ITCT CLR interrupt to check if a device is free. This was error prone, as if the interrupt doesn't occur in 10us, we miss processing it. To avoid this situation, service this interrupt and sync the event with a completion. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: add irq and tasklet cleanup in v2 hwXiang Chen
This patch adds support to clean-up allocated IRQs and kill tasklets when probe fails and for driver removal. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: remove repeated device config in v2 hwXiang Chen
This patch removes some repeated configurations: (1) The device id of the device is already set in the alloc function, so we don't need to modify in free device function. (2) Field dev_type and dev_status are configured in hisi_sas_dev_gone(), so there is no need for repeated config in free_device_v3_hw. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: use array for v2 hw ECC errorsJohn Garry
The code to print ECC errors in v2 hw driver is very repetitive. This patch condensed the code by looping an array of errors. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: add v2 hw DFX featureXiaofei Tan
Add DFX feature for v2 hw. We are adding support for the following errors: - loss_of_dword_sync_count - invalid_dword_count - phy_reset_problem_count - running_disparity_error_count Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: fix v2 hw underflow residual valueXiang Chen
The value dw0 is the residual bytes when UNDERFLOW error happens, but we filled the residual with the value of dw3 before. So change the residual from dw3 to dw0. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: avoid potential v2 hw interrupt issueXiang Chen
When some interrupts happen together, we need to process every interrupt one-by-one, and should not return immediately when one interrupt process is finished being processed. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: hisi_sas: fix reset and port ID refresh issuesXiaofei Tan
This patch provides fixes for the following issues: 1. Fix issue of controller reset required to send commands. For reset process, it may be required to send commands to the controller, but not during soft reset. So add HISI_SAS_NOT_ACCEPT_CMD_BIT to prevent executing a task during this period. 2. Send a broadcast event in rescan topology to detect any topology changes during reset. 3. Previously it was not ensured that libsas has processed the PHY up and down events after reset. Potentially this could cause an issue that we still process the PHY event after reset. So resolve this by flushing shot workqueue in LLDD reset. 4. Port ID requires refresh after reset. The port ID generated after reset is not guaranteed to be the same as before reset, so it needs to be refreshed for each device's ITCT. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: smartpqi: change driver version to 1.1.2-125Kevin Barnett
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: smartpqi: add in new controller idsKevin Barnett
Update the driver’s PCI IDs to match the latest Microsemi controllers Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: smartpqi: update kexec and power down supportKevin Barnett
Add PQI reset to driver shutdown callback to work around controller bug. During an 1.) OS shutdown or 2.) kexec outside of a kdump, the Linux kernel will clear BME on our controller. If BME is cleared during a controller/host PCIe transfer, the controller will lock up. So we perform a PQI reset in the driver's shutdown callback function to eliminate the possibility of a controller/host PCIe transfer being active when the kernel clears BME immediately after calling the driver's shutdown callback. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: smartpqi: cleanup doorbell register usage.Kevin Barnett
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: smartpqi: update pqi passthru ioctlKevin Barnett
- make pass-thru requests bi-directional Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: smartpqi: enhance BMIC cache flushKevin Barnett
- distinguish between shutdown and non-shutdown. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: smartpqi: add pqi reset quiesce supportKevin Barnett
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: dpt_i2o: remove redundant null check on array deviceColin Ian King
The null check on pHba->channel[chan].device is redundant because device is an array and hence can never be null. Remove the check. Detected by CoverityScan, CID#115362 ("Array compared against 0") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: qla2xxx: use dma_mapping_error to check map errorsPan Bian
The return value of dma_map_single() should be checked by dma_mapping_error(). However, in function qla26xx_dport_diagnostics(), its return value is checked against NULL, which could result in failures. Signed-off-by: Pan Bian <bianpan2016@163.com> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: mvsas: replace kfree with scsi_host_putPan Bian
The return value of scsi_host_alloc() should be released by scsi_host_put(). However, in function mvs_pci_init(), kfree() is used. This patch replaces kfree() with scsi_host_put() to avoid possible memory leaks. Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: pm8001: fix double free in pm8001_pci_probePan Bian
In function pm8001_pci_probe(), on errors that the control flow jumps to label err_out_ha_free, function pm8001_free() is called. In pm8001_free(), scsi_host_put() is called to release shost, which keeps the return value of scsi_host_alloc(). After pm8001_free() returns, kfree() is called to free shost again, resulting in a double free bug. This patch removes scsi_host_put() from pm8001_free() and explicitly calls scsi_host_put() to release Scsi_Host in need. Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: megaraid_sas: fix allocate instance->pd_info twiceweiping
fix allocate instance->pd_info twice which was introduced by 96188a89cc6d. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: esp_scsi: Always clear msg_out_len after MESSAGE OUT phaseFinn Thain
After sending a message, always clear esp->msg_out_len. Otherwise, eh_abort_handler may subsequently fail to send an ABORT TASK SET message. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: esp_scsi: Avoid sending ABORT TASK SET messagesFinn Thain
If an LLD aborts a task set, it should complete the affected commands with the appropriate result code. In a couple of cases esp_scsi doesn't do so. When the initiator receives an unhandled message, just respond by sending a MESSAGE REJECT instead of ABORT TASK SET, and thus avoid the issue. OTOH, a MESSAGE REJECT sent by a target can be taken as an indication that the initiator messed up somehow. It isn't always possible to abort correctly, so just fall back on a SCSI bus reset, which will complete the affected commands with the appropriate result code. For example, certain Apple (Sony) CD-ROM drives, when the non-existent LUN 1 is scanned, can't handle the INQUIRY command. The problem is not detected until the initiator gets a MESSAGE REJECT. Whenever esp_scsi sees that message, it raises ATN and sends ABORT TASK SET -- but neglects to complete the failed scmd. The target then goes into DATA OUT phase (probably bogus), while the ESP device goes into disconnected mode (surprising, given the bus phase). The next Transfer Information command from esp_scsi then causes an Invalid Command interrupt because that command is not valid when in disconnected mode: mac_esp: using PDMA for controller 0 mac_esp mac_esp.0: esp0: regs[50f10000:(null)] irq[19] mac_esp mac_esp.0: esp0: is a ESP236, 16 MHz (ccf=4), SCSI ID 7 scsi host0: esp scsi 0:0:0:0: Direct-Access SEAGATE ST318416N 0010 PQ: 0 ANSI: 3 scsi target0:0:0: Beginning Domain Validation scsi target0:0:0: asynchronous scsi target0:0:0: Domain Validation skipping write tests scsi target0:0:0: Ending Domain Validation scsi 0:0:3:0: CD-ROM SONY CD-ROM CDU-8003A 1.9a PQ: 0 ANSI: 2 CCS scsi target0:0:3: Beginning Domain Validation scsi target0:0:3: FAST-5 SCSI 2.0 MB/s ST (500 ns, offset 15) scsi target0:0:3: Domain Validation skipping write tests scsi target0:0:3: Ending Domain Validation scsi host0: unexpected IREG 40 scsi host0: Dumping command log scsi host0: ent[2] CMD val[c2] sreg[90] seqreg[cc] sreg2[00] ireg[20] ss[01] event[0c] scsi host0: ent[3] CMD val[00] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[0c] scsi host0: ent[4] EVENT val[0d] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[0c] scsi host0: ent[5] EVENT val[03] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[0d] scsi host0: ent[6] CMD val[90] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[03] scsi host0: ent[7] EVENT val[05] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[03] scsi host0: ent[8] EVENT val[0d] sreg[93] seqreg[cc] sreg2[00] ireg[10] ss[00] event[05] scsi host0: ent[9] CMD val[01] sreg[93] seqreg[cc] sreg2[00] ireg[10] ss[00] event[0d] scsi host0: ent[10] CMD val[11] sreg[93] seqreg[cc] sreg2[00] ireg[10] ss[00] event[0d] scsi host0: ent[11] EVENT val[0b] sreg[93] seqreg[cc] sreg2[00] ireg[10] ss[00] event[0d] scsi host0: ent[12] CMD val[12] sreg[97] seqreg[cc] sreg2[00] ireg[08] ss[00] event[0b] scsi host0: ent[13] EVENT val[0c] sreg[97] seqreg[cc] sreg2[00] ireg[08] ss[00] event[0b] scsi host0: ent[14] CMD val[44] sreg[90] seqreg[cc] sreg2[00] ireg[20] ss[00] event[0c] scsi host0: ent[15] CMD val[01] sreg[90] seqreg[cc] sreg2[00] ireg[20] ss[01] event[0c] scsi host0: ent[16] CMD val[c2] sreg[90] seqreg[cc] sreg2[00] ireg[20] ss[01] event[0c] scsi host0: ent[17] CMD val[00] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[0c] scsi host0: ent[18] EVENT val[0d] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[0c] scsi host0: ent[19] EVENT val[06] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[0d] scsi host0: ent[20] CMD val[01] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[06] scsi host0: ent[21] CMD val[10] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[06] scsi host0: ent[22] CMD val[1a] sreg[87] seqreg[ca] sreg2[00] ireg[08] ss[00] event[06] scsi host0: ent[23] CMD val[12] sreg[87] seqreg[ca] sreg2[00] ireg[08] ss[00] event[06] scsi host0: ent[24] EVENT val[0d] sreg[87] seqreg[ca] sreg2[00] ireg[08] ss[00] event[06] scsi host0: ent[25] EVENT val[09] sreg[86] seqreg[ca] sreg2[00] ireg[10] ss[00] event[0d] scsi host0: ent[26] CMD val[01] sreg[86] seqreg[ca] sreg2[00] ireg[10] ss[00] event[09] scsi host0: ent[27] CMD val[10] sreg[86] seqreg[ca] sreg2[00] ireg[10] ss[00] event[09] scsi host0: ent[28] EVENT val[0a] sreg[86] seqreg[ca] sreg2[00] ireg[10] ss[00] event[09] scsi host0: ent[29] EVENT val[0d] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[0a] scsi host0: ent[30] EVENT val[04] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[0d] scsi host0: ent[31] CMD val[01] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[04] scsi host0: ent[0] CMD val[90] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[04] scsi host0: ent[1] EVENT val[05] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[04] scsi target0:0:3: FAST-5 SCSI 2.0 MB/s ST (500 ns, offset 15) scsi target0:0:0: asynchronous sr 0:0:3:0: [sr0] scsi-1 drive cdrom: Uniform CD-ROM driver Revision: 3.20 sd 0:0:0:0: Attached scsi generic sg0 type 0 sr 0:0:3:0: Attached scsi generic sg1 type 5 This patch resolves this issue because the bus reset causes the INQUIRY command to fail earlier, and return the appropriate result code. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: esp_scsi: Clean up control flow and dead codeFinn Thain
This patch improves readability. There are no functional changes. Since this touches on a questionable ESP_INTR_DC conditional, add some commentary to help others who may (as I did) find themselves chasing an "Invalid Command" error after the device flags this condition. This cleanup also eliminates a warning from "make W=1": drivers/scsi/esp_scsi.c: In function 'esp_finish_select': drivers/scsi/esp_scsi.c:1233:5: warning: variable 'orig_select_state' set but not used [-Wunused-but-set-variable] u8 orig_select_state; Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: mac_esp: Fix PIO transfers for MESSAGE IN phaseFinn Thain
When in MESSAGE IN phase, the ESP device does not automatically acknowledge each byte that is transferred by PIO. The mac_esp driver neglects to explicitly ack them, which causes a timeout during messages larger than one byte (e.g. tag bytes during reconnect). Fix this with an ESP_CMD_MOK command after each byte. The MESSAGE IN phase is also different in that each byte transferred raises ESP_INTR_FDONE. So don't exit the transfer loop for this interrupt, for this phase. That resolves the "Reconnect IRQ2 timeout" error on those Macs which use PIO transfers instead of PDMA. This patch also improves on the weak tests for unexpected interrupts and phase changes during PIO transfers. Tested-by: Stan Johnson <userm57@yahoo.com> Fixes: 02507a80b35e ("[PATCH] [SCSI] mac_esp: fix PIO mode, take 2") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: mac_esp: Avoid type warning from sparseFinn Thain
Avoid the following warning from "make C=1": CHECK drivers/scsi/mac_esp.c drivers/scsi/mac_esp.c:357:30: warning: incorrect type in initializer (different address spaces) drivers/scsi/mac_esp.c:357:30: expected unsigned char [usertype] *fifo drivers/scsi/mac_esp.c:357:30: got void [noderef] <asn:2>* Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10arcmsr: add const to bin_attribute structuresBhumika Goyal
Add const to bin_attribute structures as they are only passed to the functions system_{remove/create}_bin_file. The arguments passed are of type const, so declare the structures to be const. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: early returns for traces disabled via levelMartin Peschke
This patch adds early checks to avoid burning CPU cycles on the assembly of trace entries which would be skipped anyway. Introduce a static const variable to keep the trace level to check with debug_level_enabled() in sync with the actual trace emit with debug_event(). In order not to refactor the SAN tracing too much, simply use a define instead. This change is only for the non / semi hot paths, while the actual (I/O) hot path was already improved earlier: zfcp_dbf_scsi() is already guarded by its only caller _zfcp_dbf_scsi() since commit dcd20e2316cd ("[SCSI] zfcp: Only collect SCSI debug data for matching trace levels"). zfcp_dbf_hba_fsf_res() is already guarded by its only caller zfcp_dbf_hba_fsf_response() since commit 2e261af84cdb ("[SCSI] zfcp: Only collect FSF/HBA debug data for matching trace levels"). Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> [maier@linux.vnet.ibm.com: rebase, reword, default level 3 branch prediction] Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: clean up unnecessary module_param_named() with no_auto_port_rescanMartin Peschke
Improves commit 43f60cbd56c4 ("[SCSI] zfcp: No automatic port_rescan on events") Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> [maier@linux.vnet.ibm.com: reword, underscore in description to match sysfs] Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: clean up a member of struct zfcp_qdio that was assigned but ↵Martin Peschke
never used v2.6.38 commit a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") dropped trace information previously introduced with v2.6.27 commit c3baa9a26c5a ("[SCSI] zfcp: Add information about interrupt to trace.") but kept and needlessly assigned a now no longer used struct field. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> [maier@linux.vnet.ibm.com: reword, added git history] Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: clean up no longer existent prototype from zfcp API headerSteffen Maier
Commit a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") refactored zfcp_dbf_hba_berr into zfcp_dbf_hba_bit_err but added the prototype for the latter without removing it for the former. Suggested-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: clean up redundant code with fall through in link down SRB ↵Martin Peschke
switch case Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> [maier@linux.vnet.ibm.com: re-worded short description for more details] Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: fix kernel doc comment typos for struct zfcp_dbf_scsiSteffen Maier
Improves commit 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.") Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: use endianness conversions with common FC(P) struct fieldsSteffen Maier
Just to silence sparse. Since zfcp only exists for s390 and s390 is big endian, this has been working correctly without conversions and all the new conversions are NOPs so no performance impact. Nonetheless, use the conversion on the constant expression where possible. NB: N_Port-IDs have always been handled with hton24 or ntoh24 conversions because they also convert to / from character array. Affected common code structs and .fields are: HOT I/O PATH: fcp_cmnd .fc_dl FCP command: regular SCSI I/O, including DIX case SEMI-HOT I/O PATH: fcp_cmnd .fc_dl recovery FCP command: task management function (LUN / target reset) fcp_resp_ext FCP response having FCP_SNS_LEN_VAL with .fr_rsp_len .fr_sns_len FCP response having FCP_RESID_UNDER with .fr_resid RECOVERY / DISCOVERY PATHS: fc_ct_hdr .ct_cmd .ct_mr_size zfcp auto port scan [GPN_FT] with fc_gpn_ft_resp.fp_wwpn, recovery for returned port [GID_PN] with fc_ns_gid_pn.fn_wwpn, get symbolic port name [GSPN], register symbolic port name [RSPN] (NPIV only). fc_els_rscn .rscn_plen incoming ELS (RSCN). fc_els_flogi .fl_wwpn .fl_wwnn incoming ELS (PLOGI), port open response with .fl_csp.sp_bb_data .fl_cssp[0..3].cp_class, FCP channel physical port, point-to-point peer (P2P only). fc_els_logo .fl_n_port_wwn incoming ELS (LOGO). fc_els_adisc .adisc_wwnn .adisc_wwpn path test after RSCN for gone target port. Since v4.10 commit 05de97003c77 ("linux/types.h: enable endian checks for all sparse builds"), below sparse endianness reports appear by default. Previously, one needed to pass argument CF="-D__CHECK_ENDIAN__" to make as in: $ make C=1 CF="-D__CHECK_ENDIAN__" M=drivers/s390/scsi. Silenced sparse warnings and one error: $ make C=1 M=drivers/s390/scsi ... CHECK drivers/s390/scsi/zfcp_dbf.c drivers/s390/scsi/zfcp_dbf.c:463:22: warning: restricted __be16 degrades to integer drivers/s390/scsi/zfcp_dbf.c:476:28: warning: restricted __be16 degrades to integer CC drivers/s390/scsi/zfcp_dbf.o ... CHECK drivers/s390/scsi/zfcp_fc.c drivers/s390/scsi/zfcp_fc.c:263:26: warning: restricted __be16 degrades to integer drivers/s390/scsi/zfcp_fc.c:299:41: warning: incorrect type in argument 2 (different base types) drivers/s390/scsi/zfcp_fc.c:299:41: expected unsigned long long [unsigned] [usertype] wwpn drivers/s390/scsi/zfcp_fc.c:299:41: got restricted __be64 [usertype] fl_wwpn drivers/s390/scsi/zfcp_fc.c:309:40: warning: incorrect type in argument 2 (different base types) drivers/s390/scsi/zfcp_fc.c:309:40: expected unsigned long long [unsigned] [usertype] wwpn drivers/s390/scsi/zfcp_fc.c:309:40: got restricted __be64 [usertype] fl_n_port_wwn drivers/s390/scsi/zfcp_fc.c:338:31: warning: restricted __be16 degrades to integer drivers/s390/scsi/zfcp_fc.c:355:24: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fc.c:355:24: expected restricted __be16 [usertype] ct_cmd drivers/s390/scsi/zfcp_fc.c:355:24: got unsigned short [unsigned] [usertype] cmd drivers/s390/scsi/zfcp_fc.c:356:28: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fc.c:356:28: expected restricted __be16 [usertype] ct_mr_size drivers/s390/scsi/zfcp_fc.c:356:28: got int drivers/s390/scsi/zfcp_fc.c:379:36: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fc.c:379:36: expected restricted __be64 [usertype] fn_wwpn drivers/s390/scsi/zfcp_fc.c:379:36: got unsigned long long [unsigned] [usertype] wwpn drivers/s390/scsi/zfcp_fc.c:463:18: warning: restricted __be64 degrades to integer drivers/s390/scsi/zfcp_fc.c:465:17: warning: cast from restricted __be64 drivers/s390/scsi/zfcp_fc.c:473:20: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fc.c:473:20: expected unsigned long long [unsigned] [usertype] wwnn drivers/s390/scsi/zfcp_fc.c:473:20: got restricted __be64 [usertype] fl_wwnn drivers/s390/scsi/zfcp_fc.c:474:29: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fc.c:474:29: expected unsigned int [unsigned] [usertype] maxframe_size drivers/s390/scsi/zfcp_fc.c:474:29: got restricted __be16 [usertype] sp_bb_data drivers/s390/scsi/zfcp_fc.c:476:30: warning: restricted __be16 degrades to integer drivers/s390/scsi/zfcp_fc.c:478:30: warning: restricted __be16 degrades to integer drivers/s390/scsi/zfcp_fc.c:480:30: warning: restricted __be16 degrades to integer drivers/s390/scsi/zfcp_fc.c:482:30: warning: restricted __be16 degrades to integer drivers/s390/scsi/zfcp_fc.c:500:28: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fc.c:500:28: expected unsigned long long [unsigned] [usertype] wwnn drivers/s390/scsi/zfcp_fc.c:500:28: got restricted __be64 [usertype] adisc_wwnn drivers/s390/scsi/zfcp_fc.c:502:38: warning: restricted __be64 degrades to integer drivers/s390/scsi/zfcp_fc.c:541:40: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fc.c:541:40: expected restricted __be64 [usertype] adisc_wwpn drivers/s390/scsi/zfcp_fc.c:541:40: got unsigned long long [unsigned] [usertype] port_name drivers/s390/scsi/zfcp_fc.c:542:40: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fc.c:542:40: expected restricted __be64 [usertype] adisc_wwnn drivers/s390/scsi/zfcp_fc.c:542:40: got unsigned long long [unsigned] [usertype] node_name drivers/s390/scsi/zfcp_fc.c:669:16: warning: restricted __be16 degrades to integer drivers/s390/scsi/zfcp_fc.c:696:24: warning: restricted __be64 degrades to integer drivers/s390/scsi/zfcp_fc.c:699:54: warning: incorrect type in argument 2 (different base types) drivers/s390/scsi/zfcp_fc.c:699:54: expected unsigned long long [unsigned] [usertype] <noident> drivers/s390/scsi/zfcp_fc.c:699:54: got restricted __be64 [usertype] fp_wwpn CC drivers/s390/scsi/zfcp_fc.o CHECK drivers/s390/scsi/zfcp_fsf.c drivers/s390/scsi/zfcp_fsf.c:479:34: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fsf.c:479:34: expected unsigned long long [unsigned] [usertype] port_name drivers/s390/scsi/zfcp_fsf.c:479:34: got restricted __be64 [usertype] fl_wwpn drivers/s390/scsi/zfcp_fsf.c:480:34: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fsf.c:480:34: expected unsigned long long [unsigned] [usertype] node_name drivers/s390/scsi/zfcp_fsf.c:480:34: got restricted __be64 [usertype] fl_wwnn drivers/s390/scsi/zfcp_fsf.c:506:36: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fsf.c:506:36: expected unsigned long long [unsigned] [usertype] peer_wwpn drivers/s390/scsi/zfcp_fsf.c:506:36: got restricted __be64 [usertype] fl_wwpn drivers/s390/scsi/zfcp_fsf.c:507:36: warning: incorrect type in assignment (different base types) drivers/s390/scsi/zfcp_fsf.c:507:36: expected unsigned long long [unsigned] [usertype] peer_wwnn drivers/s390/scsi/zfcp_fsf.c:507:36: got restricted __be64 [usertype] fl_wwnn drivers/s390/scsi/zfcp_fc.h:269:46: warning: restricted __be32 degrades to integer drivers/s390/scsi/zfcp_fc.h:270:29: error: incompatible types in comparison expression (different base types) Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: use common code fcp_cmnd and fcp_resp with union in ↵Steffen Maier
fsf_qtcb_bottom_io This eases crash dump analysis by automatically dissecting these protocol headers at least somewhat rather than getting a string interpretation of large unstructured character array buffer fields. Also, we can get rid of some unnecessary and error-prone type casts. This change is possible since v2.6.33 commit 4318e08c84e4 ("[SCSI] zfcp: Update FCP protocol related code"). Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: clarify that we don't need "link" test on failed open portSteffen Maier
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: more fitting constant for fc_ct_hdr.ct_reason on port scan responseSteffen Maier
v2.6.33 commit dbf5dfe9dbce ("[SCSI] zfcp: Use common code definitions for FC CT structs") replaced own definitions with common code definitions. While FC_BA_RJT_UNABLE happens to be defined with the same value 9 as FC_FS_RJT_UNABL and thus also works, here we should use the latter from fc_gs.h. See also its use in libfc's fc_disc_gpn_ft_resp(). Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: trace high part of "new" 64 bit SCSI LUNSteffen Maier
Complements debugging aspects of the otherwise functionally complete v3.17 commit 9cb78c16f5da ("scsi: use 64-bit LUNs"). While I don't have access to a target exporting 3 or 4 level LUNs, I did test it by explicitly attaching a non-existent fake 4 level LUN by means of zfcp sysfs attribute "unit_add". In order to see corresponding trace records of otherwise successful events, we had to increase the trace level of area SCSI and HBA to 6. $ echo 6 > /sys/kernel/debug/s390dbf/zfcp_0.0.1880_scsi/level $ echo 6 > /sys/kernel/debug/s390dbf/zfcp_0.0.1880_hba/level $ echo 0x4011402240334044 > \ /sys/bus/ccw/drivers/zfcp/0.0.1880/0x50050763031bd327/unit_add Example output formatted by an updated zfcpdbf from the s390-tools package interspersed with kernel messages at scsi_logging_level=4605: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : scsla_1 LUN : 0x4011402240334044 WWPN : 0x50050763031bd327 D_ID : 0x00...... Adapter status : 0x5400050b Port status : 0x54000001 LUN status : 0x41000000 Ready count : 0x00000001 Running count : 0x00000000 ERP want : 0x01 ERP need : 0x01 scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY pass 1 length 36 scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY successful with code 0x0 Timestamp : ... Area : HBA Subarea : 00 Level : 6 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : fs_norm Request ID : 0x<inquiry2-req-id> Request status : 0x00000010 FSF cmnd : 0x00000001 FSF sequence no: 0x... FSF issued : ... FSF stat : 0x00000000 FSF stat qual : 00000000 00000000 00000000 00000000 Prot stat : 0x00000001 Prot stat qual : ........ ........ 00000000 00000000 Port handle : 0x... LUN handle : 0x... | Timestamp : ... Area : SCSI Subarea : 00 Level : 6 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : rsl_nor Request ID : 0x<inquiry2-req-id> SCSI ID : 0x00000000 SCSI LUN : 0x40224011 SCSI LUN high : 0x40444033 <======================= SCSI result : 0x00000000 SCSI retries : 0x00 SCSI allowed : 0x03 SCSI scribble : 0x<inquiry2-req-id> SCSI opcode : 12000000 a4000000 00000000 00000000 FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000000 00000000 00000000 00000000 scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY pass 2 length 164 scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY successful with code 0x0 scsi 2:0:0:4630896905707208721: scsi scan: peripheral device type of 31, \ no device added Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 9cb78c16f5da ("scsi: use 64-bit LUNs") Cc: <stable@vger.kernel.org> #3.17+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Jens Remus <jremus@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late ↵Steffen Maier
response At the default trace level, we only trace unsuccessful events including FSF responses. zfcp_dbf_hba_fsf_response() only used protocol status and FSF status to decide on an unsuccessful response. However, this is only one of multiple possible sources determining a failed struct zfcp_fsf_req. An FSF request can also "fail" if its response runs into an ERP timeout or if it gets dismissed because a higher level recovery was triggered [trace tags "erscf_1" or "erscf_2" in zfcp_erp_strategy_check_fsfreq()]. FSF requests with ERP timeout are: FSF_QTCB_EXCHANGE_CONFIG_DATA, FSF_QTCB_EXCHANGE_PORT_DATA, FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT or FSF_QTCB_CLOSE_PHYSICAL_PORT for target ports, FSF_QTCB_OPEN_LUN, FSF_QTCB_CLOSE_LUN. One example is slow queue processing which can cause follow-on errors, e.g. FSF_PORT_ALREADY_OPEN after FSF_QTCB_OPEN_PORT_WITH_DID timed out. In order to see the root cause, we need to see late responses even if the channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD. Example trace records formatted with zfcpdbf from the s390-tools package: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : ... Record ID : 1 Tag : fcegpf1 LUN : 0xffffffffffffffff WWPN : 0x<WWPN> D_ID : 0x00<D_ID> Adapter status : 0x5400050b Port status : 0x41200000 LUN status : 0x00000000 Ready count : 0x00000001 Running count : 0x... ERP want : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT ERP need : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT | Timestamp : ... 30 seconds later Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : ... Record ID : 2 Tag : erscf_2 LUN : 0xffffffffffffffff WWPN : 0x<WWPN> D_ID : 0x00<D_ID> Adapter status : 0x5400050b Port status : 0x41200000 LUN status : 0x00000000 Request ID : 0x<request_ID> ERP status : 0x10000000 ZFCP_STATUS_ERP_TIMEDOUT ERP step : 0x0800 ZFCP_ERP_STEP_PORT_OPENING ERP action : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT ERP count : 0x00 | Timestamp : ... later than previous record Area : HBA Subarea : 00 Level : 5 > default level => 3 <= default level Exception : - CPU ID : 00 Caller : ... Record ID : 1 Tag : fs_qtcb => fs_rerr Request ID : 0x<request_ID> Request status : 0x00001010 ZFCP_STATUS_FSFREQ_DISMISSED | ZFCP_STATUS_FSFREQ_CLEANUP FSF cmnd : 0x00000005 FSF sequence no: 0x... FSF issued : ... > 30 seconds ago FSF stat : 0x00000000 FSF_GOOD FSF stat qual : 00000000 00000000 00000000 00000000 Prot stat : 0x00000001 FSF_PROT_GOOD Prot stat qual : 00000000 00000000 00000000 00000000 Port handle : 0x... LUN handle : 0x00000000 QTCB log length: ... QTCB log info : ... In case of problems detecting that new responses are waiting on the input queue, we sooner or later trigger adapter recovery due to an FSF request timeout (trace tag "fsrth_1"). FSF requests with FSF request timeout are: typically FSF_QTCB_ABORT_FCP_CMND; but theoretically also FSF_QTCB_EXCHANGE_CONFIG_DATA or FSF_QTCB_EXCHANGE_PORT_DATA via sysfs, FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT for WKA ports, FSF_QTCB_FCP_CMND for task management function (LUN / target reset). One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD because the channel filled in the response via DMA into the request's QTCB. In a theroretical case, inject code can create an erroneous FSF request on purpose. If data router is enabled, it uses deferred error reporting. A READ SCSI command can succeed with FSF_PROT_GOOD, FSF_GOOD, and SAM_STAT_GOOD. But on writing the read data to host memory via DMA, it can still fail, e.g. if an intentionally wrong scatter list does not provide enough space. Rather than getting an unsuccessful response, we get a QDIO activate check which in turn triggers adapter recovery. One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD because the channel filled in the response via DMA into the request's QTCB. Example trace records formatted with zfcpdbf from the s390-tools package: Timestamp : ... Area : HBA Subarea : 00 Level : 6 > default level => 3 <= default level Exception : - CPU ID : .. Caller : ... Record ID : 1 Tag : fs_norm => fs_rerr Request ID : 0x<request_ID2> Request status : 0x00001010 ZFCP_STATUS_FSFREQ_DISMISSED | ZFCP_STATUS_FSFREQ_CLEANUP FSF cmnd : 0x00000001 FSF sequence no: 0x... FSF issued : ... FSF stat : 0x00000000 FSF_GOOD FSF stat qual : 00000000 00000000 00000000 00000000 Prot stat : 0x00000001 FSF_PROT_GOOD Prot stat qual : ........ ........ 00000000 00000000 Port handle : 0x... LUN handle : 0x... | Timestamp : ... Area : SCSI Subarea : 00 Level : 3 Exception : - CPU ID : .. Caller : ... Record ID : 1 Tag : rsl_err Request ID : 0x<request_ID2> SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x000e0000 DID_TRANSPORT_DISRUPTED SCSI retries : 0x00 SCSI allowed : 0x05 SCSI scribble : 0x<request_ID2> SCSI opcode : 28... Read(10) FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000000 00000000 ^^ SAM_STAT_GOOD 00000000 00000000 Only with luck in both above cases, we could see a follow-on trace record of an unsuccesful event following a successful but late FSF response with FSF_PROT_GOOD and FSF_GOOD. Typically this was the case for I/O requests resulting in a SCSI trace record "rsl_err" with DID_TRANSPORT_DISRUPTED [On ZFCP_STATUS_FSFREQ_DISMISSED, zfcp_fsf_protstatus_eval() sets ZFCP_STATUS_FSFREQ_ERROR seen by the request handler functions as failure]. However, the reason for this follow-on trace was invisible because the corresponding HBA trace record was missing at the default trace level (by default hidden records with tags "fs_norm", "fs_qtcb", or "fs_open"). On adapter recovery, after we had shut down the QDIO queues, we perform unsuccessful pseudo completions with flag ZFCP_STATUS_FSFREQ_DISMISSED for each pending FSF request in zfcp_fsf_req_dismiss_all(). In order to find the root cause, we need to see all pseudo responses even if the channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD. Therefore, check zfcp_fsf_req.status for ZFCP_STATUS_FSFREQ_DISMISSED or ZFCP_STATUS_FSFREQ_ERROR and trace with a new tag "fs_rerr". It does not matter that there are numerous places which set ZFCP_STATUS_FSFREQ_ERROR after the location where we trace an FSF response early. These cases are based on protocol status != FSF_PROT_GOOD or == FSF_PROT_FSF_STATUS_PRESENTED and are thus already traced by default as trace tag "fs_perr" or "fs_ferr" respectively. NB: The trace record with tag "fssrh_1" for status read buffers on dismiss all remains. zfcp_fsf_req_complete() handles this and returns early. All other FSF request types are handled separately and as described above. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 8a36e4532ea1 ("[SCSI] zfcp: enhancement of zfcp debug features") Fixes: 2e261af84cdb ("[SCSI] zfcp: Only collect FSF/HBA debug data for matching trace levels") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace recordsSteffen Maier
If the FCP_RSP UI has optional parts (FCP_SNS_INFO or FCP_RSP_INFO) and thus does not fit into the fsp_rsp field built into a SCSI trace record, trace the full FCP_RSP UI with all optional parts as payload record instead of just FCP_SNS_INFO as payload and a 1 byte RSP_INFO_CODE part of FCP_RSP_INFO built into the SCSI record. That way we would also get the full FCP_SNS_INFO in case a target would ever send more than min(SCSI_SENSE_BUFFERSIZE==96, ZFCP_DBF_PAY_MAX_REC==256)==96. The mandatory part of FCP_RSP IU is only 24 bytes. PAYload costs at least one full PAY record of 256 bytes anyway. We cap to the hardware response size which is only FSF_FCP_RSP_SIZE==128. So we can just put the whole FCP_RSP IU with any optional parts into PAYload similarly as we do for SAN PAY since v4.9 commit aceeffbb59bb ("zfcp: trace full payload of all SAN records (req,resp,iels)"). This does not cause any additional trace records wasting memory. Decoded trace records were confusing because they showed a hard-coded sense data length of 96 even if the FCP_RSP_IU field FCP_SNS_LEN showed actually less. Since the same commit, we set pl_len for SAN traces to the full length of a request/response even if we cap the corresponding trace. In contrast, here for SCSI traces we set pl_len to the pre-computed length of FCP_RSP IU considering SNS_LEN or RSP_LEN if valid. Nonetheless we trace a hardcoded payload of length FSF_FCP_RSP_SIZE==128 if there were optional parts. This makes it easier for the zfcpdbf tool to format only the relevant part of the long FCP_RSP UI buffer. And any trailing information is still available in the payload trace record just in case. Rename the payload record tag from "fcp_sns" to "fcp_riu" to make the new content explicit to zfcpdbf which can then pick a suitable field name such as "FCP rsp IU all:" instead of "Sense info :" Also, the same zfcpdbf can still be backwards compatible with "fcp_sns". Old example trace record before this fix, formatted with the tool zfcpdbf from s390-tools: Timestamp : ... Area : SCSI Subarea : 00 Level : 3 Exception : - CPU id : .. Caller : 0x... Record id : 1 Tag : rsl_err Request id : 0x<request_id> SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x00000002 SCSI retries : 0x00 SCSI allowed : 0x05 SCSI scribble : 0x<request_id> SCSI opcode : 00000000 00000000 00000000 00000000 FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000202 00000000 ^^==FCP_SNS_LEN_VALID 00000020 00000000 ^^^^^^^^==FCP_SNS_LEN==32 Sense len : 96 <==min(SCSI_SENSE_BUFFERSIZE,ZFCP_DBF_PAY_MAX_REC) Sense info : 70000600 00000018 00000000 29000000 00000400 00000000 00000000 00000000 00000000 00000000 00000000 00000000<==superfluous 00000000 00000000 00000000 00000000<==superfluous 00000000 00000000 00000000 00000000<==superfluous 00000000 00000000 00000000 00000000<==superfluous New example trace records with this fix: Timestamp : ... Area : SCSI Subarea : 00 Level : 3 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : rsl_err Request ID : 0x<request_id> SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x00000002 SCSI retries : 0x00 SCSI allowed : 0x03 SCSI scribble : 0x<request_id> SCSI opcode : a30c0112 00000000 02000000 00000000 FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000a02 00000200 00000020 00000000 FCP rsp IU len : 56 FCP rsp IU all : 00000000 00000000 00000a02 00000200 ^^=FCP_RESID_UNDER|FCP_SNS_LEN_VALID 00000020 00000000 70000500 00000018 ^^^^^^^^==FCP_SNS_LEN ^^^^^^^^^^^^^^^^^ 00000000 240000cb 00011100 00000000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 00000000 00000000 ^^^^^^^^^^^^^^^^^==FCP_SNS_INFO Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : lr_okay Request ID : 0x<request_id> SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x00000000 SCSI retries : 0x00 SCSI allowed : 0x05 SCSI scribble : 0x<request_id> SCSI opcode : <CDB of unrelated SCSI command passed to eh handler> FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000100 00000000 00000000 00000008 FCP rsp IU len : 32 FCP rsp IU all : 00000000 00000000 00000100 00000000 ^^==FCP_RSP_LEN_VALID 00000000 00000008 00000000 00000000 ^^^^^^^^==FCP_RSP_LEN ^^^^^^^^^^^^^^^^^==FCP_RSP_INFO Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: fix missing trace records for early returns in TMF eh handlersSteffen Maier
For problem determination we need to see that we were in scsi_eh as well as whether and why we were successful or not. The following commits introduced new early returns without adding a trace record: v2.6.35 commit a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh") on fc_block_scsi_eh() returning != 0 which is FAST_IO_FAIL, v2.6.30 commit 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp") on not having gotten an FSF request after the maximum number of retry attempts and thus could not issue a TMF and has to return FAILED. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh") Fixes: 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-10scsi: zfcp: fix passing fsf_req to SCSI trace on TMF to correlate with HBASteffen Maier
Without this fix we get SCSI trace records on task management functions which cannot be correlated to HBA trace records because all fields related to the FSF request are empty (zero). Also, the FCP_RSP_IU is missing as well as any sense data if available. This was caused by v2.6.14 commit 8a36e4532ea1 ("[SCSI] zfcp: enhancement of zfcp debug features") introducing trace records for TMFs but hard coding NULL for a possibly existing TMF FSF request. The scsi_cmnd scribble is also zero or unrelated for the TMF request so it also could not lookup a suitable FSF request from there. A broken example trace record formatted with zfcpdbf from the s390-tools package: Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : lr_fail Request ID : 0x0000000000000000 ^^^^^^^^^^^^^^^^ no correlation to HBA record SCSI ID : 0x<scsitarget> SCSI LUN : 0x<scsilun> SCSI result : 0x000e0000 SCSI retries : 0x00 SCSI allowed : 0x05 SCSI scribble : 0x0000000000000000 SCSI opcode : 2a000017 3bb80000 08000000 00000000 FCP rsp inf cod: 0x00 ^^ no TMF response FCP rsp IU : 00000000 00000000 00000000 00000000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 00000000 00000000 ^^^^^^^^^^^^^^^^^ no interesting FCP_RSP_IU Sense len : ... ^^^^^^^^^^^^^^^^^^^^ no sense data length Sense info : ... ^^^^^^^^^^^^^^^^^^^^ no sense data content, even if present There are some true cases where we really do not have an FSF request: "rsl_fai" from zfcp_dbf_scsi_fail_send() called for early returns / completions in zfcp_scsi_queuecommand(), "abrt_or", "abrt_bl", "abrt_ru", "abrt_ar" from zfcp_scsi_eh_abort_handler() where we did not get as far, "lr_nres", "tr_nres" from zfcp_task_mgmt_function() where we're successful and do not need to do anything because adapter stopped. For these cases it's correct to pass NULL for fsf_req to _zfcp_dbf_scsi(). Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 8a36e4532ea1 ("[SCSI] zfcp: enhancement of zfcp debug features") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>