summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-03-04scsi: lpfc: Fix dropped FLOGI during pt2pt discovery recoveryJames Smart
When connected in pt2pt mode, there is a scenario where the remote port significantly delays sending a response to our FLOGI, but acts on the FLOGI it sent us and proceeds to PLOGI/PRLI. The FLOGI ends up timing out and kicks off recovery logic. End result is a lot of unnecessary state changes and lots of discovery messages being logged. Fix by terminating the FLOGI and noop'ing its completion if we have already accepted the remote ports FLOGI and are now processing PLOGI. Link: https://lore.kernel.org/r/20210301171821.3427-13-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix status returned in lpfc_els_retry() error exit pathJames Smart
An unlikely error exit path from lpfc_els_retry() returns incorrect status to a caller, erroneously indicating that a retry has been successfully issued or scheduled. Change error exit path to indicate no retry. Link: https://lore.kernel.org/r/20210301171821.3427-12-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix use after free in lpfc_els_free_iocbJames Smart
There are several code paths where the following sequence occurs: - An ndlp pointer is assigned to an iocb via a nlp_get() - An attempt is made to issue the iocb, but it fails - The failure case does a put on the ndlp then calls lpfc_els_free_iocb() The put may free the ndlp structure, but the els_free_iocb may reference the now-stale ndlp pointer and cause a crash. Fix by ensuring that the lpfc_els_free_iocb() occurs before the lpfc_nlp_put(). While fixing, refactor the code to better ensure this calling sequence. Link: https://lore.kernel.org/r/20210301171821.3427-11-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix null pointer dereference in lpfc_prep_els_iocb()James Smart
It is possible to call lpfc_issue_els_plogi() passing a did for which no matching ndlp is found. A call is then made to lpfc_prep_els_iocb() with a null pointer to a lpfc_nodelist structure resulting in a null pointer dereference. Fix by returning an error status if no valid ndlp is found. Fix up comments regarding ndlp reference counting. Link: https://lore.kernel.org/r/20210301171821.3427-10-jsmart2021@gmail.com Fixes: 4430f7fd09ec ("scsi: lpfc: Rework locations of ndlp reference taking") Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix unnecessary null check in lpfc_release_scsi_bufJames Smart
lpfc_fcp_io_cmd_wqe_cmpl() is intended to mirror lpfc_nvme_io_cmd_wqe_cmpl() for sli4 fcp completions. When the routine was added, lpfc_fcp_io_cmd_wqe_cmpl() included a null pointer check for phba. However, phba is definitely valid, being dereferenced by the calling routine and used later in the routine itself. Remove the unnecessary null check. Link: https://lore.kernel.org/r/20210301171821.3427-9-jsmart2021@gmail.com Fixes: 96e209be6ecb ("scsi: lpfc: Convert SCSI I/O completions to SLI-3 and SLI-4 handlers") Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix pt2pt connection does not recover after LOGOJames Smart
On a pt2pt setup, between 2 initiators, if one side issues a a LOGO, there is no relogin attempt. The FC specs are grey in this area on which port (higher wwn or not) is to re-login. As there is no spec guidance, unconditionally re-PLOGI after the logout to ensure a login is re-established. Link: https://lore.kernel.org/r/20210301171821.3427-8-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix lpfc_els_retry() possible null pointer dereferenceJames Smart
Driver crashed in lpfc_debugfs_disc_trc() due to null ndlp pointer. In some calling cases, the ndlp is null and the did is looked up. Fix by using the local did variable that is set appropriately based on ndlp value. Link: https://lore.kernel.org/r/20210301171821.3427-7-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix FLOGI failure due to accessing a freed nodeJames Smart
After an initial successful FLOGI into the switch, if a subsequent FLOGI fails the driver crashed accessing a node struct. On FLOGI error, the flogi completion logic triggers the final dereference on the node structure without checking if it is registered with a backend. The devloss logic is triggered after node is freed leading to the access of freed node. Fix by adjusting the error path to not take the final dereferece if there is an outstanding transport registration. Let the transport devloss call remove the final reference. Link: https://lore.kernel.org/r/20210301171821.3427-6-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix stale node accesses on stale RRQ requestJames Smart
Whenever an RRQ needs to be triggered, the DID from the node structure and node pointer are stored in the RRQ data structure and the RRQ is scheduled for later transmission. However, at the point in time that the timer triggers, there's no validation on the node pointer. Reference counters may have freed the structure. Additionally the DID in the node may no longer be valid. Fix by not tracking the node pointer in the RRQ, only the DID. At the time of the timer expiration, look up the node with the did and if present, send the RRQ. If no node exists, no need to send the RRQ. Link: https://lore.kernel.org/r/20210301171821.3427-5-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix reftag generation sizing errorsJames Smart
An LBA is 8 bytes. The driver generates a reftag from the LBA but the reftag is 4 bytes. Thus scsi_get_lba() could return a value that exceeds our reftag size. Fix by converting all the code to calling the common routine t10_pi_ref_tag() which returns a u32, thus ensuring a consistent 4byte value. Also correct a few code lines that access LBA directly and ensure 64-bit data types are used. Link: https://lore.kernel.org/r/20210301171821.3427-4-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix vport indices in lpfc_find_vport_by_vpid()James Smart
Calls to lpfc_find_vport_by_vpid() for the highest indexed vport fails with error, "2936 Could not find Vport mapped to vpi XXX". Our vport indices in the loop and if-clauses were off by one. Correct the vpid range used for vpi lookup to include the highest possible vpid. Link: https://lore.kernel.org/r/20210301171821.3427-3-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix incorrect dbde assignment when building target abts wqeJames Smart
The wqe_dbde field indicates whether a Data BDE is present in Words 0:2 and should therefore should be clear in the abts request wqe. By setting the bit we can be misleading fw into error cases. Clear the wqe_dbde field. Link: https://lore.kernel.org/r/20210301171821.3427-2-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: scsi_debug: Fix cmd duration calculationDouglas Gilbert
In some cases, sdebug_defer::cmpl_ts (completion timestamp) wasn't being properly set when REQ_HIPRI was given. Fix that and improve code to only call ktime_get_boottime_ns() for commands with REQ_HIPRI set as cmpl_ts is only used in that case. Link: https://lore.kernel.org/r/20210304014107.307625-1-dgilbert@interlog.com Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: core: Set shost as hctx driver_dataKashyap Desai
hctx->driver_data is not set for SCSI currently. Set hctx->driver_data = shost. Link: https://lore.kernel.org/r/20210215074048.19424-6-kashyap.desai@broadcom.com Suggested-by: John Garry <john.garry@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: scsi_debug: Add new defer type for mq_pollDouglas Gilbert
Add a new sdeb_defer_type enumeration: SDEB_DEFER_POLL for requests that have REQ_HIPRI set in cmd_flags field. It is expected that these requests will be polled via the mq_poll entry point which is driven by calls to blk_poll() in the block layer. Therefore timer events are not 'wired up' in the normal fashion. There are still cases with short delays (e.g. < 10 microseconds) where by the time the command response processing occurs, the delay is already exceeded in which case the code calls scsi_done() directly. In such cases there is no window for mq_poll() to be called. Add 'mq_polls' counter that increments on each scsi_done() called via the mq_poll entry point. Can be used to show (with 'cat /proc/scsi/scsi_debug/<host_id>') that blk_poll() is causing completions rather than some other mechanism. Link: https://lore.kernel.org/r/20210215074048.19424-5-kashyap.desai@broadcom.com Tested-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: scsi_debug: mq_poll supportKashyap Desai
Add support of the mq_poll interface to scsi_debug. This feature requires shared host tag support in kernel and driver. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20210215074048.19424-4-kashyap.desai@broadcom.com Cc: dgilbert@interlog.com Cc: linux-block@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: megaraid_sas: mq_poll supportKashyap Desai
Implement mq_poll interface support in megaraid_sas. This feature requires shared host tag support in kernel and driver. The driver can work in non-IRQ mode which means there will not be any MSI-x vector associated for poll_queues. The MegaRAID hardware has a single submission queue and multiple reply queues. However, using the shared host tagset support will enable the driver to simulate multiple hardware queues. Change driver to allocate some extra reply queues which will be marked as poll_queues. These poll_queues will not have associated MSI-x vectors. All I/O completions on these queues will be done through the IOPOLL interface. megaraid_sas with 8 poll_queues and using the io_uring hiprio=1 setting can reach 3.2M IOPS with zero interrupts generated by the hardware. The IOPOLL feature can be enabled using module parameter poll_queues. Link: https://lore.kernel.org/r/20210215074048.19424-3-kashyap.desai@broadcom.com Cc: sumit.saxena@broadcom.com Cc: chandrakanth.patil@broadcom.com Cc: linux-block@vger.kernel.org Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: core: Add mq_poll support to SCSI layerKashyap Desai
Currently IOPOLL support is only available in block layer. This patch adds mq_poll support to the SCSI layer. Link: https://lore.kernel.org/r/20210215074048.19424-2-kashyap.desai@broadcom.com Cc: sumit.saxena@broadcom.com Cc: chandrakanth.patil@broadcom.com Cc: linux-block@vger.kernel.org Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Make completion affinity configurableMike Christie
It may not always be best to complete the IO on same CPU as it was submitted on. This commit allows userspace to configure it. This has been useful for vhost-scsi where we have a single thread for submissions and completions. If we force the completion on the submission CPU we may be adding conflicts with what the user has setup in the lower levels with settings like the block layer rq_affinity or the driver's IRQ or softirq (the network's rps_cpus value) settings. We may also want to set it up where the vhost thread runs on CPU N and does its submissions/completions there, and then have LIO do its completion booking on CPU M, but can't configure the lower levels due to issues like using dm-multipath with lots of paths (the path selector can throw commands all over the system because it's only taking into account latency/throughput at its level). The new setting is in: /sys/kernel/config/target/$fabric/$target/param/cmd_completion_affinity Writing: -1 -> Gives the current default behavior of completing on the submission CPU. -2 -> Completes the cmd on the CPU the lower layers sent it to us from. > 0 -> Completes on the CPU userspace has specified. Link: https://lore.kernel.org/r/20210227170006.5077-26-michael.christie@oracle.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Flush submission work during TMR processingMike Christie
If a cmd is on the submission workqueue then the TMR code will miss it, and end up returning task not found or success for LUN resets. The fabric driver might then tell the initiator that the running cmds have been handled when they are about to run. This adds a flush when we are processing TMRs to make sure queued cmds do not run after returning the TMR response. Link: https://lore.kernel.org/r/20210227170006.5077-25-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: tcmu: Add backend plug/unplug calloutsMike Christie
This patch adds plug/unplug callouts for tcmu, so we can avoid the number of times we switch to userspace. Using this driver with tcm_loop is a common config, and dependng on the nr_hw_queues (nr_hw_queues=1 performs much better) and fio jobs (lower num jobs around 4) this patch can increase IOPS by only around 5-10% because we hit other issues like the big per tcmu device mutex. Link: https://lore.kernel.org/r/20210227170006.5077-24-michael.christie@oracle.com Reviewed-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: iblock: Add backend plug/unplug calloutsMike Christie
This patch adds plug/unplug callouts for iblock. For an initiator driver like iSCSI which wants to pass multiple cmds to its xmit thread instead of one cmd at a time, this increases IOPS by around 10% with vhost-scsi (combined with the last patches we can see a total 40-50% increase). For driver combos like tcm_loop and faster drivers like the iSER initiator, we can still see IOPS increase by 20-30% when tcm_loop's nr_hw_queues setting is also increased. Link: https://lore.kernel.org/r/20210227170006.5077-23-michael.christie@oracle.com Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Fix backend pluggingMike Christie
target_core_iblock is plugging and unplugging on every command and this is causing perf issues for drivers that prefer batched cmds. With recent patches we can now take multiple cmds from a fabric driver queue and then pass them down the backend drivers in a batch. This patch adds this support by adding 2 callouts to the backend for plugging and unplugging the device. Subsequent commits will add support for iblock and tcmu device plugging. Link: https://lore.kernel.org/r/20210227170006.5077-22-michael.christie@oracle.com Reviewed-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Cleanup cmd flag bitsMike Christie
We have a couple holes in the cmd flags definitions. This cleans up the definitions to fix that and make it easier to read. Link: https://lore.kernel.org/r/20210227170006.5077-21-michael.christie@oracle.com Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: tcm_loop: Use LIO wq cmd submission helperMike Christie
Convert loop to use the LIO wq cmd submission helper. Link: https://lore.kernel.org/r/20210227170006.5077-20-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: tcm_loop: Use block cmd allocator for se_cmdsMike Christie
Make tcm_loop use the block layer cmd allocator for se_cmds instead of using the tcm_loop_cmd_cache. In the future when we can use the host tags for internal requests like TMFs we can completely kill the tcm_loop_cmd_cache. Link: https://lore.kernel.org/r/20210227170006.5077-19-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: vhost-scsi: Use LIO wq cmd submission helperMike Christie
Convert vhost-scsi to use the LIO wq cmd submission helper. Link: https://lore.kernel.org/r/20210227170006.5077-18-michael.christie@oracle.com Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Add workqueue based cmd submissionMike Christie
loop and vhost/scsi do their target cmd submission from driver workqueues. This allows them to avoid an issue where the backend may block waiting for resources like tags/requests, mem/locks, etc and that ends up blocking their entire submission path and for the case of vhost-scsi both the submission and completion path. This patch adds a helper drivers can use to submit from a LIO workqueue. This code will then be extended in the next patches to fix the plugging of backend devices. We are only converting vhost/loop initially, but the workqueue based submission will work for other drivers and have similar benefits where the main target loops will not end up blocking one some backend resource. Link: https://lore.kernel.org/r/20210227170006.5077-17-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Add gfp_t arg to target_cmd_init_cdb()Mike Christie
tcm_loop could be used like a normal block device, so we can't use GFP_KERNEL and should use GFP_NOIO. This adds a gfp_t arg to target_cmd_init_cdb() and converts the users. For every driver but loop GFP_KERNEL is kept. This will also be useful in subsequent patches where loop needs to do target_submit_prep() from interrupt context to get a ref to the se_device, and so it will need to use GFP_ATOMIC. Link: https://lore.kernel.org/r/20210227170006.5077-16-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Remove target_submit_cmd_map_sgls()Mike Christie
Convert target_submit_cmd() to do its own calls and then remove target_submit_cmd_map_sgls() since no one uses it. Link: https://lore.kernel.org/r/20210227170006.5077-15-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: tcm_fc: Convert to new submission APIMike Christie
target_submit_cmd() is now only for simple drivers that do their own sync during shutdown and do not use target_stop_session(). tcm_fc uses target_stop_session() to sync session shutdown with LIO core, so we use target_init_cmd(), target_submit_prep(), target_submit(), because target_init_cmd() will now detect the target_stop_session() call and return an error. Link: https://lore.kernel.org/r/20210227170006.5077-14-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: xen-scsiback: Convert to new submission APIMike Christie
target_submit_cmd_map_sgls() is being removed, so convert xen to the new submission API. This has it use target_init_cmd(), target_submit_prep(), or target_submit() because we need to have LIO core map sgls which is now done in target_submit_prep(). target_init_cmd() will never fail for xen because it does its own sync during session shutdown, so we can remove that code. Note: xen never calls target_stop_session() so target_submit_cmd_map_sgls() never failed (in the new API target_init_cmd() handles target_stop_session() being called when cmds are being submitted). If it were to have used target_stop_session() and got an error, we would have hit a refcount bug like xen and usb, because it does: if (rc < 0) { transport_send_check_condition_and_sense(se_cmd, TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0); transport_generic_free_cmd(se_cmd, 0); } transport_send_check_condition_and_sense() calls queue_status which calls scsiback_cmd_done->target_put_sess_cmd. We do an extra transport_generic_free_cmd() call above which would have dropped the refcount to -1 and the refcount code would spit out errors. Link: https://lore.kernel.org/r/20210227170006.5077-13-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: vhost-scsi: Convert to new submission APIMike Christie
target_submit_cmd_map_sgls() is being removed, so convert vhost-scsi to the new submission API. This has it use target_init_cmd(), target_submit_prep(), target_submit() because we need to have LIO core map sgls which is now done in target_submit_prep(), and in the next patches we will do the target_submit step from the LIO workqueue. Note: vhost-scsi never calls target_stop_session() so target_submit_cmd_map_sgls() never failed (in the new API target_init_cmd() handles target_stop_session() being called when cmds are being submitted). If it were to have used target_stop_session() and got an error, we would have hit a refcount bug like xen and usb, because it does: if (rc < 0) { transport_send_check_condition_and_sense(se_cmd, TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0); transport_generic_free_cmd(se_cmd, 0); } transport_send_check_condition_and_sense() calls queue_status which does transport_generic_free_cmd(), and then we do an extra transport_generic_free_cmd() call above which would have dropped the refcount to -1 and the refcount code would spit out errors. Link: https://lore.kernel.org/r/20210227170006.5077-12-michael.christie@oracle.com Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: usb: gadget: Convert to new submission APIMike Christie
target_submit_cmd() is now only for simple drivers that do their own sync during shutdown and do not use target_stop_session(). It will never return a failure, so we can remove that code from the driver. Note: Before these patches target_submit_cmd() would never return an error for usb since it does not use target_stop_session(). If it did then we would have hit a refcount error here: transport_send_check_condition_and_sense(se_cmd, TCM_UNSUPPORTED_SCSI_OPCODE, 1); transport_generic_free_cmd(&cmd->se_cmd, 0); transport_send_check_condition_and_sense() calls queue_status and the driver can sometimes do transport_generic_free_cmd() from there via uasp_status_data_cmpl(). In that case, the above transport_generic_free_cmd() would then hit a refcount error. So that other use of the above error path in the driver is also probably wrong, but someone with the hardware needs to fix that. Link: https://lore.kernel.org/r/20210227170006.5077-11-michael.christie@oracle.com Cc: Felipe Balbi <balbi@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: sbp_target: Convert to new submission APIMike Christie
target_submit_cmd() is now only for simple drivers that do their own sync during shutdown and do not use target_stop_session(). It will never return a failure, so we can remove that code from the driver. Link: https://lore.kernel.org/r/20210227170006.5077-10-michael.christie@oracle.com Cc: Chris Boot <bootc@bootc.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: tcm_loop: Convert to new submission APIMike Christie
target_submit_cmd_map_sgls() is being removed, so convert loop to the new submission API. Even though loop does its own shutdown sync, this has loop use target_init_cmd()/target_submit_prep()/target_submit() since it needed to map sgls and in the next patches it will use the API to use LIO's workqueue. Link: https://lore.kernel.org/r/20210227170006.5077-9-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: qla2xxx: Convert to new submission APIMike Christie
target_submit_cmd() is now only for simple drivers that do their own sync during shutdown and do not use target_stop_session(). tcm_qla2xxx uses target_stop_session() to sync session shutdown with LIO core, so we use target_init_cmd()/target_submit_prep()/target_submit(), because target_init_cmd() will detect the target_stop_session() call and return an error. Link: https://lore.kernel.org/r/20210227170006.5077-8-michael.christie@oracle.com Cc: Nilesh Javali <njavali@marvell.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: ibmvscsi_tgt: Convert to new submission APIMike Christie
target_submit_cmd() is now only for simple drivers that do their own sync during shutdown and do not use target_stop_session(). It will never return a failure, so we can remove that code from the driver. Link: https://lore.kernel.org/r/20210227170006.5077-7-michael.christie@oracle.com Cc: Michael Cyr <mikecyr@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: srpt: Convert to new submission APIMike Christie
target_submit_cmd_map_sgls() is being removed, so convert srpt to the new submission API. srpt uses target_stop_session() to sync session shutdown with LIO core, so we use target_init_cmd()/target_submit_prep()/target_submit(), because target_init_cmd() will detect the target_stop_session() call and return an error. Link: https://lore.kernel.org/r/20210227170006.5077-6-michael.christie@oracle.com Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Break up target_submit_cmd_map_sgls()Mike Christie
This breaks up target_submit_cmd_map_sgls() into 3 helpers: - target_init_cmd(): Do the basic general setup and get a refcount to the session to make sure the caller can execute the cmd. - target_submit_prep(): Do the mapping, cdb processing and get a ref to the LUN. - target_submit(): Pass the cmd to LIO core for execution. The above functions must be used by drivers that either: 1. Rely on LIO for session shutdown synchronization by calling target_stop_session(). 2. Need to map sgls. When the next patches are applied then simple drivers that do not need the extra functionality above can use target_submit_cmd() and not worry about failures being returned and how to handle them, since many drivers were getting this wrong and would have hit refcount bugs. Also, by breaking target_submit_cmd_map_sgls() up into these 3 helper functions, we can allow the later patches to do the init/prep from interrupt context and then do the submission from a workqueue. Link: https://lore.kernel.org/r/20210227170006.5077-5-michael.christie@oracle.com Cc: Bart Van Assche <bvanassche@acm.org> Cc: Juergen Gross <jgross@suse.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Nilesh Javali <njavali@marvell.com> Cc: Michael Cyr <mikecyr@linux.ibm.com> Cc: Chris Boot <bootc@bootc.net> Cc: Felipe Balbi <balbi@kernel.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Rename transport_init_se_cmd()Mike Christie
Rename transport_init_se_cmd() to __target_init_cmd() to reflect that it is more of an internal function that drivers should normally not use and because we are going to add a new init function in the next patches. Link: https://lore.kernel.org/r/20210227170006.5077-4-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Drop kref_get_unless_zero() in target_get_sess_cmd()Mike Christie
The kref_get_unless_zero() use in target_get_sess_cmd() was added in: commit 1b4c59b7a1d0 ("target: fix potential race window in target_sess_cmd_list_waiting()")' but it does not seem to do anything. The original patch might have thought we could have added the cmd to the sess_wait_list and then target_wait_for_sess_cmds could do a put before target_get_sess_cmd did its get. That wouldn't happen because we do the get first then grab the sess lock and put it on the list. It is also not needed now, because the sess_cmd_list does not exist anymore and we instead wait on the session cmd_count. The other problem with the commit is that several target_submit_cmd_map_sgls()/target_submit_cmd() callers do not handle the error case properly if it were to ever happen. These drivers think they have their normal refcount on the cmd and in many cases do a transport_generic_free_cmd() plus target_put_sess_cmd() so they would have fired off the refcount WARN/BUGs. This patch just changes the kref_get_unless_zero() to kref_get(). Link: https://lore.kernel.org/r/20210227170006.5077-3-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: target: core: Move t_task_cdb initializationMike Christie
Prepare to split target_submit_cmd_map_sgls() so the initialization and submission part can be called at different times. If the init part fails we can reference the t_task_cdb early in some of the logging and tracing code. Move it to transport_init_se_cmd() so we don't hit NULL pointer crashes. Link: https://lore.kernel.org/r/20210227170006.5077-2-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: core: Replace sdev->device_busy with sbitmapMing Lei
SCSI currently uses an atomic variable to track queue depth for each attached device. The queue depth depends on many factors such as transport type and device implementation. In addition, the SCSI device queue depth is not a static entity but changes over time as a result of congestion management. While blk-mq currently tracks queue depth for each hctx, it can't easily be changed to accommodate the SCSI per-device requirement. The current approach of using an atomic variable doesn't scale well when there are lots of CPU cores and the disk is very fast. IOPS can be substantially impacted by the atomic in the hot path. Replace the atomic variable sdev->device_busy with an sbitmap for tracking the SCSI device queue depth. It has been observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [mkp: fix device_busy reference in mpt3sas] Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: core: Make sure sdev->queue_depth is <= max(shost->can_queue, 1024)Ming Lei
Limit SCSI device's queue depth to max(host->can_queue, 1024) in scsi_change_queue_depth(). 1024 is big enough for saturating current fast SCSI LUN(SSD or RAID volume on multiple SSDs). Also single hardware queue depth is usually enough for saturating single LUN because per-core performance is often considered in storage design. This patch is needed for replacing sdev->device_busy with sbitmap which has to be pre-allocated with reasonable max depth. Link: https://lore.kernel.org/r/20210122023317.687987-13-ming.lei@redhat.com Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: core: Add scsi_device_busy() wrapperMing Lei
Add scsi_device_busy() helper to prepare drivers for tracking device queue depth via sbitmap_queue. Link: https://lore.kernel.org/r/20210122023317.687987-12-ming.lei@redhat.com Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: megaraid_sas: Replace sdev_busy with local counterKashyap Desai
Use local tracking of per-sdev outstanding command since sdev_busy in SCSI mid layer is improved for performance reason using sbitmap (earlier it was atomic variable). Link: https://lore.kernel.org/r/20210122023317.687987-11-ming.lei@redhat.com Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: core: Put hot fields of scsi_host_template in one cachelineMing Lei
The following three fields of scsi_host_template are referenced in the SCSI I/O submission hot path. Put them together in one cacheline: - cmd_size - queuecommand - commit_rqs Link: https://lore.kernel.org/r/20210122023317.687987-10-ming.lei@redhat.com Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: blk-mq: Return budget token from .get_budget callbackMing Lei
SCSI uses a global atomic variable to track queue depth for each LUN/request queue. This doesn't scale well when there are lots of CPU cores and the disk is very fast. It has been observed that IOPS is affected a lot by tracking queue depth via sdev->device_busy in the I/O path. Return budget token from .get_budget callback. The budget token can be passed to driver so that we can replace the atomic variable with sbitmap_queue and alleviate the scaling problems that way. Link: https://lore.kernel.org/r/20210122023317.687987-9-ming.lei@redhat.com Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: blk-mq: Add callbacks for storing & retrieving budget tokenMing Lei
Since SCSI is the only driver which requires dispatch budget move the token from struct request to struct scsi_cmnd. Link: https://lore.kernel.org/r/20210122023317.687987-8-ming.lei@redhat.com Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>