summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2019-02-15Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two fairly small fixes: the qla one is a panic inducing use after free and the entropy fix may seem minor but it has had huge userspace impact thanks to an unrelated change in openssl that causes sshd to refuse logins until it has enough entropy for the session keys, which causes tens of minutes delay before the affected systems allow logins after reboot" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmd scsi: sd: fix entropy gathering for most rotational disks
2019-02-12scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmdBill Kuzeja
In qla2x00_async_tm_cmd, we reference off sp after it has been freed. This caused a panic on a system running a slub debug kernel. Since fcport is passed in anyways, just use that instead. Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Acked-by: Giridhar Malavali <gmalavali@marvell.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12scsi: sd: fix entropy gathering for most rotational disksJames Bottomley
The problem is that the default for MQ is not to gather entropy, whereas the default for the legacy queue was always to gather it. The original attempt to fix entropy gathering for rotational disks under MQ added an else branch in sd_read_block_characteristics(). Unfortunately, the entire check isn't reached if the device has no characteristics VPD page. Since this page was only introduced in SBC-3 and its optional anyway, most less expensive rotational disks don't have one, meaning they all stopped gathering entropy when we made MQ the default. In a wholly unrelated change, openssl and openssh won't function until the random number generator is initialised, meaning lots of people have been seeing large delays before they could log into systems with default MQ kernels due to this lack of entropy, because it now can take tens of minutes to initialise the kernel random number generator. The fix is to set the non-rotational and add-randomness flags unconditionally early on in the disk initialization path, so they can be reset only if the device actually reports being non-rotational via the VPD page. Reported-by: Mikael Pettersson <mikpelinux@gmail.com> Fixes: 83e32a591077 ("scsi: sd: Contribute to randomness when running rotational device") Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Xuewei Zhang <xueweiz@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of five minor fixes (although, tecnhincally, the aicxxx fix is for a major problem in that the driver won't load without it, but I think the fact it's taken us since 4.10 to discover this indicates that the user base for these things has declined)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: cxlflash: Prevent deadlock when adapter probe fails Revert "scsi: libfc: Add WARN_ON() when deleting rports" scsi: sd_zbc: Fix zone information messages scsi: target: make the pi_prot_format ConfigFS path readable scsi: aic94xx: fix module loading
2019-02-04scsi: cxlflash: Prevent deadlock when adapter probe failsVaibhav Jain
Presently when an error is encountered during probe of the cxlflash adapter, a deadlock is seen with cpu thread stuck inside cxlflash_remove(). Below is the trace of the deadlock as logged by khungtaskd: cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16 INFO: task kworker/80:1:890 blocked for more than 120 seconds. Not tainted 5.0.0-rc4-capi2-kexec+ #2 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/80:1 D 0 890 2 0x00000808 Workqueue: events work_for_cpu_fn Call Trace: 0x4d72136320 (unreliable) __switch_to+0x2cc/0x460 __schedule+0x2bc/0xac0 schedule+0x40/0xb0 cxlflash_remove+0xec/0x640 [cxlflash] cxlflash_probe+0x370/0x8f0 [cxlflash] local_pci_probe+0x6c/0x140 work_for_cpu_fn+0x38/0x60 process_one_work+0x260/0x530 worker_thread+0x280/0x5d0 kthread+0x1a8/0x1b0 ret_from_kernel_thread+0x5c/0x80 INFO: task systemd-udevd:5160 blocked for more than 120 seconds. The deadlock occurs as cxlflash_remove() is called from cxlflash_probe() without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never successfully probed the 'cxlflash_cfg->state' never changes from STATE_PROBING hence the deadlock occurs. We fix this deadlock by setting the variable 'cxlflash_cfg->state' to STATE_PROBED in case an error occurs during cxlflash_probe() and just before calling cxlflash_remove(). Cc: stable@vger.kernel.org Fixes: c21e0bbfc485("cxlflash: Base support for IBM CXL Flash Adapter") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-04Revert "scsi: libfc: Add WARN_ON() when deleting rports"Ross Lagerwall
This reverts commit bbc0f8bd88abefb0f27998f40a073634a3a2db89. It added a warning whose intent was to check whether the rport was still linked into the peer list. It doesn't work as intended and gives false positive warnings for two reasons: 1) If the rport is never linked into the peer list it will not be considered empty since the list_head is never initialized. 2) If the rport is deleted from the peer list using list_del_rcu(), then the list_head is in an undefined state and it is not considered empty. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-04scsi: sd_zbc: Fix zone information messagesDamien Le Moal
Commit bf5054569653 ("block: Introduce blk_revalidate_disk_zones()") inadvertently broke the message output of sd_zbc_print_zones() because the zone information initialization of the scsi disk structure was moved to the second scan run while sd_zbc_print_zones() is called on the first scan. This leads to the following incorrect message to be printed for any ZBC or ZAC zoned disks. "...[sdX] 4294967295 zones of 0 logical blocks + 1 runt zone" Fix this by initializing sdkp zone size and number of zones early on the first scan. This does not impact the execution of blk_revalidate_zones(). This functions is still called only once the block device capacity is set on the second revalidate run on boot, or if the disk zone configuration changed (i.e. the disk changed). Fixes: bf5054569653 ("block: Introduce blk_revalidate_disk_zones()") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-02Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Five minor bug fixes. The libfc one is a tiny memory leak, the zfcp one is an incorrect user visible parameter and the rest are on error legs or obscure features" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: 53c700: pass correct "dev" to dma_alloc_attrs() scsi: bnx2fc: Fix error handling in probe() scsi: scsi_debug: fix write_same with virtual_gb problem scsi: libfc: free skb when receiving invalid flogi resp scsi: zfcp: fix sysfs block queue limit output for max_segment_size
2019-02-01scsi: aic94xx: fix module loadingJames Bottomley
The aic94xx driver is currently failing to load with errors like sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:02:00.3/0000:07:02.0/revision' Because the PCI code had recently added a file named 'revision' to every PCI device. Fix this by renaming the aic94xx revision file to aic_revision. This is safe to do for us because as far as I can tell, there's nothing in userspace relying on the current aic94xx revision file so it can be renamed without breaking anything. Fixes: 702ed3be1b1b (PCI: Create revision file in sysfs) Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29scsi: 53c700: pass correct "dev" to dma_alloc_attrs()Dan Carpenter
The "hostdata->dev" pointer is NULL here. We set "hostdata->dev = dev;" later in the function and we also use "hostdata->dev" when we call dma_free_attrs() in NCR_700_release(). This bug predates git version control. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29scsi: bnx2fc: Fix error handling in probe()Dan Carpenter
There are two issues here. First if cmgr->hba is not set early enough then it leads to a NULL dereference. Second if we don't completely initialize cmgr->io_bdt_pool[] then we end up dereferencing uninitialized pointers. Fixes: 853e2bd2103a ("[SCSI] bnx2fc: Broadcom FCoE offload driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29scsi: scsi_debug: fix write_same with virtual_gb problemDouglas Gilbert
The WRITE SAME(10) and (16) implementations didn't take account of the buffer wrap required when the virtual_gb parameter is greater than 0. Fix that and rename the fake_store() function to lba2fake_store() to lessen confusion with the global fake_storep pointer. Bump version date. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reported-by: Bart Van Assche <bvanassche@acm.org> Tested by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29scsi: libfc: free skb when receiving invalid flogi respMing Lu
The issue to be fixed in this commit is when libfc found it received a invalid FLOGI response from FC switch, it would return without freeing the fc frame, which is just the skb data. This would cause memory leak if FC switch keeps sending invalid FLOGI responses. This fix is just to make it execute `fc_frame_free(fp)` before returning from function `fc_lport_flogi_resp`. Signed-off-by: Ming Lu <ming.lu@citrix.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-26Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Six fixes, all of which appear to have user visible consequences. The DMA one is a regression fix from the merge window and of the others, four are driver specific and one specific to the target code" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: Use explicit access size in ufshcd_dump_regs scsi: tcmu: fix use after free scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state() scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport scsi: communicate max segment size to the DMA mapping code
2019-01-22scsi: ufs: Use explicit access size in ufshcd_dump_regsMarc Gonzalez
memcpy_fromio() doesn't provide any control over access size. For example, on arm64, it is implemented using readb and readq. This may trigger a synchronous external abort: [ 3.729943] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP [ 3.737000] Modules linked in: [ 3.744371] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G S 4.20.0-rc4 #16 [ 3.747413] Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT) [ 3.755295] pstate: 00000005 (nzcv daif -PAN -UAO) [ 3.761978] pc : __memcpy_fromio+0x68/0x80 [ 3.766718] lr : ufshcd_dump_regs+0x50/0xb0 [ 3.770767] sp : ffff00000807ba00 [ 3.774830] x29: ffff00000807ba00 x28: 00000000fffffffb [ 3.778344] x27: ffff0000089db068 x26: ffff8000f6e58000 [ 3.783728] x25: 000000000000000e x24: 0000000000000800 [ 3.789023] x23: ffff8000f6e587c8 x22: 0000000000000800 [ 3.794319] x21: ffff000008908368 x20: ffff8000f6e1ab80 [ 3.799615] x19: 000000000000006c x18: ffffffffffffffff [ 3.804910] x17: 0000000000000000 x16: 0000000000000000 [ 3.810206] x15: ffff000009199648 x14: ffff000089244187 [ 3.815502] x13: ffff000009244195 x12: ffff0000091ab000 [ 3.820797] x11: 0000000005f5e0ff x10: ffff0000091998a0 [ 3.826093] x9 : 0000000000000000 x8 : ffff8000f6e1ac00 [ 3.831389] x7 : 0000000000000000 x6 : 0000000000000068 [ 3.836676] x5 : ffff8000f6e1abe8 x4 : 0000000000000000 [ 3.841971] x3 : ffff00000928c868 x2 : ffff8000f6e1abec [ 3.847267] x1 : ffff00000928c868 x0 : ffff8000f6e1abe8 [ 3.852567] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) [ 3.857900] Call trace: [ 3.864473] __memcpy_fromio+0x68/0x80 [ 3.866683] ufs_qcom_dump_dbg_regs+0x1c0/0x370 [ 3.870522] ufshcd_print_host_regs+0x168/0x190 [ 3.874946] ufshcd_init+0xd4c/0xde0 [ 3.879459] ufshcd_pltfrm_init+0x3c8/0x550 [ 3.883264] ufs_qcom_probe+0x24/0x60 [ 3.887188] platform_drv_probe+0x50/0xa0 Assuming aligned 32-bit registers, let's use readl, after making sure that 'offset' and 'len' are indeed multiples of 4. Fixes: ba80917d9932d ("scsi: ufs: ufshcd_dump_regs to use memcpy_fromio") Cc: <stable@vger.kernel.org> Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Acked-by: Tomas Winkler <tomas.winkler@intel.com> Reviewed-by: Jeffrey Hugo <jhugo@codeaurora.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Tested-by: Evan Green <evgreen@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-22scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state()Varun Prakash
Assign fc_vport to ln->fc_vport before calling csio_fcoe_alloc_vnp() to avoid a NULL pointer dereference in csio_vport_set_state(). ln->fc_vport is dereferenced in csio_vport_set_state(). Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-22scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetportEwan D. Milne
We cannot wait on a completion object in the lpfc_nvme_targetport structure in the _destroy_targetport() code path because the NVMe/fc transport will free that structure immediately after the .targetport_delete() callback. This results in a use-after-free, and a hang if slub_debug=FZPU is enabled. Fix this by putting the completion on the stack. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-22scsi: lpfc: nvme: avoid hang / use-after-free when destroying localportEwan D. Milne
We cannot wait on a completion object in the lpfc_nvme_lport structure in the _destroy_localport() code path because the NVMe/fc transport will free that structure immediately after the .localport_delete() callback. This results in a use-after-free, and a hang if slub_debug=FZPU is enabled. Fix this by putting the completion on the stack. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-22scsi: communicate max segment size to the DMA mapping codeChristoph Hellwig
When a host driver sets a maximum segment size we should not only propagate that setting to the block layer, which can merge segments, but also to the DMA mapping layer which can merge segments as well. Fixes: 50c2e9107f ("scsi: introduce a max_segment_size host_template parameters") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-20Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "A set of 17 fixes. Most of these are minor or trivial. The one fix that may be serious is the isci one: the bug can cause hba parameters to be set from uninitialized memory. I don't think it's exploitable, but you never know" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: cxgb4i: add wait_for_completion() scsi: qla1280: set 64bit coherent mask scsi: ufs: Fix geometry descriptor size scsi: megaraid_sas: Retry reads of outbound_intr_status reg scsi: qedi: Add ep_state for login completion on un-reachable targets scsi: ufs: Fix system suspend status scsi: qla2xxx: Use correct number of vectors for online CPUs scsi: hisi_sas: Set protection parameters prior to adding SCSI host scsi: tcmu: avoid cmd/qfull timers updated whenever a new cmd comes scsi: isci: initialize shost fully before calling scsi_add_host() scsi: lpfc: lpfc_sli: Mark expected switch fall-throughs scsi: smartpqi_init: fix boolean expression in pqi_device_remove_start scsi: core: Synchronize request queue PM status only on successful resume scsi: pm80xx: reduce indentation scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param scsi: megaraid_sas: correct an info message scsi: target/iscsi: fix error msg typo when create lio_qr_cache failed scsi: sd: Fix cache_type_store()
2019-01-11scsi: cxgb4i: add wait_for_completion()Varun Prakash
In case of ->set_param() and ->bind_conn() cxgb4i driver does not wait for cmd completion, this can create race conditions, to avoid this add wait_for_completion(). Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-11scsi: qla1280: set 64bit coherent maskThomas Bogendoerfer
After Commit 54aed4dd3526 ("MIPS: IP27: use dma_direct_ops") qla1280 driver failed on SGI IP27 machines with qla1280: QLA1040 found on PCI bus 0, dev 0 qla1280 0000:00:00.0: enabling device (0006 -> 0007) qla1280: Failed to get request memory qla1280: probe of 0000:00:00.0 failed with error -12 Reason is that SGI IP27 always generates 64bit DMA addresses and has no fallback mode for 32bit DMA addresses implemented. QLA1280 supports 64bit addressing for all DMA accesses so setting coherent mask to 64bit fixes the issue. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-11scsi: ufs: Fix geometry descriptor sizeAvri Altman
Albeit we no longer rely on those hard-coded descriptor sizes, we still use them as our defaults, so better get it right. While adding its sysfs entries, we forgot to update the geometry descriptor size. It is 0x48 according to UFS2.1, and wasn't changed in UFS3.0. [mkp: typo] Fixes: c720c091222e (scsi: ufs: sysfs: geometry descriptor) Signed-off-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-11scsi: megaraid_sas: Retry reads of outbound_intr_status regShivasharan S
commit 272652fcbf1a ("scsi: megaraid_sas: add retry logic in megasas_readl") missed changing readl to megasas_readl in megasas_clear_intr_fusion(). For Aero controllers, reads of outbound_intr_status register needs to be retried. Reported-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-11scsi: qedi: Add ep_state for login completion on un-reachable targetsManish Rangankar
When the driver finds invalid destination MAC for the first un-reachable target, and before completes the PATH_REQ operation, set new ep_state to OFFLDCONN_NONE so that as part of driver ep_poll mechanism, the upper open-iscsi layer is notified to complete the login process on the first un-reachable target and thus proceed login to other reachable targets. Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-11scsi: ufs: Fix system suspend statusStanley Chu
hba->is_sys_suspended is set after successful system suspend but not clear after successful system resume. According to current behavior, hba->is_sys_suspended will not be set if host is runtime-suspended but not system-suspended. Thus we shall aligh the same policy: clear this flag even if host remains runtime-suspended after ufshcd_system_resume is successfully returned. Simply fix this flag to correct host status logs. Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-11scsi: qla2xxx: Use correct number of vectors for online CPUsMing Lei
When SCSI-MQ is enabled, in some case system would present nr_possible_cpus() which is greater than requested vectors by the driver. This results into driver being able to get larger number of MSI-X vectors than actual online CPUs. Driver then uses pci_alloc_irq_vectors_affinity() to assign 1:1 mapping and affinity for each MSI-x vector to CPUs. When the command is submitted using MSI-x vector, assigned to offline CPU, it results in an ABTS and system hang. This hang is result of a driver not being able to process interrupt on a vector assigned to an Off-line CPUs This patch fixes this issue by setting irq_offset value for the blk_mq_pci_map_queues() to use only those CPUs which has CPU mask affinity assigned and are online. By using the irq_offset value, driver will allow online cpumask to decide which vectors are used in blk_mq_pci_map_queues(). Fixes: 5601236b6f794 ("scsi: qla2xxx: Add Block Multi Queue functionality.") Cc: <stable@vger.kernel.org> #4.19 Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> Tested-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-11scsi: hisi_sas: Set protection parameters prior to adding SCSI hostJohn Garry
Currently we set the protection parameters after calling scsi_add_host() for v3 hw. They should be set beforehand, so make this change. Appearantly this fixes our DIX issue (not mainline yet) also, but more testing required. Fixes: d6a9000b81be ("scsi: hisi_sas: Add support for DIF feature for v2 hw") Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08scsi: isci: initialize shost fully before calling scsi_add_host()Logan Gunthorpe
scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates the command size to allocate based on the prot_capabilities. In the isci driver, scsi_host_set_prot() is called after scsi_add_host() so the command size gets calculated to be smaller than it needs to be. Eventually, scsi_mq_init_request() locates the 'prot_sdb' after the command assuming it was sized correctly and a buffer overrun may occur. However, seeing blk_mq_alloc_rqs() rounds up to the nearest cache line size, the mistake can go unnoticed. The bug was noticed after the struct request size was reduced by commit 9d037ad707ed ("block: remove req->timeout_list") Which likely reduced the allocated space for the request by an entire cache line, enough that the overflow could be hit and it caused a panic, on boot, at: RIP: 0010:t10_pi_complete+0x77/0x1c0 Call Trace: <IRQ> sd_done+0xf5/0x340 scsi_finish_command+0xc3/0x120 blk_done_softirq+0x83/0xb0 __do_softirq+0xa1/0x2e6 irq_exit+0xbc/0xd0 call_function_single_interrupt+0xf/0x20 </IRQ> sd_done() would call scsi_prot_sg_count() which reads the number of entities in 'prot_sdb', but seeing 'prot_sdb' is located after the end of the allocated space it reads a garbage number and erroneously calls t10_pi_complete(). To prevent this, the calls to scsi_host_set_prot() are moved into isci_host_alloc() before the call to scsi_add_host(). Out of caution, also move the similar call to scsi_host_set_guard(). Fixes: 3d2d75254915 ("[SCSI] isci: T10 DIF support") Link: http://lkml.kernel.org/r/da851333-eadd-163a-8c78-e1f4ec5ec857@deltatee.com Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Intel SCU Linux support <intel-linux-scu@intel.com> Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08scsi: lpfc: lpfc_sli: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that, in this particular case, I replaced "Drop thru" and "Fall Thru" with "fall through" annotations, which is what GCC is expecting to find. Also, in some cases a dash is added as a token in order to separate the "fall through" annotation from the rest of the comment on the same line, which is what GCC is expecting to find. Addresses-Coverity-ID: 114979 ("Missing break in switch") Addresses-Coverity-ID: 114980 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08scsi: smartpqi_init: fix boolean expression in pqi_device_remove_startGustavo A. R. Silva
Fix boolean expression by using logical AND operator '&&' instead of bitwise operator '&'. This issue was detected with the help of Coccinelle. Fixes: 1e46731efd9c ("scsi: smartpqi: check for null device pointers") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08scsi: core: Synchronize request queue PM status only on successful resumeStanley Chu
The commit 356fd2663cff ("scsi: Set request queue runtime PM status back to active on resume") fixed up the inconsistent RPM status between request queue and device. However changing request queue RPM status shall be done only on successful resume, otherwise status may be still inconsistent as below, Request queue: RPM_ACTIVE Device: RPM_SUSPENDED This ends up soft lockup because requests can be submitted to underlying devices but those devices and their required resource are not resumed. For example, After above inconsistent status happens, IO request can be submitted to UFS device driver but required resource (like clock) is not resumed yet thus lead to warning as below call stack, WARN_ON(hba->clk_gating.state != CLKS_ON); ufshcd_queuecommand scsi_dispatch_cmd scsi_request_fn __blk_run_queue cfq_insert_request __elv_add_request blk_flush_plug_list blk_finish_plug jbd2_journal_commit_transaction kjournald2 We may see all behind IO requests hang because of no response from storage host or device and then soft lockup happens in system. In the end, system may crash in many ways. Fixes: 356fd2663cff (scsi: Set request queue runtime PM status back to active on resume) Cc: stable@vger.kernel.org Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08scsi: pm80xx: reduce indentationJulia Lawall
Delete tab aligning a statement with the right hand side of a preceding assignment rather than the left hand side. Found with the help of Coccinelle. [mkp: added space] Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_paramYueHaibing
The return code should be check while qla4xxx_copy_from_fwddb_param fails. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08scsi: megaraid_sas: correct an info messageTomas Henzl
This was apparently forgotten in 894169db1 ("scsi: megaraid_sas: Use 63-bit DMA addressing"). Signed-off-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08scsi: sd: Fix cache_type_store()Ivan Mironov
Changing of caching mode via /sys/devices/.../scsi_disk/.../cache_type may fail if device responds to MODE SENSE command with DPOFUA flag set, and then checks this flag to be not set on MODE SELECT command. In this scenario, when trying to change cache_type, write always fails: # echo "none" >cache_type bash: echo: write error: Invalid argument And following appears in dmesg: [13007.865745] sd 1:0:1:0: [sda] Sense Key : Illegal Request [current] [13007.865753] sd 1:0:1:0: [sda] Add. Sense: Invalid field in parameter list From SBC-4 r15, 6.5.1 "Mode pages overview", description of DEVICE-SPECIFIC PARAMETER field in the mode parameter header: ... The write protect (WP) bit for mode data sent with a MODE SELECT command shall be ignored by the device server. ... The DPOFUA bit is reserved for mode data sent with a MODE SELECT command. ... The remaining bits in the DEVICE-SPECIFIC PARAMETER byte are also reserved and shall be set to zero. [mkp: shuffled commentary to commit description] Cc: stable@vger.kernel.org Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08cross-tree: phase out dma_zalloc_coherent()Luis Chamberlain
We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-03Remove 'type' argument from access_ok() functionLinus Torvalds
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: smarpqi, lpfc, qedi, megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have a pile of annotation, unused variable and minor updates. The big API change is the updates for Christoph's DMA rework which include removing the DISABLE_CLUSTERING flag. And finally there are a couple of target tree updates" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits) scsi: isci: request: mark expected switch fall-through scsi: isci: remote_node_context: mark expected switch fall-throughs scsi: isci: remote_device: Mark expected switch fall-throughs scsi: isci: phy: Mark expected switch fall-through scsi: iscsi: Capture iscsi debug messages using tracepoints scsi: myrb: Mark expected switch fall-throughs scsi: megaraid: fix out-of-bound array accesses scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through scsi: fcoe: remove set but not used variable 'port' scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown() scsi: smartpqi: fix build warnings scsi: smartpqi: update driver version scsi: smartpqi: add ofa support scsi: smartpqi: increase fw status register read timeout scsi: smartpqi: bump driver version scsi: smartpqi: add smp_utils support scsi: smartpqi: correct lun reset issues scsi: smartpqi: correct volume status scsi: smartpqi: do not offline disks for transient did no connect conditions scsi: smartpqi: allow for larger raid maps ...
2018-12-28Merge tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: "This is the main pull request for block/storage for 4.21. Larger than usual, it was a busy round with lots of goodies queued up. Most notable is the removal of the old IO stack, which has been a long time coming. No new features for a while, everything coming in this week has all been fixes for things that were previously merged. This contains: - Use atomic counters instead of semaphores for mtip32xx (Arnd) - Cleanup of the mtip32xx request setup (Christoph) - Fix for circular locking dependency in loop (Jan, Tetsuo) - bcache (Coly, Guoju, Shenghui) * Optimizations for writeback caching * Various fixes and improvements - nvme (Chaitanya, Christoph, Sagi, Jay, me, Keith) * host and target support for NVMe over TCP * Error log page support * Support for separate read/write/poll queues * Much improved polling * discard OOM fallback * Tracepoint improvements - lightnvm (Hans, Hua, Igor, Matias, Javier) * Igor added packed metadata to pblk. Now drives without metadata per LBA can be used as well. * Fix from Geert on uninitialized value on chunk metadata reads. * Fixes from Hans and Javier to pblk recovery and write path. * Fix from Hua Su to fix a race condition in the pblk recovery code. * Scan optimization added to pblk recovery from Zhoujie. * Small geometry cleanup from me. - Conversion of the last few drivers that used the legacy path to blk-mq (me) - Removal of legacy IO path in SCSI (me, Christoph) - Removal of legacy IO stack and schedulers (me) - Support for much better polling, now without interrupts at all. blk-mq adds support for multiple queue maps, which enables us to have a map per type. This in turn enables nvme to have separate completion queues for polling, which can then be interrupt-less. Also means we're ready for async polled IO, which is hopefully coming in the next release. - Killing of (now) unused block exports (Christoph) - Unification of the blk-rq-qos and blk-wbt wait handling (Josef) - Support for zoned testing with null_blk (Masato) - sx8 conversion to per-host tag sets (Christoph) - IO priority improvements (Damien) - mq-deadline zoned fix (Damien) - Ref count blkcg series (Dennis) - Lots of blk-mq improvements and speedups (me) - sbitmap scalability improvements (me) - Make core inflight IO accounting per-cpu (Mikulas) - Export timeout setting in sysfs (Weiping) - Cleanup the direct issue path (Jianchao) - Export blk-wbt internals in block debugfs for easier debugging (Ming) - Lots of other fixes and improvements" * tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-block: (364 commits) kyber: use sbitmap add_wait_queue/list_del wait helpers sbitmap: add helpers for add/del wait queue handling block: save irq state in blkg_lookup_create() dm: don't reuse bio for flushes nvme-pci: trace SQ status on completions nvme-rdma: implement polling queue map nvme-fabrics: allow user to pass in nr_poll_queues nvme-fabrics: allow nvmf_connect_io_queue to poll nvme-core: optionally poll sync commands block: make request_to_qc_t public nvme-tcp: fix spelling mistake "attepmpt" -> "attempt" nvme-tcp: fix endianess annotations nvmet-tcp: fix endianess annotations nvme-pci: refactor nvme_poll_irqdisable to make sparse happy nvme-pci: only set nr_maps to 2 if poll queues are supported nvmet: use a macro for default error location nvmet: fix comparison of a u16 with -1 blk-mq: enable IO poll if .nr_queues of type poll > 0 blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight() blk-mq: skip zero-queue maps in blk_mq_map_swqueue ...
2018-12-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) New ipset extensions for matching on destination MAC addresses, from Stefano Brivio. 2) Add ipv4 ttl and tos, plus ipv6 flow label and hop limit offloads to nfp driver. From Stefano Brivio. 3) Implement GRO for plain UDP sockets, from Paolo Abeni. 4) Lots of work from Michał Mirosław to eliminate the VLAN_TAG_PRESENT bit so that we could support the entire vlan_tci value. 5) Rework the IPSEC policy lookups to better optimize more usecases, from Florian Westphal. 6) Infrastructure changes eliminating direct manipulation of SKB lists wherever possible, and to always use the appropriate SKB list helpers. This work is still ongoing... 7) Lots of PHY driver and state machine improvements and simplifications, from Heiner Kallweit. 8) Various TSO deferral refinements, from Eric Dumazet. 9) Add ntuple filter support to aquantia driver, from Dmitry Bogdanov. 10) Batch dropping of XDP packets in tuntap, from Jason Wang. 11) Lots of cleanups and improvements to the r8169 driver from Heiner Kallweit, including support for ->xmit_more. This driver has been getting some much needed love since he started working on it. 12) Lots of new forwarding selftests from Petr Machata. 13) Enable VXLAN learning in mlxsw driver, from Ido Schimmel. 14) Packed ring support for virtio, from Tiwei Bie. 15) Add new Aquantia AQtion USB driver, from Dmitry Bezrukov. 16) Add XDP support to dpaa2-eth driver, from Ioana Ciocoi Radulescu. 17) Implement coalescing on TCP backlog queue, from Eric Dumazet. 18) Implement carrier change in tun driver, from Nicolas Dichtel. 19) Support msg_zerocopy in UDP, from Willem de Bruijn. 20) Significantly improve garbage collection of neighbor objects when the table has many PERMANENT entries, from David Ahern. 21) Remove egdev usage from nfp and mlx5, and remove the facility completely from the tree as it no longer has any users. From Oz Shlomo and others. 22) Add a NETDEV_PRE_CHANGEADDR so that drivers can veto the change and therefore abort the operation before the commit phase (which is the NETDEV_CHANGEADDR event). From Petr Machata. 23) Add indirect call wrappers to avoid retpoline overhead, and use them in the GRO code paths. From Paolo Abeni. 24) Add support for netlink FDB get operations, from Roopa Prabhu. 25) Support bloom filter in mlxsw driver, from Nir Dotan. 26) Add SKB extension infrastructure. This consolidates the handling of the auxiliary SKB data used by IPSEC and bridge netfilter, and is designed to support the needs to MPTCP which could be integrated in the future. 27) Lots of XDP TX optimizations in mlx5 from Tariq Toukan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1845 commits) net: dccp: fix kernel crash on module load drivers/net: appletalk/cops: remove redundant if statement and mask bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw net/net_namespace: Check the return value of register_pernet_subsys() net/netlink_compat: Fix a missing check of nla_parse_nested ieee802154: lowpan_header_create check must check daddr net/mlx4_core: drop useless LIST_HEAD mlxsw: spectrum: drop useless LIST_HEAD net/mlx5e: drop useless LIST_HEAD iptunnel: Set tun_flags in the iptunnel_metadata_reply from src net/mlx5e: fix semicolon.cocci warnings staging: octeon: fix build failure with XFRM enabled net: Revert recent Spectre-v1 patches. can: af_can: Fix Spectre v1 vulnerability packet: validate address length if non-zero nfc: af_nfc: Fix Spectre v1 vulnerability phonet: af_phonet: Fix Spectre v1 vulnerability net: core: Fix Spectre v1 vulnerability net: minor cleanup in skb_ext_add() net: drop the unused helper skb_ext_get() ...
2018-12-22Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is two simple target fixes and one discard related I/O starvation problem in sd. The discard problem occurs because the discard page doesn't have a mempool backing so if the allocation fails due to memory pressure, we then lose the forward progress we require if the writeout is on the same device. The fix is to back it with a mempool" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: use mempool for discard special page scsi: target: iscsi: cxgbit: add missing spin_lock_init() scsi: target: iscsi: cxgbit: fix csk leak
2018-12-20scsi: isci: request: mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that, in this particular case, a dash is added as a token in order to separate the "Fall through" annotation from the rest of the comment on the same line, which is what GCC is expecting to find. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-20scsi: isci: remote_node_context: mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that, in this particular case, a dash is added as a token in order to separate the "Fall through" annotations from the rest of the comment on the same line, which is what GCC is expecting to find. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-20scsi: isci: remote_device: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that, in this particular case, a dash is added as a token in order to separate the "fall through" annotations from the rest of the comment on the same line, which is what GCC is expecting to find. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-20scsi: isci: phy: Mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 703127 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-20scsi: iscsi: Capture iscsi debug messages using tracepointsFred Herard
This commit enhances iscsi initiator modules to capture iscsi debug messages using linux kernel tracepoint facility: https://www.kernel.org/doc/Documentation/trace/tracepoints.txt The following tracepoint events have been created under the iscsi tracepoint event group: iscsi_dbg_conn - to capture connection debug messages (libiscsi module) iscsi_dbg_session - to capture session debug messages (libiscsi module) iscsi_dbg_eh - to capture error handling debug messages (libiscsi module) iscsi_dbg_tcp - to capture iscsi tcp debug messages (libiscsi_tcp module) iscsi_dbg_sw_tcp - to capture iscsi sw tcp debug messages (iscsi_tcp module) iscsi_dbg_trans_session - to cpature iscsi transsport sess debug messages (scsi_transport_iscsi module) iscsi_dbg_trans_conn - to capture iscsi transport conn debug messages (scsi_transport_iscsi module) [mkp: typos] Signed-off-by: Fred Herard <fred.herard@oracle.com> Reviewed-by: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-20scsi: myrb: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1465234 ("Missing break in switch") Addresses-Coverity-ID: 1465238 ("Missing break in switch") Addresses-Coverity-ID: 1465242 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-20scsi: megaraid: fix out-of-bound array accessesQian Cai
UBSAN reported those with MegaRAID SAS-3 3108, [ 77.467308] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32 [ 77.475402] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]' [ 77.481677] CPU: 16 PID: 333 Comm: kworker/16:1 Not tainted 4.20.0-rc5+ #1 [ 77.488556] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [ 77.495791] Workqueue: events work_for_cpu_fn [ 77.500154] Call trace: [ 77.502610] dump_backtrace+0x0/0x2c8 [ 77.506279] show_stack+0x24/0x30 [ 77.509604] dump_stack+0x118/0x19c [ 77.513098] ubsan_epilogue+0x14/0x60 [ 77.516765] __ubsan_handle_out_of_bounds+0xfc/0x13c [ 77.521767] mr_update_load_balance_params+0x150/0x158 [megaraid_sas] [ 77.528230] MR_ValidateMapInfo+0x2cc/0x10d0 [megaraid_sas] [ 77.533825] megasas_get_map_info+0x244/0x2f0 [megaraid_sas] [ 77.539505] megasas_init_adapter_fusion+0x9b0/0xf48 [megaraid_sas] [ 77.545794] megasas_init_fw+0x1ab4/0x3518 [megaraid_sas] [ 77.551212] megasas_probe_one+0x2c4/0xbe0 [megaraid_sas] [ 77.556614] local_pci_probe+0x7c/0xf0 [ 77.560365] work_for_cpu_fn+0x34/0x50 [ 77.564118] process_one_work+0x61c/0xf08 [ 77.568129] worker_thread+0x534/0xa70 [ 77.571882] kthread+0x1c8/0x1d0 [ 77.575114] ret_from_fork+0x10/0x1c [ 89.240332] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32 [ 89.248426] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]' [ 89.254700] CPU: 16 PID: 95 Comm: kworker/u130:0 Not tainted 4.20.0-rc5+ #1 [ 89.261665] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [ 89.268903] Workqueue: events_unbound async_run_entry_fn [ 89.274222] Call trace: [ 89.276680] dump_backtrace+0x0/0x2c8 [ 89.280348] show_stack+0x24/0x30 [ 89.283671] dump_stack+0x118/0x19c [ 89.287167] ubsan_epilogue+0x14/0x60 [ 89.290835] __ubsan_handle_out_of_bounds+0xfc/0x13c [ 89.295828] MR_LdRaidGet+0x50/0x58 [megaraid_sas] [ 89.300638] megasas_build_io_fusion+0xbb8/0xd90 [megaraid_sas] [ 89.306576] megasas_build_and_issue_cmd_fusion+0x138/0x460 [megaraid_sas] [ 89.313468] megasas_queue_command+0x398/0x3d0 [megaraid_sas] [ 89.319222] scsi_dispatch_cmd+0x1dc/0x8a8 [ 89.323321] scsi_request_fn+0x8e8/0xdd0 [ 89.327249] __blk_run_queue+0xc4/0x158 [ 89.331090] blk_execute_rq_nowait+0xf4/0x158 [ 89.335449] blk_execute_rq+0xdc/0x158 [ 89.339202] __scsi_execute+0x130/0x258 [ 89.343041] scsi_probe_and_add_lun+0x2fc/0x1488 [ 89.347661] __scsi_scan_target+0x1cc/0x8c8 [ 89.351848] scsi_scan_channel.part.3+0x8c/0xc0 [ 89.356382] scsi_scan_host_selected+0x130/0x1f0 [ 89.361002] do_scsi_scan_host+0xd8/0xf0 [ 89.364927] do_scan_async+0x9c/0x320 [ 89.368594] async_run_entry_fn+0x138/0x420 [ 89.372780] process_one_work+0x61c/0xf08 [ 89.376793] worker_thread+0x13c/0xa70 [ 89.380546] kthread+0x1c8/0x1d0 [ 89.383778] ret_from_fork+0x10/0x1c This is because when populating Driver Map using firmware raid map, all non-existing VDs set their ldTgtIdToLd to 0xff, so it can be skipped later. From drivers/scsi/megaraid/megaraid_sas_base.c , memset(instance->ld_ids, 0xff, MEGASAS_MAX_LD_IDS); From drivers/scsi/megaraid/megaraid_sas_fp.c , /* For non existing VDs, iterate to next VD*/ if (ld >= (MAX_LOGICAL_DRIVES_EXT - 1)) continue; However, there are a few places that failed to skip those non-existing VDs due to off-by-one errors. Then, those 0xff leaked into MR_LdRaidGet(0xff, map) and triggered the out-of-bound accesses. Fixes: 51087a8617fe ("megaraid_sas : Extended VD support") Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-20scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1475400 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>