summaryrefslogtreecommitdiff
path: root/drivers/scsi/megaraid/megaraid_sas_fusion.h
AgeCommit message (Collapse)Author
2021-03-04scsi: megaraid_sas: mq_poll supportKashyap Desai
Implement mq_poll interface support in megaraid_sas. This feature requires shared host tag support in kernel and driver. The driver can work in non-IRQ mode which means there will not be any MSI-x vector associated for poll_queues. The MegaRAID hardware has a single submission queue and multiple reply queues. However, using the shared host tagset support will enable the driver to simulate multiple hardware queues. Change driver to allocate some extra reply queues which will be marked as poll_queues. These poll_queues will not have associated MSI-x vectors. All I/O completions on these queues will be done through the IOPOLL interface. megaraid_sas with 8 poll_queues and using the io_uring hiprio=1 setting can reach 3.2M IOPS with zero interrupts generated by the hardware. The IOPOLL feature can be enabled using module parameter poll_queues. Link: https://lore.kernel.org/r/20210215074048.19424-3-kashyap.desai@broadcom.com Cc: sumit.saxena@broadcom.com Cc: chandrakanth.patil@broadcom.com Cc: linux-block@vger.kernel.org Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-11scsi: megaraid_sas: Replace undefined MFI_BIG_ENDIAN macro with ↵Shivasharan S
__BIG_ENDIAN_BITFIELD macro MFI_BIG_ENDIAN macro used in drivers structure bitfield to check the CPU big endianness is undefined which would break the code on big endian machine. __BIG_ENDIAN_BITFIELD kernel macro should be used in places of MFI_BIG_ENDIAN macro. Link: https://lore.kernel.org/r/20200508085130.23339-1-chandrakanth.patil@broadcom.com Fixes: a7faf81d7858 ("scsi: megaraid_sas: Set no_write_same only for Virtual Disk") Cc: <stable@vger.kernel.org> # v5.6+ Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15scsi: megaraid_sas: Do not initiate OCR if controller is not in ready stateAnand Lodnoor
Driver initiates OCR if a DCMD command times out. But there is a deadlock if the driver attempts to invoke another OCR before the mutex lock (reset_mutex) is released from the previous session of OCR. This patch takes care of the above scenario using new flag MEGASAS_FUSION_OCR_NOT_POSSIBLE to indicate if OCR is possible. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1579000882-20246-9-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15scsi: megaraid_sas: Set no_write_same only for Virtual DiskAnand Lodnoor
Disable WRITE_SAME (no_write_same) for Virtual Disks only. For System PDs and EPDs (Enhanced PDs), WRITE_SAME need not be disabled by default. Link: https://lore.kernel.org/r/1579000882-20246-3-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-11Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs, mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the removal of the osst driver (I heard from Willem privately that he would like the driver removed because all his test hardware has failed). Plus number of minor changes, spelling fixes and other trivia. The big merge conflict this time around is the SPDX licence tags. Following discussion on linux-next, we believe our version to be more accurate than the one in the tree, so the resolution is to take our version for all the SPDX conflicts" Note on the SPDX license tag conversion conflicts: the SCSI tree had done its own SPDX conversion, which in some cases conflicted with the treewide ones done by Thomas & co. In almost all cases, the conflicts were purely syntactic: the SCSI tree used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the treewide conversion had used the new-style ones ("GPL-2.0-only" and "GPL-2.0-or-later"). In these cases I picked the new-style one. In a few cases, the SPDX conversion was actually different, though. As explained by James above, and in more detail in a pre-pull-request thread: "The other problem is actually substantive: In the libsas code Luben Tuikov originally specified gpl 2.0 only by dint of stating: * This file is licensed under GPLv2. In all the libsas files, but then muddied the water by quoting GPLv2 verbatim (which includes the or later than language). So for these files Christoph did the conversion to v2 only SPDX tags and Thomas converted to v2 or later tags" So in those cases, where the spdx tag substantially mattered, I took the SCSI tree conversion of it, but then also took the opportunity to turn the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag. Similarly, when there were whitespace differences or other differences to the comments around the copyright notices, I took the version from the SCSI tree as being the more specific conversion. Finally, in the spdx conversions that had no conflicts (because the treewide ones hadn't been done for those files), I just took the SCSI tree version as-is, even if it was old-style. The old-style conversions are perfectly valid, even if the "-only" and "-or-later" versions are perhaps more descriptive. * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits) scsi: qla2xxx: move IO flush to the front of NVME rport unregistration scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition scsi: qla2xxx: on session delete, return nvme cmd scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1 scsi: megaraid_sas: Introduce various Aero performance modes scsi: megaraid_sas: Use high IOPS queues based on IO workload scsi: megaraid_sas: Set affinity for high IOPS reply queues scsi: megaraid_sas: Enable coalescing for high IOPS queues scsi: megaraid_sas: Add support for High IOPS queues scsi: megaraid_sas: Add support for MPI toolbox commands scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD scsi: megaraid_sas: Handle sequence JBOD map failure at driver level scsi: megaraid_sas: Don't send FPIO to RL Bypass queue scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout scsi: megaraid_sas: Call disable_irq from process IRQ poll scsi: megaraid_sas: Remove few debug counters from IO path ...
2019-06-27scsi: megaraid_sas: Use high IOPS queues based on IO workloadChandrakanth Patil
The driver will use round-robin method for IO submission in batches within the high IOPS queues when the number of in-flight ios on the target device is larger than 8. Otherwise the driver will use low latency reply queues. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driverChandrakanth Patil
For RAID5/RAID6 volumes configured behind Aero, driver will be doing 64bit division operations on behalf of firmware as controller's ARM CPU is very slow in this division. Later, driver calculates Q-ARM, P-ARM and Log-ARM and passes those values to firmware by writing these values to RAID_CONTEXT. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for ↵Chandrakanth Patil
only Ventura RAID1 PCI bandwidth limit algorithm is not applicable to Aero as it's PCIe Gen4 adapter. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18scsi: megaraid_sas: Export RAID map through debugfsShivasharan S
Create a debugfs interface for megaraid_sas driver. Provide interface to dump driver RAID map in debugfs. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18scsi: megaraid_sas: IRQ poll to avoid CPU hard lockupsShivasharan S
Issue Description: We have seen cpu lock up issues from field if system has a large (more than 96) logical cpu count. SAS3.0 controller (Invader series) supports max 96 MSI-X vector and SAS3.5 product (Ventura) supports max 128 MSI-X vectors. This may be a generic issue (if PCI device support completion on multiple reply queues). Let me explain it w.r.t megaraid_sas supported h/w just to simplify the problem and possible changes to handle such issues. MegaRAID controller supports multiple reply queues in completion path. Driver creates MSI-X vectors for controller as "minimum of (FW supported Reply queues, Logical CPUs)". If submitter is not interrupted via completion on same CPU, there is a loop in the IO path. This behavior can cause hard/soft CPU lockups, IO timeout, system sluggish etc. Example - one CPU (e.g. CPU A) is busy submitting the IOs and another CPU (e.g. CPU B) is busy with processing the corresponding IO's reply descriptors from reply descriptor queue upon receiving the interrupts from HBA. If CPU A is continuously pumping the IOs then always CPU B (which is executing the ISR) will see the valid reply descriptors in the reply descriptor queue and it will be continuously processing those reply descriptor in a loop without quitting the ISR handler. megaraid_sas driver will exit ISR handler if it finds unused reply descriptor in the reply descriptor queue. Since CPU A will be continuously sending the IOs, CPU B may always see a valid reply descriptor (posted by HBA Firmware after processing the IO) in the reply descriptor queue. In worst case, driver will not quit from this loop in the ISR handler. Eventually, CPU lockup will be detected by watchdog. Above mentioned behavior is not common if "rq_affinity" set to 2 or affinity_hint is honored by irqbalancer as "exact". If rq_affinity is set to 2, submitter will be always interrupted via completion on same CPU. If irqbalancer is using "exact" policy, interrupt will be delivered to submitter CPU. Problem statement: If CPU count to MSI-X vectors (reply descriptor Queues) count ratio is not 1:1, we still have exposure of issue explained above and for that we don't have any solution. Exposure of soft/hard lockup is seen if CPU count is more than MSI-X supported by device. If CPUs count to MSI-X vectors count ratio is not 1:1, (Other way, if CPU counts to MSI-X vector count ratio is something like X:1, where X > 1) then 'exact' irqbalance policy OR rq_affinity = 2 won't help to avoid CPU hard/soft lockups. There won't be any one to one mapping between CPU to MSI-X vector instead one MSI-X interrupt (or reply descriptor queue) is shared with group/set of CPUs and there is a possibility of having a loop in the IO path within that CPU group and may observe lockups. For example: Consider a system having two NUMA nodes and each node having four logical CPUs and also consider that number of MSI-X vectors enabled on the HBA is two, then CPUs count to MSI-X vector count ratio as 4:1. e.g. MSI-X vector 0 is affinity to CPU 0, CPU 1, CPU 2 & CPU 3 of NUMA node 0 and MSI-X vector 1 is affinity to CPU 4, CPU 5, CPU 6 & CPU 7 of NUMA node 1. numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 --> MSI-X 0 node 0 size: 65536 MB node 0 free: 63176 MB node 1 cpus: 4 5 6 7 --> MSI-X 1 node 1 size: 65536 MB node 1 free: 63176 MB Assume that user started an application which uses all the CPUs of NUMA node 0 for issuing the IOs. Only one CPU from affinity list (it can be any cpu since this behavior depends upon irqbalance) CPU0 will receive the interrupts from MSI-X 0 for all the IOs. Eventually, CPU 0 IO submission percentage will be decreasing and ISR processing percentage will be increasing as it is more busy with processing the interrupts. Gradually IO submission percentage on CPU 0 will be zero and it's ISR processing percentage will be 100% as IO loop has already formed within the NUMA node 0, i.e. CPU 1, CPU 2 & CPU 3 will be continuously busy with submitting the heavy IOs and only CPU 0 is busy in the ISR path as it always find the valid reply descriptor in the reply descriptor queue. Eventually, we will observe the hard lockup here. Chances of occurring of hard/soft lockups are directly proportional to value of X. If value of X is high, then chances of observing CPU lockups is high. Solution: Use IRQ poll interface defined in "irq_poll.c". megaraid_sas driver will execute ISR routine in softirq context and it will always quit the loop based on budget provided in IRQ poll interface. Driver will switch to IRQ poll only when more than a threshold number of reply descriptors are handled in one ISR. Currently threshold is set as 1/4th of HBA queue depth. In these scenarios (i.e. where CPUs count to MSI-X vectors count ratio is X:1 (where X > 1)), IRQ poll interface will avoid CPU hard lockups due to voluntary exit from the reply queue processing based on budget. Note - Only one MSI-X vector is busy doing processing. Select CONFIG_IRQ_POLL from driver Kconfig for driver compilation. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13Thomas Gleixner
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details [based] [from] [clk] [highbank] [c] you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 355 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-04scsi: megaraid_sas: Add support for DEVICE_LIST DCMD in driverShivasharan S
This patch adds support for the new DEVICE_LIST DCMD. Driver currently sends two separate DCMDs for getting the list of PDs and LDs that are exposed to host. The new DCMD provides a single interface to get a list of both PDs and LDs that are exposed to the host. Based on the list of target IDs that are returned by this DCMD, driver will add the devices (PD/LD) to SML. Driver will check for FW support for this new DCMD and based on the support will either send the new DCMD or will fall back to the earlier method of sending two separate DCMDs for PD and LD list. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06scsi: megaraid_sas: Update copyright informationShivasharan S
Change copyright to Broadcom Inc. Also update any references to Avago with Broadcom. Update copyright duration wherever required. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06scsi: megaraid_sas: Add support for FW snap dumpShivasharan S
Latest firmware adds a mechanism to save firmware logs just before controller reset on pre-allocated internal controller DRAM. This feature is called snapdump which will help debugging firmware issues. This feature requires extra time and firmware reports these values through new driver interface. Before initiating an OCR, driver needs to inform FW to save a snapdump and then wait for a specified time for the snapdump to complete. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-10scsi: megaraid_sas: re-work DCMD refire codeShivasharan S
No functional changes. This patch is a re-work of DCMD refire code to better manage all the different cases to decide whether to REFIRE or SKIP or COMPLETE certain DCMD. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-25scsi: megaraid_sas: Add support for 64bit consistent DMAShivasharan S
The latest MegaRAID Firmware (for Invader series) has support for 64bit DMA for both streaming and consistent DMA buffers. All Ventura series controller FW always support 64 bit consistent DMA. Also, on a few architectures 32bit DMA is not supported. Current driver always prefers 32bit for consistent DMA and 64bit for streaming DMA. This behavior was unintentional and carried forwarded from legacy controller FW. Need to enhance the driver to support 64bit consistent DMA buffers based on the firmware capability. Below is the DMA setting strategy in driver with this patch. For Ventura series, always try to set 64bit DMA mask. If it fails fall back to 32bit DMA mask. For Invader series and earlier generation controllers, first try to set to 32bit consistent DMA mask irrespective of FW capability. This is needed to ensure firmware downgrades do not break. If 32bit DMA setting fails, check FW capability and try seting to 64bit DMA mask. There are certain restrictions in the hardware for having all sense buffers and all reply descriptors to be in the same 4GB memory region. This limitation is h/w dependent and can not be changed in firmware. This limitation needs to be taken care in driver while allocating the buffers. There was a discussion regarding this - find details at below link. https://www.spinics.net/lists/linux-scsi/msg108251.html Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-25scsi: megaraid_sas: Retry with reduced queue depth when alloc fails for ↵Shivasharan S
higher QD In certain cases, the host memory is limited and with FW supporting higher queue depths there are increasing chances of IO request frame allocation failures that we are seeing. In case of request frame allocation failures, retry allocation with reduced queue depth (in steps of 64) to continue to configure the controller with a reduced performance rather than failing load. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-25scsi: megaraid_sas: Resize MFA frame used for IOC INIT to 4kShivasharan S
Older firmware version unconditionally pulls 4k frame for IOC INIT MFA frame. But driver allocates 1k or 4k max_chain_frame_sz based on FW capability. During boot time, this results in DMA read errors. Workaround fix in driver by allocating separate ioc_init frame of 4k size to support older firmware. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-25scsi: megaraid_sas: Pre-allocate frequently used DMA buffersShivasharan S
Pre-allocate few of the frequently used DMA buffers during load time. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-25scsi: megaraid_sas: reduce size of fusion_context and use kmalloc for allocationShivasharan S
fusion_context structure is very large around 180kB and most of the size is contributed by log_to_span array. Move log_to_span out of fusion context and have separate allocation for log_to_span. And use kmalloc to allocate fusion_context. Currently kmemleak reports 1000s of false positives for fusion->cmd_list[]. kmemleak does not track page allocation for fusion_context. This change will also fix the false positives reported by kmemleak. Ref: https://marc.info/?l=linux-scsi&m=150545293900917 Reported-by: Shu Wang <shuwang@redhat.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-25scsi: megaraid_sas: use adapter_type for all gen controllersShivasharan S
No functional change. Refactor adapter_type to set for all generation controllers, not just for fusion controllers. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13scsi: megaraid_sas: Change RAID_1_10_RMW_CMDS to RAID_1_PEER_CMDS and set ↵Shivasharan S
value to 2 For RAID1 FastPath writes, driver needs to allocate extra commands internally to accommodate for the extra peer command being sent. Currently driver is allocating 2 extra commands for each but only one extra command is necessary. Set RAID_1_10_RMW_CMDS to 2 and also change macro name to RAID_1_PEER_CMDS. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13scsi: megaraid_sas: Indentation and smatch warning fixesShivasharan S
Fix indentation issues and smatch warning reported by Dan Carpenter for previous series as discussed below. http://www.spinics.net/lists/linux-scsi/msg103635.html http://www.spinics.net/lists/linux-scsi/msg103603.html Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13scsi: megaraid_sas: big endian support changesShivasharan S
Fix endiannes fixes for Ventura specific. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13scsi: megaraid_sas: reduce size of fusion_context and use vmalloc if kmalloc ↵Shivasharan S
fails Currently fusion context has fixed array load_balance_info. Use dynamic allocation. In few places, driver do not want physically contigious memory. Attempt to use vmalloc if physical contiguous memory is not available. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13scsi: megaraid_sas: NVME fast path io supportShivasharan S
This patch provide true fast path IO support. Driver creates PRP for NVME drives and send Fast Path for performance. Certain h/w requirement needs to be taken care in driver. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13scsi: megaraid_sas: NVME interface target prop addedShivasharan S
This patch fetch true values of NVME property from FW using New DCMD interface MR_DCMD_DEV_GET_TARGET_PROP Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13scsi: megaraid_sas: NVME Interface detection and prop settingsShivasharan S
Adding detection logic for NVME device attached behind Ventura controller. Driver set HostPageSize in IOC_INIT frame to inform about page size for NVME devices. Firmware reports NVME page size to the driver. PD INFO DCMD provide new interface type NVME_PD. Driver set property of NVME device. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13scsi: megaraid_sas: raid 1 fast path code optimizeShivasharan S
No functional change. Code refactor. Remove function megasas_fpio_to_ldio as we never require to convert fpio to ldio because of frame unavailability. Grab extra frame of raid 1 write fast path before it creates first frame as Fast Path. Removed is_raid_1_fp_write flag as raid 1 write fast path command is decided using r1_alt_dev_handle only. Move resetting megasas_cmd_fusion fields at common function megasas_return_cmd_fusion. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-13Revert "scsi: megaraid_sas: Enable or Disable Fast path based on the PCI ↵Shivasharan S
Threshold Bandwidth" This reverts commit "3e5eadb1a881" ("scsi: megaraid_sas: Enable or Disable Fast path based on the PCI Threshold Bandwidth") This patch was aimed to increase performance of R1 Write operation for large IO size. Since this method used timer approach, it turn on/off fast path did not work as expected. Patch 0013 describes new algorithm and performance number. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-10scsi: megaraid_sas: Implement the PD Map support for SAS3.5 Generic Megaraid ↵Sasikumar Chandrasekaran
Controllers Update Linux driver to use new pdTargetId field for JBOD target ID Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-10scsi: megaraid_sas: Enable or Disable Fast path based on the PCI Threshold ↵Sasikumar Chandrasekaran
Bandwidth Large SEQ IO workload should sent as non fast path commands Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-10scsi: megaraid_sas: Add the Support for SAS3.5 Generic Megaraid Controllers ↵Sasikumar Chandrasekaran
Capabilities The Megaraid driver has to support the SAS3.5 Generic Megaraid Controllers Firmware functionality. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-10scsi: megaraid_sas: Dynamic Raid Map Changes for SAS3.5 Generic Megaraid ↵Sasikumar Chandrasekaran
Controllers SAS3.5 Generic Megaraid Controllers FW will support new dynamic RaidMap to have different sizes for different number of supported VDs. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-10scsi: megaraid_sas: SAS3.5 Generic Megaraid Controllers Fast Path for RAID ↵Sasikumar Chandrasekaran
1/10 Writes To improve RAID 1/10 Write performance, OS drivers need to issue the required Write IOs as Fast Path IOs (after the appropriate checks allowing Fast Path to be used) to the appropriate physical drives (translated from the OS logical IO) and wait for all Write IOs to complete. Design: A write IO on RAID volume will be examined if it can be sent in Fast Path based on IO size and starting LBA and ending LBA falling on to a Physical Drive boundary. If the underlying RAID volume is a RAID 1/10, driver issues two fast path write IOs one for each corresponding physical drive after computing the corresponding start LBA for each physical drive. Both write IOs will have the same payload and are posted to HW such that replies land in the same reply queue. If there are no resources available for sending two IOs, driver will send the original IO from SCSI layer to RAID volume through the Firmware. Based on PCI bandwidth and write payload, every second this feature is enabled/disabled. When both IOs are completed by HW, the resources will be released and SCSI IO completion handler will be called. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-10scsi: megaraid_sas: SAS3.5 Generic Megaraid Controllers Stream Detection and ↵Sasikumar Chandrasekaran
IO Coalescing Detect sequential Write IOs and pass the hint that it is part of sequential stream to help HBA Firmware do the Full Stripe Writes. For read IOs on certain RAID volumes like Read Ahead volumes,this will help driver to send it to Firmware even if the IOs can potentially be sent to hardware directly (called fast path) bypassing firmware. Design: 8 streams are maintained per RAID volume as per the combined firmware/driver design. When there is no stream detected the LRU stream is used for next potential stream and LRU/MRU map is updated to make this as MRU stream. Every time a stream is detected the MRU map is updated to make the current stream as MRU stream. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-10scsi: megaraid_sas: EEDP Escape Mode Support for SAS3.5 Generic Megaraid ↵Sasikumar Chandrasekaran
Controllers An UNMAP command on a PI formatted device will leave the Logical Block Application Tag and Logical Block Reference Tag as all F's (for those LBAs that are unmapped). To avoid IO errors if those LBAs are subsequently read before they are written with valid tag fields, the MPI SCSI IO requests need to set the EEDPFlags element EEDP Escape Mode field, Bits [7:6] appropriately. A value of 2 should be set to disable all PI checks if the Logical Block Application Tag is 0xFFFF for PI types 1 and 2. A value of 3 should be set to disable all PI checks if the Logical Block Application Tag is 0xFFFF and the Logical Block Reference Tag is 0xFFFFFFFF for PI type 3. Signed-off-by: Sasikumar Chandrasekaran <sasikumar.pc@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-09-19scsi: megaraid_sas: clean function declarations in megaraid_sas_base.c upBaoyou Xie
We get a few warnings when building kernel with W=1: drivers/scsi/megaraid/megaraid_sas_fusion.c:281:1: warning: no previous prototype for 'megasas_free_cmds_fusion' [-Wmissing-prototypes] drivers/scsi/megaraid/megaraid_sas_fusion.c:714:1: warning: no previous prototype for 'megasas_ioc_init_fusion' [-Wmissing-prototypes] .... In fact, these functions are declared in drivers/scsi/megaraid/megaraid_sas_base.c, but should be declared in a header file, thus can be recognized in other file. So this patch adds the declarations into drivers/scsi/megaraid/megaraid_sas_fusion.h. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-02-23megaraid_sas: Reply Descriptor Post Queue (RDPQ) supportSumit Saxena
This patch will create a reply queue pool for each MSI-X index and will provide an array of base addresses instead of the single address of legacy mode. Using this new interface the driver can support higher queue depths through scattered DMA pools. If array mode is not supported driver will fall back to the legacy method of reply pool allocation. This limits controller queue depth to 1K max. To enable a queue depth of more than 1K driver requires firmware to support array mode and scratch_pad3 will provide the new queue depth value. When RDPQ is used, downgrading to an older firmware release should not be permitted. This may cause firmware fault and is not supported. Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-02-23megaraid_sas: Fastpath region lock bypassSumit Saxena
Firmware will fill out per-LD data to tell driver whether a particular LD supports region lock bypass. If yes, then driver will send non-FP LDIO to region lock bypass FIFO. With this change in driver, firmware will optimize certain code to improve performance. Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-02-23megaraid_sas: Task management supportSumit Saxena
This patch adds task management for SCSI commands. Added functions are task abort and target reset. 1. Currently, megaraid_sas driver performs controller reset when any IO times out. With task management support added, task abort and target reset will be tried to recover timed out IO. If task management fails, then controller reset will be performaned. If the task management request times out, fail the request and escalate to the next level (controller reset). 2. mr_device_priv_data will be allocated for all generations of controller, but is_tm_capable flag will never be set for controllers (prior to Invader series) as firmware support is not available for task management. 3. Task management capable firmware will set is_tm_capable flag in firmware API. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-02-23megaraid_sas: Syncing request flags macro names with firmwareSumit Saxena
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-10-29megaraid_sas: Remove PCI id checkssumit.saxena@avagotech.com
Remove PCI id based checks and use instance->ctrl_context to decide whether controller is MFI-based or a Fusion adapter. Additionally, Fusion adapters are divided into two categories: Thunderbolt and Invader. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-10-29megaraid_sas: Support for max_io_size 1MBsumit.saxena@avagotech.com
Driver will expose max sge = 256 (earlier it was 64) if firmware supports extended IO size (1M). Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Martin Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-10-29megaraid_sas: JBOD sequence number supportsumit.saxena@avagotech.com
Implemented JBOD map which will provide quick access for JBOD path and also provide sequence number. This will help hardware to fail command to the FW in case of any sequence mismatch. Fast Path I/O for JBOD will refer JBOD map (which has sequence number per JBOD device) instead of RAID map. Previously, the driver used RAID map to get device handle for fast path I/O and this not have sequence number information. Now, driver will use JBOD map instead. As part of error handling, if JBOD map is failed/not supported by firmware, driver will continue using legacy behavior. Now there will be three IO paths for JBOD (syspd): - JBOD map with sequence number (Fast Path) - RAID map without sequence number (Fast Path) - FW path via h/w exception queue deliberately setup devhandle 0xFFFF (FW path). Relevant data structures: - Driver send new DCMD MR_DCMD_SYSTEM_PD_MAP_GET_INFO for this purpose. - struct MR_PD_CFG_SEQ- This structure represent map of single physical device. - struct MR_PD_CFG_SEQ_NUM_SYNC- This structure represent whole JBOD map in general(size, count of sysPDs configured, struct MR_PD_CFG_SEQ of syspD with 0 index). - JBOD sequence map size is: sizeof(struct MR_PD_CFG_SEQ_NUM_SYNC) + (sizeof(struct MR_PD_CFG_SEQ) * (MAX_PHYSICAL_DEVICES - 1)) which is allocated while setting up JBOD map at driver load time. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Martin Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-10-29megaraid_sas: Synchronize driver headers with firmware APIssumit.saxena@avagotech.com
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Martin Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-05-25megaraid_sas : add endianness annotationsChristoph Hellwig
This adds endianness annotations to all data structures, and a few variables directly referencing them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-05-25megaraid_sas : Use Block layer tag support for internal command indexingSumit.Saxena@avagotech.com
megaraid_sas driver will use block layer provided tag for indexing internal MPT frames to get any unique MPT frame tied with tag. Each IO request submitted from SCSI mid layer will get associated MPT frame from MPT framepool (retrieved and return back using spinlock inside megaraid_sas driver's submission/completion call back). Getting MPT frame from MPT Frame pool is very expensive operation because of associated spin lock operation (spinlock overhead increase on multi NUMA node). This type of locking in driver is very expensive call considering each IO request need - Acquire and Release of the same lock. With this support, in IO path driver will directly provide the unique command index(which is based on block layer tag) and will get the MPT frame tied to the tag and this way driver can get rid off lock, which synchronizes the access to MPT frame pool while fetching and returning MPT frame from the pool. This support in driver provides siginificant performance improvement(on multi NUMA node system)on latest upstream with SCSI.MQ as well as on existing linux distributions. Here is the data for test executed at Avago- - IO Tool- FIO - 4 Socket SMC server. (4 NUMA node server) - 12 SSDs in JBOD mode . - 4K Rand READ, QD=32 - SCSI MQ x86_64 (Latest Upstream kernel) - upto 300% Performance Improvement. If IOs are running on single Node, perfromance gain is less, but as soon as increase number of nodes, performance improvement is significant. IOs running on all 4 NUMA nodes, with this patch applied IOPs observed was 1170K vs 344K IOPs seen without this patch. Logically, there are two parts of this patch- 1) Block layer tag support 2) changes in calling convention of return_cmd. part 2 will revert the changes done by patch- 90dc9d9 megaraid_sas : MFI MPT linked list corruption fix because changes done in part 1 has fixed the problem of MFI MPT linked list corruption. part 2 is very much dependent on part 1, so we decided to have single patch for these two logical changes. [jejb: remove chatty printk pointed out by hch] Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-01-09megaraid_sas: endianness related bug fixes and code optimizationSumit.Saxena@avagotech.com
This patch addresses below issues: 1) Few endianness bug fixes. 2) Break the iteration after (MAX_LOGICAL_DRIVES_EXT - 1)), instead of MAX_LOGICAL_DRIVES_EXT. 3) Optimization in MFI INIT frame before firing. 4) MFI IO frame should be 256bytes aligned. Code is optimized to reduce the size of frame for fusion adapters and make the MFI frame size calculation a bit transparent and readable. Cc: <stable@vger.kernel.org> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Chaitra Basappa <chaitra.basappa@avagotech.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-24megaraid_sas: online Firmware upgrade support for Extended VD featureSumit.Saxena@avagotech.com
In OCR (Online Controller Reset) path, driver sets adapter state to MEGASAS_HBA_OPERATIONAL before getting new RAID map. There will be a small window where IO will come from OS with old RAID map. This patch will update adapter state to MEGASAS_HBA_OPERATIONAL, only after driver has new RAID map to avoid any IOs getting build using old RAID map. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>