summaryrefslogtreecommitdiff
path: root/drivers/scsi/atari_NCR5380.c
AgeCommit message (Collapse)Author
2016-03-01ncr5380: Call scsi_eh_prep_cmnd() and scsi_eh_restore_cmnd() as and when ↵Finn Thain
appropriate This bug causes the wrong command to have its sense pointer overwritten, which sometimes leads to a NULL pointer deref. Fix this by checking which command is being requeued before restoring the scsi_eh_save data. It turns out that some targets will disconnect a REQUEST SENSE command. The autosense algorithm doesn't anticipate this. Hence multiple commands can end up undergoing autosense simultaneously, and they will all try to use the same scsi_eh_save struct, which won't work. Defer autosense when the scsi_eh_save storage is in use by another command. Fixes: f27db8eb98a1 ("ncr5380: Fix autosense bugs") Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-01ncr5380: Fix NCR5380_select() EH checks and result handlingFinn Thain
Add missing checks for EH abort during arbitration and selection. Rework the handling of NCR5380_select() result to improve clarity. Fixes: 707d62b37fbb ("ncr5380: Fix EH during arbitration and selection") Tested-by: Michael Schmitz <schmitzmic@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-01ncr5380: Forget aborted commandsFinn Thain
The list structures and related logic used in the NCR5380 driver mean that a command cannot be queued twice (i.e. can't appear on more than one queue and can't appear on the same queue more than once). The abort handler must forget the command so that the mid-layer can re-use it. E.g. the ML may send it back to the LLD via via scsi_eh_get_sense(). Fix this and also fix two error paths, so that commands get forgotten iff completed. Fixes: 8b00c3d5d40d ("ncr5380: Implement new eh_abort_handler") Tested-by: Michael Schmitz <schmitzmic@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-01ncr5380: Dont re-enter NCR5380_select()Finn Thain
Calling NCR5380_select() from the abort handler causes various problems. Firstly, it means potentially re-entering NCR5380_select(). Secondly, it means that the lock is released, which permits the EH handlers to be re-entered. The combination results in crashes. Don't do it. Fixes: 8b00c3d5d40d ("ncr5380: Implement new eh_abort_handler") Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-01ncr5380: Dont release lock for PIO transferFinn Thain
The calls to NCR5380_transfer_pio() for DATA IN and DATA OUT phases will modify cmd->SCp.this_residual, cmd->SCp.ptr and cmd->SCp.buffer. That works as long as EH does not intervene, which became possible in atari_NCR5380.c when I changed the locking to bring it closer to NCR5380.c. If error recovery aborts the command, the scsi_cmnd in question and its buffer will be returned to the mid-layer. So the transfer has to cease, but it can't be stopped by the initiator because the target controls the bus phase. The problem does not arise if the lock is not released. That was fine for atari_scsi, because it implements DMA. For the other drivers, we have to release the lock and re-enable interrupts for long PIO data transfers. The solution is to split the transfer into small chunks. In between chunks the main loop releases the lock and re-enables interrupts. Thus interrupts can be serviced and eh_bus_reset_handler can intervene if need be. This fixes an oops in NCR5380_transfer_pio() that can happen when the EH abort handler is invoked during DATA IN or DATA OUT phase. Fixes: 11d2f63b9cf5 ("ncr5380: Change instance->host_lock to hostdata->lock") Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-01ncr5380: Correctly clear command pointers and lists after bus resetFinn Thain
Commands subject to exception handling are to be returned to the scsi mid-layer. Make sure that the various command pointers and command lists in the low-level driver are correctly cleansed of affected commands. This fixes some bugs that I accidentally introduced in v4.5-rc1 including the removal of INIT_LIST_HEAD for the 'autosense' and 'disconnected' command lists, and the possible NULL pointer dereference in NCR5380_bus_reset() that was reported by Dan Carpenter. hostdata->sensing may also point to an affected command so this pointer also has to be cleared. The abort handler calls complete_cmd() to take care of this; let's have the bus reset handler do the same. The issue queue may also contain an affected command. If so, remove it. This also follows the abort handler logic. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 62717f537e1b ("ncr5380: Implement new eh_bus_reset_handler") Tested-by: Michael Schmitz <schmitzmic@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Cleanup whitespace and parenthesesFinn Thain
Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06atari_NCR5380: Merge changes from NCR5380.cFinn Thain
In the past, atari_NCR5380.c was overlooked by those working on NCR5380.c and this caused needless divergence. All of the changes in this patch were taken from NCR5380.c. This removes some unimportant discrepancies between the two core driver forks so that 'diff' can be used to reveal the important ones, to facilitate reunification. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Fix whitespace in comments using regexpFinn Thain
Hanging indentation was a poor choice for the text inside comments. It has been used in the wrong places and done badly elsewhere. There is little consistency within any file. One fork of the core driver uses tabs for this indentation while the other uses spaces. Better to use flush-left alignment throughout. This patch is the result of the following substitution. It replaces tabs and spaces at the start of a comment line with a single space. perl -i -pe 's,^(\t*[/ ]\*)[ \t]+,$1 ,' drivers/scsi/{atari_,}NCR5380.c This removes some unimportant discrepancies between the two core driver forks so that the important ones become obvious, to facilitate reunification. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Cleanup commentsFinn Thain
The CVS revision log is not nearly as useful as the history/history.git repo, so remove it. Roman's commentary at the top of his driver repeats the same information elsewhere in the file so remove it. Also remove some other redundant or obsolete comments. Both the driver and the datasheets confusingly refer to a DMA access for a SCSI WRITE command as a "DMA write". Similarly a SCSI READ command is called a "DMA read". This is the opposite of the usual convention. Thankfully, the chip documentation and driver code also use "DMA send" and "DMA receive", so adopt this terminology. This removes some unimportant discrepancies between the two core driver forks so that 'diff' can be used to reveal the important ones, to facilitate reunification. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Fix soft lockupsFinn Thain
Because of the rudimentary design of the chip, it is necessary to poll the SCSI bus signals during PIO and this tends to hog the CPU. The driver will accept new commands while others execute, and this causes a soft lockup because the workqueue item will not terminate until the issue queue is emptied. When exercising dmx3191d using sequential IO from dd, the driver is sent 512 KiB WRITE commands and 128 KiB READs. For a PIO transfer, the rate is is only about 300 KiB/s, so these are long-running commands. And although PDMA may run at several MiB/s, interrupts are disabled for the duration of the transfer. Fix the unresponsiveness and soft lockup issues by calling cond_resched() after each command is completed and by limiting max_sectors for drivers that don't implement real DMA. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06atari_scsi, sun3_scsi: Remove global Scsi_Host pointerFinn Thain
This refactoring removes two global Scsi_Host pointers. This improves consistency with other ncr5380 drivers. Adopting the same conventions as the other drivers makes them easier to read. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06atari_NCR5380: Eliminate HOSTNO macroFinn Thain
Keep the two core driver forks in sync. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06atari_NCR5380: Remove HOSTNO macro from printk() and seq_printf() callsFinn Thain
Remove the HOSTNO macro that is peculiar to atari_NCR5380.c and contributes to the problem of divergence of the NCR5380 core drivers. Keep NCR5380.c in sync. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Implement new eh_bus_reset_handlerFinn Thain
NCR5380.c lacks a sane eh_bus_reset_handler. The atari_NCR5380.c code is much better but it should not throw out the issue queue (that would be a host reset) and it neglects to set the result code for commands that it throws out. Fix these bugs and keep the two core drivers in sync. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Fix EH during arbitration and selectionFinn Thain
During arbitration and selection, the relevant command is invisible to exception handlers and can be found only in a pointer on the stack of a different thread. When eh_abort_handler can't find a given command, it can't decide whether that command was completed already or is still in arbitration or selection phase. But it must return either SUCCESS (e.g. command completed earlier) or FAILED (could not abort the nexus, try bus reset). The solution is to make sure all commands belonging to the LLD are always visible to exception handlers. Add another scsi_cmnd pointer to the hostdata struct to track the command in arbitration or selection phase. Replace 'retain_dma_irq' with the new 'selecting' pointer, to bring atari_NCR5380.c into line with NCR5380.c. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Implement new eh_abort_handlerFinn Thain
Introduce a new eh_abort_handler implementation. This one attempts to follow all of the rules relating to EH handlers. There is still a known bug: during selection, a command becomes invisible to the EH handlers because it only appears in a pointer on the stack of a different thread. This bug is addressed in a subsequent patch. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Fix autosense bugsFinn Thain
NCR5380_information_transfer() may re-queue a command for autosense, after calling scsi_eh_prep_cmnd(). This creates several possibilities: 1. Reselection may intervene before the re-queued command gets processed. If the reconnected command then undergoes autosense, this causes the scsi_eh_save data from the previous command to be overwritten. 2. After NCR5380_information_transfer() calls scsi_eh_prep_cmnd(), a new REQUEST SENSE command may arrive. This would be queued ahead of any command already undergoing autosense, which means the scsi_eh_save data might be restored to the wrong command. 3. After NCR5380_information_transfer() calls scsi_eh_prep_cmnd(), eh_abort_handler() may abort the command. But the scsi_eh_save data is not discarded, which means the scsi_eh_save data might be incorrectly restored to the next REQUEST SENSE command issued. This patch adds a new autosense list so that commands that are re-queued because of a CHECK CONDITION result can be kept apart from the REQUEST SENSE commands that arrive via queuecommand. This patch also adds a function dedicated to dequeueing and preparing the next command for processing. By refactoring the main loop in this way, scsi_eh_save takes place when an autosense command is dequeued rather than when re-queued. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Refactor command completionFinn Thain
Implement a 'complete_cmd' function to complete commands. This is needed by the following patch; the new function provides a site for the logic needed to correctly handle REQUEST SENSE commands. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Use standard list data structureFinn Thain
The NCR5380 drivers have a home-spun linked list implementation for scsi_cmnd structs that uses cmd->host_scribble as a 'next' pointer. Adopt the standard list_head data structure and list operations instead. Remove the eh_abort_handler rather than convert it. Doing the conversion would only be churn because the existing EH handlers don't work and get replaced in a subsequent patch. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Remove redundant volatile qualifiersFinn Thain
The hostdata struct is now protected by a spin lock so the volatile qualifiers are redundant. Remove them. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Remove LIST and REMOVE macrosFinn Thain
Printing command pointers can be useful when debugging queues. Other than that, the LIST and REMOVE macros are just clutter. These macros are redundant now that NDEBUG_QUEUES causes pointers to be printed, so remove them. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Use dsprintk() for queue debuggingFinn Thain
Print the command pointers in the log messages for debugging queue data structures. The LIST and REMOVE macros can then be removed. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Use shost_priv helperFinn Thain
Make use of the shost_priv() helper. Remove HOSTDATA and SETUP_HOSTDATA macros because they harm readability. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Remove H_NO macro and introduce dsprintkFinn Thain
Replace all H_NO and some HOSTNO macros (both peculiar to atari_NCR5380.c) with a new dsprintk macro that's more useful and more consistent. The new macro avoids a lot of boilerplate in new code in subsequent patches. Keep NCR5380.c in sync. Remaining HOSTNO macros are removed as side-effects of subsequent patches. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Remove command list debug codeFinn Thain
Some NCR5380 hosts offer a .show_info method to access the contents of the various command list data structures from a procfs file. When NDEBUG is set, the same information is sent to the console during EH. The two core drivers, atari_NCR5380.c and NCR5380.c differ here. Because it is just for debugging, the easiest way to fix the discrepancy is simply remove this code. The only remaining users of NCR5380_show_info() and NCR5380_write_info() are drivers that define PSEUDO_DMA. The others have no use for the .show_info method, so don't initialize it. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Change instance->host_lock to hostdata->lockFinn Thain
NCR5380.c presently uses the instance->host_lock spin lock. Convert this to a new spin lock that protects the NCR5380_hostdata struct. atari_NCR5380.c previously used local_irq_save/restore() rather than a spin lock. Convert this to hostdata->lock in irq mode. For SMP platforms, the interrupt handler now also acquires the spin lock. This brings all locking in the two core drivers into agreement. Adding this locking also means that a bunch of volatile qualifiers can be removed from the members of the NCR5380_hostdata struct. This is done in a subsequent patch. Proper locking will allow the abort handler to locate a command being aborted. This is presently impossible if the abort handler is invoked when the command has been moved from a queue to a pointer on the stack. (If eh_abort_handler can't determine whether a command has been completed or is still being processed then it can't decide whether to return success or failure.) The hostdata spin lock is now held when calling NCR5380_select() and NCR5380_information_transfer(). Where possible, the lock is dropped for polling and PIO transfers. Clean up the now-redundant SELECT_ENABLE_REG writes, that used to provide limited mutual exclusion between information_transfer() and reselect(). Accessing hostdata->connected without data races means taking the lock; cleanup these accesses. The new spin lock falls away for m68k and other UP builds, so this should have little impact there. In the SMP case the new lock should be uncontested even when the SCSI bus is contested. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Remove redundant ICR_ARBITRATION_LOST test and eliminate FLAG_DTC3181EFinn Thain
Remove FLAG_DTC3181E. It was used to suppress a final Arbitration Lost (SEL asserted) test that isn't actually needed. The test was suppressed because it causes problems for DTC436 and DTC536 chips. It takes place after the host wins arbitration, so SEL has been asserted. These chips can't seem to tell whether it was the host or another bus device that did so. This questionable final test appears in a flow chart in an early NCR5380 datasheet. It was removed from later documents like the DP5380 datasheet. By the time this final test takes place, the driver has already tested the Arbitration Lost bit several times. The first test happens 3 us after BUS FREE (or longer due to register access delays). The protocol requires that a device stop signalling within 1.8 us after BUS FREE unless it won arbitration, in which case it must assert SEL, which is detected 1.2 us later by the first Arbitration Lost test. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06atari_NCR5380: Fix queue_size limitFinn Thain
When a target reports a QUEUE_FULL condition it causes the driver to update the 'queue_size' limit with the number of currently allocated tags. At least, that's what's supposed to happen, according to the comments. Unfortunately the terms in the assignment are swapped. Fix this and cleanup some obsolete comments. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Cleanup #include directivesFinn Thain
Remove unused includes (stat.h, signal.h, proc_fs.h) and move includes needed by the core drivers into the common header (delay.h etc). Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Fix off-by-one bug in extended_msg[] bounds checkFinn Thain
Fix the array bounds check when transferring an extended message from the target. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Standardize reselection handlingFinn Thain
Bring the two NCR5380_reselect() implementations into agreement. Replace infinite loops in atari_NCR5380.c with timeouts, as per NCR5380.c. Remove 'abort' flag in NCR5380.c as per atari_NCR5380.c -- if reselection fails, there may be no MESSAGE IN phase so don't attempt data transfer. During selection, don't interfere with the chip registers after a reselection interrupt intervenes. Clean up some trivial issues with code style, comments and printk. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Introduce NCR5380_poll_politely2Finn Thain
SCSI bus protocol sometimes requires monitoring two related conditions simultaneously. Enhance NCR5380_poll_politely() for this purpose, and put it to use in the arbitration algorithm. It will also find use in pseudo DMA. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Standardize interrupt handlingFinn Thain
Because interrupt handling is crucial to the core driver(s), all wrapper drivers need to agree on this code. This patch removes discrepancies. NCR5380_intr() in NCR5380.c has the following pointless loop that differs from the code in atari_NCR5380.c. done = 1; do { /* ... */ } while (!done); The 'done' flag gets cleared when a reconnected command is to be processed from the work queue. But in NCR5380.c, the flag is also used to cause the interrupt conditions to be re-examined. Perhaps this was because NCR5380_reselect() was expected to cause another interrupt, or perhaps the remaining present interrupt conditions need to be handled after the NCR5380_reselect() call? Actually, both possibilities are bogus, as is the loop itself. It seems have been overlooked in the hit-and-miss removal of scsi host instance list iteration many years ago; see history/history.git commit 491447e1fcff ("[PATCH] next NCR5380 updates") and commit 69e1a9482e57 ("[PATCH] fix up NCR5380 private data"). See also my earlier patch, "Always retry arbitration and selection". The datasheet says, "IRQ can be reset simply by reading the Reset Parity/Interrupt Register". So don't treat the chip IRQ like a level-triggered interrupt. Of the conditions that set the IRQ flag, some are level-triggered and some are edge-triggered, which means IRQ itself must be edge-triggered. Some interrupt conditions are latched and some are not. Before clearing the chip IRQ flag, clear all state that may cause it to be raised. That means clearing the DMA Mode and Busy Monitor bits in the Mode Register and clearing the host ID in the Select Enable register. Also clean up some printk's and some comments. Keep atari_NCR5380.c and NCR5380.c in agreement. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Standardize work queueing algorithmFinn Thain
The complex main_running/queue_main mechanism is peculiar to atari_NCR5380.c. It isn't SMP safe and offers little value given that the work queue already offers concurrency management. Remove this complexity to bring atari_NCR5380.c closer to NCR5380.c. It is not a good idea to call the information transfer state machine from queuecommand because, according to Documentation/scsi/scsi_mid_low_api.txt that could happen in soft irq context. Fix this. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Dont wait for BUS FREE after disconnectFinn Thain
When there is a queued command and no connected command, NCR5380_select() is called and arbitration begins. The chip waits for BUS FREE once the MR_ARBITRATE bit in the mode register is enabled. That means there is no need to wait for BUS FREE after disconnecting. There is presently no polling for BUS FREE after sending an ABORT or other message that might lead to disconnection. It only happens after COMMAND COMPLETE or DISCONNECT messages, which seems inconsistent. Remove the polling for !BSY in the COMMAND COMPLETE and DISCONNECT cases. BTW, the comments say "avoid nasty timeouts" and perhaps BUS FREE polling was somehow helpful back in Linux v0.99.14u, when it was introduced. The relevant timeout is presently 1 second (for bus arbitration). Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06atari_NCR5380: Use arbitration timeoutFinn Thain
Allow target selection to fail with a timeout instead of waiting in infinite loops. This gets rid of the unused NCR_TIMEOUT macro, it is more defensive and has proved helpful in debugging. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06atari_NCR5380: Set do_abort() timeoutsFinn Thain
Use timeouts in do_abort() in atari_NCR5380.c instead of infinite loops. Also fix the kernel-doc comment. Keep the two core driver forks in sync. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Remove references to linked commandsHannes Reinecke
Some old drivers partially implemented support for linked commands using a "proposed" next_link pointer in struct scsi_cmnd that never actually existed. Remove this code. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Drop DEF_SCSI_QCMD macroFinn Thain
Remove the DEF_SCSI_QCMD macro (already removed from atari_NCR5380.c). The lock provided by DEF_SCSI_QCMD is only needed for queue data structures. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Fix NCR5380_transfer_pio() resultFinn Thain
According to the SCSI-2 draft revision 10L, atari_NCR5380.c is correct when it says that the phase lines are valid up until ACK is negated following the transmission of the last byte in MESSAGE IN phase. This is true for all information transfer phases, from target to initiator. Sample the phase bits in STATUS_REG so that NCR5380_transfer_pio() can return the correct result. The return value is presently unused (perhaps because of bugs like this) but this change at least fixes the caller's phase variable, which is passed by reference. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Rework disconnect versus poll logicFinn Thain
The atari_NCR5380.c and NCR5380.c core drivers differ in their handling of target disconnection. This is partly because atari_NCR5380.c had all of the polling and sleeping removed to become entirely interrupt-driven, and it is partly because of damage done to NCR5380.c after atari_NCR5380.c was forked. See commit 37cd23b44929 ("Linux 2.1.105") in history/history.git. The polling changes that were made in v2.1.105 are questionable at best: if REQ is not already asserted when NCR5380_transfer_pio() is invoked, and if the expected phase is DATA IN or DATA OUT, the function will schedule main() to execute after USLEEP_SLEEP jiffies and then return. The problems here are the expected REQ timing and the sleep interval*. Avoid this issue by using NCR5380_poll_politely() instead of scheduling main(). The atari_NCR5380.c core driver requires the use of the chip interrupt and always permits target disconnection. It sets the cmd->device->disconnect flag when a device disconnects, but never tests this flag. The NCR5380.c core driver permits disconnection only when instance->irq != NO_IRQ. It sets the cmd->device->disconnect flag when a device disconnects and it tests this flag in a couple of places: 1. During NCR5380_information_transfer(), following COMMAND OUT phase, if !cmd->device->disconnect, the initiator will take a guess as to whether or not the target will then choose to go to MESSAGE IN phase and disconnect. If the driver guesses "yes", it will schedule main() to execute after USLEEP_SLEEP jiffies and then return there. Unfortunately the driver may guess "yes" even after it has denied the target the disconnection privilege. When the target does not disconnect, the sleep can be beneficial, assuming the sleep interval is appropriate (mostly it is not*). And even if the driver guesses "yes" correctly, and the target would then disconnect, the driver still has to go through the MESSAGE IN phase in order to get to BUS FREE phase. The main loop can do nothing useful until BUS FREE, and sleeping just delays the phase transition. 2. If !cmd->device->disconnect and REQ is not already asserted when NCR5380_information_transfer() is invoked, the function polls for REQ for USLEEP_POLL jiffies. If REQ is not asserted, it then schedules main() to execute after USLEEP_SLEEP jiffies and returns. The idea is apparently to yeild the CPU while waiting for REQ. This is conditional upon !cmd->device->disconnect, but there seems to be no rhyme or reason for that. For example, the flag may be unset because disconnection privilege was denied because the driver has no IRQ. Or the flag may be unset because the device has never needed to disconnect before. Or if the flag is set, disconnection may have no relevance to the present bus phase. Another deficiency of the existing algorithm is as follows. When the driver has no IRQ, it prevents disconnection, and generally polls and sleeps more than it would normally. Now, if the driver is going to poll anyway, why not allow the target to disconnect? That way the driver can do something useful with the bus instead of polling unproductively! Avoid this pointless latency, complexity and guesswork by using NCR5380_poll_politely() instead of scheduling main(). * For g_NCR5380, the time intervals for USLEEP_SLEEP and USLEEP_POLL are 200 ms and 10 ms, respectively. They are 20 ms and 200 ms respectively for the other NCR5380 drivers. There doesn't seem to be any reason for this discrepancy. The timing seems to have no relation to the type of adapter. Bizarrely, the timing in g_NCR5380 seems to relate only to one particular type of target device. This patch attempts to solve the problem for all NCR5380 drivers and all target devices. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Implement NCR5380_dma_xfer_len and remove LIMIT_TRANSFERSIZE macroFinn Thain
Follow the example of the atari_NCR5380.c core driver and adopt the NCR5380_dma_xfer_len() hook. Implement NCR5380_dma_xfer_len() for dtc.c and g_NCR5380.c to take care of the limitations of these cards. Keep the default for drivers using PSEUDO_DMA. Eliminate the unused macro LIMIT_TRANSFERSIZE. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Always retry arbitration and selectionFinn Thain
If NCR5380_select() returns -1, it means arbitration was lost or selection failed and should be retried. If the main loop simply terminates when there are still commands on the issue queue, they will remain queued until they expire. Fix this by clearing the 'done' flag after selection failure or lost arbitration. The "else break" clause in NCR5380_main() that gets removed here appears to be a vestige of a long-gone loop that iterated over host instances. See commit 491447e1fcff ("[PATCH] next NCR5380 updates") in history/history.git. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Eliminate selecting stateFinn Thain
Linux v2.1.105 changed the algorithm for polling for the BSY signal in NCR5380_select() and NCR5380_main(). Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1) meant reschedule main() after sleeping for 10 ms. Repeated 25 times this provided the recommended 250 ms selection time-out delay. This got broken when HZ became configurable. We could fix this but there's no need to reschedule the main loop. This BSY polling presently happens when the NCR5380_main() work queue item calls NCR5380_select(), which in turn schedules NCR5380_main(), which calls NCR5380_select() again, and so on. This algorithm is a deviation from the simpler one in atari_NCR5380.c. The extra complexity and state is pointless. There's no reason to stop selection half-way and return to to the main loop when the main loop can do nothing useful until selection completes. So just poll for BSY. We can sleep while polling now that we have a suitable workqueue. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Sleep when polling, if possibleFinn Thain
When in process context, sleep during polling if doing so won't add significant latency. In interrupt context or if the lock is held, poll briefly then give up. Keep both core drivers in sync. Calibrate busy-wait iterations to allow for variation in chip register access times between different 5380 hardware implementations. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Introduce unbound workqueueFinn Thain
Allocate a work queue that will permit busy waiting and sleeping. This means NCR5380_init() can potentially fail, so add this error path. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Keep BSY asserted when entering SELECTION phaseFinn Thain
NCR5380.c is not compliant with the SCSI-2 standard (at least, not with the draft revision 10L that I have to refer to). The selection algorithm in atari_NCR5380.c is correct, so use that. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Proceed with next command after NCR5380_select() calls scsi_doneFinn Thain
If a target disappears from the SCSI bus, NCR5380_select() may subsequently fail with a time-out. In this situation, scsi_done is called and NCR5380_select() returns 0. Both hostdata->connected and hostdata->selecting are NULL and the main loop should proceed with the next command in the issue queue. Clarify this logic. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-01-06ncr5380: Always escalate bad target time-out in NCR5380_select()Finn Thain
Remove the restart_select and targets_present variables introduced in Linux v1.1.38. The former was used only for a questionable debug printk and the latter "so we can call a select failure a retryable condition". Well, retrying select failure in general is a different problem to a target that doesn't assert BSY. We need to handle these two cases differently; the latter case can be left to the SCSI ML. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>