summaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/pseries/eeh.c
AgeCommit message (Collapse)Author
2007-06-14[POWERPC] Tweak EEH copyright infoLinas Vepstas
Twiddle the copyright notices. Per current guidelines, the use of the (C) or (c) in source code is deprecated. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> ---- arch/powerpc/platforms/pseries/eeh.c | 6 +++++- arch/powerpc/platforms/pseries/eeh_cache.c | 3 ++- arch/powerpc/platforms/pseries/eeh_driver.c | 6 +++--- 3 files changed, 10 insertions(+), 5 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14[POWERPC] Remove dead EEH codeLinas Vepstas
Remove some dead code. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> ---- arch/powerpc/platforms/pseries/eeh.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14[POWERPC] Show EEH per-device false positivesLinas Vepstas
Track and report the number of times we read an all-1s value (0xff, 0xffff or 0xffffffff) from each device which is valid data, not indicating EEH isolation. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> ---- arch/powerpc/platforms/pseries/eeh.c | 5 +++++ arch/powerpc/platforms/pseries/eeh_sysfs.c | 3 +++ include/asm-powerpc/pci-bridge.h | 1 + 3 files changed, 9 insertions(+) Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14[POWERPC] Add EEH sysfs blinkenlightsLinas Vepstas
Add sysfs blinkenlights for EEH statistics. Shuffle the eeh_add_device_tree() call so that it appears in the correct sequence. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> ---- arch/powerpc/platforms/pseries/Makefile | 2 arch/powerpc/platforms/pseries/eeh.c | 4 + arch/powerpc/platforms/pseries/eeh_cache.c | 2 arch/powerpc/platforms/pseries/eeh_sysfs.c | 84 +++++++++++++++++++++++++++++ arch/powerpc/platforms/pseries/pci_dlpar.c | 7 +- include/asm-powerpc/ppc-pci.h | 3 + 6 files changed, 98 insertions(+), 4 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-10[POWERPC] Assorted janitorial EEH cleanupsLinas Vepstas
Assorted minor cleanups to EEH code; -- use literals, use kerneldoc format. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> ---- arch/powerpc/platforms/pseries/eeh.c | 13 ++++++++++--- arch/powerpc/platforms/pseries/eeh_driver.c | 7 ++++--- include/asm-powerpc/ppc-pci.h | 18 +++++++++++++++--- 3 files changed, 29 insertions(+), 9 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09[POWERPC] EEH: log all PCI-X and PCI-E AER registersLinas Vepstas
When an EEH event is detected, and after the device driver has been notified, but before the device is reset, enable MMIO to the adapter, and grab the contents of the PCI status and command registers, the PCI-X status and command, and the PCI-E capability 10 and AER registers. Pass these up to the RTAS error log, and also printk them. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09[POWERPC] EEH: capture and log pci state on errorLinas Vepstas
If an EEH event is observed, capture PCI config space info about the device, wrap it up and pass it to the event logger. This pach just slots in the basic logging function. A later patch will provide for more through data gathering. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08[POWERPC] Add powerpc PCI-E reset API implementationBrian King
Adds the pSeries platform implementation for a new PCI API which can be used to issue various types of PCI-E reset, including PCI-E warm reset and PCI-E hot reset. This is needed for an ipr PCI-E adapter which does not properly implement BIST. Running BIST on this adapter results in PCI-E errors. The only reliable reset mechanism that exists on this hardware is PCI Fundamental reset (warm reset). Acked-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-13[POWERPC] Rename get_property to of_get_property: arch/powerpcStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] EEH: restructure multi-function supportLinas Vepstas
Rework how multi-function PCI devices are identified and traversed. This fixes a bug with multi-function recovery on Power4 that was introduced by a recent Power4 EEH patch. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] EEH: verify state changeLinas Vepstas
After requesting a state change, verify that the state change actually ocurred, and the system ends up in the expected state. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] EEH: rm un-needed dataLinas Vepstas
The EEH event notification system passes around data that is not needed or at least, not used properly. Stop passing this data; get it in a more reliable fashion. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] EEH: wait for slot statusLinas Vepstas
Modify routine that returns PCI slot status to wait for slot status to become available. This is needed, as slots that are in some remote card cage may go offline for extended periods of time. New users for this routine in following patches. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] EEH: handle reset state highLinas Vepstas
Some firmware versions will return a slot reset state of "1" when a slot is EEH frozen. Recognize this as a state that can be handled. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] EEH: support ibm,get-config-addr-info2 RTAS callLinas Vepstas
Provide support for the new ibm,get-config-addr-info2 RTAS token, whenever it is actually available. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] EEH: Tolerate high mmioLinas Vepstas
Some drivers will attempt to perform a lot of mmio even after an EEH event was detected. This is especially the case for fast cpu's and PCI-E slots. Be a bit more lenient in allowing this. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] EEH: modify order of EEH state checkingLinas Vepstas
Change the order in which pci error state is examined; the "capabilites" is not valid if "reset state" is 5. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-02-12[PATCH] mark struct file_operations const 2Arjan van de Ven
Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. [akpm@osdl.org: sparc64 fix] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-07[POWERPC] pSeries: EEH improperly enabled for some Power4 systemsLinas Vepstas
It appears that EEH is improperly enabled for some Power4 systems. On these systems, the ibm,set-eeh-option returns a value of success even when EEH is not supported on the given node. Thus, an explicit check for support is required. During boot, on power4, without this patch, one sees messages similar to: EEH: event on unsupported device, rc=0 dn=/pci@400000000110/IBM,sp@1 EEH: event on unsupported device, rc=0 dn=/pci@400000000110/pci@2 EEH: event on unsupported device, rc=0 dn=/pci@400000000110/pci@2,2 etc. The patch makes these go away. Without this patch, EEH recovery does seem to work correctly for at least some devices (I tested ethernet e1000), but fails to recover others (the Emulex LightPulse LPFC, most notably). Off the top of my head, I don't remember why some devices are affected, but not others. The PAPR indicates that the correct way to test for EEH is as done in this patch; its not clear to me if this was in the PAPR all along, or recently added; if it was there all along, its not clear to me why this hadn't been fixed long ago. I suspect only certain firmware levels are affected. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-12-08[POWERPC] EEH recovery tweaksLinas Vepstas
If one attempts to create a device driver recovery sequence that does not depend on a hard reset of the device, but simply just attempts to resume processing, then one discovers that the recovery sequence implemented on powerpc is not quite right. This patch fixes this up. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-09-26[POWERPC] EEH failure to mark pci slot as frozen.Linas Vepstas
Bug fix: when marking a slot as frozen, we forgot to mark pci device itself as frozen. (we did manage to mark the pci children, but forget the parent itself.) This is needed so that some device drivers can check the pci status in critical sections (e.g. in spin loops with interrupts disabled). Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-09-22[POWERPC] EEH: Power4 systems sometimes need multiple resets.Linas Vepstas
On detection of an EEH error, some Power4 systems seem to occasionally want to be reset twice before they report themselves as fully recovered. This patch re-arranges the code to attempt additional resets if the first one doesn't take. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-09-21[POWERPC] EEH: enable MMIO/DMA on frozen slotLinas Vepstas
Add wrapper around the rtas call to enable MMIO or DMA on a frozen pci slot. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-09-21[POWERPC] EEH: code comment cleanupLinas Vepstas
Clean up subroutine documentation; mostly formatting changes, with some new content. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-07-31[POWERPC] pseries: Constify & voidify get_property()Jeremy Kerr
Now that get_property() returns a void *, there's no need to cast its return value. Also, treat the return value as const, so we can constify get_property later. pseries platform changes. Built for pseries_defconfig Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-13[PATCH] powerpc/pseries: bugfix: balance calls to pci_device_putLinas Vepstas
Repeated calls to eeh_remove_device() can result in multiple (and thus unbalanced) calls to pci_dev_put(). Make sure the pci_device_put() is called only once (since there was only one call to the matching pci_device_get()). Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-01[PATCH] powerpc/pseries: EEH CleanupNathan Fontenot
This patch removes unnecessary exports, marks functions as static when possible, and simplifies some list-related code. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-28[PATCH] powerpc: Kill _machine and hard-coded platform numbersBenjamin Herrenschmidt
This removes statically assigned platform numbers and reworks the powerpc platform probe code to use a better mechanism. With this, board support files can simply declare a new machine type with a macro, and implement a probe() function that uses the flattened device-tree to detect if they apply for a given machine. We now have a machine_is() macro that replaces the comparisons of _machine with the various PLATFORM_* constants. This commit also changes various drivers to use the new macro instead of looking at _machine. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-28[PATCH] powerpc: fix dynamic PCI probe regressionJohn Rose
Some hotplug driver functions were migrated to the kernel for use by EEH in commit 2bf6a8fa21570f37fd1789610da30f70a05ac5e3. Previously, the PCI Hotplug module had been changed to use the new OFDT-based PCI probe when appropriate: 5fa80fcdca9d20d30c9ecec30d4dbff4ed93a5c6 When rpaphp_pci_config_slot() was moved from the rpaphp driver to the new kernel function pcibios_add_pci_devices(), the OFDT-based probe stuff was dropped. This patch restores it. Signed-off-by: John Rose <johnrose@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-12[PATCH] powerpc: remove warning in EEH codeOlof Johansson
Remove warning in eeh code about mixed variables and code. Signed-off-by: Olof Johansson <olof@lixom.net> Acked-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-10[PATCH] powerpc/pseries: dlpar-add crash on null pointer dereflinas
This fixes a crash on null-pointer deref during dlpar slot addition. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 1c87c0f84943fbbc91826967ff4fea1b059a526f commit)
2006-01-10[PATCH] powerpc: get rid of per_cpu EEH countersLinas Vepstas
242-eeh-no-percpu-counters.patch Remove per-cpu counters from the EEH code. These statistics counters are incremented at a very low frequency, and the performance gains of per-cpu variables are negligable. By contrast, the counters weren't safe against cpu off/online operations, and its not worth the effort to make them so (other than to turn them into plain globals). Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from be3b5d1be053ccb41e91fa5a6f43ef5db301357d commit)
2006-01-10[PATCH] powerpc: Save device BARs much earlier in the boot sequenceLinas Vepstas
241-eeh-save-bars-earlier.patch Save the PCI device bars *before* any PCI probing is done. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 76c902b919098860f3d4e125f847abcc4cb1782a commit)
2006-01-10[PATCH] powerpc: handle multifunction PCI devices properlyLinas Vepstas
239-eeh-multifunction-consolidate.patch New-style firmware will often place multiple different functions under a non-EEH-aware parent. However, these devices might share a common PE "partition endpoint" and config address, ad thus any EEH events will affect all of the devices in common. This patch makes the effort to find all of these common devices and handle them together. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 216810296bb97d39da8e176822e9de78d2f00187 commit)
2006-01-10[PATCH] powerpc: Don't continue with PCI Error recovery if slot reset failed.Linas Vepstas
238-eeh-stop-if-reset_failed.patch If the firmware is unable to reset the PCI slot for some reason, then don't attempt any further recovery steps after that point. Instead, mark the device as permanently failed. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from e06b942521eb2cdaf232726f45a820d5837acb12 commit)
2006-01-10[PATCH] powerpc: set up the RTAS token just like the rest of them.Linas Vepstas
237-eeh-bridge-token.patch Minor: the rtas-bridge token should be set up the same way that all the other rtas tokens are set up. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 78379b6c5fc17b6666c40b05988e6708e98479c0 commit)
2006-01-10[PATCH] powerpc: Use PE configuration address consistentlyLinas Vepstas
236-eeh-config-addr.patch The PE configuration address wasn't being cnsistently used in all locations where a config address is called for. This patch adds it to the places it should have appeared in. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from c2bc904a28095aca0b04a37854b63b78622a032e commit)
2006-01-10[PATCH] powerpc: Remove duplicate codeLinas Vepstas
234-eeh-find-pe.patch The find_device_pe() routine is duplicated in two files. Remove one of the two copies, declare the other extern. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 48408e708282d4d0269136ff27ea5acbd9410b5a commit)
2006-01-10[PATCH] powerpc: remove bogus printkLinas Vepstas
233-eeh-buid-fix.patch Remove un-desired warning print from EEH code. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 241239e6aff69788a177d97c5d06fe9995c74cca commit)
2006-01-10[PATCH] powerpc: Add "partitionable endpoint" supportLinas Vepstas
26-eeh-partition-endpoint.patch New versions of firmware introduce a new method by which the "partitionable endpoint" (the point at which the pci bus is cut) should be located. This code adds the support for this (mandatory) new feature. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 9fcfb5d35b5294659f9299aa9cae6fd16325c07e commit)
2006-01-10[PATCH] powerpc: Split out PCI address cache to its own fileLinas Vepstas
25-pci-address-cache.patch The core EEH file is rather large. This patch splits out a self-contained chunk of it into its own file. This is the chunk that performes the caching and lookup of pci devices based on the i/o addresses of thier resoures. This code is almos architecture-independent and could be used by any system that wanted to find a pci device based only on the i/o address used by the device. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from b0b291d59906d4a9a89ed9e34d9fd684c7188924 commit)
2006-01-10[PATCH] powerpc: PCI Error Recovery: PPC64 core recovery routinesLinas Vepstas
Various PCI bus errors can be signaled by newer PCI controllers. The core error recovery routines are architecture dependent. This patch adds a recovery infrastructure for the PPC64 pSeries systems. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from e8ca11b460c4c9c7fa6b529be221529ebd770e38 commit)
2006-01-09[PATCH] powerpc: PCI hotplug common code eliminationLinas Vepstas
20-rpaphp-eeh-cleanup.patch This patch move some code from the rpaphp directory, to the powerpc directory, where it should have been all along (Among other things, I need it in the powerpc directory for the PCI error recovery.) Please note that patch affects TWO maintainers: Paul, after applying the powerpc part, please ask that GregKH appli the PCI part. It is safe to have the powerpc part go in first. It would be bad to have the PCI part go in first. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-17[PATCH] Avoid use of uninitialised spinlock in EEH.David Woodhouse
If the kernel supports both G5 and pSeries, and CONFIG_EEH is enabled, eeh_init() is (quite reasonably) never called when we boot on a G5. Yet eeh_check_failure() still gets called. We should avoid doing that if !eeh_subsystem_enabled. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10[PATCH] ppc64: mark failed devicesLinas Vepstas
17-eeh-slot-marking-bug.patch A device that experiences a PCI outage may be just one deivce out of many that was affected. In order to avoid repeated reports of a failure, the entire tree of affected devices should be marked as failed. This patch marks up the entire tree. Signed-off-by: Linas Vepstas <linas@linas.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10powerpc: Fix compile error in EEH code with gcc4Paul Mackerras
Gcc 4 doesn't like being told to inline a recursive function... Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10[PATCH] powerpc: merge code values for identifying platformsPaul Mackerras
This patch merges platform codes. systemcfg->platform is no longer used, systemcfg use in general is deprecated as much as possible (and renamed _systemcfg before it gets completely moved elsewhere in a future patch), _machine is now used on ppc64 along as ppc32. Platform codes aren't gone yet but we are getting a step closer. A bunch of asm code in head[_64].S is also turned into C code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10[PATCH] ppc64: Save & restore of PCI device BARSLinas Vepstas
14-eeh-device-bar-save.patch After a PCI device has been resest, the device BAR's and other config space info must be restored to the same state as they were in when the firmware first handed us this device. This will allow the PCI device driver, when restarted, to correctly recognize and set up the device. Tis patch saves the device config space as early as reasonable after the firmware has handed over the device. Te state resore funcion is inteded for use by the EEH recovery routines. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10[PATCH] ppc64: PCI reset support routinesLinas Vepstas
13-eeh-recovery-support-routines.patch EEH Recovery support routines This patch adds routines required to help drive the recovery of EEH-frozen slots. The main function is to drive the PCI #RST signal line high for a qurter of a second, and then allow for a second & a half of settle time. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10[PATCH] ppc64: PCI error event dispatcherLinas Vepstas
12-eeh-event-dispatcher.patch ppc64: EEH Recovery dispatcher thread This patch adds a mechanism to create recovery threads when an EEH event is received. Since an EEH freeze state may be detected within an interrupt context, we need to get out of the interrupt context before starting recovery. This dispatcher does this in two steps: first, it uses a workqueue to get out, and then lanuches a kernel thread, so that the recovery routine can sleep for exteded periods without upseting the keventd. A kernel thread is created with each EEH event, rather than having one long-running daemon started at boot time. This is because it is anticipated that EEH events will be very rare (very very rare, ideally) and so its pointless to cluter the process tables with a daemon that will almost never run. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>