summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-09-18powerpc/eeh: Introduce EEH_PE_INVALID type PEGavin Shan
When EEH error happens on the PE whose PCI devices don't have attached drivers. In function eeh_handle_event(), the default value PCI_ERS_RESULT_NONE will be returned after iterating all drivers of those PCI devices belonging to the PE. Actually, we don't have installed drivers for the PCI devices. Under the circumstance, we will remove the corresponding PCI bus of the PE, including the associated EEH devices and PE instance. However, we still need the information stored in the PE instance to do PE reset after that. So it's unsafe to free the PE instance. The patch introduces EEH_PE_INVALID type PE to address the issue. When the PCI bus and the corresponding attached EEH devices are removed, we will mark the PE as EEH_PE_INVALID. At later point, the PE will be changed to EEH_PE_DEVICE or EEH_PE_BUS when the corresponding EEH devices are attached again. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-18powerpc: Add an xmon command to dump one or all pacasMichael Ellerman
This was originally motivated by a desire to see the mapping between logical and hardware cpu numbers. But it seemed that it made more sense to just add a command to dump (most of) the paca. With no arguments "dp" will dump the paca for the current cpu. It also takes an argument, eg. "dp 3" which is the logical cpu number in hex. This form does not check if the cpu is possible, but displays the paca regardless, as well as the cpu's state in the possible, present and online masks. Thirdly, "dpa" will display the paca for all possible cpus. If there are no possible cpus, like early in boot, it will tell you that. Sample output, number in brackets is the offset into the struct: 2:mon> dp 3 paca for cpu 0x3 @ c00000000ff20a80: possible = yes present = yes online = yes lock_token = 0x8000 (0x8) paca_index = 0x3 (0xa) kernel_toc = 0xc00000000144f990 (0x10) kernelbase = 0xc000000000000000 (0x18) kernel_msr = 0xb000000000001032 (0x20) stab_real = 0x0 (0x28) stab_addr = 0x0 (0x30) emergency_sp = 0xc00000003ffe4000 (0x38) data_offset = 0xa40000 (0x40) hw_cpu_id = 0x9 (0x50) cpu_start = 0x1 (0x52) kexec_state = 0x0 (0x53) __current = 0xc00000007e568680 (0x218) kstack = 0xc00000007e5a3e30 (0x220) stab_rr = 0x1a (0x228) saved_r1 = 0xc00000007e7cb450 (0x230) trap_save = 0x0 (0x240) soft_enabled = 0x0 (0x242) irq_happened = 0x0 (0x243) io_sync = 0x0 (0x244) irq_work_pending = 0x0 (0x245) nap_state_lost = 0x0 (0x246) Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/powernv: Remove unused functionsGavin Shan
We don't need them anymore. The patch removes those functions. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/powernv: Using PCI core to do resource assignmentGavin Shan
Currently, the PCI probe flags "PCI_PROBE_ONLY | PCI_REASSIGN_ALL_RSRC" used on powernv platform. That means the platform has to do the PCI resource assignment by itself. The patch changes the PCI probe flag to "PCI_REASSIGN_ALL_RSRC" so that the PCI core will do the resource assignment. Also, the I/O and MMIO minimal alignment for P2P bridges have been configured while doing fixup for the PHBs. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/powernv: Fix overrunning segment tracing arrayGavin Shan
There're 2 arrays introduced to trace which PE has occupied the corresponding resource (I/O or MMIO) segment. However, we didn't allocate enough memory for them and that possiblly leads to PE descriptor corruption. The patch fixes that by allocating enough memory for those 2 arrays. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/powernv: Skip check on PE if necessaryGavin Shan
While the device driver or PCI core tries to enable PCI device, the platform dependent callback "ppc_md.pcibios_enable_device_hook" will be called to check if there has one associated PE for the PCI device. If we don't have the associated PE for the PCI device, it's not allowed to enable the PCI device. Unfortunately, there might have some cases we have to enable the PCI device (e.g. P2P bridge), but the PEs have not been created yet. The patch handles the unfortunate cases. Each PHB (struct pnv_phb) has one field "initialized" to trace if the PEs have been created and configured or not. When the PEs are not available, we won't check the associated PE for the PCI device to be enabled. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/powernv: Initialize DMA for PEsGavin Shan
The patch introduces additional wrapper function to call the original implementation so that the DMA can be configured for all existing PEs. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/powernv: I/O and MMIO resource assignment for PEsGavin Shan
There're 2 types of PCI bus sensitive PEs: (A) The PE includes single PCI bus. (B) The PE includes the PCI bus and all the subordinate PCI buses, and the patch tries to assign I/O and MMIO resources based on created PEs. Fortunately, we figured out unified scheme to do resource assignment for all types of PCI bus based PEs according to Ben's idea: - Resource assignment based on PE from top to bottom. - The soureces, either I/O or MMIO, of the PE are figured out from the assigned PCI bus. - The occupied resource by parent PE could possibilly be overrided by children PEs. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/powernv: PE list based on creation orderGavin Shan
The resource (I/O and MMIO) will be assigned on basis of PE from top to bottom so that we can implement the trick here: the resource that has been assigned to parent PE could be taken by child PE if necessary. The current implementation already has PE list per PHB basis, but the list doesn't meet our requirment: tracing PE based on their cration time from top to bottom. So the patch does rename for the DMA based PE list and introduces the list to trace the PEs sequentially based on their creation time. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/powernv: Create bus sensitive PEsGavin Shan
Basically, there're 2 types of PCI bus sensitive PEs: (A) The PE includes single PCI bus. (B) The PE includes the PCI bus and all the subordinate PCI buses. At present, we'd like to put PCI bus originated by PCI-e link to form PE that contains single PCI bus, and the PCIe-to-PCI bridge will form the 2nd type of PE. We don't figure out to detect PLX bridge yet. Once we can detect PLX bridge some day, we have to put PCI buses originated from the downstream port of PLX bridge to the 2nd type of PE. The patch changes the original implementation for a little bit to support 2 types of PCI bus sensitive PEs described as above. Also, the function used to retrieve the corresponding PE according to the given PCI device has been changed based on that because each PCI device should trace the directly associated PE. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Richard Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/trace: Fix interrupt tracepoints vs. RCULi Zhong
There are a few tracepoints in the interrupt code path, which is before irq_enter(), or after irq_exit(), like trace_irq_entry()/trace_irq_exit() in do_IRQ(), trace_timer_interrupt_entry()/trace_timer_interrupt_exit() in timer_interrupt(). If the interrupt is from idle(), and because tracepoint contains RCU read-side critical section, we could see following suspicious RCU usage reported: [ 145.127743] =============================== [ 145.127747] [ INFO: suspicious RCU usage. ] [ 145.127752] 3.6.0-rc3+ #1 Not tainted [ 145.127755] ------------------------------- [ 145.127759] /root/.workdir/linux/arch/powerpc/include/asm/trace.h:33 suspicious rcu_dereference_check() usage! [ 145.127765] [ 145.127765] other info that might help us debug this: [ 145.127765] [ 145.127771] [ 145.127771] RCU used illegally from idle CPU! [ 145.127771] rcu_scheduler_active = 1, debug_locks = 0 [ 145.127777] RCU used illegally from extended quiescent state! [ 145.127781] no locks held by swapper/0/0. [ 145.127785] [ 145.127785] stack backtrace: [ 145.127789] Call Trace: [ 145.127796] [c00000000108b530] [c000000000013c40] .show_stack +0x70/0x1c0 (unreliable) [ 145.127806] [c00000000108b5e0] [c0000000000f59d8] .lockdep_rcu_suspicious+0x118/0x150 [ 145.127813] [c00000000108b680] [c00000000000fc58] .do_IRQ+0x498/0x500 [ 145.127820] [c00000000108b750] [c000000000003950] hardware_interrupt_common+0x150/0x180 [ 145.127828] --- Exception: 501 at .plpar_hcall_norets+0x84/0xd4 [ 145.127828] LR = .check_and_cede_processor+0x38/0x70 [ 145.127836] [c00000000108bab0] [c0000000000665dc] .shared_cede_loop +0x5c/0x100 [ 145.127844] [c00000000108bb70] [c000000000588ab0] .cpuidle_enter +0x30/0x50 [ 145.127850] [c00000000108bbe0] [c000000000588b0c] .cpuidle_enter_state+0x3c/0xb0 [ 145.127857] [c00000000108bc60] [c000000000589730] .cpuidle_idle_call +0x150/0x6c0 [ 145.127863] [c00000000108bd30] [c000000000058440] .pSeries_idle +0x10/0x40 [ 145.127870] [c00000000108bda0] [c00000000001683c] .cpu_idle +0x18c/0x2d0 [ 145.127876] [c00000000108be60] [c00000000000b434] .rest_init +0x124/0x1b0 [ 145.127884] [c00000000108bef0] [c0000000009d0d28] .start_kernel +0x568/0x588 [ 145.127890] [c00000000108bf90] [c000000000009660] .start_here_common +0x20/0x40 This is because the RCU usage in interrupt context should be used in area marked by rcu_irq_enter()/rcu_irq_exit(), called in irq_enter()/irq_exit() respectively. Move them into the irq_enter()/irq_exit() area to avoid the reporting. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Make some of the PGTABLE_RANGE dependency explicitAneesh Kumar K.V
slice array size and slice mask size depend on PGTABLE_RANGE. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Update VSID allocation documentationAneesh Kumar K.V
This update the proto-VSID and VSID scramble related information to be more generic by using names instead of current values. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Add 64TB supportAneesh Kumar K.V
Increase max addressable range to 64TB. This is not tested on real hardware yet. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Use 32bit array for slb cacheAneesh Kumar K.V
With larger vsid we need to track more bits of ESID in slb cache for slb invalidate. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Use the required number of VSID bits in slbmteAneesh Kumar K.V
ASM_VSID_SCRAMBLE can leave non-zero bits in the high 28 bits of the result for 256MB segment (40 bits for 1T segment). Properly mask them before using the values in slbmte Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Increase the slice range to 64TBAneesh Kumar K.V
This patch makes the high psizes mask as an unsigned char array so that we can have more than 16TB. Currently we support upto 64TB Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Make KERN_VIRT_SIZE not dependend on PGTABLE_RANGEAneesh Kumar K.V
As we keep increasing PGTABLE_RANGE we need not increase the virual map area for kernel. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Convert virtual address to vpnAneesh Kumar K.V
This patch convert different functions to take virtual page number instead of virtual address. Virtual page number is virtual address shifted right by VPN_SHIFT (12) bits. This enable us to have an address range of upto 76 bits. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Simplify hpte_decodeAneesh Kumar K.V
This patch simplify hpte_decode for easy switching of virtual address to virtual page number in the later patch Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Use hpt_va to compute virtual addressAneesh Kumar K.V
Don't open code the same Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Replace open coded CONTEXT_BITS valueAneesh Kumar K.V
To clarify the meaning for future readers, replace the open coded 19 with CONTEXT_BITS Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Fix typo in PTRS_PER_PUDScott Wood
PTRS_PER_PUD should be based on PUD_INDEX_SIZE, not PMD_INDEX_SIZE. We got away with it because PUD and PMD had the same index size, but this is no longer true with Aneesh's patchset to support a 46-bit user effective address space. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc: Add denormalisation exception handling for POWER6/7Michael Neuling
On POWER6 and POWER7 if the input operand to an instruction is a denormalised single precision binary floating point value we can take a denormalisation exception where it's expected that the hypervisor (HV=1) will fix up the inputs before the instruction is run. This adds code to handle this denormalisation exception for POWER6 and POWER7. It also add a CONFIG_PPC_DENORMALISATION option and sets it in pseries/ppc64_defconfig. This is useful on bare metal systems only. Based on patch from Milton Miller. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17Merge remote-tracking branch 'pci/pci/gavin-window-alignment' into nextBenjamin Herrenschmidt
Merge Gavin patches from the PCI tree as subsequent powerpc patches are going to depend on them Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-13Merge commit 'v3.6-rc5' into pci/gavin-window-alignmentBjorn Helgaas
* commit 'v3.6-rc5': (1098 commits) Linux 3.6-rc5 HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured Remove user-triggerable BUG from mpol_to_str xen/pciback: Fix proper FLR steps. uml: fix compile error in deliver_alarm() dj: memory scribble in logi_dj Fix order of arguments to compat_put_time[spec|val] xen: Use correct masking in xen_swiotlb_alloc_coherent. xen: fix logical error in tlb flushing xen/p2m: Fix one-off error in checking the P2M tree directory. powerpc: Don't use __put_user() in patch_instruction powerpc: Make sure IPI handlers see data written by IPI senders powerpc: Restore correct DSCR in context switch powerpc: Fix DSCR inheritance in copy_thread() powerpc: Keep thread.dscr and thread.dscr_inherit in sync powerpc: Update DSCR on all CPUs when writing sysfs dscr_default powerpc/powernv: Always go into nap mode when CPU is offline powerpc: Give hypervisor decrementer interrupts their own handler powerpc/vphn: Fix arch_update_cpu_topology() return value ARM: gemini: fix the gemini build ... Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/rapidio/devices/tsi721.c
2012-09-11powerpc/powernv: I/O and memory alignment for P2P bridgesGavin Shan
The patch implements ppc_md.pcibios_window_alignment for powernv platform so that the resource reassignment in PCI core will be done according to the I/O and memory alignment returned from powernv platform. The alignments returned from powernv platform is closely depending on the scheme for PE segmenting. Besides, the patch isn't useful for now, but the subsequent patches will be working based on it. [bhelgaas: use pci_pcie_type() since pci_dev.pcie_type was removed] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-09-11powerpc/PCI: Override pcibios_window_alignment()Gavin Shan
This patch implements pcibios_window_alignment() so powerpc platforms can force P2P bridge windows to be at larger alignments than the PCI spec requires. [bhelgaas: changelog] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-09-11PCI: Refactor pbus_size_mem()Gavin Shan
The original idea comes from Ram Pai. This patch puts the chunk of code for calculating the minimal alignment of memory window into a separate inline function. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-09-11PCI: Align P2P windows using pcibios_window_alignment()Gavin Shan
This patch changes pbus_size_io() and pbus_size_mem() to do window (I/O, memory and prefetchable memory) reassignment based on the minimal alignments for the P2P bridge, which was retrieved by window_alignment(). [bhelgaas: changelog] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-09-11PCI: Add weak pcibios_window_alignment() interfaceGavin Shan
This patch implements a weak function to return the default I/O or memory window alignment for a P2P bridge. By default, I/O windows are aligned to 4KiB or 1KiB and memory windows are aligned to 4MiB. Some platforms, e.g., powernv, have special alignment requirements and can override pcibios_window_alignment(). [bhelgaas: changelog] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-09-10powerpc/mm: Match variable types to APIJoe MacDonald
sys_subpage_prot() takes an unsigned long for 'addr' then does some stuff with it and the result is stored in a signed int, i, which is eventually used as the size parameter in a copy_from_user call. Update 'i' to be an unsigned long as well and since 'nw' is used in a size_t context which, depending on whether this is 32- or 64-bit may be unsigned int or unsigned long, switch that to a size_t and always be right. Finally, since we're in the neighbourhood, make the same changes to subpage_prot_clear(). Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Joe MacDonald <joe.macdonald@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/iommu: Add ppc_md.tce_get() callback for use by VFIOAlexey Kardashevskiy
The upcoming VFIO support requires a way to know which entry in the TCE map is not empty in order to do cleanup at QEMU exit/crash. This patch adds such functionality to POWERNV platform code. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc: cleanup old DABRX #definesMichael Neuling
These are no longer used so get rid of them Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc: Dynamically calculate the dabrx based on kernel/user/hypervisorMichael Neuling
Currently we mark the DABRX to interrupt on all matches (hypervisor/kernel/user and then filter in software. We can be a lot smarter now that we can set the DABRX dynamically. This sets the DABRX based on the flags passed by the user. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc: Rework set_dabr so it can take a DABRX value as wellMichael Neuling
Rework set_dabr to take a DABRX value as well. Both the pseries and PS3 hypervisors do some checks on the DABRX values that are passed in the hcall. This patch stops bogus values from being passed to hypervisor. Also, in the case where we are clearing the breakpoint, where DABR and DABRX are zero, we modify the DABRX value to make it valid so that the hcall won't fail. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Cleanup on EEH PCI address cacheGavin Shan
The patch does cleanup on EEH PCI address cache based on the fact EEH core is the only user of the component. * Cleanup on function names so that they all have prefix "eeh" and looks more short. * Function printk() has been replaced with pr_debug() or pr_warning() accordingly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Trace eeh device from I/O cacheGavin Shan
The idea comes from Benjamin Herrenschmidt. The eeh cache helps fetching the pci device according to the given I/O address. Since the eeh cache is serving for eeh, it's reasonable for eeh cache to trace eeh device except pci device. The patch make eeh cache to trace eeh device. Also, the major eeh entry function eeh_dn_check_failure has been renamed to eeh_dev_check_failure since it will take eeh device as input parameter. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Probe mode supportGavin Shan
While EEH module is installed, PCI devices is checked one by one to see if it supports eeh. On different platforms, the PCI devices are referred through different ways when the EEH module is loaded. For example, on pSeries platform, that is done by OF node. However, we would do that by real PCI devices (struct pci_dev) on PowerNV platform in future. So we needs some mechanism to differentiate those cases by classifying them to probe modes, either from OF nodes or real PCI devices. The patch implements the support to eeh probe mode. Also, the EEH on pSeries has set it into EEH_PROBE_MODE_DEVTREE. That means the probe will be done based on OF nodes on pSeries platform. In addition, On pSeries platform, it's done by OF nodes. The patch moves the the probe function from EEH core to platform dependent backend and some cleanup applied. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Move stats to PEGavin Shan
The patch removes the eeh related statistics for eeh device since they have been maintained by the corresponding eeh PE. Also, the flags used to trace the state of eeh device and PE have been reworked for a little bit. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Handle EEH error based on PEGavin Shan
The patch reworks the current implementation so that the eeh errors will be handled basing on PE instead of eeh device. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Make EEH handler PE sensitiveGavin Shan
Once eeh error is found, eeh event will be created and put it into the global linked list. At the mean while, kernel thread will be started to process it. The handler for the kernel thread originally was eeh device sensitive. The patch reworks the handler of the kernel thread so that it's PE sensitive. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Do reset based on PEGavin Shan
The patch implements reset based on PE instead of eeh device. Also, The functions used to retrieve the reset type, either hot or fundamental reset, have been reworked for a little bit. More specificly, it's implemented based the the eeh device traverse function. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: I/O enable and log retrival based on PEGavin Shan
The patch refactors the original implementation in order to enable I/O and retrieve EEH log based on PE. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Device bars restore based on PEGavin Shan
The patch introduces the function to traverse the devices of the specified PE and its child PEs. Also, the restore on device bars is implemented based on the traverse function. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Make EEH operations based on PEGavin Shan
Originally, all the EEH operations were implemented based on OF node. Actually, it explicitly breaks the rules that the operation target is PE instead of device. Therefore, the patch makes all the operations based on PE instead of device. Unfortunately, the backend for config space has to be kept as original because it doesn't depend on PE. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Trace error based on PE from beginningGavin Shan
There're 2 conditions to trigger EEH error detection: invalid value returned from reading I/O or config space. On each case, the function eeh_dn_check_failure will be called to initialize EEH event and put it into the poll for further processing. The patch changes the function for a little bit so that the EEH error will be traced based on PE instead of EEH device any more. Also, the function eeh_find_device_pe() has been removed since the eeh device is tracing the PE by struct eeh_dev::pe. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Trace EEH state based on PEGavin Shan
Since we've introduced dedicated struct to trace individual PEs, it's reasonable to trace its state through the dedicated struct instead of using "eeh_dev" any more. The patches implements the state tracing based on PE. It's notable that the PE state will be applied to the specified PE as well as its child PEs. That complies with the rule that problematic parent PE will prevent those child PEs from working properly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Build EEH event based on PEGavin Shan
The original implementation builds EEH event based on EEH device. We already had dedicated struct to depict PE. It's reasonable to build EEH event based on PE. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Remove PE at appropriate timeGavin Shan
During PCI hotplug and EEH recovery, the PE hierarchy tree might be changed due to the PCI topology changes. At later point when the PCI device is added, the PE will be created dynamically again. The patch introduces new function to remove EEH devices from the associated PE. That also can cause that the parent PE is removed from the PE tree if the parent PE doesn't include valid EEH devices and child PEs. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>