summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2013-08-15Merge tag 'v3.11-rc5' into perf/coreIngo Molnar
Merge Linux 3.11-rc5, to sync up with the latest upstream fixes since -rc1. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-09powerpc/tm: Fix context switching TAR, PPR and DSCR SPRsMichael Neuling
If a transaction is rolled back, the Target Address Register (TAR), Processor Priority Register (PPR) and Data Stream Control Register (DSCR) should be restored to the checkpointed values before the transaction began. Any changes to these SPRs inside the transaction should not be visible in the abort handler. Currently Linux doesn't save or restore the checkpointed TAR, PPR or DSCR. If we preempt a processes inside a transaction which has modified any of these, on process restore, that same transaction may be aborted we but we won't see the checkpointed versions of these SPRs. This adds checkpointed versions of these SPRs to the thread_struct and adds the save/restore of these three SPRs to the treclaim/trechkpt code. Without this if any of these SPRs are modified during a transaction, users may incorrectly see a speculated SPR value even if the transaction is aborted. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: Save the TAR register earlierMichael Neuling
This moves us to save the Target Address Register (TAR) a earlier in __switch_to. It introduces a new function save_tar() to do this. We need to save the TAR earlier as we will overwrite it in the transactional memory reclaim/recheckpoint path. We are going to do this in a subsequent patch which will fix saving the TAR register when it's modified inside a transaction. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: Fix context switch DSCR on POWER8Michael Neuling
POWER8 allows the DSCR to be accessed directly from userspace via a new SPR number 0x3 (Rather than 0x11. DSCR SPR number 0x11 is still used on POWER8 but like POWER7, is only accessible in HV and OS modes). Currently, we allow this by setting H/FSCR DSCR bit on boot. Unfortunately this doesn't work, as the kernel needs to see the DSCR change so that it knows to no longer restore the system wide version of DSCR on context switch (ie. to set thread.dscr_inherit). This clears the H/FSCR DSCR bit initially. If a process then accesses the DSCR (via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in facility_unavailable_exception(). We also change _switch() so that we set or clear the H/FSCR DSCR bit based on the thread.dscr_inherit. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: Rework setting up H/FSCR bit definitionsMichael Neuling
This reworks the Facility Status and Control Regsiter (FSCR) config bit definitions so that we can access the bit numbers. This is needed for a subsequent patch to fix the userspace DSCR handling. HFSCR and FSCR bit definitions are the same, so reuse them. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: Fix hypervisor facility unavaliable vector numberMichael Neuling
Currently if we take hypervisor facility unavaliable (from 0xf80/0x4f80) we mark it as an OS facility unavaliable (0xf60) as the two share the same code path. The becomes a problem in facility_unavailable_exception() as we aren't able to see the hypervisor facility unavailable exceptions. Below fixes this by duplication the required macros. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/kvm/book3s_pr: Return appropriate error when allocation failsThadeu Lima de Souza Cascardo
err was overwritten by a previous function call, and checked to be 0. If the following page allocation fails, 0 is going to be returned instead of -ENOMEM. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/kvm: Add signed type cast for comparationChen Gang
'rmls' is 'unsigned long', lpcr_rmls() will return negative number when failure occurs, so it need a type cast for comparing. 'lpid' is 'unsigned long', kvmppc_alloc_lpid() return negative number when failure occurs, so it need a type cast for comparing. Signed-off-by: Chen Gang <gang.chen@asianux.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/eeh: Add missing procfs entry for PowerNVMike Qiu
The procfs entry for global statistics has been missed on PowerNV platform and the patch is going to add that. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/pseries: Add backward compatibilty to read old kernel oops-logAruna Balakrishnaiah
Older kernels has just length information in their header. Handle it while reading old kernel oops log from pstore. Applies on top of powerpc/pseries: Fix buffer overflow when reading from pstore Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/pseries: Fix buffer overflow when reading from pstoreAruna Balakrishnaiah
When reading from pstore there is a buffer overflow during decompression due to the header added in unzip_oops. Remove unzip_oops and call pstore_decompress directly in nvram_pstore_read. Allocate buffer of size report_length of the oops header as header will not be deallocated in pstore. Since we have 'openssl' command line tool to decompress the compressed data, dump the compressed data in case decompression fails instead of not dumping anything. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: On POWERNV enable PPC_DENORMALISATION by defaultAnton Blanchard
We want PPC_DENORMALISATION enabled when POWERNV is enabled, so update the Kconfig. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org>
2013-08-02Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "Here is not quite a handful of powerpc fixes for rc3. The windfarm fix is a regression fix (though not a new one), the PMU interrupt rename is not a fix per-se but has been submitted a long time ago and I kept forgetting to put it in (it puts us back in sync with x86), the other perf bit is just about putting an API/ABI bit definition in the right place for userspace to consume, and finally, we have a fix for the VPHN (Virtual Partition Home Node) feature (notification that the hypervisor is moving nodes around) which could cause lockups so we may as well fix it now" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/windfarm: Fix noisy slots-fan on Xserve (rm31) powerpc: VPHN topology change updates all siblings powerpc/perf: Export PERF_EVENT_CONFIG_EBB_SHIFT to userspace powerpc: Rename PMU interrupts from CNT to PMI
2013-08-02Merge tag 'pci-v3.11-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "Yinghai fixed a couple regressions: one resource assignment problem introduced in v3.10 that showed up with SR-IOV on powerpc, and another SR-IOV hot-remove issue related to refcounting changes we merged for v3.11. Yinghai is still working on another SR-IOV-related fix or two, which will be simpler if pciehp is non-modular, so I included the Kconfig changes now to get them in earlier. Finally, a minor fix for the ARM Marvell EBU host bridge driver that was merged for v3.11 Hotplug: PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device PCI: hotplug: Convert to be builtin only, not modular PCI: pciehp: Convert pciehp to be builtin only, not modular Resource allocation: PCI: Retry allocation of only the resource type that failed ARM: PCI: mvebu: Disable prefetchable memory support in PCI-to-PCI bridge" * tag 'pci-v3.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: mvebu: Disable prefetchable memory support in PCI-to-PCI bridge PCI: Retry allocation of only the resource type that failed PCI: pciehp: Convert pciehp to be builtin only, not modular PCI: hotplug: Convert to be builtin only, not modular PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device
2013-08-01powerpc: VPHN topology change updates all siblingsRobert Jennings
When an associativity level change is found for one thread, the siblings threads need to be updated as well. This is done today for PRRN in stage_topology_update() but is missing for VPHN in update_cpu_associativity_changes_mask(). This patch will correctly update all thread siblings during a topology change. Without this patch a topology update can result in a CPU in init_sched_groups_power() getting stuck indefinitely in a loop. This loop is built in build_sched_groups(). As a result of the thread moving to a node separate from its siblings the struct sched_group will have its next pointer set to point to itself rather than the sched_group struct of the next thread. This happens because we have a domain without the SD_OVERLAP flag, which is correct, and a topology that doesn't conform with reality (threads on the same core assigned to different numa nodes). When this list is traversed by init_sched_groups_power() it will reach the thread's sched_group structure and loop indefinitely; the cpu will be stuck at this point. The bug was exposed when VPHN was enabled in commit b7abef0 (v3.9). Cc: <stable@vger.kernel.org> [v3.9+] Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-01powerpc/perf: Export PERF_EVENT_CONFIG_EBB_SHIFT to userspaceMichael Ellerman
We use bit 63 of the event code for userspace to request that the event be counted using EBB (Event Based Branches). Export this value, making it part of the API - though only on processors that support EBB. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-01powerpc: Rename PMU interrupts from CNT to PMIMichael Ellerman
Back in commit 89713ed "Add timer, performance monitor and machine check counts to /proc/interrupts" we added a count of PMU interrupts to the output of /proc/interrupts. At the time we named them "CNT" to match x86. However in commit 89ccf46 "Rename 'performance counter interrupt'", the x86 guys renamed theirs from "CNT" to "PMI". Arguably changing the name could break someone's script, but I think the chance of that is minimal, and it's preferable to have a name that 1) is somewhat meaningful, and 2) matches x86. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-25PCI: hotplug: Convert to be builtin only, not modularBjorn Helgaas
Convert CONFIG_HOTPLUG_PCI from tristate to bool. This only affects the hotplug core; several of the hotplug drivers can still be modules. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org>
2013-07-24powerpc/perf: BHRB filter configuration should follow the taskAnshuman Khandual
When the task moves around the system, the corresponding cpuhw per cpu strcuture should be popullated with the BHRB filter request value so that PMU could be configured appropriately with that during the next call into power_pmu_enable(). Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/perf: Ignore separate BHRB privilege state filter requestAnshuman Khandual
Completely ignore BHRB privilege state filter request as we are already configuring that with privilege state filtering attribute for the accompanying PMU event. This would help achieve cleaner user space interaction for BHRB. This patch fixes a situation like this Before patch:- ------------ ./perf record -j any -e branch-misses:k ls Error: The sys_perf_event_open() syscall returned with 95 (Operation not supported) for event (branch-misses:k). /bin/dmesg may provide additional information. No CONFIG_PERF_EVENTS=y kernel support configured? Here 'perf record' actually copies over ':k' filter request into BHRB privilege state filter config and our previous check in kernel would fail that. After patch:- ------------- ./perf record -j any -e branch-misses:k ls perf perf.data perf.data.old test-mmap-ring [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (~102 samples)] Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/powernv: Mark pnv_pci_init_ioda2_phb() as __initBjorn Helgaas
Mark pnv_pci_init_ioda2_phb() as __init. It is called only from an init function (pnv_pci_init()), and it calls an init function (pnv_pci_init_ioda_phb()): pnv_pci_init # init pnv_pci_init_ioda2_phb # non-init pnv_pci_init_ioda_phb # init This should fix a section mismatch warning. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/mm: Use the correct SLB(LLP) encoding in tlbie instructionAneesh Kumar K.V
The sllp value is stored in mmu_psize_defs in such a way that we can easily OR the value to get the operand for slbmte instruction. ie, the L and LP bits are not contiguous. Decode the bits and use them correctly in tlbie. regression is introduced by 1f6aaaccb1b3af8613fe45781c1aefee2ae8c6b3 "powerpc: Update tlbie/tlbiel as per ISA doc" Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/mm: Fix fallthrough bug in hpte_decodeAneesh Kumar K.V
We should not fallthrough different case statements in hpte_decode. Add break statement to break out of the switch. The regression is introduced by dcda287a9b26309ae43a091d0ecde16f8f61b4c0 "powerpc/mm: Simplify hpte_decode" Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/pseries: Fix a typo in pSeries_lpar_hpte_insert()Denis Kirjanov
Commit 801eb73f45371accc78ca9d6d22d647eeb722c11 introduced a bug while checking PTE flags. We have to drop the _PAGE_COHERENT flag when __PAGE_NO_CACHE is set and the cache update policy is not write-through (i.e. _PAGE_WRITETHRU is not set) Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> CC: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Introdce flag to protect sysfsGavin Shan
The patch introduces flag EEH_DEV_SYSFS to keep track that the sysfs entries for the corresponding EEH device (then PCI device) has been added or removed, in order to avoid race condition. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Fix unbalanced enable for IRQGavin Shan
The patch fixes following issue: Unbalanced enable for IRQ 23 ------------[ cut here ]------------ WARNING: at kernel/irq/manage.c:437 : NIP [c00000000016de8c] .__enable_irq+0x11c/0x140 LR [c00000000016de88] .__enable_irq+0x118/0x140 Call Trace: [c000003ea1f23880] [c00000000016de88] .__enable_irq+0x118/0x140 (unreliable) [c000003ea1f23910] [c00000000016df08] .enable_irq+0x58/0xa0 [c000003ea1f239a0] [c0000000000388b4] .eeh_enable_irq+0xc4/0xe0 [c000003ea1f23a30] [c000000000038a28] .eeh_report_reset+0x78/0x130 [c000003ea1f23ac0] [c000000000037508] .eeh_pe_dev_traverse+0x98/0x170 [c000003ea1f23b60] [c0000000000391ac] .eeh_handle_normal_event+0x2fc/0x3d0 [c000003ea1f23bf0] [c000000000039538] .eeh_handle_event+0x2b8/0x2c0 [c000003ea1f23c90] [c000000000039600] .eeh_event_handler+0xc0/0x170 [c000003ea1f23d30] [c0000000000da9a0] .kthread+0xf0/0x100 [c000003ea1f23e30] [c00000000000a1dc] .ret_from_kernel_thread+0x5c/0x80 Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Don't use pci_dev during BAR restoreGavin Shan
While restoring BARs for one specific PCI device, the pci_dev instance should have been released. So it's not reliable to use the pci_dev instance on restoring BARs. However, we still need some information (e.g. PCIe capability position, header type) from the pci_dev instance. So we have to store those information to EEH device in advance. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Use partial hotplug for EEH unaware driversGavin Shan
When EEH error happens to one specific PE, some devices with drivers supporting EEH won't except hotplug on the device. However, there might have other deivces without driver, or with driver without EEH support. For the case, we need do partial hotplug in order to make sure that the PE becomes absolutely quite during reset. Otherise, the PE reset might fail and leads to failure of error recovery. The current code doesn't handle that 'mixed' case properly, it either uses the error callbacks to the drivers, or tries hotplug, but doesn't handle a PE (EEH domain) composed of a combination of the two. The patch intends to support so-called "partial" hotplug for EEH: Before we do reset, we stop and remove those PCI devices without EEH sensitive driver. The corresponding EEH devices are not detached from its PE, but with special flag. After the reset is done, those EEH devices with the special flag will be scanned one by one. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/pci: Partial tree hotplug supportGavin Shan
When EEH error happens to one specific PE, the device drivers of its attached EEH devices (PCI devices) are checked to see the further action: reset with complete hotplug, or reset without hotplug. However, that's not enough for those PCI devices whose drivers can't support EEH, or those PCI devices without driver. So we need do so-called "partial hotplug" on basis of PCI devices. In the situation, part of PCI devices of the specific PE are unplugged and plugged again after PE reset. The patch changes pcibios_add_pci_devices() so that it can support full hotplug and so-called "partial" hotplug based on device-tree or real hardware. It's notable that pci_of_scan.c has been changed for a bit in order to support the "partial" hotplug based on dev-tree. Most of the generic code already supports that, we just need to plumb it properly on our side. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Use safe list traversal when walking EEH devicesGavin Shan
Currently, we're trasversing the EEH devices list using list_for_each_entry(). That's not safe enough because the EEH devices might be removed from its parent PE while doing iteration. The patch replaces that with list_for_each_entry_safe(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Keep PE during hotplugGavin Shan
When we do normal hotplug, the PE (shadow EEH structure) shouldn't be kept around. However, we need to keep it if the hotplug an artifial one caused by EEH errors recovery. Since we remove EEH device through the PCI hook pcibios_release_device(), the flag "purge_pe" passed to various functions is meaningless. So the patch removes the meaningless flag and introduce new flag "EEH_PE_KEEP" to save the PE while doing hotplug during EEH error recovery. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/pci: Override pcibios_release_device()Gavin Shan
The patch overrides pcibios_release_device() to release EEH resources (EEH cache, unbinding EEH device) for the indicated PCI device. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Export functions for hotplugGavin Shan
Make some functions public in order to support hotplug on either specific PCI bus or PCI device in future. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Remove reference to PCI deviceGavin Shan
We will rely on pcibios_release_device() to remove the EEH cache and unbind EEH device for the specific PCI device. So we shouldn't hold the reference to the PCI device from EEH cache and EEH device. Otherwise, pcibios_release_device() won't be called as we expected. The patch removes the reference to the PCI device in EEH core. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc: Fix the corrupt r3 error during MCE handling.Mahesh Salgaonkar
During Machine Check interrupt on pseries platform, R3 generally points to memory region inside RTAS (FWNMI) area. We see r3 corruption because when RTAS delivers the machine check exception it passes the address inside FWNMI area with the top most bit set. This patch fixes this issue by masking top two bit in machine check exception handler. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/perf: Set PPC_FEATURE2_EBB when we register the power8 PMUMichael Ellerman
The presence or absence of EBB is advertised to userspace via the presence or absence of PPC_FEATURE2_EBB in cpu_user_features2. Because the kernel can be built without PMU support, we should only add PPC_FEATURE2_EBB to cpu_user_features2 when we successfully register the power8 PMU support. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/pseries: Drop "select HOTPLUG"Paul Bolle
The Kconfig symbol HOTPLUG was removed with commit 40b313608a ("Finally eradicate CONFIG_HOTPLUG"). But there's still one select statement for that symbol. It seems that select statement was added after the patch to remove CONFIG_HOTPLUG was submitted. Anyhow, it is useless and can be dropped. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc: Access local paca after hard irq disabledTiejun Chen
In hard_irq_disable(), we accessed the PACA before we hard disabled the interrupts, potentially causing a warning as get_paca() will us debug_smp_processor_id(). Move that to after the disabling, and also use local_paca directly rather than get_paca() to avoid several redundant and useless checks. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/modules: Module CRC relocation fix causes perf issuesAnton Blanchard
Module CRCs are implemented as absolute symbols that get resolved by a linker script. We build an intermediate .o that contains an unresolved symbol for each CRC. genksysms parses this .o, calculates the CRCs and writes a linker script that "resolves" the symbols to the calculated CRC. Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols that need relocating and relocates them at boot. Commit d4703aef (module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y) added a hook to reverse the bogus relocations. Part of this patch created a symbol at 0x0: # head -2 /proc/kallsyms 0000000000000000 T reloc_start c000000000000000 T .__start This reloc_start symbol is causing lots of confusion to perf. It thinks reloc_start is a massive function that stretches from 0x0 to 0xc000000000000000 and we get various cryptic errors out of perf, including: problem incrementing symbol count, skipping event This patch removes the reloc_start linker script label and instead defines it as PHYSICAL_START. We also need to wrap it with CONFIG_PPC64 because the ppc32 kernel can set a non zero PHYSICAL_START at compile time and we wouldn't want to subtract it from the CRCs in that case. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc: Add second POWER8 PVR entryMichael Neuling
POWER8 comes with two different PVRs. This patch enables the additional PVR in the cputable. The existing entry (PVR=0x4b) is renamed to POWER8E and the new entry (PVR=0x4d) is given POWER8. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-19Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: * Add missing 'finished_round' event forwarding in 'perf inject', from Adrian Hunter. * Assorted tidy ups, from Adrian Hunter. * Fall back to sysfs event names when parsing fails, from Andi Kleen. * List pmu events in perf list, from Andi Kleen. * Cleanup some memory allocation/freeing uses, from David Ahern. * Add option to collapse undesired parts of call graph, from Greg Price. * Prep work for multi perf data file storage, from Jiri Olsa. * Add support for more than two files comparision in 'perf diff', from Jiri Olsa * A few more 'perf test' improvements, from Jiri Olsa * libtraceevent cleanups, from Namhyung Kim. * Remove odd build stall in 'perf sched' by moving a large struct initialization from a local variable to a global one, from Namhyung Kim. * Add support for callchains in the gtk UI, from Namhyung Kim. * Do not apply symfs for an absolute vmlinux path, fix from Namhyung Kim. * Use default include path notation for libtraceevent, from Robert Richter. * Fix 'make tools/perf', from Robert Richter. * Make Power7 events available, from Runzhen Wang. * Add --objdump option to 'perf top', from Sukadev Bhattiprolu. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Just a bunch of small fixes and tidy ups: 1) Finish the "busy_poll" renames, from Eliezer Tamir. 2) Fix RCU stalls in IFB driver, from Ding Tianhong. 3) Linearize buffers properly in tun/macvtap zerocopy code. 4) Don't crash on rmmod in vxlan, from Pravin B Shelar. 5) Spinlock used before init in alx driver, from Maarten Lankhorst. 6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry Kravkov. 7) Dummy and ifb driver load failure paths can oops, fixes from Tan Xiaojun and Ding Tianhong. 8) Correct MTU calculations in IP tunnels, from Alexander Duyck. 9) Account all TCP retransmits in SNMP stats properly, from Yuchung Cheng. 10) atl1e and via-rhine do not handle DMA mapping failures properly, from Neil Horman. 11) Various equal-cost multipath route fixes in ipv6 from Hannes Frederic Sowa" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits) ipv6: only static routes qualify for equal cost multipathing via-rhine: fix dma mapping errors atl1e: fix dma mapping warnings tcp: account all retransmit failures usb/net/r815x: fix cast to restricted __le32 usb/net/r8152: fix integer overflow in expression net: access page->private by using page_private net: strict_strtoul is obsolete, use kstrtoul instead drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe net/usb: add relative mii functions for r815x net/tipc: use %*phC to dump small buffers in hex form qlcnic: Adding Maintainers. gre: Fix MTU sizing check for gretap tunnels pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts pkt_sched: sch_qfq: improve efficiency of make_eligible gso: Update tunnel segmentation to support Tx checksum offload inet: fix spacing in assignment ifb: fix oops when loading the ifb failed ...
2013-07-13Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: - fix for do_div() abuse on x86 - locking fix in perf core - a pile of (build) fixes and cleanups in perf tools * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) perf/x86: Fix incorrect use of do_div() in NMI warning perf: Fix perf_lock_task_context() vs RCU perf: Remove WARN_ON_ONCE() check in __perf_event_enable() for valid scenario perf: Clone child context from parent context pmu perf script: Fix broken include in Context.xs perf tools: Fix -ldw/-lelf link test when static linking perf tools: Revert regression in configuration of Python support perf tools: Fix perf version generation perf stat: Fix per-socket output bug for uncore events perf symbols: Fix vdso list searching perf evsel: Fix missing increment in sample parsing perf tools: Update symbol_conf.nr_events when processing attribute events perf tools: Fix new_term() missing free on error path perf tools: Fix parse_events_terms() segfault on error path perf evsel: Fix count parameter to read call in event_format__new perf tools: fix a typo of a Power7 event name perf tools: Fix -x/--exclude-other option for report command perf evlist: Enhance perf_evlist__start_workload() perf record: Remove -f/--force option perf record: Remove -A/--append option ...
2013-07-12perf tools: Make Power7 events available for perfRunzhen Wang
Power7 supports over 530 different perf events but only a small subset of these can be specified by name, for the remaining events, we must specify them by their raw code: perf stat -e r2003c <application> This patch makes all the POWER7 events available in sysfs. So we can instead specify these as: perf stat -e 'cpu/PM_CMPLU_STALL_DFU/' <application> where PM_CMPLU_STALL_DFU is the r2003c in previous example. Before this patch is applied, the size of power7-pmu.o is: $ size arch/powerpc/perf/power7-pmu.o text data bss dec hex filename 3073 2720 0 5793 16a1 arch/powerpc/perf/power7-pmu.o and after the patch is applied, it is: $ size arch/powerpc/perf/power7-pmu.o text data bss dec hex filename 15950 31112 0 47062 b7d6 arch/powerpc/perf/power7-pmu.o For the run time overhead, I use two scripts, one is "event_name.sh", which contains 50 event names, it looks like: # ./perf record -e 'cpu/PM_CMPLU_STALL_DFU/' -e ..... /bin/sleep 1 the other one is named "event_code.sh" which use corresponding events raw code instead of events names, it looks like: # ./perf record -e r2003c -e ...... /bin/sleep 1 below is the result. Using events name: [root@localhost perf]# time ./event_name.sh [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (~102 samples) ] real 0m1.192s user 0m0.028s sys 0m0.106s Using events raw code: [root@localhost perf]# time ./event_code.sh [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.003 MB perf.data (~112 samples) ] real 0m1.198s user 0m0.028s sys 0m0.105s Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Cc: icycoder@gmail.com Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Runzhen Wang <runzhew@clemson.edu> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1372407297-6996-3-git-send-email-runzhen@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-10mm: remove free_area_cacheMichel Lespinasse
Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-10net: rename busy poll socket op and globalsEliezer Tamir
Rename LL_SO to BUSY_POLL_SO Rename sysctl_net_ll_{read,poll} to sysctl_busy_{read,poll} Fix up users of these variables. Fix documentation for sysctl. a patch for the socket.7 man page will follow separately, because of limitations of my mail setup. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "This is a re-do of the net-next pull request for the current merge window. The only difference from the one I made the other day is that this has Eliezer's interface renames and the timeout handling changes made based upon your feedback, as well as a few bug fixes that have trickeled in. Highlights: 1) Low latency device polling, eliminating the cost of interrupt handling and context switches. Allows direct polling of a network device from socket operations, such as recvmsg() and poll(). Currently ixgbe, mlx4, and bnx2x support this feature. Full high level description, performance numbers, and design in commit 0a4db187a999 ("Merge branch 'll_poll'") From Eliezer Tamir. 2) With the routing cache removed, ip_check_mc_rcu() gets exercised more than ever before in the case where we have lots of multicast addresses. Use a hash table instead of a simple linked list, from Eric Dumazet. 3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski, Marek Puzyniak, Michal Kazior, and Sujith Manoharan. 4) Support reporting the TUN device persist flag to userspace, from Pavel Emelyanov. 5) Allow controlling network device VF link state using netlink, from Rony Efraim. 6) Support GRE tunneling in openvswitch, from Pravin B Shelar. 7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from Daniel Borkmann and Eric Dumazet. 8) Allow controlling of TCP quickack behavior on a per-route basis, from Cong Wang. 9) Several bug fixes and improvements to vxlan from Stephen Hemminger, Pravin B Shelar, and Mike Rapoport. In particular, support receiving on multiple UDP ports. 10) Major cleanups, particular in the area of debugging and cookie lifetime handline, to the SCTP protocol code. From Daniel Borkmann. 11) Allow packets to cross network namespaces when traversing tunnel devices. From Nicolas Dichtel. 12) Allow monitoring netlink traffic via AF_PACKET sockets, in a manner akin to how we monitor real network traffic via ptype_all. From Daniel Borkmann. 13) Several bug fixes and improvements for the new alx device driver, from Johannes Berg. 14) Fix scalability issues in the netem packet scheduler's time queue, by using an rbtree. From Eric Dumazet. 15) Several bug fixes in TCP loss recovery handling, from Yuchung Cheng. 16) Add support for GSO segmentation of MPLS packets, from Simon Horman. 17) Make network notifiers have a real data type for the opaque pointer that's passed into them. Use this to properly handle network device flag changes in arp_netdev_event(). From Jiri Pirko and Timo Teräs. 18) Convert several drivers over to module_pci_driver(), from Peter Huewe. 19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a O(1) calculation instead. From Eric Dumazet. 20) Support setting of explicit tunnel peer addresses in ipv6, just like ipv4. From Nicolas Dichtel. 21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet. 22) Prevent a single high rate flow from overruning an individual cpu during RX packet processing via selective flow shedding. From Willem de Bruijn. 23) Don't use spinlocks in TCP md5 signing fast paths, from Eric Dumazet. 24) Don't just drop GSO packets which are above the TBF scheduler's burst limit, chop them up so they are in-bounds instead. Also from Eric Dumazet. 25) VLAN offloads are missed when configured on top of a bridge, fix from Vlad Yasevich. 26) Support IPV6 in ping sockets. From Lorenzo Colitti. 27) Receive flow steering targets should be updated at poll() time too, from David Majnemer. 28) Fix several corner case regressions in PMTU/redirect handling due to the routing cache removal, from Timo Teräs. 29) We have to be mindful of ipv4 mapped ipv6 sockets in upd_v6_push_pending_frames(). From Hannes Frederic Sowa. 30) Fix L2TP sequence number handling bugs, from James Chapman." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits) drivers/net: caif: fix wrong rtnl_is_locked() usage drivers/net: enic: release rtnl_lock on error-path vhost-net: fix use-after-free in vhost_net_flush net: mv643xx_eth: do not use port number as platform device id net: sctp: confirm route during forward progress virtio_net: fix race in RX VQ processing virtio: support unlocked queue poll net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit Documentation: Fix references to defunct linux-net@vger.kernel.org net/fs: change busy poll time accounting net: rename low latency sockets functions to busy poll bridge: fix some kernel warning in multicast timer sfc: Fix memory leak when discarding scattered packets sit: fix tunnel update via netlink dt:net:stmmac: Add dt specific phy reset callback support. dt:net:stmmac: Add support to dwmac version 3.610 and 3.710 dt:net:stmmac: Allocate platform data only if its NULL. net:stmmac: fix memleak in the open method ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available net: ipv6: fix wrong ping_v6_sendmsg return value ...
2013-07-09ptrace/powerpc: revert "hw_breakpoints: Fix racy access to ptrace breakpoints"Oleg Nesterov
This reverts commit 07fa7a0a8a58 ("hw_breakpoints: Fix racy access to ptrace breakpoints") and removes ptrace_get/put_breakpoints() added by other commits. The patch was fine but we can no longer race with SIGKILL after commit 9899d11f6544 ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL"), the __TASK_TRACED tracee can't be woken up and ->ptrace_bps[] can't go away. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michael Neuling <mikey@neuling.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-08perf tools: fix a typo of a Power7 event nameRunzhen Wang
In the Power7 PMU guide: https://www.power.org/documentation/commonly-used-metrics-for-performance-analysis/ PM_BRU_MPRED is referred to as PM_BR_MPRED. It fixed the typo by changing the name of the event in kernel and documentation accordingly. This patch changes the ABI, there are some reasons I think it's ok: - It is relatively new interface, specific to the Power7 platform. - No tools that we know of actually use this interface at this point (none are listed near the interface). - Users of this interface (eg oprofile users migrating to perf) would be more used to the "PM_BR_MPRED" rather than "PM_BRU_MPRED". - These are in the ABI/testing at this point rather than ABI/stable, so hoping we have some wiggle room. Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Cc: icycoder@gmail.com Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Runzhen Wang <runzhew@clemson.edu> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1372407297-6996-2-git-send-email-runzhen@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-06Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linuxLinus Torvalds
Pull irqdomain refactoring from Grant Likely: "This is the long awaited simplification of irqdomain. It gets rid of the different types of irq domains and instead both linear and tree mappings can be supported in a single domain. Doing this removes a lot of special case code and makes irq domains simpler to understand overall" * tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux: irq: fix checkpatch error irqdomain: Include hwirq number in /proc/interrupts irqdomain: make irq_linear_revmap() a fast path again irqdomain: remove irq_domain_generate_simple() irqdomain: Refactor irq_domain_associate_many() irqdomain: Beef up debugfs output irqdomain: Clean up aftermath of irq_domain refactoring irqdomain: Eliminate revmap type irqdomain: merge linear and tree reverse mappings. irqdomain: Add a name field irqdomain: Replace LEGACY mapping with LINEAR irqdomain: Relax failure path on setting up mappings