summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2020-01-23powerpc/vdso32: Don't read cache line size from the datapage on PPC32.Christophe Leroy
On PPC32, the cache lines have a fixed size known at build time. Don't read it from the datapage. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/dfa7b35e27e01964fcda84bf1ed8b2b31cf93826.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/vdso32: inline __get_datapage()Christophe Leroy
__get_datapage() is only a few instructions to retrieve the address of the page where the kernel stores data to the VDSO. By inlining this function into its users, a bl/blr pair and a mflr/mtlr pair is avoided, plus a few reg moves. The improvement is noticeable (about 55 nsec/call on an 8xx) vdsotest before the patch: gettimeofday: vdso: 731 nsec/call clock-gettime-realtime-coarse: vdso: 668 nsec/call clock-gettime-monotonic-coarse: vdso: 745 nsec/call vdsotest after the patch: gettimeofday: vdso: 677 nsec/call clock-gettime-realtime-coarse: vdso: 613 nsec/call clock-gettime-monotonic-coarse: vdso: 690 nsec/call Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c39ef7f3dfa25356b01e211d539671f279086c09.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/vdso32: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSEChristophe Leroy
This is copied and adapted from commit 5c929885f1bb ("powerpc/vdso64: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE") from Santosh Sivaraj <santosh@fossix.org> Benchmark from vdsotest-all: clock-gettime-realtime: syscall: 3601 nsec/call clock-gettime-realtime: libc: 1072 nsec/call clock-gettime-realtime: vdso: 931 nsec/call clock-gettime-monotonic: syscall: 4034 nsec/call clock-gettime-monotonic: libc: 1213 nsec/call clock-gettime-monotonic: vdso: 1076 nsec/call clock-gettime-realtime-coarse: syscall: 2722 nsec/call clock-gettime-realtime-coarse: libc: 805 nsec/call clock-gettime-realtime-coarse: vdso: 668 nsec/call clock-gettime-monotonic-coarse: syscall: 2949 nsec/call clock-gettime-monotonic-coarse: libc: 882 nsec/call clock-gettime-monotonic-coarse: vdso: 745 nsec/call Additional test passed with: vdsotest -d 30 clock-gettime-monotonic-coarse verify Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/linuxppc/issues/issues/41 Link: https://lore.kernel.org/r/d1d24a376e396540194eeb85a2efe481e92ade24.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/32: Add VDSO version of getcpu on non SMPChristophe Leroy
Commit 18ad51dd342a ("powerpc: Add VDSO version of getcpu") added getcpu() for PPC64 only, by making use of a user readable general purpose SPR. PPC32 doesn't have any such SPR. For non SMP, just return CPU id 0 from the VDSO directly. PPC32 doesn't support CONFIG_NUMA so NUMA node is always 0. Before the patch, vdsotest reported: getcpu: syscall: 1572 nsec/call getcpu: libc: 1787 nsec/call getcpu: vdso: not tested Now, vdsotest reports: getcpu: syscall: 1582 nsec/call getcpu: libc: 502 nsec/call getcpu: vdso: 187 nsec/call Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eaac4b6494ecff1811220fccc895bf282aab884a.1575273217.git.christophe.leroy@c-s.fr
2020-01-23powerpc/devicetrees: Change 'gpios' to 'cs-gpios' on fsl, spi nodesChristophe Leroy
Since commit 0f0581b24bd0 ("spi: fsl: Convert to use CS GPIO descriptors"), the prefered way to define chipselect GPIOs is using 'cs-gpios' property instead of the legacy 'gpios' property. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7556683b57d8ce100855857f03d1cd3d2903d045.1574943062.git.christophe.leroy@c-s.fr
2020-01-23powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size.Christophe Leroy
Unlike standard powerpc, Powerpc 8xx doesn't have SPRN_DABR, but it has a breakpoint support based on a set of comparators which allow more flexibility. Commit 4ad8622dc548 ("powerpc/8xx: Implement hw_breakpoint") implemented breakpoints by emulating the DABR behaviour. It did this by setting one comparator the match 4 bytes at breakpoint address and the other comparator to match 4 bytes at breakpoint address + 4. Rewrite 8xx hw_breakpoint to make breakpoints match all addresses defined by the breakpoint address and length by making full use of comparators. Now, comparator E is set to match any address greater than breakpoint address minus one. Comparator F is set to match any address lower than breakpoint address plus breakpoint length. Addresses are aligned to 32 bits. When the breakpoint range starts at address 0, the breakpoint is set to match comparator F only. When the breakpoint range end at address 0xffffffff, the breakpoint is set to match comparator E only. Otherwise the breakpoint is set to match comparator E and F. At the same time, use registers bit names instead of hardcoded values. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/05105deeaf63bc02151aea2cdeaf525534e0e9d4.1574790198.git.christophe.leroy@c-s.fr
2020-01-23powerpc/8xx: Fix permanently mapped IMMR region.Christophe Leroy
When not using large TLBs, the IMMR region is still mapped as a whole block in the FIXMAP area. Properly report that the IMMR region is block-mapped even when not using large TLBs. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/45f4f414bcd7198b0755cf4287ff216fbfc24b9d.1574774187.git.christophe.leroy@c-s.fr
2020-01-23powerpc/ptdump: Only enable PPC_CHECK_WX with STRICT_KERNEL_RWXChristophe Leroy
ptdump_check_wx() is called from mark_rodata_ro() which only exists when CONFIG_STRICT_KERNEL_RWX is selected. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/922d4939c735c6b52b4137838bcc066fffd4fc33.1578989545.git.christophe.leroy@c-s.fr
2020-01-23powerpc/ptdump: Fix W+X verificationChristophe Leroy
Verification cannot rely on simple bit checking because on some platforms PAGE_RW is 0, checking that a page is not W means checking that PAGE_RO is set instead of checking that PAGE_RW is not set. Use pte helpers instead of checking bits. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0d894839fdbb19070f0e1e4140363be4f2bb62fc.1578989540.git.christophe.leroy@c-s.fr
2020-01-23powerpc/ptdump: Fix W+X verification call in mark_rodata_ro()Christophe Leroy
ptdump_check_wx() also have to be called when pages are mapped by blocks. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/37517da8310f4457f28921a4edb88fb21d27b62a.1578989531.git.christophe.leroy@c-s.fr
2020-01-23powerpc/ptdump: don't entirely rebuild kernel when selecting CONFIG_PPC_DEBUG_WXChristophe Leroy
Selecting CONFIG_PPC_DEBUG_WX only impacts ptdump and pgtable_32/64 init calls. Declaring related functions in asm/pgtable.h implies rebuilding almost everything. Move ptdump_check_wx() declaration in mm/mmu_decl.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bf34fd9dca61eadf9a134a9f89ebbc162cfd5f86.1578986011.git.christophe.leroy@c-s.fr
2020-01-16powerpc/64s: Reimplement power4_idle code in CNicholas Piggin
This implements the tricky tracing and soft irq handling bits in C, leaving the low level bit to asm. A functional difference is that this redirects the interrupt exit to a return stub to execute blr, rather than the lr address itself. This is probably barely measurable on real hardware, but it keeps the link stack balanced. Tested with QEMU. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Move power4_fixup_nap back into exceptions-64s.S] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190711022404.18132-1-npiggin@gmail.com
2020-01-16powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLERussell Currey
I have tested this with the Radix MMU and everything seems to work, and the previous patch for Hash seems to fix everything too. STRICT_KERNEL_RWX should still be disabled by default for now. Please test STRICT_KERNEL_RWX + RELOCATABLE! Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191224064126.183670-2-ruscur@russell.cc
2020-01-16powerpc/book3s64/hash: Disable 16M linear mapping size if not alignedRussell Currey
With STRICT_KERNEL_RWX on in a relocatable kernel under the hash MMU, if the position the kernel is loaded at is not 16M aligned things go horribly wrong. Specifically hash__mark_initmem_nx() will call hash__change_memory_range() which then aligns down the start address, and due to the text not being 16M aligned causes some of the kernel text to be marked non-executable. We can avoid this when selecting the linear mapping size, so do so and print a warning. I tested this for various alignments and as long as the position is 64K aligned it's fine (the base requirement for powerpc). Signed-off-by: Russell Currey <ruscur@russell.cc> [mpe: Add details of the failure mode] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191224064126.183670-1-ruscur@russell.cc
2020-01-14powerpc/pseries: Advance pfn if section is not present in lmb_is_removable()Pingfan Liu
In lmb_is_removable(), if a section is not present, it should continue to test the rest of the sections in the block. But the current code fails to do so. Fixes: 51925fb3c5c9 ("powerpc/pseries: Implement memory hotplug remove in the kernel") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1578632042-12415-1-git-send-email-kernelfans@gmail.com
2020-01-14powerpc/xmon: don't access ASDR in VMsSukadev Bhattiprolu
ASDR is HV-privileged and must only be accessed in HV-mode. Fixes a Program Check (0x700) when xmon in a VM dumps SPRs. Fixes: d1e1b351f50f ("powerpc/xmon: Add ISA v3.0 SPRs to SPR dump") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200107021633.GB29843@us.ibm.com
2020-01-07powerpc/mpc85xx: also write addr_h to spin table for 64bit boot entryBai Yingjie
CPU like P4080 has 36bit physical address, its DDR physical start address can be configured above 4G by LAW registers. For such systems in which their physical memory start address was configured higher than 4G, we need also to write addr_h into the spin table of the target secondary CPU, so that addr_h and addr_l together represent a 64bit physical address. Otherwise the secondary core can not get correct entry to start from. Signed-off-by: Bai Yingjie <byj.tea@gmail.com> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200106042957.26494-2-yingjie_bai@126.com
2020-01-07powerpc32/booke: consistently return phys_addr_t in __pa()Bai Yingjie
When CONFIG_RELOCATABLE=y is set, VIRT_PHYS_OFFSET is a 64bit variable, thus __pa() returns as 64bit value. But when CONFIG_RELOCATABLE=n, __pa() returns 32bit value. When CONFIG_PHYS_64BIT is set, __pa() should consistently return as 64bit value irrelevant to CONFIG_RELOCATABLE. So we'd make __pa() consistently return phys_addr_t, which is 64bit when CONFIG_PHYS_64BIT is set. Signed-off-by: Bai Yingjie <byj.tea@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200106042957.26494-1-yingjie_bai@126.com
2020-01-07powerpc/powernv: use resource_sizeJulia Lawall
Use resource_size rather than a verbose computation on the end and start fields. The semantic patch that makes these changes is as follows: (http://coccinelle.lip6.fr/) <smpl> @@ struct resource ptr; @@ - (ptr.end - ptr.start + 1) + resource_size(&ptr) </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1577900990-8588-11-git-send-email-Julia.Lawall@inria.fr
2020-01-07powerpc/83xx: use resource_sizeJulia Lawall
Use resource_size rather than a verbose computation on the end and start fields. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) <smpl> @@ struct resource ptr; @@ - (ptr.end - ptr.start + 1) + resource_size(&ptr) </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1577900990-8588-6-git-send-email-Julia.Lawall@inria.fr
2020-01-07powerpc/mpic: constify copied structureJulia Lawall
The mpic_ipi_chip and mpic_irq_ht_chip structures are only copied into other structures, so make them const. The opportunity for this change was found using Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1577864614-5543-10-git-send-email-Julia.Lawall@inria.fr
2020-01-06powerpc/85xx: Get twr_p102x to compile againSebastian Andrzej Siewior
With CONFIG_QUICC_ENGINE enabled and CONFIG_UCC_GETH + CONFIG_SERIAL_QE disabled we have an unused variable (np). The code won't compile with -Werror. Move the np variable to the block where it is actually used. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191219151602.1908411-1-bigeasy@linutronix.de
2020-01-06powerpc/pseries/svm: Allow IOMMU to work in SVMAlexey Kardashevskiy
H_PUT_TCE_INDIRECT uses a shared page to send up to 512 TCE to a hypervisor in a single hypercall. This does not work for secure VMs as the page needs to be shared or the VM should use H_PUT_TCE instead. This disables H_PUT_TCE_INDIRECT by clearing the FW_FEATURE_PUT_TCE_IND feature bit so SVMs will map TCEs using H_PUT_TCE. This is not a part of init_svm() as it is called too late after FW patching is done and may result in a warning like this: [ 3.727716] Firmware features changed after feature patching! [ 3.727965] WARNING: CPU: 0 PID: 1 at (...)arch/powerpc/lib/feature-fixups.c:466 check_features+0xa4/0xc0 Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Tested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191216041924.42318-5-aik@ozlabs.ru
2020-01-06powerpc/pseries/iommu: Separate FW_FEATURE_MULTITCE to put/stuff featuresAlexey Kardashevskiy
H_PUT_TCE_INDIRECT allows packing up to 512 TCE updates into a single hypercall; H_STUFF_TCE can clear lots in a single hypercall too. However, unlike H_STUFF_TCE (which writes the same TCE to all entries), H_PUT_TCE_INDIRECT uses a 4K page with new TCEs. In a secure VM environment this means sharing a secure VM page with a hypervisor which we would rather avoid. This splits the FW_FEATURE_MULTITCE feature into FW_FEATURE_PUT_TCE_IND and FW_FEATURE_STUFF_TCE. "hcall-multi-tce" in the "/rtas/ibm,hypertas-functions" device tree property sets both; the "multitce=off" kernel command line parameter disables both. This should not cause behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Tested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191216041924.42318-4-aik@ozlabs.ru
2020-01-06powerpc/pseries: Allow not having ibm, hypertas-functions::hcall-multi-tce ↵Alexey Kardashevskiy
for DDW By default a pseries guest supports a H_PUT_TCE hypercall which maps a single IOMMU page in a DMA window. Additionally the hypervisor may support H_PUT_TCE_INDIRECT/H_STUFF_TCE which update multiple TCEs at once; this is advertised via the device tree /rtas/ibm,hypertas-functions property which Linux converts to FW_FEATURE_MULTITCE. FW_FEATURE_MULTITCE is checked when dma_iommu_ops is used; however the code managing the huge DMA window (DDW) ignores it and calls H_PUT_TCE_INDIRECT even if it is explicitly disabled via the "multitce=off" kernel command line parameter. This adds FW_FEATURE_MULTITCE checking to the DDW code path. This changes tce_build_pSeriesLP to take liobn and page size as the huge window does not have iommu_table descriptor which usually the place to store these numbers. Fixes: 4e8b0cf46b25 ("powerpc/pseries: Add support for dynamic dma windows") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Tested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191216041924.42318-3-aik@ozlabs.ru
2020-01-06Revert "powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests"Ram Pai
This reverts commit edea902c1c1efb855f77e041f9daf1abe7a9768a. At the time the change allowed direct DMA ops for secure VMs; however since then we switched on using SWIOTLB backed with IOMMU (direct mapping) and to make this work, we need dma_iommu_ops which handles all cases including TCE mapping I/O pages in the presence of an IOMMU. Fixes: edea902c1c1e ("powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests") Signed-off-by: Ram Pai <linuxram@us.ibm.com> [aik: added "revert" and "fixes:"] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Tested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191216041924.42318-2-aik@ozlabs.ru
2020-01-06powerpc/pseries: Remove redundant select of PPC_DOORBELLMichael Ellerman
Commit d4e58e5928f8 ("powerpc/powernv: Enable POWER8 doorbell IPIs") added a select of PPC_DOORBELL to PPC_PSERIES, but it already had a select of PPC_DOORBELL. One is enough. Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191219125840.32592-1-mpe@ellerman.id.au
2020-01-06powerpc/512x: Use dma_request_chan() instead dma_request_slave_channel()Peter Ujfalusi
dma_request_slave_channel() is a wrapper on top of dma_request_chan() eating up the error code. By using dma_request_chan() directly the driver can support deferred probing against DMA. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191217073730.21249-1-peter.ujfalusi@ti.com
2020-01-06powerpc/pci: Remove pcibios_setup_bus_devices()Oliver O'Halloran
With the previous patch applied pcibios_setup_device() will always be run when pcibios_bus_add_device() is called. There are several code paths where pcibios_setup_bus_device() is still called (the PowerPC specific PCI hotplug support is one) so with just the previous patch applied the setup can be run multiple times on a device, once before the device is added to the bus and once after. There's no need to run the setup in the early case any more so just remove it entirely. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191028085424.12006-3-oohall@gmail.com
2020-01-06powerpc/pci: Fix pcibios_setup_device() orderingShawn Anastasio
Move PCI device setup from pcibios_add_device() and pcibios_fixup_bus() to pcibios_bus_add_device(). This ensures that platform-specific DMA and IOMMU setup occurs after the device has been registered in sysfs, which is a requirement for IOMMU group assignment to work This fixes IOMMU group assignment for hotplugged devices on pseries, where the existing behavior results in IOMMU assignment before registration. Thanks to Lukas Wunner <lukas@wunner.de> for the suggestion. Signed-off-by: Shawn Anastasio <shawn@anastas.io> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191028085424.12006-2-oohall@gmail.com
2020-01-06powerpc/powernv/iov: Ensure the pdn for VFs always contains a valid PE numberOliver O'Halloran
On pseries there is a bug with adding hotplugged devices to an IOMMU group. For a number of dumb reasons fixing that bug first requires re-working how VFs are configured on PowerNV. For background, on PowerNV we use the pcibios_sriov_enable() hook to do two things: 1. Create a pci_dn structure for each of the VFs, and 2. Configure the PHB's internal BARs so the MMIO range for each VF maps to a unique PE. Roughly speaking a PE is the hardware counterpart to a Linux IOMMU group since all the devices in a PE share the same IOMMU table. A PE also defines the set of devices that should be isolated in response to a PCI error (i.e. bad DMA, UR/CA, AER events, etc). When isolated all MMIO and DMA traffic to and from devicein the PE is blocked by the root complex until the PE is recovered by the OS. The requirement to block MMIO causes a giant headache because the P8 PHB generally uses a fixed mapping between MMIO addresses and PEs. As a result we need to delay configuring the IOMMU groups for device until after MMIO resources are assigned. For physical devices (i.e. non-VFs) the PE assignment is done in pcibios_setup_bridge() which is called immediately after the MMIO resources for downstream devices (and the bridge's windows) are assigned. For VFs the setup is more complicated because: a) pcibios_setup_bridge() is not called again when VFs are activated, and b) The pci_dev for VFs are created by generic code which runs after pcibios_sriov_enable() is called. The work around for this is a two step process: 1. A fixup in pcibios_add_device() is used to initialised the cached pe_number in pci_dn, then 2. A bus notifier then adds the device to the IOMMU group for the PE specified in pci_dn->pe_number. A side effect fixing the pseries bug mentioned in the first paragraph is moving the fixup out of pcibios_add_device() and into pcibios_bus_add_device(), which is called much later. This results in step 2. failing because pci_dn->pe_number won't be initialised when the bus notifier is run. We can fix this by removing the need for the fixup. The PE for a VF is known before the VF is even scanned so we can initialise pci_dn->pe_number pcibios_sriov_enable() instead. Unfortunately, moving the initialisation causes two problems: 1. We trip the WARN_ON() in the current fixup code, and 2. The EEH core clears pdn->pe_number when recovering a VF and relies on the fixup to correctly re-set it. The only justification for either of these is a comment in eeh_rmv_device() suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order for the VF to be scanned. However, this comment appears to have no basis in reality. Both bugs can be fixed by just deleting the code. Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191028085424.12006-1-oohall@gmail.com
2020-01-06powerpc/papr_scm: Update debug messageAneesh Kumar K.V
Resource struct p->res is assigned later. Avoid using %pR before the resource struct is assigned. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191202063855.154321-1-aneesh.kumar@linux.ibm.com
2020-01-06powerpc/44x: Adjust indentation in ibm4xx_denali_fixup_memsizeNathan Chancellor
Clang warns: ../arch/powerpc/boot/4xx.c:231:3: warning: misleading indentation; statement is not part of the previous 'else' [-Wmisleading-indentation] val = SDRAM0_READ(DDR0_42); ^ ../arch/powerpc/boot/4xx.c:227:2: note: previous statement is here else ^ This is because there is a space at the beginning of this line; remove it so that the indentation is consistent according to the Linux kernel coding style and clang no longer warns. Fixes: d23f5099297c ("[POWERPC] 4xx: Adds decoding of 440SPE memory size to boot wrapper library") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/780 Link: https://lore.kernel.org/r/20191209200338.12546-1-natechancellor@gmail.com
2020-01-06powerpc/64: Use {SAVE,REST}_NVGPRS macrosJordan Niethe
In entry_64.S there are places that open code saving and restoring the non-volatile registers. There are already macros for doing this so use them. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211023552.16480-1-jniethe5@gmail.com
2020-01-05Merge tag 'riscv/for-v5.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "Several fixes for RISC-V: - Fix function graph trace support - Prefix the CSR IRQ_* macro names with "RV_", to avoid collisions with macros elsewhere in the Linux kernel tree named "IRQ_TIMER" - Use __pa_symbol() when computing the physical address of a kernel symbol, rather than __pa() - Mark the RISC-V port as supporting GCOV One DT addition: - Describe the L2 cache controller in the FU540 DT file One documentation update: - Add patch acceptance guideline documentation" * tag 'riscv/for-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: Documentation: riscv: add patch acceptance guidelines riscv: prefix IRQ_ macro names with an RV_ namespace clocksource: riscv: add notrace to riscv_sched_clock riscv: ftrace: correct the condition logic in function graph tracer riscv: dts: Add DT support for SiFive L2 cache controller riscv: gcov: enable gcov for RISC-V riscv: mm: use __pa_symbol for kernel symbols
2020-01-04riscv: prefix IRQ_ macro names with an RV_ namespacePaul Walmsley
"IRQ_TIMER", used in the arch/riscv CSR header file, is a sufficiently generic macro name that it's used by several source files across the Linux code base. Some of these other files ultimately include the arch/riscv CSR include file, causing collisions. Fix by prefixing the RISC-V csr.h IRQ_ macro names with an RV_ prefix. Fixes: a4c3733d32a72 ("riscv: abstract out CSR names for supervisor vs machine mode") Reported-by: Olof Johansson <olof@lixom.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-04Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "17 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: hexagon: define ioremap_uc ocfs2: fix the crash due to call ocfs2_get_dlm_debug once less ocfs2: call journal flush to mark journal as empty after journal recovery when mount mm/hugetlb: defer freeing of huge pages if in non-task context mm/gup: fix memory leak in __gup_benchmark_ioctl mm/oom: fix pgtables units mismatch in Killed process message fs/posix_acl.c: fix kernel-doc warnings hexagon: work around compiler crash hexagon: parenthesize registers in asm predicates fs/namespace.c: make to_mnt_ns() static fs/nsfs.c: include headers for missing declarations fs/direct-io.c: include fs/internal.h for missing prototype mm: move_pages: return valid node id in status if the page is already on the target node memcg: account security cred as well to kmemcg kcov: fix struct layout for kcov_remote_arg mm/zsmalloc.c: fix the migrated zspage statistics. mm/memory_hotplug: shrink zones when offlining memory
2020-01-04Merge tag 'mips_fixes_5.5_1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A collection of MIPS fixes: - Fill the struct cacheinfo shared_cpu_map field with sensible values, notably avoiding issues with perf which was unhappy in the absence of these values. - A boot fix for Loongson 2E & 2F machines which was fallout from some refactoring performed this cycle. - A Kconfig dependency fix for the Loongson CPU HWMon driver. - A couple of VDSO fixes, ensuring gettimeofday() behaves appropriately for kernel configurations that don't include support for a clocksource the VDSO can use & fixing the calling convention for the n32 & n64 VDSOs which would previously clobber the $gp/$28 register. - A build fix for vmlinuz compressed images which were inappropriately building with -fsanitize-coverage despite not being part of the kernel proper, then failing to link due to the missing __sanitizer_cov_trace_pc() function. - A couple of eBPF JIT fixes, including disabling it for MIPS32 due to a large number of issues with the code generated there & reflecting ISA dependencies in Kconfig to enforce that systems which don't support the JIT must include the interpreter" * tag 'mips_fixes_5.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: Avoid VDSO ABI breakage due to global register variable MIPS: BPF: eBPF JIT: check for MIPS ISA compliance in Kconfig MIPS: BPF: Disable MIPS32 eBPF JIT MIPS: Prevent link failure with kcov instrumentation MIPS: Kconfig: Use correct form for 'depends on' mips: Fix gettimeofday() in the vdso library MIPS: Fix boot on Fuloong2 systems mips: cacheinfo: report shared CPU map
2020-01-04hexagon: define ioremap_ucNick Desaulniers
Similar to commit 38e45d81d14e ("sparc64: implement ioremap_uc") define ioremap_uc for hexagon to avoid errors from -Wimplicit-function-definition. Link: http://lkml.kernel.org/r/20191209222956.239798-2-ndesaulniers@google.com Link: https://github.com/ClangBuiltLinux/linux/issues/797 Fixes: e537654b7039 ("lib: devres: add a helper function for ioremap_uc") Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Suggested-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Brian Cain <bcain@codeaurora.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Tuowen Zhao <ztuowen@gmail.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Will Deacon <will@kernel.org> Cc: Richard Fontana <rfontana@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04hexagon: work around compiler crashNick Desaulniers
Clang cannot translate the string "r30" into a valid register yet. Link: https://github.com/ClangBuiltLinux/linux/issues/755 Link: http://lkml.kernel.org/r/20191028155722.23419-1-ndesaulniers@google.com Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Suggested-by: Sid Manning <sidneym@quicinc.com> Reviewed-by: Brian Cain <bcain@codeaurora.org> Cc: Allison Randal <allison@lohutok.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Richard Fontana <rfontana@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04hexagon: parenthesize registers in asm predicatesNick Desaulniers
Hexagon requires that register predicates in assembly be parenthesized. Link: https://github.com/ClangBuiltLinux/linux/issues/754 Link: http://lkml.kernel.org/r/20191209222956.239798-3-ndesaulniers@google.com Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Suggested-by: Sid Manning <sidneym@codeaurora.org> Acked-by: Brian Cain <bcain@codeaurora.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Tuowen Zhao <ztuowen@gmail.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Will Deacon <will@kernel.org> Cc: Richard Fontana <rfontana@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04mm/memory_hotplug: shrink zones when offlining memoryDavid Hildenbrand
We currently try to shrink a single zone when removing memory. We use the zone of the first page of the memory we are removing. If that memmap was never initialized (e.g., memory was never onlined), we will read garbage and can trigger kernel BUGs (due to a stale pointer): BUG: unable to handle page fault for address: 000000000000353d #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:clear_zone_contiguous+0x5/0x10 Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840 RSP: 0018:ffffad2400043c98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000 RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40 RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000 R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680 FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __remove_pages+0x4b/0x640 arch_remove_memory+0x63/0x8d try_remove_memory+0xdb/0x130 __remove_memory+0xa/0x11 acpi_memory_device_remove+0x70/0x100 acpi_bus_trim+0x55/0x90 acpi_device_hotplug+0x227/0x3a0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x221/0x550 worker_thread+0x50/0x3b0 kthread+0x105/0x140 ret_from_fork+0x3a/0x50 Modules linked in: CR2: 000000000000353d Instead, shrink the zones when offlining memory or when onlining failed. Introduce and use remove_pfn_range_from_zone(() for that. We now properly shrink the zones, even if we have DIMMs whereby - Some memory blocks fall into no zone (never onlined) - Some memory blocks fall into multiple zones (offlined+re-onlined) - Multiple memory blocks that fall into different zones Drop the zone parameter (with a potential dubious value) from __remove_pages() and __remove_section(). Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319] Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: <stable@vger.kernel.org> [5.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-03Merge tag 'powerpc-5.5-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Two more powerpc fixes for 5.5: - One commit to fix a build error when CONFIG_JUMP_LABEL=n, introduced by our recent fix to is_shared_processor(). - A commit marking some SLB related functions as notrace, as tracing them triggers warnings. Thanks to Jason A Donenfeld" * tag 'powerpc-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/spinlocks: Include correct header for static key powerpc/mm: Mark get_slice_psize() & slice_addr_is_low() as notrace
2020-01-03riscv: ftrace: correct the condition logic in function graph tracerZong Li
The condition should be logical NOT to assign the hook address to parent address. Because the return value 0 of function_graph_enter upon success. Fixes: e949b6db51dc (riscv/function_graph: Simplify with function_graph_enter()) Signed-off-by: Zong Li <zong.li@sifive.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: stable@vger.kernel.org Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-03riscv: dts: Add DT support for SiFive L2 cache controllerYash Shah
Add the L2 cache controller DT node in SiFive FU540 soc-specific DT file Signed-off-by: Yash Shah <yash.shah@sifive.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-03riscv: gcov: enable gcov for RISC-VZong Li
This patch enables GCOV code coverage measurement on RISC-V. Lightly tested on QEMU and Hifive Unleashed board, seems to work as expected. Signed-off-by: Zong Li <zong.li@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-03riscv: mm: use __pa_symbol for kernel symbolsZong Li
__pa_symbol is the marcro that should be used for kernel symbols. It is also a pre-requisite for DEBUG_VIRTUAL which will do bounds checking. Signed-off-by: Zong Li <zong.li@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-02MIPS: Avoid VDSO ABI breakage due to global register variablePaul Burton
Declaring __current_thread_info as a global register variable has the effect of preventing GCC from saving & restoring its value in cases where the ABI would typically do so. To quote GCC documentation: > If the register is a call-saved register, call ABI is affected: the > register will not be restored in function epilogue sequences after the > variable has been assigned. Therefore, functions cannot safely return > to callers that assume standard ABI. When our position independent VDSO is built for the n32 or n64 ABIs all functions it exposes should be preserving the value of $gp/$28 for their caller, but in the presence of the __current_thread_info global register variable GCC stops doing so & simply clobbers $gp/$28 when calculating the address of the GOT. In cases where the VDSO returns success this problem will typically be masked by the caller in libc returning & restoring $gp/$28 itself, but that is by no means guaranteed. In cases where the VDSO returns an error libc will typically contain a fallback path which will now fail (typically with a bad memory access) if it attempts anything which relies upon the value of $gp/$28 - eg. accessing anything via the GOT. One fix for this would be to move the declaration of __current_thread_info inside the current_thread_info() function, demoting it from global register variable to local register variable & avoiding inadvertently creating a non-standard calling ABI for the VDSO. Unfortunately this causes issues for clang, which doesn't support local register variables as pointed out by commit fe92da0f355e ("MIPS: Changed current_thread_info() to an equivalent supported by both clang and GCC") which introduced the global register variable before we had a VDSO to worry about. Instead, fix this by continuing to use the global register variable for the kernel proper but declare __current_thread_info as a simple extern variable when building the VDSO. It should never be referenced, and will cause a link error if it is. This resolves the calling convention issue for the VDSO without having any impact upon the build of the kernel itself for either clang or gcc. Signed-off-by: Paul Burton <paulburton@kernel.org> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <christian.brauner@canonical.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: <stable@vger.kernel.org> # v4.4+ Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org
2019-12-30powerpc/spinlocks: Include correct header for static keyJason A. Donenfeld
Recently, the spinlock implementation grew a static key optimization, but the jump_label.h header include was left out, leading to build errors: linux/arch/powerpc/include/asm/spinlock.h:44:7: error: implicit declaration of function ‘static_branch_unlikely’ 44 | if (!static_branch_unlikely(&shared_processor)) This commit adds the missing header. mpe: The build break is only seen with CONFIG_JUMP_LABEL=n. Fixes: 656c21d6af5d ("powerpc/shared: Use static key to detect shared processor") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191223133147.129983-1-Jason@zx2c4.com
2019-12-27riscv: export flush_icache_all to modulesOlof Johansson
This is needed by LKDTM (crash dump test module), it calls flush_icache_range(), which on RISC-V turns into flush_icache_all(). On other architectures, the actual implementation is exported, so follow that precedence and export it here too. Fixes build of CONFIG_LKDTM that fails with: ERROR: "flush_icache_all" [drivers/misc/lkdtm/lkdtm.ko] undefined! Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>