summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2018-10-31powerpc/xmon: Relax frame size for clangJoel Stanley
When building with clang (8 trunk, 7.0 release) the frame size limit is hit: arch/powerpc/xmon/xmon.c:452:12: warning: stack frame size of 2576 bytes in function 'xmon_core' [-Wframe-larger-than=] Some investigation by Naveen indicates this is due to clang saving the addresses to printf format strings on the stack. While this issue is investigated, bump up the frame size limit for xmon when building with clang. Link: https://github.com/ClangBuiltLinux/linux/issues/252 Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-29Merge branch 'next' of ↵Michael Ellerman
https://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Updates from Scott: "This contains a couple device tree updates, and a fix for a missing prototype warning."
2018-10-26powerpc/process: Fix flush_all_to_thread for SPEFelipe Rechia
Fix a bug introduced by the creation of flush_all_to_thread() for processors that have SPE (Signal Processing Engine) and use it to compute floating-point operations. >From userspace perspective, the problem was seen in attempts of computing floating-point operations which should generate exceptions. For example: fork(); float x = 0.0 / 0.0; isnan(x); // forked process returns False (should be True) The operation above also should always cause the SPEFSCR FINV bit to be set. However, the SPE floating-point exceptions were turned off after a fork(). Kernel versions prior to the bug used flush_spe_to_thread(), which first saves SPEFSCR register values in tsk->thread and then calls giveup_spe(tsk). After commit 579e633e764e, the save_all() function was called first to giveup_spe(), and then the SPEFSCR register values were saved in tsk->thread. This would save the SPEFSCR register values after disabling SPE for that thread, causing the bug described above. Fixes 579e633e764e ("powerpc: create flush_all_to_thread()") Signed-off-by: Felipe Rechia <felipe.rechia@datacom.com.br> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26powerpc/pseries: add missing cpumask.h include fileTyrel Datwyler
Build error is encountered when inlcuding <asm/rtas.h> if no explicit or implicit include of cpumask.h exists in the including file. In file included from arch/powerpc/platforms/pseries/hotplug-pci.c:3:0: ./arch/powerpc/include/asm/rtas.h:360:34: error: unknown type name 'cpumask_var_t' extern int rtas_online_cpus_mask(cpumask_var_t cpus); ^ ./arch/powerpc/include/asm/rtas.h:361:35: error: unknown type name 'cpumask_var_t' extern int rtas_offline_cpus_mask(cpumask_var_t cpus); Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26KVM: PPC: Use exported tb_to_ns() function in decrementer emulationPaul Mackerras
This changes the KVM code that emulates the decrementer function to do the conversion of decrementer values to time intervals in nanoseconds by calling the tb_to_ns() function exported by the powerpc timer code, in preference to open-coded arithmetic using values from the decrementer_clockevent struct. Similarly, the HV-KVM code that did the same conversion using arithmetic on tb_ticks_per_sec also now uses tb_to_ns(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26powerpc/pseries: Export maximum memory valueAravinda Prasad
This patch exports the maximum possible amount of memory configured on the system via /proc/powerpc/lparcfg. Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26powerpc/8xx: Use patch_site for perf counters setupChristophe Leroy
The 8xx TLB miss routines are patched when (de)activating perf counters. This patch uses the new patch_site functionality in order to get a better code readability and avoid a label mess when dumping the code with 'objdump -d' Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26powerpc/8xx: Use patch_site for memory setup patchingChristophe Leroy
The 8xx TLB miss routines are patched at startup at several places. This patch uses the new patch_site functionality in order to get a better code readability and avoid a label mess when dumping the code with 'objdump -d' Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26powerpc/code-patching: Add a helper to get the address of a patch_siteChristophe Leroy
This patch adds a helper to get the address of a patch_site. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Call it "patch site" addr] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP"Christophe Leroy
This reverts commit 4f94b2c7462d9720b2afa7e8e8d4c19446bb31ce. That commit was buggy, as it used rlwinm instead of rlwimi. Instead of fixing that bug, we revert the previous commit in order to reduce the dependency between L1 entries and L2 entries Fixes: 4f94b2c7462d9 ("powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-22powerpc/8xx: add missing header in 8xx_mmu.cChristophe Leroy
arch/powerpc/mm/8xx_mmu.c:174:6: error: no previous prototype for ‘set_context’ [-Werror=missing-prototypes] void set_context(unsigned long id, pgd_t *pgd) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2018-10-22powerpc/8xx: Add DT node for using the SEC engine of the MPC885Christophe Leroy
The MPC885 has SEC engine version 1.2 with the following details: - Number of Crypto channels: 1 - Exec Units: DEU, MDEU and AESU - Available descriptors: 00010, 00100, 00110, 01000, 11000, 11010 It is also supposed to have descriptor 00000, but it doesn't work properly so we keep it out for the moment. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2018-10-21powerpc/msi: Fix compile error on mpc83xxChristophe Leroy
mpic_get_primary_version() is not defined when not using MPIC. The compile error log like: arch/powerpc/sysdev/built-in.o: In function `fsl_of_msi_probe': fsl_msi.c:(.text+0x150c): undefined reference to `fsl_mpic_primary_get_version' Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Radu Rendec <radu.rendec@gmail.com> Fixes: 807d38b73b6 ("powerpc/mpic: Add get_version API both for internal and external use") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-21powerpc: Fix stack protector crashes on CPU hotplugMichael Ellerman
Recently in commit 7241d26e8175 ("powerpc/64: properly initialise the stackprotector canary on SMP.") we fixed a crash with stack protector on SMP by initialising the stack canary in cpu_idle_thread_init(). But this can also causes crashes, when a CPU comes back online after being offline: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: pnv_smp_cpu_kill_self+0x2a0/0x2b0 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.19.0-rc3-gcc-7.3.1-00168-g4ffe713b7587 #94 Call Trace: dump_stack+0xb0/0xf4 (unreliable) panic+0x144/0x328 __stack_chk_fail+0x2c/0x30 pnv_smp_cpu_kill_self+0x2a0/0x2b0 cpu_die+0x48/0x70 arch_cpu_idle_dead+0x20/0x40 do_idle+0x274/0x390 cpu_startup_entry+0x38/0x50 start_secondary+0x5e4/0x600 start_secondary_prolog+0x10/0x14 Looking at the stack we see that the canary value in the stack frame doesn't match the canary in the task/paca. That is because we have reinitialised the task/paca value, but then the CPU coming online has returned into a function using the old canary value. That causes the comparison to fail. Instead we can call boot_init_stack_canary() from start_secondary() which never returns. This is essentially what the generic code does in cpu_startup_entry() under #ifdef X86, we should make that non-x86 specific in a future patch. Fixes: 7241d26e8175 ("powerpc/64: properly initialise the stackprotector canary on SMP.") Reported-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
2018-10-20powerpc/dts/fsl: t2080rdb: reorder the Cortina PHY XFI lanesCamelia Groza
According to the T2080RDB schematics, for the CS4315 PHY, the XFI 1 lane is connected to SFP 2 and the XFI 2 lane is connected to SFP 1. Change the device tree to reflect the correct PHY order and port association. Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
2018-10-20powerpc/traps: restore recoverability of machine_check interruptsChristophe Leroy
commit b96672dd840f ("powerpc: Machine check interrupt is a non- maskable interrupt") added a call to nmi_enter() at the beginning of machine check restart exception handler. Due to that, in_interrupt() always returns true regardless of the state before entering the exception, and die() panics even when the system was not already in interrupt. This patch calls nmi_exit() before calling die() in order to restore the interrupt state we had before calling nmi_enter() Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/64/module: REL32 relocation range checkNicholas Piggin
The recent module relocation overflow crash demonstrated that we have no range checking on REL32 relative relocations. This patch implements a basic check, the same kernel that previously oopsed and rebooted now continues with some of these errors when loading the module: module_64: x_tables: REL32 527703503449812 out of range! Possibly other relocations (ADDR32, REL16, TOC16, etc.) should also have overflow checks. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/64s/radix: Fix radix__flush_tlb_collapsed_pmd double flushing pmdNicholas Piggin
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm: Fix page table dump to work on RadixMichael Ellerman
When we're running on Book3S with the Radix MMU enabled the page table dump currently prints the wrong addresses because it uses the wrong start address. Fix it to use PAGE_OFFSET rather than KERN_VIRT_START. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm/radix: Display if mappings are exec or notMichael Ellerman
At boot we print the ranges we've mapped for the linear mapping and what page size we've used. Also track whether the range is mapped executable or not and display that as well. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm/radix: Simplify split mapping logicMichael Ellerman
If we look closely at the logic in create_physical_mapping(), when we're doing STRICT_KERNEL_RWX, we do the following steps: - determine the gap from where we are to the end of the range - choose an appropriate mapping_size based on the gap - check if that mapping_size would overlap the __init_begin boundary, and if not choose an appropriate mapping_size We can simplify the logic by taking the __init_begin boundary into account when we calculate the initial gap. So add a next_boundary() function which tells us what the next boundary is, either the __init_begin boundary or end. In future we can add more boundaries. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm/radix: Remove the retry in the split mapping logicMichael Ellerman
When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. The current logic uses a goto inside the for loop, which works, but is hard to reason about. When we hit the goto retry case we set max_mapping_size to PMD_SIZE and go back to the start. Setting max_mapping_size means we skip the PUD case and go to the PMD case. We know we will pass the alignment and gap checks because the only reason we are there is we hit the goto retry, and that is guarded by mapping_size == PUD_SIZE, which means addr is PUD aligned and gap is greater or equal to PUD_SIZE. So the only part of the check that can fail is the mmu_psize_defs check for the 2M page size. If we just duplicate that check we can avoid the goto, and we get the same result. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm/radix: Fix small page at boundary when splittingMichael Ellerman
When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. Currently we always use a small page at the text/data boundary, even when that's not necessary: Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages This is because the check that the mapping crosses the __init_begin boundary is too strict, it also returns true when we map exactly up to the boundary. So fix it to check that the mapping would actually map past __init_begin, and with that we see: Mapped 0x0000000000000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm/radix: Fix overuse of small pages in splitting logicMichael Ellerman
When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. But the current logic uses small pages for the entire text section, regardless of whether a larger page size would fit. eg. with the boundary at 16M we could use 2M pages, but instead we use 64K pages up to the 16M boundary: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages This is because the test is checking if addr is < __init_begin and addr + mapping_size is >= _stext. But that is true for all pages between _stext and __init_begin. Instead what we want to check is if we are crossing the text/data boundary, which is at __init_begin. With that fixed we see: Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages ie. we're correctly using 2MB pages below __init_begin, but we still drop down to 64K pages unnecessarily at the boundary. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm/radix: Fix off-by-one in split mapping logicMichael Ellerman
When we have CONFIG_STRICT_KERNEL_RWX enabled, we try to split the kernel linear (1:1) mapping so that the kernel text is in a separate page to kernel data, so we can mark the former read-only. We could achieve that just by always using 64K pages for the linear mapping, but we try to be smarter. Instead we use huge pages when possible, and only switch to smaller pages when necessary. However we have an off-by-one bug in that logic, which causes us to calculate the wrong boundary between text and data. For example with the end of the kernel text at 16M we see: radix-mmu: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages ie. we mapped from 0 to 18M with 64K pages, even though the boundary between text and data is at 16M. With the fix we see we're correctly hitting the 16M boundary: radix-mmu: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/ftrace: Handle large kernel configsNaveen N. Rao
Currently, we expect to be able to reach ftrace_caller() from all ftrace-enabled functions through a single relative branch. With large kernel configs, we see functions outside of 32MB of ftrace_caller() causing ftrace_init() to bail. In such configurations, gcc/ld emits two types of trampolines for mcount(): 1. A long_branch, which has a single branch to mcount() for functions that are one hop away from mcount(): c0000000019e8544 <00031b56.long_branch._mcount>: c0000000019e8544: 4a 69 3f ac b c00000000007c4f0 <._mcount> 2. A plt_branch, for functions that are farther away from mcount(): c0000000051f33f8 <0008ba04.plt_branch._mcount>: c0000000051f33f8: 3d 82 ff a4 addis r12,r2,-92 c0000000051f33fc: e9 8c 04 20 ld r12,1056(r12) c0000000051f3400: 7d 89 03 a6 mtctr r12 c0000000051f3404: 4e 80 04 20 bctr We can reuse those trampolines for ftrace if we can have those trampolines go to ftrace_caller() instead. However, with ABIv2, we cannot depend on r2 being valid. As such, we use only the long_branch trampolines by patching those to instead branch to ftrace_caller or ftrace_regs_caller. In addition, we add additional trampolines around .text and .init.text to catch locations that are covered by the plt branches. This allows ftrace to work with most large kernel configurations. For now, we always patch the trampolines to go to ftrace_regs_caller, which is slightly inefficient. This can be optimized further at a later point. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm: Fix WARN_ON with THP NUMA migrationAneesh Kumar K.V
WARNING: CPU: 12 PID: 4322 at /arch/powerpc/mm/pgtable-book3s64.c:76 set_pmd_at+0x4c/0x2b0 Modules linked in: CPU: 12 PID: 4322 Comm: qemu-system-ppc Tainted: G W 4.19.0-rc3-00758-g8f0c636b0542 #36 NIP: c0000000000872fc LR: c000000000484eec CTR: 0000000000000000 REGS: c000003fba876fe0 TRAP: 0700 Tainted: G W (4.19.0-rc3-00758-g8f0c636b0542) MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 24282884 XER: 00000000 CFAR: c000000000484ee8 IRQMASK: 0 GPR00: c000000000484eec c000003fba877268 c000000001f0ec00 c000003fbd229f80 GPR04: 00007c8fe8e00000 c000003f864c5a38 860300853e0000c0 0000000000000080 GPR08: 0000000080000000 0000000000000001 0401000000000080 0000000000000001 GPR12: 0000000000002000 c000003fffff5400 c000003fce292000 00007c9024570000 GPR16: 0000000000000000 0000000000ffffff 0000000000000001 c000000001885950 GPR20: 0000000000000000 001ffffc0004807c 0000000000000008 c000000001f49d05 GPR24: 00007c8fe8e00000 c0000000020f2468 ffffffffffffffff c000003fcd33b090 GPR28: 00007c8fe8e00000 c000003fbd229f80 c000003f864c5a38 860300853e0000c0 NIP [c0000000000872fc] set_pmd_at+0x4c/0x2b0 LR [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20 Call Trace: [c000003fba877268] [c00000000045931c] mpol_misplaced+0x1bc/0x230 (unreliable) [c000003fba8772c8] [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20 [c000003fba877398] [c00000000040d344] __handle_mm_fault+0x5e4/0x2300 [c000003fba8774d8] [c00000000040f400] handle_mm_fault+0x3a0/0x420 [c000003fba877528] [c0000000003ff6f4] __get_user_pages+0x2e4/0x560 [c000003fba877628] [c000000000400314] get_user_pages_unlocked+0x104/0x2a0 [c000003fba8776c8] [c000000000118f44] __gfn_to_pfn_memslot+0x284/0x6a0 [c000003fba877748] [c0000000001463a0] kvmppc_book3s_radix_page_fault+0x360/0x12d0 [c000003fba877838] [c000000000142228] kvmppc_book3s_hv_page_fault+0x48/0x1300 [c000003fba877988] [c00000000013dc08] kvmppc_vcpu_run_hv+0x1808/0x1b50 [c000003fba877af8] [c000000000126b44] kvmppc_vcpu_run+0x34/0x50 [c000003fba877b18] [c000000000123268] kvm_arch_vcpu_ioctl_run+0x288/0x2d0 [c000003fba877b98] [c00000000011253c] kvm_vcpu_ioctl+0x1fc/0x8c0 [c000003fba877d08] [c0000000004e9b24] do_vfs_ioctl+0xa44/0xae0 [c000003fba877db8] [c0000000004e9c44] ksys_ioctl+0x84/0xf0 [c000003fba877e08] [c0000000004e9cd8] sys_ioctl+0x28/0x80 We removed the pte_protnone check earlier with the understanding that we mark the pte invalid before the set_pte/set_pmd usage. But the huge pmd autonuma still use the set_pmd_at directly. This is ok because a protnone pte won't have translation cache in TLB. Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/time: no steal_time when CONFIG_PPC_SPLPAR is not selectedChristophe Leroy
If CONFIG_PPC_SPLPAR is not selected, steal_time will always be NUL, so accounting it is pointless Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/time: Only set CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC64Christophe Leroy
scaled cputime is only meaningfull when the processor has SPURR and/or PURR, which means only on PPC64. Removing it on PPC32 significantly reduces the size of vtime_account_system() and vtime_account_idle() on an 8xx: Before: 00000000 l F .text 000000a8 vtime_delta 00000280 g F .text 0000010c vtime_account_system 0000038c g F .text 00000048 vtime_account_idle After: (vtime_delta gets inlined inside the two functions) 000001d8 g F .text 000000a0 vtime_account_system 00000278 g F .text 00000038 vtime_account_idle In terms of performance, we also get approximatly 7% improvement on task switch. The following small benchmark app is run with perf stat: void *thread(void *arg) { int i; for (i = 0; i < atoi((char*)arg); i++) pthread_yield(); } int main(int argc, char **argv) { pthread_t th1, th2; pthread_create(&th1, NULL, thread, argv[1]); pthread_create(&th2, NULL, thread, argv[1]); pthread_join(th1, NULL); pthread_join(th2, NULL); return 0; } Before the patch: Performance counter stats for 'chrt -f 98 ./sched 100000' (50 runs): 8228.476465 task-clock (msec) # 0.954 CPUs utilized ( +- 0.23% ) 200004 context-switches # 0.024 M/sec ( +- 0.00% ) After the patch: Performance counter stats for 'chrt -f 98 ./sched 100000' (50 runs): 7649.070444 task-clock (msec) # 0.955 CPUs utilized ( +- 0.27% ) 200004 context-switches # 0.026 M/sec ( +- 0.00% ) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/time: isolate scaled cputime accounting in dedicated functions.Christophe Leroy
scaled cputime is only meaningfull when the processor has SPURR and/or PURR, which means only on PPC64. In preparation of the following patch that will remove CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC32, this patch moves all scaled cputing accounting logic into dedicated functions. This patch doesn't change any functionality. It's only code reorganisation. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/kgdb: add kgdb_arch_set/remove_breakpoint()Christophe Leroy
Generic implementation fails to remove breakpoints after init when CONFIG_STRICT_KERNEL_RWX is selected: [ 13.251285] KGDB: BP remove failed: c001c338 [ 13.259587] kgdbts: ERROR PUT: end of test buffer on 'do_fork_test' line 8 expected OK got $E14#aa [ 13.268969] KGDB: re-enter exception: ALL breakpoints killed [ 13.275099] CPU: 0 PID: 1 Comm: init Not tainted 4.18.0-g82bbb913ffd8 #860 [ 13.282836] Call Trace: [ 13.285313] [c60e1ba0] [c0080ef0] kgdb_handle_exception+0x6f4/0x720 (unreliable) [ 13.292618] [c60e1c30] [c000e97c] kgdb_handle_breakpoint+0x3c/0x98 [ 13.298709] [c60e1c40] [c000af54] program_check_exception+0x104/0x700 [ 13.305083] [c60e1c60] [c000e45c] ret_from_except_full+0x0/0x4 [ 13.310845] [c60e1d20] [c02a22ac] run_simple_test+0x2b4/0x2d4 [ 13.316532] [c60e1d30] [c0081698] put_packet+0xb8/0x158 [ 13.321694] [c60e1d60] [c00820b4] gdb_serial_stub+0x230/0xc4c [ 13.327374] [c60e1dc0] [c0080af8] kgdb_handle_exception+0x2fc/0x720 [ 13.333573] [c60e1e50] [c000e928] kgdb_singlestep+0xb4/0xcc [ 13.339068] [c60e1e70] [c000ae1c] single_step_exception+0x90/0xac [ 13.345100] [c60e1e80] [c000e45c] ret_from_except_full+0x0/0x4 [ 13.350865] [c60e1f40] [c000e11c] ret_from_syscall+0x0/0x38 [ 13.356346] Kernel panic - not syncing: Recursive entry to debugger This patch creates powerpc specific version of kgdb_arch_set_breakpoint() and kgdb_arch_remove_breakpoint() using patch_instruction() Fixes: 1e0fc9d1eb2b ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/sysdev/ipic: check primary_ipic NULL pointer before using itChristophe Leroy
ipic_get_mcp_status() is used by targets implementing NMI watchdog in target specific machine check handler in order to known whether a machine check results from a watchdog NMI reset. In case of very early machine check, primary_ipic pointer might not have been set yet, so ipic_get_mcp_status() needs to check it for nullity before using it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm: fix always true/false warning in slice.cChristophe Leroy
This patch fixes the following warnings (obtained with make W=1). arch/powerpc/mm/slice.c: In function 'slice_range_to_mask': arch/powerpc/mm/slice.c:73:12: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (start < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c:81:20: error: comparison is always false due to limited range of data type [-Werror=type-limits] if ((start + len) > SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c: In function 'slice_mask_for_free': arch/powerpc/mm/slice.c:136:17: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (high_limit <= SLICE_LOW_TOP) ^ arch/powerpc/mm/slice.c: In function 'slice_check_range_fits': arch/powerpc/mm/slice.c:185:12: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (start < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c:195:39: error: comparison is always false due to limited range of data type [-Werror=type-limits] if (SLICE_NUM_HIGH && ((start + len) > SLICE_LOW_TOP)) { ^ arch/powerpc/mm/slice.c: In function 'slice_scan_available': arch/powerpc/mm/slice.c:306:11: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (addr < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c: In function 'get_slice_psize': arch/powerpc/mm/slice.c:709:11: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (addr < SLICE_LOW_TOP) { ^ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm: fix missing prototypes in slice.cChristophe Leroy
This patch fixes the following warnings (obtained with make W=1). arch/powerpc/mm/slice.c: At top level: arch/powerpc/mm/slice.c:682:15: error: no previous prototype for 'arch_get_unmapped_area' [-Werror=missing-prototypes] unsigned long arch_get_unmapped_area(struct file *filp, ^ arch/powerpc/mm/slice.c:692:15: error: no previous prototype for 'arch_get_unmapped_area_topdown' [-Werror=missing-prototypes] unsigned long arch_get_unmapped_area_topdown(struct file *filp, ^ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm: Trace tlbia instructionChristophe Leroy
Add a trace point for tlbia (Translation Lookaside Buffer Invalidate All) instruction. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/mm: Add missing tracepoint for tlbieChristophe Leroy
commit 0428491cba927 ("powerpc/mm: Trace tlbie(l) instructions") added tracepoints for tlbie calls, but _tlbil_va() was forgotten Fixes: 0428491cba927 ("powerpc/mm: Trace tlbie(l) instructions") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/book3s64: fix dump_linuxpagetables "present" flagChristophe Leroy
Since commit bd0dbb73e013 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid."), _PAGE_PRESENT doesn't mean exactly that a page is present. A page is also considered preset when _PAGE_INVALID is set. This patch changes the meaning of "present" and adds a status "valid" associated to the _PAGE_PRESENT flag. Fixes: bd0dbb73e013 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc/pseries: Export raw per-CPU VPA data via debugfsAravinda Prasad
This patch exports the raw per-CPU VPA data via debugfs. A per-CPU file is created which exports the VPA data of that CPU to help debug some of the VPA related issues or to analyze the per-CPU VPA related statistics. v3: Removed offline CPU check. v2: Included offline CPU check and other review comments. Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc64/module elfv1: Set opd addresses after module relocationNaveen N. Rao
module_frob_arch_sections() is called before the module is moved to its final location. The function descriptor section addresses we are setting here are thus invalid. Fix this by processing opd section during module_finalize() Fixes: 5633e85b2c313 ("powerpc64: Add .opd based function descriptor dereference") Cc: stable@vger.kernel.org # v4.16 Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc: Add support for function error injectionNaveen N. Rao
We implement regs_set_return_value() and override_function_with_return() for this purpose. On powerpc, a return from a function (blr) just branches to the location contained in the link register. So, we can just update pt_regs rather than redirecting execution to a dummy function that returns. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/time: Fix clockevent_decrementer initalisation for PR KVMMichael Ellerman
In the recent commit 8b78fdb045de ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer") we changed the way we initialise the decrementer clockevent(s). We no longer initialise the mult & shift values of decrementer_clockevent itself. This has the effect of breaking PR KVM, because it uses those values in kvmppc_emulate_dec(). The symptom is guest kernels spin forever mid-way through boot. For now fix it by assigning back to decrementer_clockevent the mult and shift values. Fixes: 8b78fdb045de ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/aout: Fix struct user definition to use user_pt_regsMichael Ellerman
I'm pretty sure this is dead code, it's only used by the a.out core dump code, and we don't support a.out. We should remove it. But while it's in the tree it should be using the ABI version of pt_regs which is called user_pt_regs in the kernel, because the whole struct is written to the core dump and so its size shouldn't change. Note this isn't a uapi header so we don't need an ifdef. Fixes: 002af9391bfb ("powerpc: Split user/kernel definitions of struct pt_regs") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/uapi: Fix sigcontext definition to use user_pt_regsMichael Ellerman
My recent patch to split pt_regs between user and kernel missed the usage in struct sigcontext. Because this is a user visible struct it should be using the user visible definition, which when we're building for the kernel is called struct user_pt_regs. As far as I can see this hasn't actually caused a bug (yet), because we don't use the sizeof() the sigcontext->regs anywhere. But we should still fix it to avoid confusion and future bugs. Fixes: 002af9391bfb ("powerpc: Split user/kernel definitions of struct pt_regs") Reported-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/io: remove old GCC version implementationChristophe Leroy
GCC 4.6 is the minimum supported now. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc: Add -Werror at arch/powerpc levelMichael Ellerman
Back when I added -Werror in commit ba55bd74360e ("powerpc: Add configurable -Werror for arch/powerpc") I did it by adding it to most of the arch Makefiles. At the time we excluded math-emu, because apparently it didn't build cleanly. But that seems to have been fixed somewhere in the interim. So move the -Werror addition to the top-level of the arch, this saves us from repeating it in every Makefile and means we won't forget to add it to any new sub-dirs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc: Move core kernel logic into arch/powerpc/KbuildMichael Ellerman
This is a nice cleanup, arch/powerpc/Makefile is long and messy so moving this out helps a little. It also allows us to do: $ make arch/powerpc Which can be helpful if you just want to compile test some changes to arch code and not link everything. Finally it also gives us a single place to do subdir-cc-flags assignments which affect the whole of arch/powerpc, which we will do in a future patch. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/traps: remove redundant in_interrupt panic in die()Christophe Leroy
do_exit() already includes a test to panic() is in_interrupt() This patch removes powerpc one which is redundant. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/prom_init: Generate "phandle" instead of "linux, phandle"Benjamin Herrenschmidt
When creating the boot-time FDT from an actual Open Firmware live tree, let's generate "phandle" properties for the phandles instead of the old deprecated "linux,phandle". Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Unsplit warning printf()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc: Check prom_init for disallowed sectionsBenjamin Herrenschmidt
prom_init.c must not modify the kernel image outside of the .bss.prominit section. Thus make sure that prom_init.o doesn't have anything in any of these: .data .bss .init.data Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/prom_init: Move __prombss to it's own section and store it in .bssBenjamin Herrenschmidt
This makes __prombss its own section, and for now store it in .bss. This will give us the ability later to store it elsewhere and/or free it after boot (it's about 8KB). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>