summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2017-10-21powerpc/powernv: Enable TM without suspend if possibleMichael Ellerman
Some Power9 revisions can run in a mode where TM operates without suspended state. If we find ourself on a CPU that might be in this mode, we query OPAL to check, and if so we reenable TM in CPU features, and enable a new user feature to signal to userspace that we are in this mode. We do not enable the "normal" user feature, PPC_FEATURE2_HTM, but we do enable PPC_FEATURE2_HTM_NOSC because that indicates to userspace that the kernel will abort transactions on syscall entry, which is true regardless of the suspend mode. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-20powerpc: Add PPC_FEATURE2_HTM_NO_SUSPENDMichael Ellerman
Some CPUs can operate in a mode where TM (Transactional Memory) is enabled but the suspended state of TM is disabled. In this mode tsuspend does not enter suspended state, instead the transaction is aborted. Similarly any other event that would lead to suspended state instead aborts the transaction. There is also an ABI change, in that in this mode processes are not allowed to sigreturn with an MSR that would lead to suspended state, Linux will instead return an error to the sigreturn syscall. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-20powerpc/tm: Add commandline option to disable hardware transactional memoryCyril Bur
Currently the kernel relies on firmware to inform it whether or not the CPU supports HTM and as long as the kernel was built with CONFIG_PPC_TRANSACTIONAL_MEM=y then it will allow userspace to make use of the facility. There may be situations where it would be advantageous for the kernel to not allow userspace to use HTM, currently the only way to achieve this is to recompile the kernel with CONFIG_PPC_TRANSACTIONAL_MEM=n. This patch adds a simple commandline option so that HTM can be disabled at boot time. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> [mpe: Simplify to a bool, move to prom.c, put doco in the right place. Always disable, regardless of initial state, to avoid user confusion.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-20Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Bring in some KVM commits we need (the TM one in particular).
2017-10-20KVM: PPC: Tie KVM_CAP_PPC_HTM to the user-visible TM featureMichael Ellerman
Currently we use CPU_FTR_TM to decide if the CPU/kernel can support TM (Transactional Memory), and if it's true we advertise that to Qemu (or similar) via KVM_CAP_PPC_HTM. PPC_FEATURE2_HTM is the user-visible feature bit, which indicates that the CPU and kernel can support TM. Currently CPU_FTR_TM and PPC_FEATURE2_HTM always have the same value, either true or false, so using the former for KVM_CAP_PPC_HTM is correct. However some Power9 CPUs can operate in a mode where TM is enabled but TM suspended state is disabled. In this mode CPU_FTR_TM is true, but PPC_FEATURE2_HTM is false. Instead a different PPC_FEATURE2 bit is set, to indicate that this different mode of TM is available. It is not safe to let guests use TM as-is, when the CPU is in this mode. So to prevent that from happening, use PPC_FEATURE2_HTM to determine the value of KVM_CAP_PPC_HTM. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-19Revert "KVM: PPC: Book3S HV: POWER9 does not require secondary thread ↵Paul Mackerras
management" This reverts commit 94a04bc25a2c6296bd0c5e82c10e8231c2b11f77. In order to run HPT guests on a radix POWER9 host, we will have to run the host in single-threaded mode, because POWER9 processors do not currently support running some threads of a core in HPT mode while others are in radix mode ("mixed mode"). That means that we will need the same mechanisms that are used on POWER8 to make the secondary threads available to KVM, which were disabled on POWER9 by commit 94a04bc25a2c. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/vphn: Fix numa update end-loop bugMichael Bringmann
powerpc/vphn: On Power systems with shared configurations of CPUs and memory, there are some issues with the association of additional CPUs and memory to nodes when hot-adding resources. This patch fixes an end-of-updates processing problem observed occasionally in numa_update_cpu_topology(). Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/hotplug: Improve responsiveness of hotplug changeMichael Bringmann
powerpc/hotplug: On Power systems with shared configurations of CPUs and memory, there are some issues with the association of additional CPUs and memory to nodes when hot-adding resources. During hotplug CPU operations, this patch resets the timer on topology update work function to a small value to better ensure that the CPU topology is detected and configured sooner. Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/vphn: Improve recognition of PRRN/VPHNMichael Bringmann
powerpc/vphn: On Power systems with shared configurations of CPUs and memory, there are some issues with the association of additional CPUs and memory to nodes when hot-adding resources. This patch updates the initialization checks to independently recognize PRRN or VPHN support. Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/vphn: Update CPU topology when VPHN enabledMichael Bringmann
powerpc/vphn: On Power systems with shared configurations of CPUs and memory, there are some issues with the association of additional CPUs and memory to nodes when hot-adding resources. This patch corrects the currently broken capability to set the topology for shared CPUs in LPARs. At boot time for shared CPU lpars, the topology for each CPU was being set to node zero. Now when numa_update_cpu_topology() is called appropriately, the Virtual Processor Home Node (VPHN) capabilities information provided by the pHyp allows the appropriate node in the shared configuration to be selected for the CPU. Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/mce: hookup memory_failure for UE errorsBalbir Singh
If we are in user space and hit a UE error, we now have the basic infrastructure to walk the page tables and find out the effective address that was accessed, since the DAR is not valid. We use a work_queue content to hookup the bad pfn, any other context causes problems, since memory_failure itself can call into schedule() via lru_drain_ bits. We could probably poison the struct page to avoid a race between detection and taking corrective action. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/mce: Hookup ierror (instruction) UE errorsBalbir Singh
Hookup instruction errors (UE) for memory offling via memory_failure() in a manner similar to load/store errors (derror). Since we have access to the NIP, the conversion is a one step process in this case. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/mce: Hookup derror (load/store) UE errorsBalbir Singh
Extract physical_address for UE errors by walking the page tables for the mm and address at the NIP, to extract the instruction. Then use the instruction to find the effective address via analyse_instr(). We might have page table walking races, but we expect them to be rare, the physical address extraction is best effort. The idea is to then hook up this infrastructure to memory failure eventually. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/mce: Align the print of physical address betterBalbir Singh
Use the same alignment as Effective address. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-16powerpc/mce: Remove unused function get_mce_fault_addr()Balbir Singh
There are no users of get_mce_fault_addr() since commit 1363875bdb63 ("powerpc/64s: fix handling of non-synchronous machine checks") removed the last usage. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-13powerpc/modules: Use WARN_ON() in stub_for_addr()Kamalesh Babulal
Use WARN_ON(), while running out of stubs in stub_for_addr() and abort loading of the module instead of BUG_ON(). Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-06powerpc: get_wchan(): solve possible race scenario due to parallel wakeupKautuk Consul
Add a check for p->state == TASK_RUNNING so that any wake-ups on task_struct p in the interim lead to 0 being returned by get_wchan(). Signed-off-by: Kautuk Consul <kautuk.consul.1980@gmail.com> [mpe: Confirmed other architectures do similar] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-06powerpc: Always initialize input array when calling epapr_hypercall()Seth Forshee
Several callers to epapr_hypercall() pass an uninitialized stack allocated array for the input arguments, presumably because they have no input arguments. However this can produce errors like this one arch/powerpc/include/asm/epapr_hcalls.h:470:42: error: 'in' may be used uninitialized in this function [-Werror=maybe-uninitialized] unsigned long register r3 asm("r3") = in[0]; ~~^~~ Fix callers to this function to always zero-initialize the input arguments array to prevent this. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-06powerpc: Add PPC_EMULATED_STATS to powernv_defconfigMichael Neuling
This is useful, especially for developers. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-06powerpc/xmon: Add option to show uptime informationGuilherme G. Piccoli
It might be useful to quickly get the uptime of a running system on xmon, without needing to grab data from memory and doing math on struct addresses. For example, it'd be useful to check for how long after a crash a system is on xmon shell or if some test was started after the first test crashed (and this 2nd test crashed too into xmon). This small patch adds the 'U' command, to accomplish this. Suggested-by: Murilo Fossa Vicentini <muvic@linux.vnet.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> [mpe: Display units (seconds), add sync()/__delay() sequence] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-06powerpc/powernv: Make opal_event_shutdown() callable from IRQ contextMichael Ellerman
In opal_event_shutdown() we free all the IRQs hanging off the opal_event_irqchip. However it's not safe to do so if we're called from IRQ context, because free_irq() wants to synchronise versus IRQ context. This can lead to warnings and a stuck system. For example from sysrq-b: Trying to free IRQ 17 from IRQ context! ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/irq/manage.c:1461 __free_irq+0x398/0x8d0 ... NIP __free_irq+0x398/0x8d0 LR __free_irq+0x394/0x8d0 Call Trace: __free_irq+0x394/0x8d0 (unreliable) free_irq+0xa4/0x140 opal_event_shutdown+0x128/0x180 opal_shutdown+0x1c/0xb0 pnv_shutdown+0x20/0x40 machine_restart+0x38/0x90 emergency_restart+0x28/0x40 sysrq_handle_reboot+0x24/0x40 __handle_sysrq+0x198/0x590 hvc_poll+0x48c/0x8c0 hvc_handle_interrupt+0x1c/0x50 __handle_irq_event_percpu+0xe8/0x6e0 handle_irq_event_percpu+0x34/0xe0 handle_irq_event+0xc4/0x210 handle_level_irq+0x250/0x770 generic_handle_irq+0x5c/0xa0 opal_handle_events+0x11c/0x240 opal_interrupt+0x38/0x50 __handle_irq_event_percpu+0xe8/0x6e0 handle_irq_event_percpu+0x34/0xe0 handle_irq_event+0xc4/0x210 handle_fasteoi_irq+0x174/0xa10 generic_handle_irq+0x5c/0xa0 __do_irq+0xbc/0x4e0 call_do_irq+0x14/0x24 do_IRQ+0x18c/0x540 hardware_interrupt_common+0x158/0x180 We can avoid that by using disable_irq_nosync() rather than free_irq(). Although it doesn't fully free the IRQ, it should be sufficient when we're shutting down, particularly in an emergency. Add an in_interrupt() check and use free_irq() when we're shutting down normally. It's probably OK to use disable_irq_nosync() in that case too, but for now it's safer to leave that behaviour as-is. Fixes: 9f0fd0499d30 ("powerpc/powernv: Add a virtual irqchip for opal events") Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-05powerpc/jprobes: Validate break handler invocation as being due to a ↵Naveen N. Rao
jprobe_return() Fix a circa 2005 FIXME by implementing a check to ensure that we actually got into the jprobe break handler() due to the trap in jprobe_return(). Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-05powerpc/jprobes: Disable preemption when triggered through ftraceNaveen N. Rao
KPROBES_SANITY_TEST throws the below splat when CONFIG_PREEMPT is enabled: Kprobe smoke test: started DEBUG_LOCKS_WARN_ON(val > preempt_count()) ------------[ cut here ]------------ WARNING: CPU: 19 PID: 1 at kernel/sched/core.c:3094 preempt_count_sub+0xcc/0x140 Modules linked in: CPU: 19 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc7-nnr+ #97 task: c0000000fea80000 task.stack: c0000000feb00000 NIP: c00000000011d3dc LR: c00000000011d3d8 CTR: c000000000a090d0 REGS: c0000000feb03400 TRAP: 0700 Not tainted (4.13.0-rc7-nnr+) MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 28000282 XER: 00000000 CFAR: c00000000015aa18 SOFTE: 0 <snip> NIP preempt_count_sub+0xcc/0x140 LR preempt_count_sub+0xc8/0x140 Call Trace: preempt_count_sub+0xc8/0x140 (unreliable) kprobe_handler+0x228/0x4b0 program_check_exception+0x58/0x3b0 program_check_common+0x16c/0x170 --- interrupt: 0 at kprobe_target+0x8/0x20 LR = init_test_probes+0x248/0x7d0 kp+0x0/0x80 (unreliable) livepatch_handler+0x38/0x74 init_kprobes+0x1d8/0x208 do_one_initcall+0x68/0x1d0 kernel_init_freeable+0x298/0x374 kernel_init+0x24/0x160 ret_from_kernel_thread+0x5c/0x70 Instruction dump: 419effdc 3d22001b 39299240 81290000 2f890000 409effc8 3c82ffcb 3c62ffcb 3884bc68 3863bc18 4803d5fd 60000000 <0fe00000> 4bffffa8 60000000 60000000 ---[ end trace 432dd46b4ce3d29f ]--- Kprobe smoke test: passed successfully The issue is that we aren't disabling preemption in kprobe_ftrace_handler(). Disable it. Fixes: ead514d5fb30a0 ("powerpc/kprobes: Add support for KPROBES_ON_FTRACE") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Trim oops a little for formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/kprobes: Fix warnings from __this_cpu_read() on preempt kernelsNaveen N. Rao
Kamalesh pointed out that we are getting the below call traces with livepatched functions when we enable CONFIG_PREEMPT: [ 495.470721] BUG: using __this_cpu_read() in preemptible [00000000] code: cat/8394 [ 495.471167] caller is is_current_kprobe_addr+0x30/0x90 [ 495.471171] CPU: 4 PID: 8394 Comm: cat Tainted: G K 4.13.0-rc7-nnr+ #95 [ 495.471173] Call Trace: [ 495.471178] [c00000008fd9b960] [c0000000009f039c] dump_stack+0xec/0x160 (unreliable) [ 495.471184] [c00000008fd9b9a0] [c00000000059169c] check_preemption_disabled+0x15c/0x170 [ 495.471187] [c00000008fd9ba30] [c000000000046460] is_current_kprobe_addr+0x30/0x90 [ 495.471191] [c00000008fd9ba60] [c00000000004e9a0] ftrace_call+0x1c/0xb8 [ 495.471195] [c00000008fd9bc30] [c000000000376fd8] seq_read+0x238/0x5c0 [ 495.471199] [c00000008fd9bcd0] [c0000000003cfd78] proc_reg_read+0x88/0xd0 [ 495.471203] [c00000008fd9bd00] [c00000000033e5d4] __vfs_read+0x44/0x1b0 [ 495.471206] [c00000008fd9bd90] [c0000000003402ec] vfs_read+0xbc/0x1b0 [ 495.471210] [c00000008fd9bde0] [c000000000342138] SyS_read+0x68/0x110 [ 495.471214] [c00000008fd9be30] [c00000000000bc6c] system_call+0x58/0x6c Commit c05b8c4474c030 ("powerpc/kprobes: Skip livepatch_handler() for jprobes") introduced a helper is_current_kprobe_addr() to help determine if the current function has been livepatched or if it has a jprobe installed, both of which modify the NIP. This was subsequently renamed to __is_active_jprobe(). In the case of a jprobe, kprobe_ftrace_handler() disables pre-emption before calling into setjmp_pre_handler() which returns without disabling pre-emption. This is done to ensure that the jprobe handler won't disappear beneath us if the jprobe is unregistered between the setjmp_pre_handler() and the subsequent longjmp_break_handler() called from the jprobe handler. Due to this, we can use __this_cpu_read() in __is_active_jprobe() with the pre-emption check as we know that pre-emption will be disabled. However, if this function has been livepatched, we are still doing this check and when we do so, pre-emption won't necessarily be disabled. This results in the call trace shown above. Fix this by only invoking __is_active_jprobe() when pre-emption is disabled. And since we now guard this within a pre-emption check, we can instead use raw_cpu_read() to get the current_kprobe value skipping the check done by __this_cpu_read(). Fixes: c05b8c4474c030 ("powerpc/kprobes: Skip livepatch_handler() for jprobes") Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/kprobes: Clean up jprobe detection in livepatch handlerNaveen N. Rao
In commit c05b8c4474c03 ("powerpc/kprobes: Skip livepatch_handler() for jprobes"), we added a helper is_current_kprobe_addr() to help detect if the modified regs->nip was due to a jprobe or livepatch. Masami felt that the function name was not quite clear. To that end, this patch renames is_current_kprobe_addr() to __is_active_jprobe() and adds a comment to (hopefully) better clarify the purpose of this helper. The helper has also now been moved to kprobes-ftrace.c so that it is only available for KPROBES_ON_FTRACE. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/kprobes: Do not suppress instruction emulation if a single run failedNaveen N. Rao
Currently, we disable instruction emulation if emulate_step() fails for any reason. However, such failures could be transient and specific to a particular run. Instead, only disable instruction emulation if we have never been able to emulate this. If we had emulated this instruction successfully at least once, then we single step only this probe hit and continue to try emulating the instruction in subsequent probe hits. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/kprobes: Some cosmetic updates to try_to_emulate()Naveen N. Rao
1. This is only used in kprobes.c, so make it static. 2. Remove the un-necessary (ret == 0) comparison in the else clause. Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/configs: Add Skiroot defconfigJoel Stanley
This configuration is used by the OpenPower firmware for it's Linux-as-bootloader implementation. Also known as the Petitboot kernel, this configuration broke in 4.12 (CPU_HOTPLUG=n), so add it to the upstream tree in order to get better coverage. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/lib/sstep: Fix fixed-point shift instructions that set CA32Sandipan Das
This fixes the emulated behaviour of existing fixed-point shift right algebraic instructions that are supposed to set both the CA and CA32 bits of XER when running on a system that is compliant with POWER ISA v3.0 independent of whether the system is executing in 32-bit mode or 64-bit mode. The following instructions are affected: * Shift Right Algebraic Word Immediate (srawi[.]) * Shift Right Algebraic Word (sraw[.]) * Shift Right Algebraic Doubleword Immediate (sradi[.]) * Shift Right Algebraic Doubleword (srad[.]) Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()") Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/lib/sstep: Fix fixed-point arithmetic instructions that set CA32Sandipan Das
There are existing fixed-point arithmetic instructions that always set the CA bit of XER to reflect the carry out of bit 0 in 64-bit mode and out of bit 32 in 32-bit mode. In ISA v3.0, these instructions also always set the CA32 bit of XER to reflect the carry out of bit 32. This fixes the emulated behaviour of such instructions when running on a system that is compliant with POWER ISA v3.0. The following instructions are affected: * Add Immediate Carrying (addic) * Add Immediate Carrying and Record (addic.) * Subtract From Immediate Carrying (subfic) * Add Carrying (addc[.]) * Subtract From Carrying (subfc[.]) * Add Extended (adde[.]) * Subtract From Extended (subfe[.]) * Add to Minus One Extended (addme[.]) * Subtract From Minus One Extended (subfme[.]) * Add to Zero Extended (addze[.]) * Subtract From Zero Extended (subfze[.]) Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()") Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/lib/sstep: Add XER bits introduced in POWER ISA v3.0Sandipan Das
This adds definitions for the OV32 and CA32 bits of XER that were introduced in POWER ISA v3.0. There are some existing instructions that currently set the OV and CA bits based on certain conditions. The emulation behaviour of all these instructions needs to be updated to set these new bits accordingly. Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/powermac: Use setup_timer() helperAllen Pais
Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/6xx: Use setup_timer() helperAllen Pais
Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/oprofile: Use setup_timer() helperAllen Pais
Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/powernv: Use early_radix_enabled in POWER9 tlb flushNicholas Piggin
This code is used at boot and machine checks, so it should be using early_radix_enabled() (which is usable any time). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESETNicholas Piggin
This allows MSR[EE]=0 lockups to be detected on an OPAL (bare metal) system similarly to the hcall NMI IPI on pseries guests, when the platform/firmware supports it. This is an example of CPU10 spinning with interrupts hard disabled: Watchdog CPU:32 detected Hard LOCKUP other CPUS:10 Watchdog CPU:10 Hard LOCKUP CPU: 10 PID: 4410 Comm: bash Not tainted 4.13.0-rc7-00074-ge89ce1f89f62-dirty #34 task: c0000003a82b4400 task.stack: c0000003af55c000 NIP: c0000000000a7b38 LR: c000000000659044 CTR: c0000000000a7b00 REGS: c00000000fd23d80 TRAP: 0100 Not tainted (4.13.0-rc7-00074-ge89ce1f89f62-dirty) MSR: 90000000000c1033 <SF,HV,ME,IR,DR,RI,LE> CR: 28422222 XER: 20000000 CFAR: c0000000000a7b38 SOFTE: 0 GPR00: c000000000659044 c0000003af55fbb0 c000000001072a00 0000000000000078 GPR04: c0000003c81b5c80 c0000003c81cc7e8 9000000000009033 0000000000000000 GPR08: 0000000000000000 c0000000000a7b00 0000000000000001 9000000000001003 GPR12: c0000000000a7b00 c00000000fd83200 0000000010180df8 0000000010189e60 GPR16: 0000000010189ed8 0000000010151270 000000001018bd88 000000001018de78 GPR20: 00000000370a0668 0000000000000001 00000000101645e0 0000000010163c10 GPR24: 00007fffd14d6294 00007fffd14d6290 c000000000fba6f0 0000000000000004 GPR28: c000000000f351d8 0000000000000078 c000000000f4095c 0000000000000000 NIP [c0000000000a7b38] sysrq_handle_xmon+0x38/0x40 LR [c000000000659044] __handle_sysrq+0xe4/0x270 Call Trace: [c0000003af55fbd0] [c000000000659044] __handle_sysrq+0xe4/0x270 [c0000003af55fc70] [c000000000659810] write_sysrq_trigger+0x70/0xa0 [c0000003af55fca0] [c0000000003da650] proc_reg_write+0xb0/0x110 [c0000003af55fcf0] [c0000000003423bc] __vfs_write+0x6c/0x1b0 [c0000003af55fd90] [c000000000344398] vfs_write+0xd8/0x240 [c0000003af55fde0] [c00000000034632c] SyS_write+0x6c/0x110 [c0000003af55fe30] [c00000000000b220] system_call+0x58/0x6c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Use kernel types for opal_signal_system_reset()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/64s: Implement system reset idle wakeup reasonNicholas Piggin
It is possible to wake from idle due to a system reset exception, in which case the CPU takes a system reset interrupt to wake from idle, with system reset as the wakeup reason. The regular (not idle wakeup) system reset interrupt handler must be invoked in this case, otherwise the system reset interrupt is lost. Handle the system reset interrupt immediately after CPU state has been restored. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/xmon: Avoid tripping SMP hardlockup watchdogNicholas Piggin
The SMP hardlockup watchdog cross-checks other CPUs for lockups, which causes xmon headaches because it's assuming interrupts hard disabled means no watchdog troubles. Try to improve that by calling touch_nmi_watchdog() in obvious places where secondaries are spinning. Also annotate these spin loops with spin_begin/end calls. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdogNicholas Piggin
In xmon, touch_nmi_watchdog() is not expected to be checking that other CPUs have not touched the watchdog, so the code will just call touch_nmi_watchdog() once before re-enabling hard interrupts. Just update our CPU's state, and ignore apparently stuck SMP threads. Arguably touch_nmi_watchdog should check for SMP lockups, and callers should be fixed, but that's not trivial for the input code of xmon. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/watchdog: Do not backtrace locked CPUs twice if allcpus backtrace is ↵Nicholas Piggin
enabled If sysctl_hardlockup_all_cpu_backtrace is enabled, there is no need to IPI stuck CPUs for backtrace before trigger_allbutself_cpu_backtrace(), which does the same thing again. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/watchdog: Do not panic from locked CPU's IPI handlerNicholas Piggin
The SMP watchdog will detect locked CPUs and IPI them to print a backtrace and registers. If panic on hard lockup is enabled, do not panic from this handler, because that can cause recursion into the IPI layer during the panic. The caller already panics in this case. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-28cxl: Enable global TLBIs for cxl contextsFrederic Barrat
The PSL and nMMU need to see all TLB invalidations for the memory contexts used on the adapter. For the hash memory model, it is done by making all TLBIs global as soon as the cxl driver is in use. For radix, we need something similar, but we can refine and only convert to global the invalidations for contexts actually used by the device. The new mm_context_add_copro() API increments the 'active_cpus' count for the contexts attached to the cxl adapter. As soon as there's more than 1 active cpu, the TLBIs for the context become global. Active cpu count must be decremented when detaching to restore locality if possible and to avoid overflowing the counter. The hash memory model support is somewhat limited, as we can't decrement the active cpus count when mm_context_remove_copro() is called, because we can't flush the TLB for a mm on hash. So TLBIs remain global on hash. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Fixes: f24be42aab37 ("cxl: Add psl9 specific code") Tested-by: Alistair Popple <alistair@popple.id.au> [mpe: Fold in updated comment on the barrier from Fred] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-28powerpc/mm: Export flush_all_mm()Frederic Barrat
With the optimizations introduced by commit a46cc7a90fd8 ("powerpc/mm/radix: Improve TLB/PWC flushes"), flush_tlb_mm() no longer flushes the page walk cache (PWC) with radix. This patch introduces flush_all_mm(), which flushes everything, TLB and PWC, for a given mm. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> [mpe: Add a WARN_ON_ONCE() in the empty hash routines] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-27powerpc/64s: Add workaround for P9 vector CI load issueMichael Neuling
POWER9 DD2.1 and earlier has an issue where some cache inhibited vector load will return bad data. The workaround is two part, one firmware/microcode part triggers HMI interrupts when hitting such loads, the other part is this patch which then emulates the instructions in Linux. The affected instructions are limited to lxvd2x, lxvw4x, lxvb16x and lxvh8x. When an instruction triggers the HMI, all threads in the core will be sent to the HMI handler, not just the one running the vector load. In general, these spurious HMIs are detected by the emulation code and we just return back to the running process. Unfortunately, if a spurious interrupt occurs on a vector load that's to normal memory we have no way to detect that it's spurious (unless we walk the page tables, which is very expensive). In this case we emulate the load but we need do so using a vector load itself to ensure 128bit atomicity is preserved. Some additional debugfs emulated instruction counters are added also. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Switch CONFIG_PPC_BOOK3S_64 to CONFIG_VSX to unbreak the build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-26powerpc/powernv: Rework EEH initialization on powernvBenjamin Herrenschmidt
Remove the post_init callback which is only used by powernv, we can just call it explicitly from the powernv code. This partially kills the ability to "disable" eeh at runtime via debugfs as this was calling that same callback again, but this is both unused and broken in several ways. If we want to revive it, we need to create a dedicated enable/disable callback on the backend that does the right thing. Let the bulk of eeh initialize normally at core_initcall() like it does on pseries by removing the hack in eeh_init() that delays it. Instead we make sure our eeh->probe cleanly bails out of the PEs haven't been created yet and we force a re-probe where we used to call eeh_init() again. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-24Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Another round of CR3/PCID related fixes (I think this addresses all but one of the known problems with PCID support), an objtool fix plus a Clang fix that (finally) solves all Clang quirks to build a bootable x86 kernel as-is" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Fix inline asm call constraints for Clang objtool: Handle another GCC stack pointer adjustment bug x86/mm/32: Load a sane CR3 before cpu_init() on secondary CPUs x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier x86/mm/64: Stop using CR3.PCID == 0 in ASID-aware code x86/mm: Factor out CR3-building code
2017-09-24Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull address-limit checking fixes from Ingo Molnar: "This fixes a number of bugs in the address-limit (USER_DS) checks that got introduced in the merge window, (mostly) affecting the ARM and ARM64 platforms" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/syscalls: Move address limit check in loop arm/syscalls: Optimize address limit check Revert "arm/syscalls: Check address limit on user-mode return" syscalls: Use CHECK_DATA_CORRUPTION for addr_limit_user_check
2017-09-23Merge branch 'parisc-4.14-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Unbreak parisc bootloader by avoiding a gcc-7 optimization to convert multiple byte-accesses into one word-access. - Add missing HWPOISON page fault handler code. I completely missed that when I added HWPOISON support during this merge window and it only showed up now with the madvise07 LTP test case. - Fix backtrace unwinding to stop when stack start has been reached. - Issue warning if initrd has been loaded into memory regions with broken RAM modules. - Fix HPMC handler (parisc hardware fault handler) to comply with architecture specification. - Avoid compiler warnings about too large frame sizes. - Minor init-section fixes. * 'parisc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Unbreak bootloader due to gcc-7 optimizations parisc: Reintroduce option to gzip-compress the kernel parisc: Add HWPOISON page fault handler code parisc: Move init_per_cpu() into init section parisc: Check if initrd was loaded into broken RAM parisc: Add PDCE_CHECK instruction to HPMC handler parisc: Add wrapper for pdc_instr() firmware function parisc: Move start_parisc() into init section parisc: Stop unwinding at start of stack parisc: Fix too large frame size warnings
2017-09-23x86/asm: Fix inline asm call constraints for ClangJosh Poimboeuf
For inline asm statements which have a CALL instruction, we list the stack pointer as a constraint to convince GCC to ensure the frame pointer is set up first: static inline void foo() { register void *__sp asm(_ASM_SP); asm("call bar" : "+r" (__sp)) } Unfortunately, that pattern causes Clang to corrupt the stack pointer. The fix is easy: convert the stack pointer register variable to a global variable. It should be noted that the end result is different based on the GCC version. With GCC 6.4, this patch has exactly the same result as before: defconfig defconfig-nofp distro distro-nofp before 9820389 9491555 8816046 8516940 after 9820389 9491555 8816046 8516940 With GCC 7.2, however, GCC's behavior has changed. It now changes its behavior based on the conversion of the register variable to a global. That somehow convinces it to *always* set up the frame pointer before inserting *any* inline asm. (Therefore, listing the variable as an output constraint is a no-op and is no longer necessary.) It's a bit overkill, but the performance impact should be negligible. And in fact, there's a nice improvement with frame pointers disabled: defconfig defconfig-nofp distro distro-nofp before 9796316 9468236 9076191 8790305 after 9796957 9464267 9076381 8785949 So in summary, while listing the stack pointer as an output constraint is no longer necessary for newer versions of GCC, it's still needed for older versions. Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-22Merge tag 'pci-v4.14-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - fix endpoint "end of test" interrupt issue (introduced in v4.14-rc1) (John Keeping) - fix MIPS use-after-free map_irq() issue (introduced in v4.14-rc1) (Lorenzo Pieralisi) * tag 'pci-v4.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: endpoint: Use correct "end of test" interrupt MIPS: PCI: Move map_irq() hooks out of initdata