summaryrefslogtreecommitdiff
path: root/arch/powerpc/sysdev/xics
AgeCommit message (Collapse)Author
2020-08-25powerpc/icp-hv: Fix missing of_node_put() in success pathNicholas Mc Guire
Both of_find_compatible_node() and of_find_node_by_type() will return a refcounted node on success - thus for the success path the node must be explicitly released with a of_node_put(). Fixes: 0b05ac6e2480 ("powerpc/xics: Rewrite XICS driver") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1530691407-3991-1-git-send-email-hofrat@osadl.org
2020-06-02powerpc/rtas: Implement reentrant rtas callLeonardo Bras
Implement rtas_call_reentrant() for reentrant rtas-calls: "ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive". On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4, items 2 and 3 say: 2 - For the PowerPC External Interrupt option: The * call must be reentrant to the number of processors on the platform. 3 - For the PowerPC External Interrupt option: The * argument call buffer for each simultaneous call must be physically unique. So, these rtas-calls can be called in a lockless way, if using a different buffer for each cpu doing such rtas call. For this, it was suggested to add the buffer (struct rtas_args) in the PACA struct, so each cpu can have it's own buffer. The PACA struct received a pointer to rtas buffer, which is allocated in the memory range available to rtas 32-bit. Reentrant rtas calls are useful to avoid deadlocks in crashing, where rtas-calls are needed, but some other thread crashed holding the rtas.lock. This is a backtrace of a deadlock from a kdump testing environment: #0 arch_spin_lock #1 lock_rtas () #2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0) #3 ics_rtas_mask_real_irq (hw_irq=4100) #4 machine_kexec_mask_interrupts #5 default_machine_crash_shutdown #6 machine_crash_shutdown #7 __crash_kexec #8 crash_kexec #9 oops_end Signed-off-by: Leonardo Bras <leobras.c@gmail.com> [mpe: Move under #ifdef PSERIES to avoid build breakage] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200518234245.200672-3-leobras.c@gmail.com
2019-09-24KVM: PPC: Book3S HV: use smp_mb() when setting/clearing host_ipi flagMichael Roth
On a 2-socket Power9 system with 32 cores/128 threads (SMT4) and 1TB of memory running the following guest configs: guest A: - 224GB of memory - 56 VCPUs (sockets=1,cores=28,threads=2), where: VCPUs 0-1 are pinned to CPUs 0-3, VCPUs 2-3 are pinned to CPUs 4-7, ... VCPUs 54-55 are pinned to CPUs 108-111 guest B: - 4GB of memory - 4 VCPUs (sockets=1,cores=4,threads=1) with the following workloads (with KSM and THP enabled in all): guest A: stress --cpu 40 --io 20 --vm 20 --vm-bytes 512M guest B: stress --cpu 4 --io 4 --vm 4 --vm-bytes 512M host: stress --cpu 4 --io 4 --vm 2 --vm-bytes 256M the below soft-lockup traces were observed after an hour or so and persisted until the host was reset (this was found to be reliably reproducible for this configuration, for kernels 4.15, 4.18, 5.0, and 5.3-rc5): [ 1253.183290] rcu: INFO: rcu_sched self-detected stall on CPU [ 1253.183319] rcu: 124-....: (5250 ticks this GP) idle=10a/1/0x4000000000000002 softirq=5408/5408 fqs=1941 [ 1256.287426] watchdog: BUG: soft lockup - CPU#105 stuck for 23s! [CPU 52/KVM:19709] [ 1264.075773] watchdog: BUG: soft lockup - CPU#24 stuck for 23s! [worker:19913] [ 1264.079769] watchdog: BUG: soft lockup - CPU#31 stuck for 23s! [worker:20331] [ 1264.095770] watchdog: BUG: soft lockup - CPU#45 stuck for 23s! [worker:20338] [ 1264.131773] watchdog: BUG: soft lockup - CPU#64 stuck for 23s! [avocado:19525] [ 1280.408480] watchdog: BUG: soft lockup - CPU#124 stuck for 22s! [ksmd:791] [ 1316.198012] rcu: INFO: rcu_sched self-detected stall on CPU [ 1316.198032] rcu: 124-....: (21003 ticks this GP) idle=10a/1/0x4000000000000002 softirq=5408/5408 fqs=8243 [ 1340.411024] watchdog: BUG: soft lockup - CPU#124 stuck for 22s! [ksmd:791] [ 1379.212609] rcu: INFO: rcu_sched self-detected stall on CPU [ 1379.212629] rcu: 124-....: (36756 ticks this GP) idle=10a/1/0x4000000000000002 softirq=5408/5408 fqs=14714 [ 1404.413615] watchdog: BUG: soft lockup - CPU#124 stuck for 22s! [ksmd:791] [ 1442.227095] rcu: INFO: rcu_sched self-detected stall on CPU [ 1442.227115] rcu: 124-....: (52509 ticks this GP) idle=10a/1/0x4000000000000002 softirq=5408/5408 fqs=21403 [ 1455.111787] INFO: task worker:19907 blocked for more than 120 seconds. [ 1455.111822] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1 [ 1455.111833] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1455.111884] INFO: task worker:19908 blocked for more than 120 seconds. [ 1455.111905] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1 [ 1455.111925] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1455.111966] INFO: task worker:20328 blocked for more than 120 seconds. [ 1455.111986] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1 [ 1455.111998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1455.112048] INFO: task worker:20330 blocked for more than 120 seconds. [ 1455.112068] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1 [ 1455.112097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1455.112138] INFO: task worker:20332 blocked for more than 120 seconds. [ 1455.112159] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1 [ 1455.112179] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1455.112210] INFO: task worker:20333 blocked for more than 120 seconds. [ 1455.112231] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1 [ 1455.112242] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1455.112282] INFO: task worker:20335 blocked for more than 120 seconds. [ 1455.112303] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1 [ 1455.112332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1455.112372] INFO: task worker:20336 blocked for more than 120 seconds. [ 1455.112392] Tainted: G L 5.3.0-rc5-mdr-vanilla+ #1 CPUs 45, 24, and 124 are stuck on spin locks, likely held by CPUs 105 and 31. CPUs 105 and 31 are stuck in smp_call_function_many(), waiting on target CPU 42. For instance: # CPU 105 registers (via xmon) R00 = c00000000020b20c R16 = 00007d1bcd800000 R01 = c00000363eaa7970 R17 = 0000000000000001 R02 = c0000000019b3a00 R18 = 000000000000006b R03 = 000000000000002a R19 = 00007d537d7aecf0 R04 = 000000000000002a R20 = 60000000000000e0 R05 = 000000000000002a R21 = 0801000000000080 R06 = c0002073fb0caa08 R22 = 0000000000000d60 R07 = c0000000019ddd78 R23 = 0000000000000001 R08 = 000000000000002a R24 = c00000000147a700 R09 = 0000000000000001 R25 = c0002073fb0ca908 R10 = c000008ffeb4e660 R26 = 0000000000000000 R11 = c0002073fb0ca900 R27 = c0000000019e2464 R12 = c000000000050790 R28 = c0000000000812b0 R13 = c000207fff623e00 R29 = c0002073fb0ca808 R14 = 00007d1bbee00000 R30 = c0002073fb0ca800 R15 = 00007d1bcd600000 R31 = 0000000000000800 pc = c00000000020b260 smp_call_function_many+0x3d0/0x460 cfar= c00000000020b270 smp_call_function_many+0x3e0/0x460 lr = c00000000020b20c smp_call_function_many+0x37c/0x460 msr = 900000010288b033 cr = 44024824 ctr = c000000000050790 xer = 0000000000000000 trap = 100 CPU 42 is running normally, doing VCPU work: # CPU 42 stack trace (via xmon) [link register ] c00800001be17188 kvmppc_book3s_radix_page_fault+0x90/0x2b0 [kvm_hv] [c000008ed3343820] c000008ed3343850 (unreliable) [c000008ed33438d0] c00800001be11b6c kvmppc_book3s_hv_page_fault+0x264/0xe30 [kvm_hv] [c000008ed33439d0] c00800001be0d7b4 kvmppc_vcpu_run_hv+0x8dc/0xb50 [kvm_hv] [c000008ed3343ae0] c00800001c10891c kvmppc_vcpu_run+0x34/0x48 [kvm] [c000008ed3343b00] c00800001c10475c kvm_arch_vcpu_ioctl_run+0x244/0x420 [kvm] [c000008ed3343b90] c00800001c0f5a78 kvm_vcpu_ioctl+0x470/0x7c8 [kvm] [c000008ed3343d00] c000000000475450 do_vfs_ioctl+0xe0/0xc70 [c000008ed3343db0] c0000000004760e4 ksys_ioctl+0x104/0x120 [c000008ed3343e00] c000000000476128 sys_ioctl+0x28/0x80 [c000008ed3343e20] c00000000000b388 system_call+0x5c/0x70 --- Exception: c00 (System Call) at 00007d545cfd7694 SP (7d53ff7edf50) is in userspace It was subsequently found that ipi_message[PPC_MSG_CALL_FUNCTION] was set for CPU 42 by at least 1 of the CPUs waiting in smp_call_function_many(), but somehow the corresponding call_single_queue entries were never processed by CPU 42, causing the callers to spin in csd_lock_wait() indefinitely. Nick Piggin suggested something similar to the following sequence as a possible explanation (interleaving of CALL_FUNCTION/RESCHEDULE IPI messages seems to be most common, but any mix of CALL_FUNCTION and !CALL_FUNCTION messages could trigger it): CPU X: smp_muxed_ipi_set_message(): X: smp_mb() X: message[RESCHEDULE] = 1 X: doorbell_global_ipi(42): X: kvmppc_set_host_ipi(42, 1) X: ppc_msgsnd_sync()/smp_mb() X: ppc_msgsnd() -> 42 42: doorbell_exception(): // from CPU X 42: ppc_msgsync() 105: smp_muxed_ipi_set_message(): 105: smb_mb() // STORE DEFERRED DUE TO RE-ORDERING --105: message[CALL_FUNCTION] = 1 | 105: doorbell_global_ipi(42): | 105: kvmppc_set_host_ipi(42, 1) | 42: kvmppc_set_host_ipi(42, 0) | 42: smp_ipi_demux_relaxed() | 42: // returns to executing guest | // RE-ORDERED STORE COMPLETES ->105: message[CALL_FUNCTION] = 1 105: ppc_msgsnd_sync()/smp_mb() 105: ppc_msgsnd() -> 42 42: local_paca->kvm_hstate.host_ipi == 0 // IPI ignored 105: // hangs waiting on 42 to process messages/call_single_queue This can be prevented with an smp_mb() at the beginning of kvmppc_set_host_ipi(), such that stores to message[<type>] (or other state indicated by the host_ipi flag) are ordered vs. the store to to host_ipi. However, doing so might still allow for the following scenario (not yet observed): CPU X: smp_muxed_ipi_set_message(): X: smp_mb() X: message[RESCHEDULE] = 1 X: doorbell_global_ipi(42): X: kvmppc_set_host_ipi(42, 1) X: ppc_msgsnd_sync()/smp_mb() X: ppc_msgsnd() -> 42 42: doorbell_exception(): // from CPU X 42: ppc_msgsync() // STORE DEFERRED DUE TO RE-ORDERING -- 42: kvmppc_set_host_ipi(42, 0) | 42: smp_ipi_demux_relaxed() | 105: smp_muxed_ipi_set_message(): | 105: smb_mb() | 105: message[CALL_FUNCTION] = 1 | 105: doorbell_global_ipi(42): | 105: kvmppc_set_host_ipi(42, 1) | // RE-ORDERED STORE COMPLETES -> 42: kvmppc_set_host_ipi(42, 0) 42: // returns to executing guest 105: ppc_msgsnd_sync()/smp_mb() 105: ppc_msgsnd() -> 42 42: local_paca->kvm_hstate.host_ipi == 0 // IPI ignored 105: // hangs waiting on 42 to process messages/call_single_queue Fixing this scenario would require an smp_mb() *after* clearing host_ipi flag in kvmppc_set_host_ipi() to order the store vs. subsequent processing of IPI messages. To handle both cases, this patch splits kvmppc_set_host_ipi() into separate set/clear functions, where we execute smp_mb() prior to setting host_ipi flag, and after clearing host_ipi flag. These functions pair with each other to synchronize the sender and receiver sides. With that change in place the above workload ran for 20 hours without triggering any lock-ups. Fixes: 755563bc79c7 ("powerpc/powernv: Fixes for hypervisor doorbell handling") # v4.0 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190911223155.16045-1-mdroth@linux.vnet.ibm.com
2019-07-13Merge tag 'powerpc-5.3-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver, as well as some other functions only used by drivers that haven't (yet?) made it upstream. - A fix for a bug in our handling of hardware watchpoints (eg. perf record -e mem: ...) which could lead to register corruption and kernel crashes. - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for vmalloc when using the Radix MMU. - A large but incremental rewrite of our exception handling code to use gas macros rather than multiple levels of nested CPP macros. And the usual small fixes, cleanups and improvements. Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Cédric Le Goater, Christian Lamparter, Christophe Leroy, Christophe Lombard, Christoph Hellwig, Daniel Axtens, Denis Efremov, Enrico Weigelt, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg Kurz, Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi Bangoria, Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher Boessenkool, Shaokun Zhang, Shawn Anastasio, Stewart Smith, Suraj Jitindar Singh, Thiago Jung Bauermann, YueHaibing" * tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits) powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state. powerpc/eeh: Handle hugepages in ioremap space ocxl: Update for AFU descriptor template version 1.1 powerpc/boot: pass CONFIG options in a simpler and more robust way powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h powerpc/irq: Don't WARN continuously in arch_local_irq_restore() powerpc/module64: Use symbolic instructions names. powerpc/module32: Use symbolic instructions names. powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h powerpc/module64: Fix comment in R_PPC64_ENTRY handling powerpc/boot: Add lzo support for uImage powerpc/boot: Add lzma support for uImage powerpc/boot: don't force gzipped uImage powerpc/8xx: Add microcode patch to move SMC parameter RAM. powerpc/8xx: Use IO accessors in microcode programming. powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c powerpc/8xx: refactor programming of microcode CPM params. powerpc/8xx: refactor printing of microcode patch name. powerpc/8xx: Refactor microcode write powerpc/8xx: refactor writing of CPM microcode arrays ...
2019-07-04powerpc/Kconfig: Clean up formattingEnrico Weigelt, metux IT consult
Formatting of Kconfig files doesn't look so pretty, so let the Great White Handkerchief come around and clean it up. Also convert "---help---" as requested. Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-19powerpc: Add -Werror at arch/powerpc levelMichael Ellerman
Back when I added -Werror in commit ba55bd74360e ("powerpc: Add configurable -Werror for arch/powerpc") I did it by adding it to most of the arch Makefiles. At the time we excluded math-emu, because apparently it didn't build cleanly. But that seems to have been fixed somewhere in the interim. So move the -Werror addition to the top-level of the arch, this saves us from repeating it in every Makefile and means we won't forget to add it to any new sub-dirs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-03powerpc/xics: Add missing of_node_put() in error pathYueHaibing
The device node obtained with of_find_compatible_node() should be released by calling of_node_put(). But it was not released when of_get_property() failed. Signed-off-by: YueHaibing <yuehaibing@huawei.com> [mpe: Invert the sense of the if so we only need one return path] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30powerpc/64: Use array of paca pointers and allocate pacas individuallyNicholas Piggin
Change the paca array into an array of pointers to pacas. Allocate pacas individually. This allows flexibility in where the PACAs are allocated. Future work will allocate them node-local. Platforms that don't have address limits on PACAs would be able to defer PACA allocations until later in boot rather than allocate all possible ones up-front then freeing unused. This is slightly more overhead (one additional indirection) for cross CPU paca references, but those aren't too common. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-04powerpc: Use pr_warn instead of pr_warningJoe Perches
At some point, pr_warning will be removed so all logging messages use a consistent <prefix>_warn style. Update arch/powerpc/ Miscellanea: o Coalesce formats o Realign arguments o Use %s, __func__ instead of embedded function names o Remove unnecessary line continuations Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Geoff Levand <geoff@infradead.org> [mpe: Rebase due to some %pOF changes.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-26powerpc/powernv: Fix oops on P9 DD1 in cause_ipi()Michael Ellerman
Recently we merged the native xive support for Power9, and then separately some reworks for doorbell IPI support. In isolation both series were OK, but the merged result had a bug in one case. On P9 DD1 we use pnv_p9_dd1_cause_ipi() which tries to use doorbells, and then falls back to the interrupt controller. However the fallback is implemented by calling icp_ops->cause_ipi. But now that xive support is merged we might be using xive, in which case icp_ops is not initialised, it's a xics specific structure. This leads to an oops such as: Unable to handle kernel paging request for data at address 0x00000028 Oops: Kernel access of bad area, sig: 11 [#1] NIP pnv_p9_dd1_cause_ipi+0x74/0xe0 LR smp_muxed_ipi_message_pass+0x54/0x70 To fix it, rather than using icp_ops which might be NULL, have both xics and xive set smp_ops->cause_ipi, and then in the powernv code we save that as ic_cause_ipi before overriding smp_ops->cause_ipi. For paranoia add a WARN_ON() to check if somehow smp_ops->cause_ipi is NULL. Fixes: b866cc2199d6 ("powerpc: Change the doorbell IPI calling convention") Tested-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-13powerpc: Change the doorbell IPI calling conventionNicholas Piggin
Change the doorbell callers to know about their msgsnd addressing, rather than have them set a per-cpu target data tag at boot that gets sent to the cause_ipi functions. The data is only used for doorbell IPI functions, no other IPI types, so it makes sense to keep that detail local to doorbell. Have the platform code understand doorbell IPIs, rather than the interrupt controller code understand them. Platform code can look at capabilities it has available and decide which to use. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-10powerpc: Consolidate variants of real-mode MMIOsBenjamin Herrenschmidt
We have all sort of variants of MMIO accessors for the real mode instructions. This creates a clean set of accessors based on Linux normal naming conventions, replacing all occurrences of the old ones in the tree. I have purposefully removed the "out/in" variants in favor of only including __raw variants. Any code using these is already pretty much hand tuned to operate in a very specific environment. I've fixed up the 2 users (only one of them actually needed a barrier in the first place). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06powerpc/xics: Work around limitations of OPAL XICS priority handlingBalbir Singh
The CPPR (Current Processor Priority Register) of a XICS interrupt presentation controller contains a value N, such that only interrupts with a priority "more favoured" than N will be received by the CPU, where "more favoured" means "less than". So if the CPPR has the value 5 then only interrupts with a priority of 0-4 inclusive will be received. In theory the CPPR can support a value of 0 to 255 inclusive. In practice Linux only uses values of 0, 4, 5 and 0xff. Setting the CPPR to 0 rejects all interrupts, setting it to 0xff allows all interrupts. The values 4 and 5 are used to differentiate IPIs from external interrupts. Setting the CPPR to 5 allows IPIs to be received but not external interrupts. The CPPR emulation in the OPAL XICS implementation only directly supports priorities 0 and 0xff. All other priorities are considered equivalent, and mapped to a single priority value internally. This means when using icp-opal we can not allow IPIs but not externals. This breaks Linux's use of priority values when a CPU is hot unplugged. After migrating IRQs away from the CPU that is being offlined, we set the priority to 5, meaning we still want the offline CPU to receive IPIs. But the effect of the OPAL XICS emulation's use of a single priority value is that all interrupts are rejected by the CPU. With the CPU offline, and not receiving IPIs, we may not be able to wake it up to bring it back online. The first part of the fix is in icp_opal_set_cpu_priority(). CPPR values of 0 to 4 inclusive will correctly cause all interrupts to be rejected, so we pass those CPPR values through to OPAL. However if we are called with a CPPR of 5 or greater, the caller is expecting to be able to allow IPIs but not external interrupts. We know this doesn't work, so instead of rejecting all interrupts we choose the opposite which is to allow all interrupts. This is still not correct behaviour, but we know for the only existing caller (xics_migrate_irqs_away()), that it is the better option. The other part of the fix is in xics_migrate_irqs_away(). Instead of setting priority (CPPR) to 0, and then back to 5 before migrating IRQs, we migrate the IRQs before setting the priority back to 5. This should have no effect on an ICP backend with a working set_priority(), and on icp-opal it means we will keep all interrupts blocked until after we've finished doing the IRQ migration. Additionally we wait for 5ms after doing the migration to make sure there are no IRQs in flight. Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend") Cc: stable@vger.kernel.org # v4.8+ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Rewrote comments and change log, change delay to 5ms] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09powerpc/powernv: Properly set "host-ipi" on IPIsBenjamin Herrenschmidt
Otherwise KVM will fail to pass them through to the host Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09powerpc/powernv: Fix CPU hotplug to handle waking on HVIBenjamin Herrenschmidt
The IPIs come in as HVI not EE, so we need to test the appropriate SRR1 bits. The encoding is such that it won't have false positives on P7 and P8 so we can just test it like that. We also need to handle the icp-opal variant of the flush. Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-17powerpc/icp-opal: Fix missing KVM case and harden replayBenjamin Herrenschmidt
The icp-opal call is missing the code from icp-native to recover interrupts snatched by KVM. Without that, when running KVM, we can get into a situation where an interrupt is lost and the CPU stuck with an elevated CPPR. Also harden replay by always checking the return from opal_int_eoi(). Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-07Merge tag 'powerpc-4.9-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin) - Use gas sections for arranging exception vectors et. al. - Large set of TM cleanups and selftests (Cyril Bur) - Enable transactional memory (TM) lazily for userspace (Cyril Bur) - Support for XZ compression in the zImage wrapper (Oliver O'Halloran) - Add support for bpf constant blinding (Naveen N. Rao) - Beginnings of upstream support for PA Semi Nemo motherboards (Darren Stevens) Fixes: - Ensure .mem(init|exit).text are within _stext/_etext (Michael Ellerman) - xmon: Don't use ld on 32-bit (Michael Ellerman) - vdso64: Use double word compare on pointers (Anton Blanchard) - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui) - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy) - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K (Aneesh Kumar K.V) - Fix memory leak in queue_hotplug_event() error path (Andrew Donnellan) - Replay hypervisor maintenance interrupt first (Nicholas Piggin) Various performance optimisations (Anton Blanchard): - Align hot loops of memset() and backwards_memcpy() - During context switch, check before setting mm_cpumask - Remove static branch prediction in atomic{, 64}_add_unless - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian - Set default CPU type to POWER8 for little endian builds Cleanups & features: - Sparse fixes/cleanups (Daniel Axtens) - Preserve CFAR value on SLB miss caused by access to bogus address (Paul Mackerras) - Radix MMU fixups for POWER9 (Aneesh Kumar K.V) - Support for setting used_(vsr|vr|spe) in sigreturn path (for CRIU) (Simon Guo) - Optimise syscall entry for virtual, relocatable case (Nicholas Piggin) - Optimise MSR handling in exception handling (Nicholas Piggin) - Support for kexec with Radix MMU (Benjamin Herrenschmidt) - powernv EEH fixes (Russell Currey) - Suprise PCI hotplug support for powernv (Gavin Shan) - Endian/sparse fixes for powernv PCI (Gavin Shan) - Defconfig updates (Anton Blanchard) - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh) - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat) - cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put() (Andrew Donnellan) - Fix HV facility unavailable to use correct handler (Nicholas Piggin) - Remove unnecessary syscall trampoline (Nicholas Piggin) - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael Ellerman) - Quieten EEH message when no adapters are found (Anton Blanchard) - powernv: Add PHB register dump debugfs handle (Russell Currey) - Use kprobe blacklist for exception handlers & asm functions (Nicholas Piggin) - Document the syscall ABI (Nicholas Piggin) - MAINTAINERS: Update cxl maintainers (Michael Neuling) - powerpc: Remove all usages of NO_IRQ (Michael Ellerman) Minor cleanups: - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur, Frederic Barrat, Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng, Simon Guo" * tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (156 commits) powerpc/bpf: Add support for bpf constant blinding powerpc/bpf: Implement support for tail calls powerpc/bpf: Introduce accessors for using the tmp local stack space powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n powerpc: tm: Enable transactional memory (TM) lazily for userspace powerpc/tm: Add TM Unavailable Exception powerpc: Remove do_load_up_transact_{fpu,altivec} powerpc: tm: Rename transct_(*) to ck(\1)_state powerpc: tm: Always use fp_state and vr_state to store live registers selftests/powerpc: Add checks for transactional VSXs in signal contexts selftests/powerpc: Add checks for transactional VMXs in signal contexts selftests/powerpc: Add checks for transactional FPUs in signal contexts selftests/powerpc: Add checks for transactional GPRs in signal contexts selftests/powerpc: Check that signals always get delivered selftests/powerpc: Add TM tcheck helpers in C selftests/powerpc: Allow tests to extend their kill timeout selftests/powerpc: Introduce GPR asm helper header file selftests/powerpc: Move VMX stack frame macros to header file selftests/powerpc: Rework FPU stack placement macros and move to header file selftests/powerpc: Check for VSX preservation across userspace preemption ...
2016-09-20powerpc: Remove all usages of NO_IRQMichael Ellerman
NO_IRQ has been == 0 on powerpc for just over ten years (since commit 0ebfff1491ef ("[POWERPC] Add new interrupt mapping core and change platforms to use it")). It's also 0 on most other arches. Although it's fairly harmless, every now and then it causes confusion when a driver is built on powerpc and another arch which doesn't define NO_IRQ. There's at least 6 definitions of NO_IRQ in drivers/, at least some of which are to work around that problem. So we'd like to remove it. This is fairly trivial in the arch code, we just convert: if (irq == NO_IRQ) to if (!irq) if (irq != NO_IRQ) to if (irq) irq = NO_IRQ; to irq = 0; return NO_IRQ; to return 0; And a few other odd cases as well. At least for now we keep the #define NO_IRQ, because there is driver code that uses NO_IRQ and the fixes to remove those will go via other trees. Note we also change some occurrences in PPC sound drivers, drivers/ps3, and drivers/macintosh. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-06powerpc/xics/opal: Fix processor numbers in OPAL ICPBenjamin Herrenschmidt
When using the OPAL ICP backend we incorrectly pass Linux CPU numbers rather than HW CPU numbers to OPAL. Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09powerpc/xics: Properly set Edge/Level type and enable resendBenjamin Herrenschmidt
This sets the type of the interrupt appropriately. We set it as follow: - If not mapped from the device-tree, we use edge. This is the case of the virtual interrupts and PCI MSIs for example. - If mapped from the device-tree and #interrupt-cells is 2 (PAPR compliant), we use the second cell to set the appropriate type - If mapped from the device-tree and #interrupt-cells is 1 (current OPAL on P8 does that), we assume level sensitive since those are typically going to be the PSI LSIs which are level sensitive. Additionally, we mark the interrupts requested via the opal_interrupts property all level. This is a bit fishy but the best we can do until we fix OPAL to properly expose them with a complete descriptor. It is also correct for the current HW anyway as OPAL interrupts are currently PCI error and PSI interrupts which are level. Finally now that edge interrupts are properly identified, we can enable CONFIG_HARDIRQS_SW_RESEND which will make the core re-send them if they occur while masked, which some drivers rely upon. This fixes issues with lost interrupts on some Mellanox adapters. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17powerpc/xics: Add ICP OPAL backendBenjamin Herrenschmidt
This adds a new XICS backend that uses OPAL calls, which can be used when we don't have native support for the platform interrupt controller. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29powerpc/xics: Add icp_native_cause_ipi_rmSuresh Warrier
Function to cause an IPI by directly updating the MFFR register in the XICS. The function is meant for real-mode callers since they cannot use the smp_ops->cause_ipi function which uses an ioremapped address. Normal usage is for the the KVM real mode code to set the IPI message using smp_muxed_ipi_message_pass and then invoke icp_native_cause_ipi_rm to cause the actual IPI. The function requires kvm_hstate.xics_phys to have been initialized with the physical address of XICS. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2015-09-15powerpc, irq: Use access helper irq_data_get_affinity_mask()Jiang Liu
Use access helper irq_data_get_affinity_mask() so we can move the affinity mask to irq_common_data. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1433145945-789-25-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-30genirq/irqdomain: Allow irq domain aliasingMarc Zyngier
It is not uncommon (at least with the ARM stuff) to have a piece of hardware that implements different flavours of "interrupts". A typical example of this is the GICv3 ITS, which implements standard PCI/MSI support, but also some form of "generic MSI". So far, the PCI/MSI domain is registered using the ITS device_node, so that irq_find_host can return it. On the contrary, the raw MSI domain is not registered with an device_node, making it impossible to be looked up by another subsystem (obviously, using the same device_node twice would only result in confusion, as it is not defined which one irq_find_host would return). A solution to this is to "type" domains that may be aliasing, and to be able to lookup an device_node that matches a given type. For this, we introduce irq_find_matching_host() as a superset of irq_find_host: struct irq_domain *irq_find_matching_host(struct device_node *node, enum irq_domain_bus_token bus_token); where bus_token is the "type" we want to match the domain against (so far, only DOMAIN_BUS_ANY is defined). This result in some moderately invasive changes on the PPC side (which is the only user of the .match method). This has otherwise no functionnal change. Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: <linux-arm-kernel@lists.infradead.org> Cc: Yijing Wang <wangyijing@huawei.com> Cc: Ma Jun <majun258@huawei.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Duc Dang <dhdang@apm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jason Cooper <jason@lakedaemon.net> Link: http://lkml.kernel.org/r/1438091186-10244-2-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-16treewide: Use helper function to access irq_data->msi_descJiang Liu
Use irq_data access helper to access irq_data->msi_desc, so we can move msi_desc from struct irq_data into struct irq_common_data later. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Chris Metcalf <cmetcalf@ezchip.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-17powerpc: Make doorbell check preemption safeShreyas B. Prabhu
Doorbell can be used to cause ipi on cpus which are sibling threads on the same core. So icp_native_cause_ipi checks if the destination cpu is a sibling thread of the current cpu and uses doorbell in such cases. But while running with CONFIG_PREEMPT=y, since this section is preemtible, we can run into issues if after we check if the destination cpu is a sibling cpu, the task gets migrated from a sibling cpu to a cpu on another core. Fix this by using get_cpu()/ put_cpu() Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02powerpc: Use irq_desc_get_xxx() to avoid redundant lookup of irq_descJiang Liu
Use irq_desc_get_xxx() to avoid redundant lookup of irq_desc while we already have a pointer to corresponding irq_desc. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-11powerpc: Constify irq_domain_opsKrzysztof Kozlowski
The irq_domain_ops are not modified by the driver and the irqdomain core code accepts pointer to a const data. Signed-off-by: Krzysztof Kozlowski <k.kozlowski.k@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-10powerpc: Drop return value of smp_ops->probe()Michael Ellerman
smp_ops->probe() is currently supposed to return the number of cpus in the system. The last actual usage of the value was removed in May 2007 in e147ec8f1808 "[POWERPC] Simplify smp_space_timers". We still passed the value around until June 2010 when even that was finally removed in c1aa687d499a "powerpc: Clean up obsolete code relating to decrementer and timebase". So drop that requirement, probe() now returns void, and update all implementations. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-13powerpc: use %*pb[l] to print bitmaps including cpumasks and nodemasksTejun Heo
printk and friends can now format bitmaps using '%*pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. * Spurious if (len > 1) test dropped from shared_cpu_map_show(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-01-23powerpc: Replace cpumask_weight(cpu_possible_mask) with num_possible_cpus()Emil Medve
num_possible_cpus() is just a shorthand for it. Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-11Merge tag 'powerpc-3.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: "Some nice cleanups like removing bootmem, and removal of __get_cpu_var(). There is one patch to mm/gup.c. This is the generic GUP implementation, but is only used by us and arm(64). We have an ack from Steve Capper, and although we didn't get an ack from Andrew he told us to take the patch through the powerpc tree. There's one cxl patch. This is in drivers/misc, but Greg said he was happy for us to manage fixes for it. There is an infrastructure patch to support an IPMI driver for OPAL. There is also an RTC driver for OPAL. We weren't able to get any response from the RTC maintainer, Alessandro Zummo, so in the end we just merged the driver. The usual batch of Freescale updates from Scott" * tag 'powerpc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (101 commits) powerpc/powernv: Return to cpu offline loop when finished in KVM guest powerpc/book3s: Fix partial invalidation of TLBs in MCE code. powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault powerpc/xmon: Cleanup the breakpoint flags powerpc/xmon: Enable HW instruction breakpoint on POWER8 powerpc/mm/thp: Use tlbiel if possible powerpc/mm/thp: Remove code duplication powerpc/mm/hugetlb: Sanity check gigantic hugepage count powerpc/oprofile: Disable pagefaults during user stack read powerpc/mm: Check for matching hpte without taking hpte lock powerpc: Drop useless warning in eeh_init() powerpc/powernv: Cleanup unused MCE definitions/declarations. powerpc/eeh: Dump PHB diag-data early powerpc/eeh: Recover EEH error on ownership change for BCM5719 powerpc/eeh: Set EEH_PE_RESET on PE reset powerpc/eeh: Refactor eeh_reset_pe() powerpc: Remove more traces of bootmem powerpc/pseries: Initialise nvram_pstore_info's buf_lock cxl: Name interrupts in /proc/interrupt cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warning ...
2014-11-23PCI/MSI: Rename mask/unmask_msi_irq treewideThomas Gleixner
The PCI/MSI irq chip callbacks mask/unmask_msi_irq have been renamed to pci_msi_mask/unmask_irq to mark them PCI specific. Rename all usage sites. The conversion helper functions are kept around to avoid conflicts in next and will be removed after merging into mainline. Coccinelle assisted conversion. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: x86@kernel.org Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Mohit Kumar <mohit.kumar@st.com> Cc: Simon Horman <horms@verge.net.au> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Yijing Wang <wangyijing@huawei.com>
2014-11-03powerpc: Replace __get_cpu_var usesChristoph Lameter
This still has not been merged and now powerpc is the only arch that does not have this change. Sorry about missing linuxppc-dev before. V2->V2 - Fix up to work against 3.18-rc1 __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: Christoph Lameter <cl@linux.com> [mpe: Fix build errors caused by set/or_softirq_pending(), and rework assignment in __set_breakpoint() to use memcpy().] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc/powernv: Don't call generic code on offline cpusPaul Mackerras
On PowerNV platforms, when a CPU is offline, we put it into nap mode. It's possible that the CPU wakes up from nap mode while it is still offline due to a stray IPI. A misdirected device interrupt could also potentially cause it to wake up. In that circumstance, we need to clear the interrupt so that the CPU can go back to nap mode. In the past the clearing of the interrupt was accomplished by briefly enabling interrupts and allowing the normal interrupt handling code (do_IRQ() etc.) to handle the interrupt. This has the problem that this code calls irq_enter() and irq_exit(), which call functions such as account_system_vtime() which use RCU internally. Use of RCU is not permitted on offline CPUs and will trigger errors if RCU checking is enabled. To avoid calling into any generic code which might use RCU, we adopt a different method of clearing interrupts on offline CPUs. Since we are on the PowerNV platform, we know that the system interrupt controller is a XICS being driven directly (i.e. not via hcalls) by the kernel. Hence this adds a new icp_native_flush_interrupt() function to the native-mode XICS driver and arranges to call that when an offline CPU is woken from nap. This new function reads the interrupt from the XICS. If it is an IPI, it clears the IPI; if it is a device interrupt, it prints a warning and disables the source. Then it does the end-of-interrupt processing for the interrupt. The other thing that briefly enabling interrupts did was to check and clear the irq_happened flag in this CPU's PACA. Therefore, after flushing the interrupt from the XICS, we also clear all bits except the PACA_IRQ_HARD_DIS (interrupts are hard disabled) bit from the irq_happened flag. The PACA_IRQ_HARD_DIS flag is set by power7_nap() and is left set to indicate that interrupts are hard disabled. This means we then have to ignore that flag in power7_nap(), which is reasonable since it doesn't indicate that any interrupt event needs servicing. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-06-11powerpc/powernv: Enable POWER8 doorbell IPIsMichael Neuling
This patch enables POWER8 doorbell IPIs on powernv. Since doorbells can only IPI within a core, we test to see when we can use doorbells and if not we fall back to XICS. This also enables hypervisor doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit. Based on tests by Anton, the best case IPI latency between two threads dropped from 894ns to 512ns. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15powerpc: Delete non-required instances of include <linux/init.h>Paul Gortmaker
None of these files are actually using any __init type directives and hence don't need to include <linux/init.h>. Most are just a left over from __devinit and __cpuinit removal, or simply due to code getting copied from one driver to the next. The one instance where we add an include for init.h covers off a case where that file was implicitly getting it from another header which itself didn't need it. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-11powerpc/powernv: Fix endian issues in OPAL ICS backendBenjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14powerpc: Fix some endian issues in xics codeAnton Blanchard
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14powerpc: Add some endian annotations to time and xics codeAnton Blanchard
Fix a couple of sparse warnings. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Fix irq_set_affinity() return valuesAlexander Gordeev
Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-05Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Gleb Natapov: "Highlights of the updates are: general: - new emulated device API - legacy device assignment is now optional - irqfd interface is more generic and can be shared between arches x86: - VMCS shadow support and other nested VMX improvements - APIC virtualization and Posted Interrupt hardware support - Optimize mmio spte zapping ppc: - BookE: in-kernel MPIC emulation with irqfd support - Book3S: in-kernel XICS emulation (incomplete) - Book3S: HV: migration fixes - BookE: more debug support preparation - BookE: e6500 support ARM: - reworking of Hyp idmaps s390: - ioeventfd for virtio-ccw And many other bug fixes, cleanups and improvements" * tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) kvm: Add compat_ioctl for device control API KVM: x86: Account for failing enable_irq_window for NMI window request KVM: PPC: Book3S: Add API for in-kernel XICS emulation kvm/ppc/mpic: fix missing unlock in set_base_addr() kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write kvm/ppc/mpic: remove users kvm/ppc/mpic: fix mmio region lists when multiple guests used kvm/ppc/mpic: remove default routes from documentation kvm: KVM_CAP_IOMMU only available with device assignment ARM: KVM: iterate over all CPUs for CPU compatibility check KVM: ARM: Fix spelling in error message ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally KVM: ARM: Fix API documentation for ONE_REG encoding ARM: KVM: promote vfp_host pointer to generic host cpu context ARM: KVM: add architecture specific hook for capabilities ARM: KVM: perform HYP initilization for hotplugged CPUs ARM: KVM: switch to a dual-step HYP init code ARM: KVM: rework HYP page table freeing ARM: KVM: enforce maximum size for identity mapped code ARM: KVM: move to a KVM provided HYP idmap ...
2013-04-26KVM: PPC: Book3S HV: Speed up wakeups of CPUs on HV KVMBenjamin Herrenschmidt
Currently, we wake up a CPU by sending a host IPI with smp_send_reschedule() to thread 0 of that core, which will take all threads out of the guest, and cause them to re-evaluate their interrupt status on the way back in. This adds a mechanism to differentiate real host IPIs from IPIs sent by KVM for guest threads to poke each other, in order to target the guest threads precisely when possible and avoid that global switch of the core to host state. We then use this new facility in the in-kernel XICS code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26powerpc/powernv: Patch MSI EOI handler on P8Gavin Shan
The EOI handler of MSI/MSI-X interrupts for P8 (PHB3) need additional steps to handle the P/Q bits in IVE before EOIing the corresponding interrupt. The patch changes the EOI handler to cover that. we have individual IRQ chip in each PHB instance. During the MSI IRQ setup time, the IRQ chip is copied over from the original one for that IRQ, and the EOI handler is patched with the one that will handle the P/Q bits (As Ben suggested). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-08powerpc: fix ics_rtas_init and start_secondary section mismatchDaniel Borkmann
It seems, we're fine with just annotating the two functions. Thus, this fixes the following build warnings on ppc64: WARNING: arch/powerpc/sysdev/xics/built-in.o(.text+0x1664): The function .ics_rtas_init() references the function __init .xics_register_ics(). This is often because .ics_rtas_init lacks a __init annotation or the annotation of .xics_register_ics is wrong. WARNING: arch/powerpc/sysdev/built-in.o(.text+0x6044): The function .ics_rtas_init() references the function __init .xics_register_ics(). This is often because .ics_rtas_init lacks a __init annotation or the annotation of .xics_register_ics is wrong. WARNING: arch/powerpc/kernel/built-in.o(.text+0x2db30): The function .start_secondary() references the function __cpuinit .vdso_getcpu_init(). This is often because .start_secondary lacks a __cpuinit annotation or the annotation of .vdso_getcpu_init is wrong. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Make sure IPI handlers see data written by IPI sendersPaul Mackerras
We have been observing hangs, both of KVM guest vcpu tasks and more generally, where a process that is woken doesn't properly wake up and continue to run, but instead sticks in TASK_WAKING state. This happens because the update of rq->wake_list in ttwu_queue_remote() is not ordered with the update of ipi_message in smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in scheduler_ipi() is not ordered with the reading of ipi_message in smp_ipi_demux(). Thus it is possible for the IPI receiver not to see the updated rq->wake_list and therefore conclude that there is nothing for it to do. In order to make sure that anything done before smp_send_reschedule() is ordered before anything done in the resulting call to scheduler_ipi(), this adds barriers in smp_muxed_message_pass() and smp_ipi_demux(). The barrier in smp_muxed_message_pass() is a full barrier to ensure that there is a full ordering between the smp_send_reschedule() caller and scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than xchg_local() because xchg() includes release and acquire barriers. Using xchg() rather than xchg_local() makes sense given that ipi_message is not just accessed locally. This moves the barrier between setting the message and calling the cause_ipi() function into the individual cause_ipi implementations. Most of them -- those that used outb, out_8 or similar -- already had a full barrier because out_8 etc. include a sync before the MMIO store. This adds an explicit barrier in the two remaining cases. These changes made no measurable difference to the speed of IPIs as measured using a simple ping-pong latency test across two CPUs on different cores of a POWER7 machine. The analysis of the reason why processes were not waking up properly is due to Milton Miller. Cc: stable@vger.kernel.org # v3.0+ Reported-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-11irqdomain: Eliminate dedicated radix lookup functionsGrant Likely
In preparation to remove the slow revmap path, eliminate the public radix revmap lookup functions. This simplifies the code and makes the slowpath removal patch a lot simpler. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <rob.herring@calxeda.com>
2012-07-11irqdomain: Always update revmap when setting up a virqGrant Likely
At irq_setup_virq() time all of the data needed to update the reverse map is available, but the current code ignores it and relies upon the slow path to insert revmap records. This patch adds revmap updating to the setup path so the slow path will no longer be necessary. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <rob.herring@calxeda.com>