diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-08-17 11:32:50 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-08-17 11:32:50 -0700 |
commit | 5e2d059b52e397d9ac42f4c4d9d9a841887b5818 (patch) | |
tree | c8cd8fd7187113be33e29fcc75f45a8bbc27e6b2 /Documentation | |
parent | d190775206d06397a9309421cac5ba2f2c243521 (diff) | |
parent | a2dc009afa9ae8b92305be7728676562a104cb40 (diff) |
Merge tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- A fix for a bug in our page table fragment allocator, where a page
table page could be freed and reallocated for something else while
still in use, leading to memory corruption etc. The fix reuses
pt_mm in struct page (x86 only) for a powerpc only refcount.
- Fixes to our pkey support. Several are user-visible changes, but
bring us in to line with x86 behaviour and/or fix outright bugs.
Thanks to Florian Weimer for reporting many of these.
- A series to improve the hvc driver & related OPAL console code,
which have been seen to cause hardlockups at times. The hvc driver
changes in particular have been in linux-next for ~month.
- Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y.
- Remove Power8 DD1 and Power9 DD1 support, neither chip should be in
use anywhere other than as a paper weight.
- An optimised memcmp implementation using Power7-or-later VMX
instructions
- Support for barrier_nospec on some NXP CPUs.
- Support for flushing the count cache on context switch on some IBM
CPUs (controlled by firmware), as a Spectre v2 mitigation.
- A series to enhance the information we print on unhandled signals
to bring it into line with other arches, including showing the
offending VMA and dumping the instructions around the fault.
Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey
Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan,
Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski,
Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng,
Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph
Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave
Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer,
Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand,
Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel
Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh
Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues,
Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha,
Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul
Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza
Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood,
Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago
Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat
Rao, zhong jiang"
* tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (234 commits)
powerpc/mm/book3s/radix: Add mapping statistics
powerpc/uaccess: Enable get_user(u64, *p) on 32-bit
powerpc/mm/hash: Remove unnecessary do { } while(0) loop
powerpc/64s: move machine check SLB flushing to mm/slb.c
powerpc/powernv/idle: Fix build error
powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range
powerpc/mm: remove warning about ‘type’ being set
powerpc/32: Include setup.h header file to fix warnings
powerpc: Move `path` variable inside DEBUG_PROM
powerpc/powermac: Make some functions static
powerpc/powermac: Remove variable x that's never read
cxl: remove a dead branch
powerpc/powermac: Add missing include of header pmac.h
powerpc/kexec: Use common error handling code in setup_new_fdt()
powerpc/xmon: Add address lookup for percpu symbols
powerpc/mm: remove huge_pte_offset_and_shift() prototype
powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled
powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.
powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements
powerpc/fadump: handle crash memory ranges array index overflow
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/ppc-memtrace | 9 | ||||
-rw-r--r-- | Documentation/admin-guide/kernel-parameters.txt | 4 | ||||
-rw-r--r-- | Documentation/hwmon/ibmpowernv | 43 | ||||
-rw-r--r-- | Documentation/powerpc/DAWR-POWER9.txt | 58 | ||||
-rw-r--r-- | Documentation/powerpc/transactional_memory.txt | 44 |
5 files changed, 152 insertions, 6 deletions
diff --git a/Documentation/ABI/testing/ppc-memtrace b/Documentation/ABI/testing/ppc-memtrace index 2e8b93741270..9606aed33137 100644 --- a/Documentation/ABI/testing/ppc-memtrace +++ b/Documentation/ABI/testing/ppc-memtrace @@ -13,10 +13,11 @@ Contact: linuxppc-dev@lists.ozlabs.org Description: Write an integer containing the size in bytes of the memory you want removed from each NUMA node to this file - it must be aligned to the memblock size. This amount of RAM will be removed - from the kernel mappings and the following debugfs files will be - created. This can only be successfully done once per boot. Once - memory is successfully removed from each node, the following - files are created. + from each NUMA node in the kernel mappings and the following + debugfs files will be created. Once memory is successfully + removed from each node, the following files are created. To + re-add memory to the kernel, echo 0 into this file (it will be + automatically onlined). What: /sys/kernel/debug/powerpc/memtrace/<node-id> Date: Aug 2017 diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index adafe47ac376..d60b169bb54b 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2784,6 +2784,10 @@ nosmt=force: Force disable SMT, cannot be undone via the sysfs control file. + nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds + check bypass). With this option data leaks are possible + in the system. + nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 (indirect branch prediction) vulnerability. System may allow data leaks with this option, which is equivalent diff --git a/Documentation/hwmon/ibmpowernv b/Documentation/hwmon/ibmpowernv index 8826ba29db36..56468258711f 100644 --- a/Documentation/hwmon/ibmpowernv +++ b/Documentation/hwmon/ibmpowernv @@ -33,9 +33,48 @@ fanX_input Measured RPM value. fanX_min Threshold RPM for alert generation. fanX_fault 0: No fail condition 1: Failing fan + tempX_input Measured ambient temperature. tempX_max Threshold ambient temperature for alert generation. -inX_input Measured power supply voltage +tempX_highest Historical maximum temperature +tempX_lowest Historical minimum temperature +tempX_enable Enable/disable all temperature sensors belonging to the + sub-group. In POWER9, this attribute corresponds to + each OCC. Using this attribute each OCC can be asked to + disable/enable all of its temperature sensors. + 1: Enable + 0: Disable + +inX_input Measured power supply voltage (millivolt) inX_fault 0: No fail condition. 1: Failing power supply. -power1_input System power consumption (microWatt) +inX_highest Historical maximum voltage +inX_lowest Historical minimum voltage +inX_enable Enable/disable all voltage sensors belonging to the + sub-group. In POWER9, this attribute corresponds to + each OCC. Using this attribute each OCC can be asked to + disable/enable all of its voltage sensors. + 1: Enable + 0: Disable + +powerX_input Power consumption (microWatt) +powerX_input_highest Historical maximum power +powerX_input_lowest Historical minimum power +powerX_enable Enable/disable all power sensors belonging to the + sub-group. In POWER9, this attribute corresponds to + each OCC. Using this attribute each OCC can be asked to + disable/enable all of its power sensors. + 1: Enable + 0: Disable + +currX_input Measured current (milliampere) +currX_highest Historical maximum current +currX_lowest Historical minimum current +currX_enable Enable/disable all current sensors belonging to the + sub-group. In POWER9, this attribute corresponds to + each OCC. Using this attribute each OCC can be asked to + disable/enable all of its current sensors. + 1: Enable + 0: Disable + +energyX_input Cumulative energy (microJoule) diff --git a/Documentation/powerpc/DAWR-POWER9.txt b/Documentation/powerpc/DAWR-POWER9.txt new file mode 100644 index 000000000000..2feaa6619658 --- /dev/null +++ b/Documentation/powerpc/DAWR-POWER9.txt @@ -0,0 +1,58 @@ +DAWR issues on POWER9 +============================ + +On POWER9 the DAWR can cause a checkstop if it points to cache +inhibited (CI) memory. Currently Linux has no way to disinguish CI +memory when configuring the DAWR, so (for now) the DAWR is disabled by +this commit: + + commit 9654153158d3e0684a1bdb76dbababdb7111d5a0 + Author: Michael Neuling <mikey@neuling.org> + Date: Tue Mar 27 15:37:24 2018 +1100 + powerpc: Disable DAWR in the base POWER9 CPU features + +Technical Details: +============================ + +DAWR has 6 different ways of being set. +1) ptrace +2) h_set_mode(DAWR) +3) h_set_dabr() +4) kvmppc_set_one_reg() +5) xmon + +For ptrace, we now advertise zero breakpoints on POWER9 via the +PPC_PTRACE_GETHWDBGINFO call. This results in GDB falling back to +software emulation of the watchpoint (which is slow). + +h_set_mode(DAWR) and h_set_dabr() will now return an error to the +guest on a POWER9 host. Current Linux guests ignore this error, so +they will silently not get the DAWR. + +kvmppc_set_one_reg() will store the value in the vcpu but won't +actually set it on POWER9 hardware. This is done so we don't break +migration from POWER8 to POWER9, at the cost of silently losing the +DAWR on the migration. + +For xmon, the 'bd' command will return an error on P9. + +Consequences for users +============================ + +For GDB watchpoints (ie 'watch' command) on POWER9 bare metal , GDB +will accept the command. Unfortunately since there is no hardware +support for the watchpoint, GDB will software emulate the watchpoint +making it run very slowly. + +The same will also be true for any guests started on a POWER9 +host. The watchpoint will fail and GDB will fall back to software +emulation. + +If a guest is started on a POWER8 host, GDB will accept the watchpoint +and configure the hardware to use the DAWR. This will run at full +speed since it can use the hardware emulation. Unfortunately if this +guest is migrated to a POWER9 host, the watchpoint will be lost on the +POWER9. Loads and stores to the watchpoint locations will not be +trapped in GDB. The watchpoint is remembered, so if the guest is +migrated back to the POWER8 host, it will start working again. + diff --git a/Documentation/powerpc/transactional_memory.txt b/Documentation/powerpc/transactional_memory.txt index e32fdbb4c9a7..52c023e14f26 100644 --- a/Documentation/powerpc/transactional_memory.txt +++ b/Documentation/powerpc/transactional_memory.txt @@ -198,3 +198,47 @@ presented). The transaction cannot then be continued and will take the failure handler route. Furthermore, the transactional 2nd register state will be inaccessible. GDB can currently be used on programs using TM, but not sensibly in parts within transactions. + +POWER9 +====== + +TM on POWER9 has issues with storing the complete register state. This +is described in this commit: + + commit 4bb3c7a0208fc13ca70598efd109901a7cd45ae7 + Author: Paul Mackerras <paulus@ozlabs.org> + Date: Wed Mar 21 21:32:01 2018 +1100 + KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9 + +To account for this different POWER9 chips have TM enabled in +different ways. + +On POWER9N DD2.01 and below, TM is disabled. ie +HWCAP2[PPC_FEATURE2_HTM] is not set. + +On POWER9N DD2.1 TM is configured by firmware to always abort a +transaction when tm suspend occurs. So tsuspend will cause a +transaction to be aborted and rolled back. Kernel exceptions will also +cause the transaction to be aborted and rolled back and the exception +will not occur. If userspace constructs a sigcontext that enables TM +suspend, the sigcontext will be rejected by the kernel. This mode is +advertised to users with HWCAP2[PPC_FEATURE2_HTM_NO_SUSPEND] set. +HWCAP2[PPC_FEATURE2_HTM] is not set in this mode. + +On POWER9N DD2.2 and above, KVM and POWERVM emulate TM for guests (as +described in commit 4bb3c7a0208f), hence TM is enabled for guests +ie. HWCAP2[PPC_FEATURE2_HTM] is set for guest userspace. Guests that +makes heavy use of TM suspend (tsuspend or kernel suspend) will result +in traps into the hypervisor and hence will suffer a performance +degradation. Host userspace has TM disabled +ie. HWCAP2[PPC_FEATURE2_HTM] is not set. (although we make enable it +at some point in the future if we bring the emulation into host +userspace context switching). + +POWER9C DD1.2 and above are only available with POWERVM and hence +Linux only runs as a guest. On these systems TM is emulated like on +POWER9N DD2.2. + +Guest migration from POWER8 to POWER9 will work with POWER9N DD2.2 and +POWER9C DD1.2. Since earlier POWER9 processors don't support TM +emulation, migration from POWER8 to POWER9 is not supported there. |