summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2021-06-21powerpc/lib/code-patching: Set up Strict RWX patching earlierJordan Niethe
setup_text_poke_area() is a late init call so it runs before mark_rodata_ro() and after the init calls. This lets all the init code patching simply write to their locations. In the future, kprobes is going to allocate its instruction pages RO which means they will need setup_text__poke_area() to have been already called for their code patching. However, init_kprobes() (which allocates and patches some instruction pages) is an early init call so it happens before setup_text__poke_area(). start_kernel() calls poking_init() before any of the init calls. On powerpc, poking_init() is currently a nop. setup_text_poke_area() relies on kernel virtual memory, cpu hotplug and per_cpu_areas being setup. setup_per_cpu_areas(), boot_cpu_hotplug_init() and mm_init() are called before poking_init(). Turn setup_text_poke_area() into poking_init(). Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Russell Currey <ruscur@russell.cc> [mpe: Fold in missing prototype for poking_init() from lkp] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210609013431.9805-3-jniethe5@gmail.com
2021-06-21powerpc/mm: Implement set_memory() routinesRussell Currey
The set_memory_{ro/rw/nx/x}() functions are required for STRICT_MODULE_RWX, and are generally useful primitives to have. This implementation is designed to be generic across powerpc's many MMUs. It's possible that this could be optimised to be faster for specific MMUs. This implementation does not handle cases where the caller is attempting to change the mapping of the page it is executing from, or if another CPU is concurrently using the page being altered. These cases likely shouldn't happen, but a more complex implementation with MMU-specific code could safely handle them. On hash, the linear mapping is not kept in the linux pagetable, so this will not change the protection if used on that range. Currently these functions are not used on the linear map so just WARN for now. apply_to_existing_page_range() does not work on huge pages so for now disallow changing the protection of huge pages. [jpn: - Allow set memory functions to be used without Strict RWX - Hash: Disallow certain regions - Have change_page_attr() take function pointers to manipulate ptes - Radix: Add ptesync after set_pte_at()] Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210609013431.9805-2-jniethe5@gmail.com
2021-06-21powerpc/pesries: Get STF barrier requirement from H_GET_CPU_CHARACTERISTICSNicholas Piggin
This allows the hypervisor / firmware to describe this workarounds to the guest. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210503130243.891868-4-npiggin@gmail.com
2021-06-21powerpc/security: Add a security feature for STF barrierNicholas Piggin
Rather than tying this mitigation to RFI L1D flush requirement, add a new bit for it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210503130243.891868-3-npiggin@gmail.com
2021-06-21powerpc/pseries: Get entry and uaccess flush required bits from ↵Nicholas Piggin
H_GET_CPU_CHARACTERISTICS This allows the hypervisor / firmware to describe these workarounds to the guest. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210503130243.891868-2-npiggin@gmail.com
2021-06-21powerpc/boot: add zImage.lds to targetsNicholas Piggin
This prevents spurious rebuilds of the lds and then wrappers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210611111104.1058991-1-npiggin@gmail.com
2021-06-21powerpc/powernv: Fix machine check reporting of async store errorsNicholas Piggin
POWER9 and POWER10 asynchronous machine checks due to stores have their cause reported in SRR1 but SRR1[42] is set, which in other cases indicates DSISR cause. Check for these cases and clear SRR1[42], so the cause matching uses the i-side (SRR1) table. Fixes: 7b9f71f974a1 ("powerpc/64s: POWER9 machine check handler") Fixes: 201220bb0e8c ("powerpc/powernv: Machine check handler for POWER10") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210517140355.2325406-1-npiggin@gmail.com
2021-06-20powerpc/pseries/vas: Setup IRQ and fault handlingHaren Myneni
NX generates an interrupt when sees a fault on the user space buffer and the hypervisor forwards that interrupt to OS. Then the kernel handles the interrupt by issuing H_GET_NX_FAULT hcall to retrieve the fault CRB information. This patch also adds changes to setup and free IRQ per each window and also handles the fault by updating the CSB. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b8fc66dcb783d06a099a303e5cfc69087bb3357a.camel@linux.ibm.com
2021-06-20powerpc/pseries/vas: Integrate API with open/close windowsHaren Myneni
This patch adds VAS window allocatioa/close with the corresponding hcalls. Also changes to integrate with the existing user space VAS API and provide register/unregister functions to NX pseries driver. The driver register function is used to create the user space interface (/dev/crypto/nx-gzip) and unregister to remove this entry. The user space process opens this device node and makes an ioctl to allocate VAS window. The close interface is used to deallocate window. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e8d956bace3f182c4d2e66e343ff37cb0391d1fd.camel@linux.ibm.com
2021-06-20powerpc/pseries/vas: Implement getting capabilities from hypervisorHaren Myneni
The hypervisor provides VAS capabilities for GZIP default and QoS features. These capabilities gives information for the specific features such as total number of credits available in LPAR, maximum credits allowed per window, maximum credits allowed in LPAR, whether usermode copy/paste is supported, and etc. This patch adds the following: - Retrieve all features that are provided by hypervisor using H_QUERY_VAS_CAPABILITIES hcall with 0 as feature type. - Retrieve capabilities for the specific feature using the same hcall and the feature type (1 for QoS and 2 for default type). Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/177c88608cb88f7b87d1c546103f18cec6c056b4.camel@linux.ibm.com
2021-06-20powerpc/pseries/vas: Add hcall wrappers for VAS handlingHaren Myneni
This patch adds the following hcall wrapper functions to allocate, modify and deallocate VAS windows, and retrieve VAS capabilities. H_ALLOCATE_VAS_WINDOW: Allocate VAS window H_DEALLOCATE_VAS_WINDOW: Close VAS window H_MODIFY_VAS_WINDOW: Setup window before using H_QUERY_VAS_CAPABILITIES: Get VAS capabilities Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/40fb02a4d56ca4e240b074a15029082055be5997.camel@linux.ibm.com
2021-06-20powerpc/vas: Define QoS credit flag to allocate windowHaren Myneni
PowerVM introduces two different type of credits: Default and Quality of service (QoS). The total number of default credits available on each LPAR depends on CPU resources configured. But these credits can be shared or over-committed across LPARs in shared mode which can result in paste command failure (RMA_busy). To avoid NX HW contention, the hypervisor ntroduces QoS credit type which makes sure guaranteed access to NX esources. The system admins can assign QoS credits or each LPAR via HMC. Default credit type is used to allocate a VAS window by default as on PowerVM implementation. But the process can pass VAS_TX_WIN_FLAG_QOS_CREDIT flag with VAS_TX_WIN_OPEN ioctl to open QoS type window. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aa950b7b8e8077364267720274a7b9ec34e76e73.camel@linux.ibm.com
2021-06-20powerpc/pseries/vas: Define VAS/NXGZIP hcalls and structsHaren Myneni
This patch adds hcalls and other definitions. Also define structs that are used in VAS implementation on PowerVM. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b4b8c594c27ee4aa6be9dc6dc4ee7331571cbbe8.camel@linux.ibm.com
2021-06-20powerpc/vas: Define and use common vas_window structHaren Myneni
Many elements in vas_struct are used on PowerNV and PowerVM platforms. vas_window is used for both TX and RX windows on PowerNV and for TX windows on PowerVM. So some elements are specific to these platforms. So this patch defines common vas_window and platform specific window structs (pnv_vas_window on PowerNV). Also adds the corresponding changes in PowerNV vas code. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1698c35c158dfe52c6d2166667823d3d4a463353.camel@linux.ibm.com
2021-06-20powerpc/vas: Move update_csb/dump_crb to common book3s platformHaren Myneni
If a coprocessor encounters an error translating an address, the VAS will cause an interrupt in the host. The kernel processes the fault by updating CSB. This functionality is same for both powerNV and pseries. So this patch moves these functions to common vas-api.c and the actual functionality is not changed. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bf8d5b0770fa1ef5cba88c96580caa08d999d3b5.camel@linux.ibm.com
2021-06-20powerpc/vas: Create take/drop pid and mm reference functionsHaren Myneni
Take pid and mm references when each window opens and drops during close. This functionality is needed for powerNV and pseries. So this patch defines the existing code as functions in common book3s platform vas-api.c Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2fa40df962250a737c804e58202924717b39e381.camel@linux.ibm.com
2021-06-20powerpc/vas: Add platform specific user window operationsHaren Myneni
PowerNV uses registers to open/close VAS windows, and getting the paste address. Whereas the hypervisor calls are used on PowerVM. This patch adds the platform specific user space window operations and register with the common VAS user space interface. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f85091f4ace67f951ac04d60394d67b21e2f5d3c.camel@linux.ibm.com
2021-06-20powerpc/powernv/vas: Rename register/unregister functionsHaren Myneni
powerNV and pseries drivers register / unregister to the corresponding platform specific VAS separately. Then these VAS functions call the common API with the specific window operations. So rename powerNV VAS API register/unregister functions. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9db00d58dbdcb7cfc07a1df95f3d2a9e3e5d746a.camel@linux.ibm.com
2021-06-20powerpc/vas: Move VAS API to book3s common platformHaren Myneni
The pseries platform will share vas and nx code and interfaces with the PowerNV platform, so create the arch/powerpc/platforms/book3s/ directory and move VAS API code there. Functionality is not changed. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e05c8db17b9eabe3545b902d034238e4c6c08180.camel@linux.ibm.com
2021-06-20powerpc/powernv/vas: Release reference to tgid during window closeHaren Myneni
The kernel handles the NX fault by updating CSB or sending signal to process. In multithread applications, children can open VAS windows and can exit without closing them. But the parent can continue to send NX requests with these windows. To prevent pid reuse, reference will be taken on pid and tgid when the window is opened and release them during window close. The current code is not releasing the tgid reference which can cause pid leak and this patch fixes the issue. Fixes: db1c08a740635 ("powerpc/vas: Take reference to PID and mm for user space windows") Cc: stable@vger.kernel.org # 5.8+ Reported-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6020fc4d444864fe20f7dcdc5edfe53e67480a1c.camel@linux.ibm.com
2021-06-17Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Merge some powerpc KVM patches from our topic branch. In particular this brings in Nick's big series rewriting parts of the guest entry/exit path in C. Conflicts: arch/powerpc/kernel/security.c arch/powerpc/kvm/book3s_hv_rmhandlers.S
2021-06-17powerpc/mm/book3s64: Fix possible build errorAneesh Kumar K.V
Update _tlbiel_pid() such that we can avoid build errors like below when using this function in other places. arch/powerpc/mm/book3s64/radix_tlb.c: In function ‘__radix__flush_tlb_range_psize’: arch/powerpc/mm/book3s64/radix_tlb.c:114:2: warning: ‘asm’ operand 3 probably does not match constraints 114 | asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1) | ^~~ arch/powerpc/mm/book3s64/radix_tlb.c:114:2: error: impossible constraint in ‘asm’ make[4]: *** [scripts/Makefile.build:271: arch/powerpc/mm/book3s64/radix_tlb.o] Error 1 m With this fix, we can also drop the __always_inline in __radix_flush_tlb_range_psize which was added by commit e12d6d7d46a6 ("powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as __always_inline") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210610083639.387365-1-aneesh.kumar@linux.ibm.com
2021-06-17powerpc/signal64: Don't read sigaction arguments back from user memoryMichael Ellerman
When delivering a signal to a sigaction style handler (SA_SIGINFO), we pass pointers to the siginfo and ucontext via r4 and r5. Currently we populate the values in those registers by reading the pointers out of the sigframe in user memory, even though the values in user memory were written by the kernel just prior: unsafe_put_user(&frame->info, &frame->pinfo, badframe_block); unsafe_put_user(&frame->uc, &frame->puc, badframe_block); ... if (ksig->ka.sa.sa_flags & SA_SIGINFO) { err |= get_user(regs->gpr[4], (unsigned long __user *)&frame->pinfo); err |= get_user(regs->gpr[5], (unsigned long __user *)&frame->puc); ie. we write &frame->info into frame->pinfo, and then read frame->pinfo back into r4, and similarly for &frame->uc. The code has always been like this, since linux-fullhistory commit d4f2d95eca2c ("Forward port of 2.4 ppc64 signal changes."). There's no reason for us to read the values back from user memory, rather than just setting the value in the gpr[4/5] directly. In fact reading the value back from user memory opens up the possibility of another user thread changing the values before we read them back. Although any process doing that would be racing against the kernel delivering the signal, and would risk corrupting the stack, so that would be a userspace bug. Note that this is 64-bit only code, so there's no subtlety with the size of pointers differing between kernel and user. Also the frame variable is not modified to point elsewhere during the function. In the past reading the values back from user memory was not costly, but now that we have KUAP on some CPUs it is, so we'd rather avoid it for that reason too. So change the code to just set the values directly, using the same values we have written to the sigframe previously in the function. Note also that this matches what our 32-bit signal code does. Using a version of will-it-scale's signal1_threads that sets SA_SIGINFO, this results in a ~4% increase in signals per second on a Power9, from 229,777 to 239,766. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210610072949.3198522-1-mpe@ellerman.id.au
2021-06-17powerpc/watchdog: include linux/processor.h for spin_until_condSudeep Holla
This implementation uses spin_until_cond in wd_smp_lock including neither linux/processor.h nor asm/processor.h This patch includes linux/processor.h here for spin_until_cond usage. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5e8d2d50f301a346040362028c2ecba40685de9e.1623438544.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/64: drop redundant defination of spin_until_condSudeep Holla
linux/processor.h has exactly same defination for spin_until_cond. Drop the redundant defination in asm/processor.h Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1fff2054e5dfc00329804dbd3f2a91667c9a8aff.1623438544.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/signal32: Remove impossible #ifdef combinationsChristophe Leroy
PPC_TRANSACTIONAL_MEM is only on book3s/64 SPE is only on booke PPC_TRANSACTIONAL_MEM selects ALTIVEC and VSX Therefore, within PPC_TRANSACTIONAL_MEM sections, ALTIVEC and VSX are always defined while SPE never is. Remove all SPE code and all #ifdef ALTIVEC and VSX in tm functions. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a069a348ee3c2fe3123a5a93695c2b35dc42cb40.1623340691.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32: Display modules range in virtual memory layoutChristophe Leroy
book3s/32 and 8xx don't use vmalloc for modules. Print the modules area at startup as part of the virtual memory layout: [ 0.000000] Kernel virtual memory layout: [ 0.000000] * 0xffafc000..0xffffc000 : fixmap [ 0.000000] * 0xc9000000..0xffafc000 : vmalloc & ioremap [ 0.000000] * 0xb0000000..0xc0000000 : modules [ 0.000000] Memory: 118480K/131072K available (7152K kernel code, 2320K rwdata, 1328K rodata, 368K init, 854K bss, 12592K reserved, 0K cma-reserved) Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/98394503e92d6fd6d8f657e0b263b32f21cf2790.1623438478.git.christophe.leroy@csgroup.eu
2021-06-17powerpc: make stack walking KASAN-safeDaniel Axtens
Make our stack-walking code KASAN-safe by using __no_sanitize_address. Generic code, arm64, s390 and x86 all make accesses unchecked for similar sorts of reasons: when unwinding a stack, we might touch memory that KASAN has marked as being out-of-bounds. In ppc64 KASAN development, I hit this sometimes when checking for an exception frame - because we're checking an arbitrary offset into the stack frame. See commit 20955746320e ("s390/kasan: avoid false positives during stack unwind"), commit bcaf669b4bdb ("arm64: disable kasan when accessing frame->fp in unwind_frame"), commit 91e08ab0c851 ("x86/dumpstack: Prevent KASAN false positive warnings") and commit 6e22c8366416 ("tracing, kasan: Silence Kasan warning in check_stack of stack_tracer"). Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210614120907.1952321-1-dja@axtens.net
2021-06-17powerpc: Move update_power8_hid0() into its only userChristophe Leroy
update_power8_hid0() is used only by powernv platform subcore.c Move it there. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/37f41d74faa0c66f90b373e243e8b1ee37a1f6fa.1623219019.git.christophe.leroy@csgroup.eu
2021-06-17powerpc: Remove proc_trap()Christophe Leroy
proc_trap() has never been used, remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/827944ea12d470c2f862635f48b5ee6c1520351f.1623217909.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32: Remove __main()Christophe Leroy
Comment says that __main() is there to make GCC happy. It's been there since the implementation of ppc arch in Linux 1.3.45. ppc32 is the only architecture having that. Even ppc64 doesn't have it. Seems like GCC is still happy without it. Drop it for good. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d01028f8166b98584eec536b52f14c5e3f98ff6b.1623172922.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Rename PTE_SIZE to PTE_T_SIZEChristophe Leroy
PTE_SIZE means PTE page table size in most placed, whereas in hash_low.S in means size of one entry in the table. Rename it PTE_T_SIZE, and define it directly in hash_low.S instead of going through asm-offsets. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/83a008a9fd6cc3f2bbcb470f592555d260ed7a3d.1623063174.git.christophe.leroy@csgroup.eu
2021-06-17powerpc: Define swapper_pg_dir[] in CChristophe Leroy
Don't duplicate swapper_pg_dir[] in each platform's head.S Define it in mm/pgtable.c Define MAX_PTRS_PER_PGD because on book3s/64 PTRS_PER_PGD is not a constant. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5e3f1b8a4695c33ccc80aa3870e016bef32b85e1.1623063174.git.christophe.leroy@csgroup.eu
2021-06-17powerpc: Define empty_zero_page[] in CChristophe Leroy
At the time being, empty_zero_page[] is defined in each platform head.S. Define it in mm/mem.c instead, and put it in BSS section instead of the DATA section. Commit 5227cfa71f9e ("arm64: mm: place empty_zero_page in bss") explains why it is interesting to have it in BSS. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5838caffa269e0957c5a50cc85477876220298b0.1623063174.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/nohash: Remove DEBUG_HARDERChristophe Leroy
DEBUG_HARDER is not user selectable. Remove it together with related messages. Also remove two pr_devel() messages that should likely have been pr_hard(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0f25109b0e12fdd1e6541dedbb2212cc53526a57.1622712515.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/nohash: Remove DEBUG_CLAMP_LAST_CONTEXTChristophe Leroy
DEBUG_CLAMP_LAST_CONTEXT was there in the old days to reduce number of contexts in order to ease debugging implementation of context switching, but that's been quite stable during years now. As it is not user selectable, remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/da81837b452e8b9f1657b529b9c3050dc10b9770.1622712515.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/nohash: Remove DEBUG_MAP_CONSISTENCYChristophe Leroy
mmu_context handling has been there for years, so we would know if there was problems with maps. DEBUG_MAP_CONSISTENCY is not user selectable, remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6fe2b88956db53f8d6ee221525b2c5dc6aec82c6.1622712515.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/nohash: Remove CONFIG_SMP #ifdefery in mmu_context.hChristophe Leroy
Everything can be done even when CONFIG_SMP is not selected. Just use IS_ENABLED() where relevant and rely on GCC to opt out unneeded code and variables when CONFIG_SMP is not set. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cc13b87b0f750a538621876ecc24c22a07e7c8be.1622712515.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/nohash: Convert set_context() to CChristophe Leroy
ppc8xx already has set_context() in C. Other ones have it in assembly. The only thing it does is to write the context id into SPRN_PID. Do it in C. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a5d0759064f3831c6b88af49ef5d3b05ba1c4dad.1622712515.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/nohash: Refactor update of BDI2000 pointers in switch_mmu_context()Christophe Leroy
Instead of duplicating the update of BDI2000 pointers in set_context(), do it directly from switch_mmu_context(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4c54997edd3548fa54717915e7c6ebaf60f208c0.1622712515.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/kuap: Force inlining of all first level KUAP helpers.Christophe Leroy
All KUAP helpers defined in asm/kup.h are single line functions that should be inlined. But on book3s/32 build, we get many instances of <prevent_write_to_user.constprop.0>. Force inlining of those helpers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8479a862e165a57a855292d47e24c259a578f5a0.1622711627.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/kuap: Remove to/from/size parameters of prevent_user_access()Christophe Leroy
prevent_user_access() doesn't use anymore to/from/size parameters. Remove them. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b7113662fd2c26e4c33e9d705de324bd3860822e.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/kuap: Remove KUAP_CURRENT_XXXChristophe Leroy
book3s/32 was the only user of KUAP_CURRENT_XXX. After rework of book3s/32 KUAP, it is not used anymore. Remove them. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/549214ecf6887d965645e664520d4886663c5ffb.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Activate KUAP and KUEP by defaultChristophe Leroy
Now that KUAP and KUEP have been significantly optimised and can be disabled at boot time using 'nosmap' and 'nosmep' kernel parameters, them can be active by default like in other powerpc platforms. It is still possible to disable them completely in the configuration. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/86c7c74a3ba5312daea7e9658b096e2bcc6f4b64.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Rework Kernel Userspace Access ProtectionChristophe Leroy
On book3s/32, KUAP is provided by toggling Ks bit in segment registers. One segment register addresses 256M of virtual memory. At the time being, KUAP implements a complex logic to apply the unlock/lock on the exact number of segments covering the user range to access, with saving the boundaries of the range of segments in a member of thread struct. But most if not all user accesses are within a single segment. Rework KUAP with a different approach: - Open only one segment, the one corresponding to the starting address of the range to be accessed. - If a second segment is involved, it will generate a page fault. The segment will then be open by the page fault handler. The kuap member of thread struct will now contain: - The start address of the current on going user access, that will be used to know which segment to lock at the end of the user access. - ~0 when no user access is open - ~1 when additionnal segments are opened by a page fault. Then, at lock time - When only one segment is open, close it. - When several segments are open, close all user segments. Almost 100% of the time, only one segment will be involved. In interrupts, inline the function that unlock/lock all segments, because not inlining them implies a lot of register save/restore. With the patch, writing value 128 in userspace in perf_copy_attr() is done with 16 instructions: 3890: 93 82 04 dc stw r28,1244(r2) 3894: 7d 20 e5 26 mfsrin r9,r28 3898: 55 29 00 80 rlwinm r9,r9,0,2,0 389c: 7d 20 e1 e4 mtsrin r9,r28 38a0: 4c 00 01 2c isync 38a4: 39 20 00 80 li r9,128 38a8: 91 3c 00 00 stw r9,0(r28) 38ac: 81 42 04 dc lwz r10,1244(r2) 38b0: 39 00 ff ff li r8,-1 38b4: 91 02 04 dc stw r8,1244(r2) 38b8: 2c 0a ff fe cmpwi r10,-2 38bc: 41 82 00 88 beq 3944 <perf_copy_attr+0x36c> 38c0: 7d 20 55 26 mfsrin r9,r10 38c4: 65 29 40 00 oris r9,r9,16384 38c8: 7d 20 51 e4 mtsrin r9,r10 38cc: 4c 00 01 2c isync ... 3944: 48 00 00 01 bl 3944 <perf_copy_attr+0x36c> 3944: R_PPC_REL24 kuap_lock_all_ool Before the patch it was 118 instructions. In reality only 42 are executed in most cases, but GCC is not able to see that a properly aligned user access cannot involve more than one segment. 5060: 39 1d 00 04 addi r8,r29,4 5064: 3d 20 b0 00 lis r9,-20480 5068: 7c 08 48 40 cmplw r8,r9 506c: 40 81 00 08 ble 5074 <perf_copy_attr+0x2cc> 5070: 3d 00 b0 00 lis r8,-20480 5074: 39 28 ff ff addi r9,r8,-1 5078: 57 aa 00 06 rlwinm r10,r29,0,0,3 507c: 55 29 27 3e rlwinm r9,r9,4,28,31 5080: 39 29 00 01 addi r9,r9,1 5084: 7d 29 53 78 or r9,r9,r10 5088: 91 22 04 dc stw r9,1244(r2) 508c: 7d 20 ed 26 mfsrin r9,r29 5090: 55 29 00 80 rlwinm r9,r9,0,2,0 5094: 7c 08 50 40 cmplw r8,r10 5098: 40 81 00 c0 ble 5158 <perf_copy_attr+0x3b0> 509c: 7d 46 50 f8 not r6,r10 50a0: 7c c6 42 14 add r6,r6,r8 50a4: 54 c6 27 be rlwinm r6,r6,4,30,31 50a8: 7d 20 51 e4 mtsrin r9,r10 50ac: 3c ea 10 00 addis r7,r10,4096 50b0: 39 29 01 11 addi r9,r9,273 50b4: 7f 88 38 40 cmplw cr7,r8,r7 50b8: 55 29 02 06 rlwinm r9,r9,0,8,3 50bc: 40 9d 00 9c ble cr7,5158 <perf_copy_attr+0x3b0> 50c0: 2f 86 00 00 cmpwi cr7,r6,0 50c4: 41 9e 00 4c beq cr7,5110 <perf_copy_attr+0x368> 50c8: 2f 86 00 01 cmpwi cr7,r6,1 50cc: 41 9e 00 2c beq cr7,50f8 <perf_copy_attr+0x350> 50d0: 2f 86 00 02 cmpwi cr7,r6,2 50d4: 41 9e 00 14 beq cr7,50e8 <perf_copy_attr+0x340> 50d8: 7d 20 39 e4 mtsrin r9,r7 50dc: 39 29 01 11 addi r9,r9,273 50e0: 3c e7 10 00 addis r7,r7,4096 50e4: 55 29 02 06 rlwinm r9,r9,0,8,3 50e8: 7d 20 39 e4 mtsrin r9,r7 50ec: 39 29 01 11 addi r9,r9,273 50f0: 3c e7 10 00 addis r7,r7,4096 50f4: 55 29 02 06 rlwinm r9,r9,0,8,3 50f8: 7d 20 39 e4 mtsrin r9,r7 50fc: 3c e7 10 00 addis r7,r7,4096 5100: 39 29 01 11 addi r9,r9,273 5104: 7f 88 38 40 cmplw cr7,r8,r7 5108: 55 29 02 06 rlwinm r9,r9,0,8,3 510c: 40 9d 00 4c ble cr7,5158 <perf_copy_attr+0x3b0> 5110: 7d 20 39 e4 mtsrin r9,r7 5114: 39 29 01 11 addi r9,r9,273 5118: 3c c7 10 00 addis r6,r7,4096 511c: 55 29 02 06 rlwinm r9,r9,0,8,3 5120: 7d 20 31 e4 mtsrin r9,r6 5124: 39 29 01 11 addi r9,r9,273 5128: 3c c6 10 00 addis r6,r6,4096 512c: 55 29 02 06 rlwinm r9,r9,0,8,3 5130: 7d 20 31 e4 mtsrin r9,r6 5134: 39 29 01 11 addi r9,r9,273 5138: 3c c7 30 00 addis r6,r7,12288 513c: 55 29 02 06 rlwinm r9,r9,0,8,3 5140: 7d 20 31 e4 mtsrin r9,r6 5144: 3c e7 40 00 addis r7,r7,16384 5148: 39 29 01 11 addi r9,r9,273 514c: 7f 88 38 40 cmplw cr7,r8,r7 5150: 55 29 02 06 rlwinm r9,r9,0,8,3 5154: 41 9d ff bc bgt cr7,5110 <perf_copy_attr+0x368> 5158: 4c 00 01 2c isync 515c: 39 20 00 80 li r9,128 5160: 91 3d 00 00 stw r9,0(r29) 5164: 38 e0 00 00 li r7,0 5168: 90 e2 04 dc stw r7,1244(r2) 516c: 7d 20 ed 26 mfsrin r9,r29 5170: 65 29 40 00 oris r9,r9,16384 5174: 40 81 00 c0 ble 5234 <perf_copy_attr+0x48c> 5178: 7d 47 50 f8 not r7,r10 517c: 7c e7 42 14 add r7,r7,r8 5180: 54 e7 27 be rlwinm r7,r7,4,30,31 5184: 7d 20 51 e4 mtsrin r9,r10 5188: 3d 4a 10 00 addis r10,r10,4096 518c: 39 29 01 11 addi r9,r9,273 5190: 7c 08 50 40 cmplw r8,r10 5194: 55 29 02 06 rlwinm r9,r9,0,8,3 5198: 40 81 00 9c ble 5234 <perf_copy_attr+0x48c> 519c: 2c 07 00 00 cmpwi r7,0 51a0: 41 82 00 4c beq 51ec <perf_copy_attr+0x444> 51a4: 2c 07 00 01 cmpwi r7,1 51a8: 41 82 00 2c beq 51d4 <perf_copy_attr+0x42c> 51ac: 2c 07 00 02 cmpwi r7,2 51b0: 41 82 00 14 beq 51c4 <perf_copy_attr+0x41c> 51b4: 7d 20 51 e4 mtsrin r9,r10 51b8: 39 29 01 11 addi r9,r9,273 51bc: 3d 4a 10 00 addis r10,r10,4096 51c0: 55 29 02 06 rlwinm r9,r9,0,8,3 51c4: 7d 20 51 e4 mtsrin r9,r10 51c8: 39 29 01 11 addi r9,r9,273 51cc: 3d 4a 10 00 addis r10,r10,4096 51d0: 55 29 02 06 rlwinm r9,r9,0,8,3 51d4: 7d 20 51 e4 mtsrin r9,r10 51d8: 3d 4a 10 00 addis r10,r10,4096 51dc: 39 29 01 11 addi r9,r9,273 51e0: 7c 08 50 40 cmplw r8,r10 51e4: 55 29 02 06 rlwinm r9,r9,0,8,3 51e8: 40 81 00 4c ble 5234 <perf_copy_attr+0x48c> 51ec: 7d 20 51 e4 mtsrin r9,r10 51f0: 39 29 01 11 addi r9,r9,273 51f4: 3c ea 10 00 addis r7,r10,4096 51f8: 55 29 02 06 rlwinm r9,r9,0,8,3 51fc: 7d 20 39 e4 mtsrin r9,r7 5200: 39 29 01 11 addi r9,r9,273 5204: 3c e7 10 00 addis r7,r7,4096 5208: 55 29 02 06 rlwinm r9,r9,0,8,3 520c: 7d 20 39 e4 mtsrin r9,r7 5210: 39 29 01 11 addi r9,r9,273 5214: 3c ea 30 00 addis r7,r10,12288 5218: 55 29 02 06 rlwinm r9,r9,0,8,3 521c: 7d 20 39 e4 mtsrin r9,r7 5220: 3d 4a 40 00 addis r10,r10,16384 5224: 39 29 01 11 addi r9,r9,273 5228: 7c 08 50 40 cmplw r8,r10 522c: 55 29 02 06 rlwinm r9,r9,0,8,3 5230: 41 81 ff bc bgt 51ec <perf_copy_attr+0x444> 5234: 4c 00 01 2c isync Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Export the ool handlers to fix build errors] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d9121f96a7c4302946839a0771f5d1daeeb6968c.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Allow disabling KUAP at boot timeChristophe Leroy
PPC64 uses MMU features to enable/disable KUAP at boot time. But feature fixups are applied way too early on PPC32. Now that all KUAP related actions are in C following the conversion of KUAP initial setup and context switch in C, static branches can be used to enable/disable KUAP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Export disable_kuap_key to fix build errors] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cd79e8008455fba5395d099f9bb1305c039b931c.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Allow disabling KUEP at boot timeChristophe Leroy
PPC64 uses MMU features to enable/disable KUEP at boot time. But feature fixups are applied way too early on PPC32. Now that all KUEP related actions are in C following the conversion of KUEP initial setup and context switch in C, static branches can be used to enable/disable KUEP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7745a2c3a08ec46302920a3f48d1cb9b5469dbbb.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Initialise KUAP and KUEP in CChristophe Leroy
In order to selectively activate KUAP and KUEP in a following patch, perform KUAP and KUEP initialisation in C. Unlike PPC64, PPC32 doesn't have an early_setup_secondary(), so do it in start_secondary(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/87be72023448dd4e476744ed279b8c04b8d08a1c.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Simplify calculation of segment register contentChristophe Leroy
segment register has VSID on bits 8-31. Bits 4-7 are reserved, there is no requirement to set them to 0. VSIDs are calculated from VSID of SR0 by adding 0x111. Even with highest possible VSID which would be 0xFFFFF0, adding 16 times 0x111 results in 0x1001100. So, the reserved bits are never overflowed, no need to clear the reserved bits after each calculation. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ddc1cfd2ec8f3b2395c6a4d7f2b0c1aa1b1e64fb.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Convert switch_mmu_context() to CChristophe Leroy
switch_mmu_context() does things that can easily be done in C. For updating user segments, we have update_user_segments(). As mentionned in commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C"), update_user_segments() has the loop unrolled which is a significant performance gain. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/05c0875ad8220c03452c3a334946e207c6ca04d6.1622708530.git.christophe.leroy@csgroup.eu