summaryrefslogtreecommitdiff
path: root/arch/powerpc/include
AgeCommit message (Collapse)Author
2020-05-26Merge branch 'fixes' into nextMichael Ellerman
Merge our fixes branch from this cycle. It contains several important fixes we need in next for testing purposes, and also some that will conflict with upcoming changes.
2020-05-26Merge "Use hugepages to map kernel mem on 8xx" into nextMichael Ellerman
Merge Christophe's large series to use huge pages for the linear mapping on 8xx. From his cover letter: The main purpose of this big series is to: - reorganise huge page handling to avoid using mm_slices. - use huge pages to map kernel memory on the 8xx. The 8xx supports 4 page sizes: 4k, 16k, 512k and 8M. It uses 2 Level page tables, PGD having 1024 entries, each entry covering 4M address space. Then each page table has 1024 entries. At the time being, page sizes are managed in PGD entries, implying the use of mm_slices as it can't mix several pages of the same size in one page table. The first purpose of this series is to reorganise things so that standard page tables can also handle 512k pages. This is done by adding a new _PAGE_HUGE flag which will be copied into the Level 1 entry in the TLB miss handler. That done, we have 2 types of pages: - PGD entries to regular page tables handling 4k/16k and 512k pages - PGD entries to hugepd tables handling 8M pages. There is no need to mix 8M pages with other sizes, because a 8M page will use more than what a single PGD covers. Then comes the second purpose of this series. At the time being, the 8xx has implemented special handling in the TLB miss handlers in order to transparently map kernel linear address space and the IMMR using huge pages by building the TLB entries in assembly at the time of the exception. As mm_slices is only for user space pages, and also because it would anyway not be convenient to slice kernel address space, it was not possible to use huge pages for kernel address space. But after step one of the series, it is now more flexible to use huge pages. This series drop all assembly 'just in time' handling of huge pages and use huge pages in page tables instead. Once the above is done, then comes icing on the cake: - Use huge pages for KASAN shadow mapping - Allow pinned TLBs with strict kernel rwx - Allow pinned TLBs with debug pagealloc Then, last but not least, those modifications for the 8xx allows the following improvement on book3s/32: - Mapping KASAN shadow with BATs - Allowing BATs with debug pagealloc All this allows to considerably simplify TLB miss handlers and associated initialisation. The overhead of reading page tables is negligible compared to the reduction of the miss handlers. While we were at touching pte_update(), some cleanup was done there too. Tested widely on 8xx and 832x. Boot tested on QEMU MAC99.
2020-05-26powerpc/32s: Implement dedicated kasan_init_region()Christophe Leroy
Implement a kasan_init_region() dedicated to book3s/32 that allocates KASAN regions using BATs. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/709e821602b48a1d7c211a9b156da26db98c3e9d.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/8xx: Add a function to early map kernel via huge pagesChristophe Leroy
Add a function to early map kernel memory using huge pages. For 512k pages, just use standard page table and map in using 512k pages. For 8M pages, create a hugepd table and populate the two PGD entries with it. This function can only be used to create page tables at startup. Once the regular SLAB allocation functions replace memblock functions, this function cannot allocate new pages anymore. However it can still update existing mappings with new protections. hugepd_none() macro is moved into asm/hugetlb.h to be usable outside of mm/hugetlbpage.c early_pte_alloc_kernel() is made visible. _PAGE_HUGE flag is now displayed by ptdump. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Change ptdump display to use "huge"] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/68325bcd3b6f93127f7810418a2352c3519066d6.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/8xx: Remove now unused TLB miss functionsChristophe Leroy
The code to setup linear and IMMR mapping via huge TLB entries is not called anymore. Remove it. Also remove the handling of removed code exits in the perf driver. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/75750d25849cb8e73ca519866bb892d7eb9649c0.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/8xx: Add function to set pinned TLBsChristophe Leroy
Pinned TLBs cannot be modified when the MMU is enabled. Create a function to rewrite the pinned TLB entries with MMU off. To set pinned TLB, we have to turn off MMU, disable pinning, do a TLB flush (Either with tlbie and tlbia) then reprogam the TLB entries, enable pinning and turn on MMU. If using tlbie, it cleared entries in both instruction and data TLB regardless whether pinning is disabled or not. If using tlbia, it clears all entries of the TLB which has disabled pinning. To make it easy, just clear all entries in both TLBs, and reprogram them. The function takes two arguments, the top of the memory to consider and whether data is RO under _sinittext. When DEBUG_PAGEALLOC is set, the top is the end of kernel rodata. Otherwise, that's the top of physical RAM. Everything below _sinittext is set RX, over _sinittext that's RW. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c17806014bb1c06513ad1e1d510faea31984b177.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/8xx: MM_SLICE is not needed anymoreChristophe Leroy
As the 8xx now manages 512k pages in standard page tables, it doesn't need CONFIG_PPC_MM_SLICES anymore. Don't select it anymore and remove all related code. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/98e8ccd424476ea73cced2b89ba38eb2ed8144fb.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/8xx: Only 8M pages are hugepte pages nowChristophe Leroy
512k pages are now standard pages, so only 8M pages are hugepte. No more handling of normal page tables through hugepd allocation and freeing, and hugepte helpers can also be simplified. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2c6135d57fb76eebf70673fbac3dc9e740767879.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/8xx: Manage 512k huge pages as standard pages.Christophe Leroy
At the time being, 512k huge pages are handled through hugepd page tables. The PMD entry is flagged as a hugepd pointer and it means that only 512k hugepages can be managed in that 4M block. However, the hugepd table has the same size as a normal page table, and 512k entries can therefore be nested with normal pages. On the 8xx, TLB loading is performed by software and allthough the page tables are organised to match the L1 and L2 level defined by the HW, all TLB entries have both L1 and L2 independent entries. It means that even if two TLB entries are associated with the same PMD entry, they can be loaded with different values in L1 part. The L1 entry contains the page size (PS field): - 00 for 4k and 16 pages - 01 for 512k pages - 11 for 8M pages By adding a flag for hugepages in the PTE (_PAGE_HUGE) and copying it into the lower bit of PS, we can then manage 512k pages with normal page tables: - PMD entry has PS=11 for 8M pages - PMD entry has PS=00 for other pages. As a PMD entry covers 4M areas, a PMD will either point to a hugepd table having a single entry to an 8M page, or the PMD will point to a standard page table which will have either entries to 4k or 16k or 512k pages. For 512k pages, as the L1 entry will not know it is a 512k page before the PTE is read, there will be 128 entries in the PTE as if it was 4k pages. But when loading the TLB, it will be flagged as a 512k page. Note that we can't use pmd_ptr() in asm/nohash/32/pgtable.h because it is not defined yet. In ITLB miss, we keep the possibility to opt it out as when kernel text is pinned and no user hugepages are used, we can save several instruction by not using r11. In DTLB miss, that's just one instruction so it's not worth bothering with it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/002819e8e166bf81d24b24782d98de7c40905d8f.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.Christophe Leroy
Prepare ITLB handler to handle _PAGE_HUGE when CONFIG_HUGETLBFS is enabled. This means that the L1 entry has to be kept in r11 until L2 entry is read, in order to insert _PAGE_HUGE into it. Also move pgd_offset helpers before pte_update() as they will be needed there in next patch. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/21fd1de8fba781bededa9474a5a9374aefb1f849.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/8xx: Drop CONFIG_8xx_COPYBACK optionChristophe Leroy
CONFIG_8xx_COPYBACK was there to help disabling copyback cache mode for debuging hardware. But nobody will design new boards with 8xx now. All 8xx platforms select it, so make it the default and remove the option. Also remove the Mx_RESETVAL values which are pretty useless and hide the real value while reading code. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bcc968cda075516eb76e2f25e09821f582c566b4.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/mm: Reduce hugepd size for 8M hugepages on 8xxChristophe Leroy
Commit 55c8fc3f4930 ("powerpc/8xx: reintroduce 16K pages with HW assistance") redefined pte_t as a struct of 4 pte_basic_t, because in 16K pages mode there are four identical entries in the page table. But hugepd entries for 8M pages require only one entry of size pte_basic_t. So there is no point in creating a cache for 4 entries page tables. Calculate PTE_T_ORDER using the size of pte_basic_t instead of pte_t. Define specific huge_pte helpers (set_huge_pte_at(), huge_pte_clear(), huge_ptep_set_wrprotect()) to write the pte in a single entry instead of using set_pte_at() which writes 4 identical entries in 16k pages mode. Also make sure that __ptep_set_access_flags() properly handle the huge_pte case. Define set_pte_filter() inline otherwise GCC doesn't inline it anymore because it is now used twice, and that gives a pretty suboptimal code because of pte_t being a struct of 4 entries. Those functions are also used for 512k pages which only require one entry as well allthough replicating it four times was harmless as 512k pages entries are spread every 128 bytes in the table. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/43050d1a0c2d6e1541cab9c1126fc80bc7015ebd.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/mm: Create a dedicated pte_update() for 8xxChristophe Leroy
pte_update() is a bit special for the 8xx. At the time being, that's an #ifdef inside the nohash/32 pte_update(). As we are going to make it even more special in the coming patches, create a dedicated version for pte_update() for 8xx. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a103be0099ac2360f8c44f4a1a63cc03713a1360.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/mm: Standardise pte_update() prototype between PPC32 and PPC64Christophe Leroy
PPC64 takes 3 additional parameters compared to PPC32: - mm - address - huge These 3 parameters will be needed in order to perform different action depending on the page size on the 8xx. Make pte_update() prototype identical for PPC32 and PPC64. This allows dropping an #ifdef in huge_ptep_get_and_clear(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/38111acf6841047a8addde37c63e92d611ee38c2.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/mm: Standardise __ptep_test_and_clear_young() params between PPC32 ↵Christophe Leroy
and PPC64 On PPC32, __ptep_test_and_clear_young() takes the mm->context.id In preparation of standardising pte_update() params between PPC32 and PPC64, __ptep_test_and_clear_young() need mm instead of mm->context.id Replace context param by mm. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0a65470e50a14373b7c2291184514aa982462255.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/mm: Refactor pte_update() on book3s/32Christophe Leroy
When CONFIG_PTE_64BIT is set, pte_update() operates on 'unsigned long long' When CONFIG_PTE_64BIT is not set, pte_update() operates on 'unsigned long' In asm/page.h, we have pte_basic_t which is 'unsigned long long' when CONFIG_PTE_64BIT is set and 'unsigned long' otherwise. Refactor pte_update() using pte_basic_t. While we are at it, drop the comment on 44x which is not applicable to book3s version of pte_update(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c78912bc8613fb249c3d80aeb1062796b5c49400.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/mm: Refactor pte_update() on nohash/32Christophe Leroy
When CONFIG_PTE_64BIT is set, pte_update() operates on 'unsigned long long' When CONFIG_PTE_64BIT is not set, pte_update() operates on 'unsigned long' In asm/page.h, we have pte_basic_t which is 'unsigned long long' when CONFIG_PTE_64BIT is set and 'unsigned long' otherwise. Refactor pte_update() using pte_basic_t. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/590d67994a2847cd9fe088f7d974499e3a18b6ac.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/mm: PTE_ATOMIC_UPDATES is only for 40xChristophe Leroy
Only 40x still uses PTE_ATOMIC_UPDATES. 40x cannot not select CONFIG_PTE64_BIT. Drop handling of PTE_ATOMIC_UPDATES: - In nohash/64 - In nohash/32 for CONFIG_PTE_64BIT Keep PTE_ATOMIC_UPDATES only for nohash/32 for !CONFIG_PTE_64BIT Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d6f8e1f46583f1842de24581a68b0496feb15516.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26powerpc/mm: Allocate static page tables for fixmapChristophe Leroy
Allocate static page tables for the fixmap area. This allows setting mappings through page tables before memblock is ready. That's needed to use early_ioremap() early and to use standard page mappings with fixmap. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4f4b1412d34de6801b8e925cb88fc69d056ff536.1589866984.git.christophe.leroy@csgroup.eu
2020-05-22Merge tag 'powerpc-5.7-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - a revert of a recent change to the PTE bits for 32-bit BookS, which broke swap. - a "fix" to disable STRICT_KERNEL_RWX for 64-bit in Kconfig, as it's causing crashes for some people. Thanks to Christophe Leroy and Rui Salvaterra. * tag 'powerpc-5.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Disable STRICT_KERNEL_RWX Revert "powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits."
2020-05-20powerpc/kasan: Declare kasan_init_region() weakChristophe Leroy
In order to alloc sub-arches to alloc KASAN regions using optimised methods (Huge pages on 8xx, BATs on BOOK3S, ...), declare kasan_init_region() weak. Also make kasan_init_shadow_page_tables() accessible from outside, so that it can be called from the specific kasan_init_region() functions if needed. And populate remaining KASAN address space only once performed the region mapping, to allow 8xx to allocate hugepd instead of standard page tables for mapping via 8M hugepages. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3c1ce419fa1b5a4171b92d7fb16455ca17e1b96d.1589866984.git.christophe.leroy@csgroup.eu
2020-05-20powerpc/kasan: Fix shadow pages allocation failureChristophe Leroy
Doing kasan pages allocation in MMU_init is too early, kernel doesn't have access yet to the entire memory space and memblock_alloc() fails when the kernel is a bit big. Do it from kasan_init() instead. Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c24163ee5d5f8cdf52fefa45055ceb35435b8f15.1589866984.git.christophe.leroy@csgroup.eu
2020-05-20powerpc/kasan: Fix issues by lowering KASAN_SHADOW_ENDChristophe Leroy
At the time being, KASAN_SHADOW_END is 0x100000000, which is 0 in 32 bits representation. This leads to a couple of issues: - kasan_remap_early_shadow_ro() does nothing because the comparison k_cur < k_end is always false. - In ptdump, address comparison for markers display fails and the marker's name is printed at the start of the KASAN area instead of being printed at the end. However, there is no need to shadow the KASAN shadow area itself, so the KASAN shadow area can stop shadowing memory at the start of itself. With a PAGE_OFFSET set to 0xc0000000, KASAN shadow area is then going from 0xf8000000 to 0xff000000. Fixes: cbd18991e24f ("powerpc/mm: Fix an Oops in kasan_mmu_init()") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ae1a3c0d19a37410c209c3fc453634cfcc0ee318.1589866984.git.christophe.leroy@csgroup.eu
2020-05-20powerpc/64s/pgtable: fix an undefined behaviourQian Cai
Booting a power9 server with hash MMU could trigger an undefined behaviour because pud_offset(p4d, 0) will do, 0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10) Fix it by converting pud_index() and friends to static inline functions. UBSAN: shift-out-of-bounds in arch/powerpc/mm/ptdump/ptdump.c:282:15 shift exponent 34 is too large for 32-bit type 'int' CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc4-next-20200303+ #13 Call Trace: dump_stack+0xf4/0x164 (unreliable) ubsan_epilogue+0x18/0x78 __ubsan_handle_shift_out_of_bounds+0x160/0x21c walk_pagetables+0x2cc/0x700 walk_pud at arch/powerpc/mm/ptdump/ptdump.c:282 (inlined by) walk_pagetables at arch/powerpc/mm/ptdump/ptdump.c:311 ptdump_check_wx+0x8c/0xf0 mark_rodata_ro+0x48/0x80 kernel_init+0x74/0x194 ret_from_kernel_thread+0x5c/0x74 Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Link: https://lore.kernel.org/r/20200306044852.3236-1-cai@lca.pw
2020-05-20powerpc/64s: Fix early_init_mmu section mismatchNicholas Piggin
Christian reports: MODPOST vmlinux.o WARNING: modpost: vmlinux.o(.text.unlikely+0x1a0): Section mismatch in reference from the function .early_init_mmu() to the function .init.text:.radix__early_init_mmu() The function .early_init_mmu() references the function __init .radix__early_init_mmu(). This is often because .early_init_mmu lacks a __init annotation or the annotation of .radix__early_init_mmu is wrong. WARNING: modpost: vmlinux.o(.text.unlikely+0x1ac): Section mismatch in reference from the function .early_init_mmu() to the function .init.text:.hash__early_init_mmu() The function .early_init_mmu() references the function __init .hash__early_init_mmu(). This is often because .early_init_mmu lacks a __init annotation or the annotation of .hash__early_init_mmu is wrong. The compiler is uninlining early_init_mmu and not putting it in an init section because there is no annotation. Add it. Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Link: https://lore.kernel.org/r/20200429070247.1678172-1-npiggin@gmail.com
2020-05-20Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Merge our topic branch shared with the kvm-ppc tree. This brings in one commit that touches the XIVE interrupt controller logic across core and KVM code.
2020-05-20Merge branch 'topic/uaccess-ppc' into nextMichael Ellerman
Merge our uaccess-ppc topic branch. It is based on the uaccess topic branch that we're sharing with Viro. This includes the addition of user_[read|write]_access_begin(), as well as some powerpc specific changes to our uaccess routines that would conflict badly if merged separately.
2020-05-20Revert "powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits."Christophe Leroy
This reverts commit 697ece78f8f749aeea40f2711389901f0974017a. The implementation of SWAP on powerpc requires page protection bits to not be one of the least significant PTE bits. Until the SWAP implementation is changed and this requirement voids, we have to keep at least _PAGE_RW outside of the 3 last bits. For now, revert to previous PTE bits order. A further rework may come later. Fixes: 697ece78f8f7 ("powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits.") Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b34706f8de87f84d135abb5f3ede6b6f16fb1f41.1589969799.git.christophe.leroy@csgroup.eu
2020-05-20Merge tag 'noinstr-x86-kvm-2020-05-16' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into HEAD
2020-05-19powerpc/watchpoint: Don't allow concurrent perf and ptrace eventsRavi Bangoria
With Book3s DAWR, ptrace and perf watchpoints on powerpc behaves differently. Ptrace watchpoint works in one-shot mode and generates signal before executing instruction. It's ptrace user's job to single-step the instruction and re-enable the watchpoint. OTOH, in case of perf watchpoint, kernel emulates/single-steps the instruction and then generates event. If perf and ptrace creates two events with same or overlapping address ranges, it's ambiguous to decide who should single-step the instruction. Because of this issue, don't allow perf and ptrace watchpoint at the same time if their address range overlaps. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-15-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Prepare handler to handle more than one watchpointRavi Bangoria
Currently we assume that we have only one watchpoint supported by hw. Get rid of that assumption and use dynamic loop instead. This should make supporting more watchpoints very easy. With more than one watchpoint, exception handler needs to know which DAWR caused the exception, and hw currently does not provide it. So we need sw logic for the same. To figure out which DAWR caused the exception, check all different combinations of user specified range, DAWR address range, actual access range and DAWRX constrains. For ex, if user specified range and actual access range overlaps but DAWRX is configured for readonly watchpoint and the instruction is store, this DAWR must not have caused exception. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Michael Neuling <mikey@neuling.org> [mpe: Unsplit multi-line printk() strings, fix some sparse warnings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200514111741.97993-14-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Use builtin ALIGN*() macrosRavi Bangoria
Currently we calculate hw aligned start and end addresses manually. Replace them with builtin ALIGN_DOWN() and ALIGN() macros. So far end_addr was inclusive but this patch makes it exclusive (by avoiding -1) for better readability. Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-13-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Convert thread_struct->hw_brk to an arrayRavi Bangoria
So far powerpc hw supported only one watchpoint. But Power10 is introducing 2nd DAWR. Convert thread_struct->hw_brk into an array. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-10-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Get watchpoint count dynamically while disabling themRavi Bangoria
Instead of disabling only one watchpoint, get num of available watchpoints dynamically and disable all of them. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-8-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Provide DAWR number to __set_breakpointRavi Bangoria
Introduce new parameter 'nr' to __set_breakpoint() which indicates which DAWR should be programed. Also convert current_brk variable to an array. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-7-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Provide DAWR number to set_dawrRavi Bangoria
Introduce new parameter 'nr' to set_dawr() which indicates which DAWR should be programed. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-6-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Introduce function to get nr watchpoints dynamicallyRavi Bangoria
So far we had only one watchpoint, so we have hardcoded HBP_NUM to 1. But Power10 is introducing 2nd DAWR and thus kernel should be able to dynamically find actual number of watchpoints supported by hw it's running on. Introduce function for the same. Also convert HBP_NUM macro to HBP_NUM_MAX, which will now represent maximum number of watchpoints supported by Powerpc. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-4-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Add SPRN macros for second DAWRRavi Bangoria
Power10 is introducing second DAWR. Add SPRN_ macros for the same. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-3-ravi.bangoria@linux.ibm.com
2020-05-19powerpc/watchpoint: Rename current DAWR macrosRavi Bangoria
Power10 is introducing second DAWR. Use real register names from ISA for current macros: s/SPRN_DAWR/SPRN_DAWR0/ s/SPRN_DAWRX/SPRN_DAWRX0/ Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Michael Neuling <mikey@neuling.org> Link: https://lore.kernel.org/r/20200514111741.97993-2-ravi.bangoria@linux.ibm.com
2020-05-19powerpc sstep: Add support for prefixed load/storesJordan Niethe
This adds emulation support for the following prefixed integer load/stores: * Prefixed Load Byte and Zero (plbz) * Prefixed Load Halfword and Zero (plhz) * Prefixed Load Halfword Algebraic (plha) * Prefixed Load Word and Zero (plwz) * Prefixed Load Word Algebraic (plwa) * Prefixed Load Doubleword (pld) * Prefixed Store Byte (pstb) * Prefixed Store Halfword (psth) * Prefixed Store Word (pstw) * Prefixed Store Doubleword (pstd) * Prefixed Load Quadword (plq) * Prefixed Store Quadword (pstq) the follow prefixed floating-point load/stores: * Prefixed Load Floating-Point Single (plfs) * Prefixed Load Floating-Point Double (plfd) * Prefixed Store Floating-Point Single (pstfs) * Prefixed Store Floating-Point Double (pstfd) and for the following prefixed VSX load/stores: * Prefixed Load VSX Scalar Doubleword (plxsd) * Prefixed Load VSX Scalar Single-Precision (plxssp) * Prefixed Load VSX Vector [0|1] (plxv, plxv0, plxv1) * Prefixed Store VSX Scalar Doubleword (pstxsd) * Prefixed Store VSX Scalar Single-Precision (pstxssp) * Prefixed Store VSX Vector [0|1] (pstxv, pstxv0, pstxv1) Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Balamuruhan S <bala24@linux.ibm.com> [mpe: Use CONFIG_PPC64 not __powerpc64__, use get_op()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-30-jniethe5@gmail.com
2020-05-19powerpc: Add prefixed instructions to instruction data typeJordan Niethe
For powerpc64, redefine the ppc_inst type so both word and prefixed instructions can be represented. On powerpc32 the type will remain the same. Update places which had assumed instructions to be 4 bytes long. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> [mpe: Rework the get_user_inst() macros to be parameterised, and don't assign to the dest if an error occurred. Use CONFIG_PPC64 not __powerpc64__ in a few places. Address other comments from Christophe. Fix some sparse complaints.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-24-jniethe5@gmail.com
2020-05-19powerpc: Define new SRR1 bits for a ISA v3.1Jordan Niethe
Add the BOUNDARY SRR1 bit definition for when the cause of an alignment exception is a prefixed instruction that crosses a 64-byte boundary. Add the PREFIXED SRR1 bit definition for exceptions caused by prefixed instructions. Bit 35 of SRR1 is called SRR1_ISI_N_OR_G. This name comes from it being used to indicate that an ISI was due to the access being no-exec or guarded. ISA v3.1 adds another purpose. It is also set if there is an access in a cache-inhibited location for prefixed instruction. Rename from SRR1_ISI_N_OR_G to SRR1_ISI_N_G_OR_CIP. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Alistair Popple <alistair@popple.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-23-jniethe5@gmail.com
2020-05-19powerpc: Enable Prefixed InstructionsAlistair Popple
Prefix instructions have their own FSCR bit which needs to enabled via a CPU feature. The kernel will save the FSCR for problem state but it needs to be enabled initially. If prefixed instructions are made unavailable by the [H]FSCR, attempting to use them will cause a facility unavailable exception. Add "PREFIX" to the facility_strings[]. Currently there are no prefixed instructions that are actually emulated by emulate_instruction() within facility_unavailable_exception(). However, when caused by a prefixed instructions the SRR1 PREFIXED bit is set. Prepare for dealing with emulated prefixed instructions by checking for this bit. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Link: https://lore.kernel.org/r/20200506034050.24806-22-jniethe5@gmail.com
2020-05-19powerpc: Introduce a function for reporting instruction lengthJordan Niethe
Currently all instructions have the same length, but in preparation for prefixed instructions introduce a function for returning instruction length. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Alistair Popple <alistair@popple.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-18-jniethe5@gmail.com
2020-05-19powerpc: Define and use get_user_instr() et. al.Jordan Niethe
Define specialised get_user_instr(), __get_user_instr() and __get_user_instr_inatomic() macros for reading instructions from user and/or kernel space. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> [mpe: Squash in addition of get_user_instr() & __user annotations] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-17-jniethe5@gmail.com
2020-05-19powerpc: Add a probe_kernel_read_inst() functionJordan Niethe
Introduce a probe_kernel_read_inst() function to use in cases where probe_kernel_read() is used for getting an instruction. This will be more useful for prefixed instructions. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> [mpe: Don't write to *inst on error] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-15-jniethe5@gmail.com
2020-05-19powerpc: Add a probe_user_read_inst() functionJordan Niethe
Introduce a probe_user_read_inst() function to use in cases where probe_user_read() is used for getting an instruction. This will be more useful for prefixed instructions. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> [mpe: Don't write to *inst on error, fold in __user annotations] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-14-jniethe5@gmail.com
2020-05-19powerpc: Use a function for reading instructionsJordan Niethe
Prefixed instructions will mean there are instructions of different length. As a result dereferencing a pointer to an instruction will not necessarily give the desired result. Introduce a function for reading instructions from memory into the instruction data type. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Alistair Popple <alistair@popple.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-13-jniethe5@gmail.com
2020-05-19powerpc: Use a datatype for instructionsJordan Niethe
Currently unsigned ints are used to represent instructions on powerpc. This has worked well as instructions have always been 4 byte words. However, ISA v3.1 introduces some changes to instructions that mean this scheme will no longer work as well. This change is Prefixed Instructions. A prefixed instruction is made up of a word prefix followed by a word suffix to make an 8 byte double word instruction. No matter the endianness of the system the prefix always comes first. Prefixed instructions are only planned for powerpc64. Introduce a ppc_inst type to represent both prefixed and word instructions on powerpc64 while keeping it possible to exclusively have word instructions on powerpc32. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> [mpe: Fix compile error in emulate_spe()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-12-jniethe5@gmail.com
2020-05-19powerpc: Introduce functions for instruction equalityJordan Niethe
In preparation for an instruction data type that can not be directly used with the '==' operator use functions for checking equality. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Balamuruhan S <bala24@linux.ibm.com> Link: https://lore.kernel.org/r/20200506034050.24806-11-jniethe5@gmail.com