summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-24Merge branch 'for-next/caches' into for-next/coreWill Deacon
Big cleanup of our cache maintenance routines, which were confusingly named and inconsistent in their implementations. * for-next/caches: arm64: Rename arm64-internal cache maintenance functions arm64: Fix cache maintenance function comments arm64: sync_icache_aliases to take end parameter instead of size arm64: __clean_dcache_area_pou to take end parameter instead of size arm64: __clean_dcache_area_pop to take end parameter instead of size arm64: __clean_dcache_area_poc to take end parameter instead of size arm64: __flush_dcache_area to take end parameter instead of size arm64: dcache_by_line_op to take end parameter instead of size arm64: __inval_dcache_area to take end parameter instead of size arm64: Fix comments to refer to correct function __flush_icache_range arm64: Move documentation of dcache_by_line_op arm64: assembler: remove user_alt arm64: Downgrade flush_icache_range to invalidate arm64: Do not enable uaccess for invalidate_icache_range arm64: Do not enable uaccess for flush_icache_range arm64: Apply errata to swsusp_arch_suspend_exit arm64: assembler: add conditional cache fixups arm64: assembler: replace `kaddr` with `addr`
2021-06-24Merge branch 'for-next/build' into for-next/coreWill Deacon
Tweak linker flags so that GDB can understand vmlinux when using RELR relocations. * for-next/build: Makefile: fix GDB warning with CONFIG_RELR
2021-06-24Merge branch 'for-next/boot' into for-next/coreWill Deacon
Boot path cleanups to enable early initialisation of per-cpu operations needed by KCSAN. * for-next/boot: arm64: scs: Drop unused 'tmp' argument to scs_{load, save} asm macros arm64: smp: initialize cpu offset earlier arm64: smp: unify task and sp setup arm64: smp: remove stack from secondary_data arm64: smp: remove pointless secondary_data maintenance arm64: assembler: add set_this_cpu_offset
2021-06-24Merge branch 'for-next/stacktrace' into for-next/coreWill Deacon
Relax frame record alignment requirements to facilitate 8-byte alignment with KASAN and Clang. * for-next/stacktrace: arm64: stacktrace: Relax frame record alignment requirement to 8 bytes arm64: Change the on_*stack functions to take a size argument arm64: Implement stack trace termination record
2021-06-08Makefile: fix GDB warning with CONFIG_RELRNick Desaulniers
GDB produces the following warning when debugging kernels built with CONFIG_RELR: BFD: /android0/linux-next/vmlinux: unknown type [0x13] section `.relr.dyn' when loading a kernel built with CONFIG_RELR into GDB. It can also prevent debugging symbols using such relocations. Peter sugguests: [That flag] means that lld will use dynamic tags and section type numbers in the OS-specific range rather than the generic range. The kernel itself doesn't care about these numbers; it determines the location of the RELR section using symbols defined by a linker script. Link: https://github.com/ClangBuiltLinux/linux/issues/1057 Suggested-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20210522012626.2811297-1-ndesaulniers@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-27arm64: scs: Drop unused 'tmp' argument to scs_{load, save} asm macrosWill Deacon
The scs_load and scs_save asm macros don't make use of the mandatory 'tmp' register argument, so drop it and fix up the callers. Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20210527105529.21967-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: smp: initialize cpu offset earlierMark Rutland
Now that we have a consistent place to initialize CPU context registers early in the boot path, let's also initialize the per-cpu offset here. This makes the primary and secondary boot paths more consistent, and allows for the use of per-cpu operations earlier, which will be necessary for instrumentation with KCSAN. Note that smp_prepare_boot_cpu() still needs to re-initialize CPU0's offset as immediately prior to this the per-cpu areas may be reallocated, and hence the boot-time offset may be stale. A comment is added to make this clear. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210520115031.18509-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: smp: unify task and sp setupMark Rutland
Once we enable the MMU, we have to initialize: * SP_EL0 to point at the active task * SP to point at the active task's stack * SCS_SP to point at the active task's shadow stack For all tasks (including init_task), this information can be derived from the task's task_struct. Let's unify __primary_switched and __secondary_switched to consistently acquire this information from the relevant task_struct. At the same time, let's fold this together with initializing a task's final frame. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210520115031.18509-6-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: smp: remove stack from secondary_dataMark Rutland
When we boot a secondary CPU, we pass it a task and a stack to use. As the stack is always the task's stack, which can be derived from the task, let's have the secondary CPU derive this itself and avoid passing redundant information. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210520115031.18509-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: smp: remove pointless secondary_data maintenanceMark Rutland
All reads and writes of secondary_data occur with the MMU on, using coherent attributes, so there's no need to perform any cache maintenance for this. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210520115031.18509-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: assembler: add set_this_cpu_offsetMark Rutland
There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210520115031.18509-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26Merge branch 'for-next/stacktrace' into for-next/bootWill Deacon
Merge in stack unwinding work to minimise conflicts in head.S. * for-next/stacktrace: arm64: stacktrace: Relax frame record alignment requirement to 8 bytes arm64: Change the on_*stack functions to take a size argument arm64: Implement stack trace termination record
2021-05-26arm64: stacktrace: Relax frame record alignment requirement to 8 bytesPeter Collingbourne
The AAPCS places no requirements on the alignment of the frame record. In theory it could be placed anywhere, although it seems sensible to require it to be aligned to 8 bytes. With an upcoming enhancement to tag-based KASAN Clang will begin creating frame records located at an address that is only aligned to 8 bytes. Accommodate such frame records in the stack unwinding code. As pointed out by Mark Rutland, the userspace stack unwinding code has the same problem, so fix it there as well. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/Ia22c375230e67ca055e9e4bb639383567f7ad268 Acked-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210526174927.2477847-2-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: Change the on_*stack functions to take a size argumentPeter Collingbourne
unwind_frame() was previously implicitly checking that the frame record is in bounds of the stack by enforcing that FP is both aligned to 16 and in bounds of the stack. Once the FP alignment requirement is relaxed to 8 this will not be sufficient because it does not account for the case where FP points to 8 bytes before the end of the stack. Make the check explicit by changing the on_*stack functions to take a size argument and adjusting the callers to pass the appropriate sizes. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/Ib7a3eb3eea41b0687ffaba045ceb2012d077d8b4 Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210526174927.2477847-1-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Rename arm64-internal cache maintenance functionsFuad Tabba
Although naming across the codebase isn't that consistent, it tends to follow certain patterns. Moreover, the term "flush" isn't defined in the Arm Architecture reference manual, and might be interpreted to mean clean, invalidate, or both for a cache. Rename arm64-internal functions to make the naming internally consistent, as well as making it consistent with the Arm ARM, by specifying whether it applies to the instruction, data, or both caches, whether the operation is a clean, invalidate, or both. Also specify which point the operation applies to, i.e., to the point of unification (PoU), coherency (PoC), or persistence (PoP). This commit applies the following sed transformation to all files under arch/arm64: "s/\b__flush_cache_range\b/caches_clean_inval_pou_macro/g;"\ "s/\b__flush_icache_range\b/caches_clean_inval_pou/g;"\ "s/\binvalidate_icache_range\b/icache_inval_pou/g;"\ "s/\b__flush_dcache_area\b/dcache_clean_inval_poc/g;"\ "s/\b__inval_dcache_area\b/dcache_inval_poc/g;"\ "s/__clean_dcache_area_poc\b/dcache_clean_poc/g;"\ "s/\b__clean_dcache_area_pop\b/dcache_clean_pop/g;"\ "s/\b__clean_dcache_area_pou\b/dcache_clean_pou/g;"\ "s/\b__flush_cache_user_range\b/caches_clean_inval_user_pou/g;"\ "s/\b__flush_icache_all\b/icache_inval_all_pou/g;" Note that __clean_dcache_area_poc is deliberately missing a word boundary check at the beginning in order to match the efistub symbols in image-vars.h. Also note that, despite its name, __flush_icache_range operates on both instruction and data caches. The name change here reflects that. No functional change intended. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-19-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Fix cache maintenance function commentsFuad Tabba
Fix and expand comments for the cache maintenance functions in cacheflush.h. Adds comments to functions that weren't described before. Explains what the functions do using Arm Architecture Reference Manual terminology. No functional change intended. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-18-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: sync_icache_aliases to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-17-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: __clean_dcache_area_pou to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-16-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: __clean_dcache_area_pop to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-15-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: __clean_dcache_area_poc to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. Because the code is shared with __dma_clean_area, it changes the parameters for that as well. However, __dma_clean_area is local to cache.S, so no other users are affected. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-14-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: __flush_dcache_area to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-13-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: dcache_by_line_op to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-12-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: __inval_dcache_area to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. Because the code is shared with __dma_inv_area, it changes the parameters for that as well. However, __dma_inv_area is local to cache.S, so no other users are affected. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-11-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Fix comments to refer to correct function __flush_icache_rangeFuad Tabba
Many comments refer to the function flush_icache_range, where the intent is in fact __flush_icache_range. Fix these comments to refer to the intended function. That's probably due to commit 3b8c9f1cdfc506e9 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings"), which renamed flush_icache_range() to __flush_icache_range() and added a wrapper. No functional change intended. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-10-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Move documentation of dcache_by_line_opFuad Tabba
The comment describing the macro dcache_by_line_op is placed right before the previous macro of the one it describes, which is a bit confusing. Move it to the macro it describes (dcache_by_line_op). No functional change intended. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-9-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: assembler: remove user_altFuad Tabba
user_alt isn't being used anymore. It's also simpler and clearer to directly use alternative_insn and _cond_extable in-line when needed. Reported-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/linux-arm-kernel/20210520125735.GF17233@C02TD0UTHF1T.local/ Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-8-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Downgrade flush_icache_range to invalidateFuad Tabba
Since __flush_dcache_area is called right before, invalidate_icache_range is sufficient in this case. Rewrite the comment to better explain the rationale behind the cache maintenance operations used here. No functional change intended. Possible performance impact due to invalidating only the icache rather than invalidating and cleaning both caches. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/linux-arch/20200511110014.lb9PEahJ4hVOYrbwIb_qUHXyNy9KQzNFdb_I3YlzY6A@z/ Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-7-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Do not enable uaccess for invalidate_icache_rangeFuad Tabba
invalidate_icache_range() works on kernel addresses, and doesn't need uaccess. Remove the code that toggles uaccess_ttbr0_enable, as well as the code that emits an entry into the exception table (via the macro invalidate_icache_by_line). Changes return type of invalidate_icache_range() from int (which used to indicate a fault) to void, since it doesn't need uaccess and won't fault. Note that return value was never checked by any of the callers. No functional change intended. Possible performance impact due to the reduced number of instructions. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/linux-arch/20200511110014.lb9PEahJ4hVOYrbwIb_qUHXyNy9KQzNFdb_I3YlzY6A@z/ Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-6-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Do not enable uaccess for flush_icache_rangeFuad Tabba
__flush_icache_range works on kernel addresses, and doesn't need uaccess. The existing code is a side-effect of its current implementation with __flush_cache_user_range fallthrough. Instead of fallthrough to share the code, use a common macro for the two where the caller specifies an optional fixup label if user access is needed. If provided, this label would be used to generate an extable entry. Simplify the code to use dcache_by_line_op, instead of replicating much of its functionality. No functional change intended. Possible performance impact due to the reduced number of instructions. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Will Deacon <will@kernel.org> Reported-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/linux-arch/20200511110014.lb9PEahJ4hVOYrbwIb_qUHXyNy9KQzNFdb_I3YlzY6A@z/ Link: https://lore.kernel.org/linux-arm-kernel/20210521121846.GB1040@C02TD0UTHF1T.local/ Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-5-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Apply errata to swsusp_arch_suspend_exitFuad Tabba
The Arm errata covered by ARM64_WORKAROUND_CLEAN_CACHE require that "dc cvau" instructions get promoted to "dc civac". Reported-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-4-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: assembler: add conditional cache fixupsMark Rutland
It would be helpful if we could use both `dcache_by_line_op` and `invalidate_icache_by_line` for user memory without accidentally fixing up unexpected faults when performing maintenance on kernel addresses. Let's make this possible by having both macros take an optional fixup label, and only generating an extable entry if a label is provided. At the same time, let's clean up the labels used to be globally unique using \@ as we do for other macros. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Cc: Ard Biesheuvel <aedb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-3-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: assembler: replace `kaddr` with `addr`Mark Rutland
The `__dcache_op_workaround_clean_cache` and `dcache_by_line_op` macros are only expected to be usedc on kernel memory, without a user fault fixup, and so we named their address variables `kaddr` to make this clear. Subseuqent patches will modify these to also work on user memory with an (optional) user fault fixup, where `kaddr` won't make as much sense. To aid the legibility of patches, this patch (only) replaces `kaddr` with `addr` as a preparatory step. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Cc: Ard Biesheuvel <aedb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-2-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Implement stack trace termination recordMadhavan T. Venkataraman
Reliable stacktracing requires that we identify when a stacktrace is terminated early. We can do this by ensuring all tasks have a final frame record at a known location on their task stack, and checking that this is the final frame record in the chain. We'd like to use task_pt_regs(task)->stackframe as the final frame record, as this is already setup upon exception entry from EL0. For kernel tasks we need to consistently reserve the pt_regs and point x29 at this, which we can do with small changes to __primary_switched, __secondary_switched, and copy_process(). Since the final frame record must be at a specific location, we must create the final frame record in __primary_switched and __secondary_switched rather than leaving this to start_kernel and secondary_start_kernel. Thus, __primary_switched and __secondary_switched will now show up in stacktraces for the idle tasks. Since the final frame record is now identified by its location rather than by its contents, we identify it at the start of unwind_frame(), before we read any values from it. External debuggers may terminate the stack trace when FP == 0. In the pt_regs->stackframe, the PC is 0 as well. So, stack traces taken in the debugger may print an extra record 0x0 at the end. While this is not pretty, this does not do any harm. This is a small price to pay for having reliable stack trace termination in the kernel. That said, gdb does not show the extra record probably because it uses DWARF and not frame pointers for stack traces. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> [Mark: rebase, use ASM_BUG(), update comments, update commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210510110026.18061-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-23Linux 5.13-rc3Linus Torvalds
2021-05-23Merge tag 'perf-urgent-2021-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Two perf fixes: - Do not check the LBR_TOS MSR when setting up unrelated LBR MSRs as this can cause malfunction when TOS is not supported - Allocate the LBR XSAVE buffers along with the DS buffers upfront because allocating them when adding an event can deadlock" * tag 'perf-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/lbr: Remove cpuc->lbr_xsave allocation from atomic context perf/x86: Avoid touching LBR_TOS MSR for Arch LBR
2021-05-23Merge tag 'locking-urgent-2021-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Two locking fixes: - Invoke the lockdep tracepoints in the correct place so the ordering is correct again - Don't leave the mutex WAITER bit stale when the last waiter is dropping out early due to a signal as that forces all subsequent lock operations needlessly into the slowpath until it's cleaned up again" * tag 'locking-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal locking/lockdep: Correct calling tracepoints
2021-05-23Merge tag 'irq-urgent-2021-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A few fixes for irqchip drivers: - Allocate interrupt descriptors correctly on Mainstone PXA when SPARSE_IRQ is enabled; otherwise the interrupt association fails - Make the APPLE AIC chip driver depend on APPLE - Remove redundant error output on devm_ioremap_resource() failure" * tag 'irq-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: Remove redundant error printing irqchip/apple-aic: APPLE_AIC should depend on ARCH_APPLE ARM: PXA: Fix cplds irqdesc allocation when using legacy mode
2021-05-23Merge tag 'x86_urgent_for_v5.13_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix how SEV handles MMIO accesses by forwarding potential page faults instead of killing the machine and by using the accessors with the exact functionality needed when accessing memory. - Fix a confusion with Clang LTO compiler switches passed to the it - Handle the case gracefully when VMGEXIT has been executed in userspace * tag 'x86_urgent_for_v5.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev-es: Use __put_user()/__get_user() for data accesses x86/sev-es: Forward page-faults which happen during emulation x86/sev-es: Don't return NULL from sev_es_get_ghcb() x86/build: Fix location of '-plugin-opt=' flags x86/sev-es: Invalidate the GHCB after completing VMGEXIT x86/sev-es: Move sev_es_put_ghcb() in prep for follow on patch
2021-05-23Merge tag 'powerpc-5.13-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix breakage of strace (and other ptracers etc.) when using the new scv ABI (Power9 or later with glibc >= 2.33). - Fix early_ioremap() on 64-bit, which broke booting on some machines. Thanks to Dmitry V. Levin, Nicholas Piggin, Alexey Kardashevskiy, and Christophe Leroy. * tag 'powerpc-5.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/syscall: Fix ptrace syscall info with scv syscalls powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference between sc and scv syscalls powerpc: Fix early setup to make early_ioremap() work
2021-05-22Merge tag 'kbuild-fixes-v5.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix short log indentation for tools builds - Fix dummy-tools to adjust to the latest stackprotector check * tag 'kbuild-fixes-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: dummy-tools: adjust to stricter stackprotector check scripts/jobserver-exec: Fix a typo ("envirnoment") tools build: Fix quiet cmd indentation
2021-05-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "10 patches. Subsystems affected by this patch series: mm (pagealloc, gup, kasan, and userfaultfd), ipc, selftests, watchdog, bitmap, procfs, and lib" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: userfaultfd: hugetlbfs: fix new flag usage in error path lib: kunit: suppress a compilation warning of frame size proc: remove Alexey from MAINTAINERS linux/bits.h: fix compilation error with GENMASK watchdog: reliable handling of timestamps kasan: slab: always reset the tag in get_freepointer_safe() tools/testing/selftests/exec: fix link error ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry Revert "mm/gup: check page posion status for coredump." mm/shuffle: fix section mismatch warning
2021-05-22userfaultfd: hugetlbfs: fix new flag usage in error pathMike Kravetz
In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags") the use of PagePrivate to indicate a reservation count should be restored at free time was changed to the hugetlb specific flag HPageRestoreReserve. Changes to a userfaultfd error path as well as a VM_BUG_ON() in remove_inode_hugepages() were overlooked. Users could see incorrect hugetlb reserve counts if they experience an error with a UFFDIO_COPY operation. Specifically, this would be the result of an unlikely copy_huge_page_from_user error. There is not an increased chance of hitting the VM_BUG_ON. Link: https://lkml.kernel.org/r/20210521233952.236434-1-mike.kravetz@oracle.com Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mina Almasry <almasry.mina@google.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Mina Almasry <almasrymina@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22lib: kunit: suppress a compilation warning of frame sizeZhen Lei
lib/bitfield_kunit.c: In function `test_bitfields_constants': lib/bitfield_kunit.c:93:1: warning: the frame size of 7456 bytes is larger than 2048 bytes [-Wframe-larger-than=] } ^ As the description of BITFIELD_KUNIT in lib/Kconfig.debug, it "Only useful for kernel devs running the KUnit test harness, and not intended for inclusion into a production build". Therefore, it is not worth modifying variable 'test_bitfields_constants' to clear this warning. Just suppress it. Link: https://lkml.kernel.org/r/20210518094533.7652-1-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Vitor Massaru Iha <vitor@massaru.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22proc: remove Alexey from MAINTAINERSAlexey Dobriyan
People Cc me and I don't have time. Link: https://lkml.kernel.org/r/YKarMxHJBIhMHQIh@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22linux/bits.h: fix compilation error with GENMASKRikard Falkeborn
GENMASK() has an input check which uses __builtin_choose_expr() to enable a compile time sanity check of its inputs if they are known at compile time. However, it turns out that __builtin_constant_p() does not always return a compile time constant [0]. It was thought this problem was fixed with gcc 4.9 [1], but apparently this is not the case [2]. Switch to use __is_constexpr() instead which always returns a compile time constant, regardless of its inputs. Link: https://lore.kernel.org/lkml/42b4342b-aefc-a16a-0d43-9f9c0d63ba7a@rasmusvillemoes.dk [0] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19449 [1] Link: https://lore.kernel.org/lkml/1ac7bbc2-45d9-26ed-0b33-bf382b8d858b@I-love.SAKURA.ne.jp [2] Link: https://lkml.kernel.org/r/20210511203716.117010-1-rikard.falkeborn@gmail.com Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22watchdog: reliable handling of timestampsPetr Mladek
Commit 9bf3bc949f8a ("watchdog: cleanup handling of false positives") tried to handle a virtual host stopped by the host a more straightforward and cleaner way. But it introduced a risk of false softlockup reports. The virtual host might be stopped at any time, for example between kvm_check_and_clear_guest_paused() and is_softlockup(). As a result, is_softlockup() might read the updated jiffies and detects a softlockup. A solution might be to put back kvm_check_and_clear_guest_paused() after is_softlockup() and detect it. But it would put back the cycle that complicates the logic. In fact, the handling of all the timestamps is not reliable. The code does not guarantee when and how many times the timestamps are read. For example, "period_ts" might be touched anytime also from NMI and re-read in is_softlockup(). It works just by chance. Fix all the problems by making the code even more explicit. 1. Make sure that "now" and "period_ts" timestamps are read only once. They might be changed at anytime by NMI or when the virtual guest is stopped by the host. Note that "now" timestamp does this implicitly because "jiffies" is marked volatile. 2. "now" time must be read first. The state of "period_ts" will decide whether it will be used or the period will get restarted. 3. kvm_check_and_clear_guest_paused() must be called before reading "period_ts". It touches the variable when the guest was stopped. As a result, "now" timestamp is used only when the watchdog was not touched and the guest not stopped in the meantime. "period_ts" is restarted in all other situations. Link: https://lkml.kernel.org/r/YKT55gw+RZfyoFf7@alley Fixes: 9bf3bc949f8aeefeacea4b ("watchdog: cleanup handling of false positives") Signed-off-by: Petr Mladek <pmladek@suse.com> Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22kasan: slab: always reset the tag in get_freepointer_safe()Alexander Potapenko
With CONFIG_DEBUG_PAGEALLOC enabled, the kernel should also untag the object pointer, as done in get_freepointer(). Failing to do so reportedly leads to SLUB freelist corruptions that manifest as boot-time crashes. Link: https://lkml.kernel.org/r/20210514072228.534418-1-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Elliot Berman <eberman@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22tools/testing/selftests/exec: fix link errorYang Yingliang
Fix the link error by adding '-static': gcc -Wall -Wl,-z,max-page-size=0x1000 -pie load_address.c -o /home/yang/linux/tools/testing/selftests/exec/load_address_4096 /usr/bin/ld: /tmp/ccopEGun.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `stderr@@GLIBC_2.17' which may bind externally can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: /tmp/ccopEGun.o(.text+0x158): unresolvable R_AARCH64_ADR_PREL_PG_HI21 relocation against symbol `stderr@@GLIBC_2.17' /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status make: *** [Makefile:25: tools/testing/selftests/exec/load_address_4096] Error 1 Link: https://lkml.kernel.org/r/20210514092422.2367367-1-yangyingliang@huawei.com Fixes: 206e22f01941 ("tools/testing/selftests: add self-test for verifying load alignment") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Cc: Chris Kennelly <ckennelly@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiryVarad Gautam
do_mq_timedreceive calls wq_sleep with a stack local address. The sender (do_mq_timedsend) uses this address to later call pipelined_send. This leads to a very hard to trigger race where a do_mq_timedreceive call might return and leave do_mq_timedsend to rely on an invalid address, causing the following crash: RIP: 0010:wake_q_add_safe+0x13/0x60 Call Trace: __x64_sys_mq_timedsend+0x2a9/0x490 do_syscall_64+0x80/0x680 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f5928e40343 The race occurs as: 1. do_mq_timedreceive calls wq_sleep with the address of `struct ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it holds a valid `struct ext_wait_queue *` as long as the stack has not been overwritten. 2. `ewq_addr` gets added to info->e_wait_q[RECV].list in wq_add, and do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call __pipelined_op. 3. Sender calls __pipelined_op::smp_store_release(&this->state, STATE_READY). Here is where the race window begins. (`this` is `ewq_addr`.) 4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it will see `state == STATE_READY` and break. 5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's stack. (Although the address may not get overwritten until another function happens to touch it, which means it can persist around for an indefinite time.) 6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a `struct ext_wait_queue *`, and uses it to find a task_struct to pass to the wake_q_add_safe call. In the lucky case where nothing has overwritten `ewq_addr` yet, `ewq_addr->task` is the right task_struct. In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a bogus address as the receiver's task_struct causing the crash. do_mq_timedsend::__pipelined_op() should not dereference `this` after setting STATE_READY, as the receiver counterpart is now free to return. Change __pipelined_op to call wake_q_add_safe on the receiver's task_struct returned by get_task_struct, instead of dereferencing `this` which sits on the receiver's stack. As Manfred pointed out, the race potentially also exists in ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare. Fix those in the same way. Link: https://lkml.kernel.org/r/20210510102950.12551-1-varad.gautam@suse.com Fixes: c5b2cbdbdac563 ("ipc/mqueue.c: update/document memory barriers") Fixes: 8116b54e7e23ef ("ipc/sem.c: document and update memory barriers") Fixes: 0d97a82ba830d8 ("ipc/msg.c: update and document memory barriers") Signed-off-by: Varad Gautam <varad.gautam@suse.com> Reported-by: Matthias von Faber <matthias.vonfaber@aox-tech.de> Acked-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Manfred Spraul <manfred@colorfullife.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22Revert "mm/gup: check page posion status for coredump."Michal Hocko
While reviewing [1] I came across commit d3378e86d182 ("mm/gup: check page posion status for coredump.") and noticed that this patch is broken in two ways. First it doesn't really prevent hwpoison pages from being dumped because hwpoison pages can be marked asynchornously at any time after the check. Secondly, and more importantly, the patch introduces a ref count leak because get_dump_page takes a reference on the page which is not released. It also seems that the patch was merged incorrectly because there were follow up changes not included as well as discussions on how to address the underlying problem [2] Therefore revert the original patch. Link: http://lkml.kernel.org/r/20210429122519.15183-4-david@redhat.com [1] Link: http://lkml.kernel.org/r/57ac524c-b49a-99ec-c1e4-ef5027bfb61b@redhat.com [2] Link: https://lkml.kernel.org/r/20210505135407.31590-1-mhocko@kernel.org Fixes: d3378e86d182 ("mm/gup: check page posion status for coredump.") Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Aili Yao <yaoaili@kingsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>