summaryrefslogtreecommitdiff
path: root/arch/arm64/include/asm/assembler.h
AgeCommit message (Collapse)Author
2018-01-30Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "The main theme of this pull request is security covering variants 2 and 3 for arm64. I expect to send additional patches next week covering an improved firmware interface (requires firmware changes) for variant 2 and way for KPTI to be disabled on unaffected CPUs (Cavium's ThunderX doesn't work properly with KPTI enabled because of a hardware erratum). Summary: - Security mitigations: - variant 2: invalidate the branch predictor with a call to secure firmware - variant 3: implement KPTI for arm64 - 52-bit physical address support for arm64 (ARMv8.2) - arm64 support for RAS (firmware first only) and SDEI (software delegated exception interface; allows firmware to inject a RAS error into the OS) - perf support for the ARM DynamIQ Shared Unit PMU - CPUID and HWCAP bits updated for new floating point multiplication instructions in ARMv8.4 - remove some virtual memory layout printks during boot - fix initial page table creation to cope with larger than 32M kernel images when 16K pages are enabled" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (104 commits) arm64: Fix TTBR + PAN + 52-bit PA logic in cpu_do_switch_mm arm64: Turn on KPTI only on CPUs that need it arm64: Branch predictor hardening for Cavium ThunderX2 arm64: Run enable method for errata work arounds on late CPUs arm64: Move BP hardening to check_and_switch_context arm64: mm: ignore memory above supported physical address size arm64: kpti: Fix the interaction between ASID switching and software PAN KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA KVM: arm64: Handle RAS SErrors from EL2 on guest exit KVM: arm64: Handle RAS SErrors from EL1 on guest exit KVM: arm64: Save ESR_EL2 on guest SError KVM: arm64: Save/Restore guest DISR_EL1 KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. KVM: arm/arm64: mask/unmask daif around VHE guests arm64: kernel: Prepare for a DISR user arm64: Unconditionally enable IESB on exception entry/return for firmware-first arm64: kernel: Survive corrected RAS errors notified by SError arm64: cpufeature: Detect CPU RAS Extentions arm64: sysreg: Move to use definitions for all the SCTLR bits arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early ...
2018-01-16arm64: kernel: Prepare for a DISR userJames Morse
KVM would like to consume any pending SError (or RAS error) after guest exit. Today it has to unmask SError and use dsb+isb to synchronise the CPU. With the RAS extensions we can use ESB to synchronise any pending SError. Add the necessary macros to allow DISR to be read and converted to an ESR. We clear the DISR register when we enable the RAS cpufeature, and the kernel has not executed any ESB instructions. Any value we find in DISR must have belonged to firmware. Executing an ESB instruction is the only way to update DISR, so we can expect firmware to have handled any deferred SError. By the same logic we clear DISR in the idle path. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-15arm64: fix comment above tcr_compute_pa_sizeKristina Martsenko
The 'pos' argument is used to select where in TCR to write the value: the IPS or PS bitfield. Fixes: 787fd1d019b2 ("arm64: limit PA size to supported range") Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-13arm64: alternatives: use tpidr_el2 on VHE hostsJames Morse
Now that KVM uses tpidr_el2 in the same way as Linux's cpu_offset in tpidr_el1, merge the two. This saves KVM from save/restoring tpidr_el1 on VHE hosts, and allows future code to blindly access per-cpu variables without triggering world-switch. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08arm64: Move post_ttbr_update_workaround to C codeMarc Zyngier
We will soon need to invoke a CPU-specific function pointer after changing page tables, so move post_ttbr_update_workaround out into C code to make this possible. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-22Merge branch 'for-next/52-bit-pa' into for-next/coreCatalin Marinas
* for-next/52-bit-pa: arm64: enable 52-bit physical address support arm64: allow ID map to be extended to 52 bits arm64: handle 52-bit physical addresses in page table entries arm64: don't open code page table entry creation arm64: head.S: handle 52-bit PAs in PTEs in early page table setup arm64: handle 52-bit addresses in TTBR arm64: limit PA size to supported range arm64: add kconfig symbol to configure physical address size
2017-12-22arm64: allow ID map to be extended to 52 bitsKristina Martsenko
Currently, when using VA_BITS < 48, if the ID map text happens to be placed in physical memory above VA_BITS, we increase the VA size (up to 48) and create a new table level, in order to map in the ID map text. This is okay because the system always supports 48 bits of VA. This patch extends the code such that if the system supports 52 bits of VA, and the ID map text is placed that high up, then we increase the VA size accordingly, up to 52. One difference from the current implementation is that so far the condition of VA_BITS < 48 has meant that the top level table is always "full", with the maximum number of entries, and an extra table level is always needed. Now, when VA_BITS = 48 (and using 64k pages), the top level table is not full, and we simply need to increase the number of entries in it, instead of creating a new table level. Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [catalin.marinas@arm.com: reduce arguments to __create_hyp_mappings()] [catalin.marinas@arm.com: reworked/renamed __cpu_uses_extended_idmap_level()] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-22arm64: handle 52-bit addresses in TTBRKristina Martsenko
The top 4 bits of a 52-bit physical address are positioned at bits 2..5 in the TTBR registers. Introduce a couple of macros to move the bits there, and change all TTBR writers to use them. Leave TTBR0 PAN code unchanged, to avoid complicating it. A system with 52-bit PA will have PAN anyway (because it's ARMv8.1 or later), and a system without 52-bit PA can only use up to 48-bit PAs. A later patch in this series will add a kconfig dependency to ensure PAN is configured. In addition, when using 52-bit PA there is a special alignment requirement on the top-level table. We don't currently have any VA_BITS configuration that would violate the requirement, but one could be added in the future, so add a compile-time BUG_ON to check for it. Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [catalin.marinas@arm.com: added TTBR_BADD_MASK_52 comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-22arm64: limit PA size to supported rangeKristina Martsenko
We currently copy the physical address size from ID_AA64MMFR0_EL1.PARange directly into TCR.(I)PS. This will not work for 4k and 16k granule kernels on systems that support 52-bit physical addresses, since 52-bit addresses are only permitted with the 64k granule. To fix this, fall back to 48 bits when configuring the PA size when the kernel does not support 52-bit PAs. When it does, fall back to 52, to avoid similar problems in the future if the PA size is ever increased above 52. Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [catalin.marinas@arm.com: tcr_set_pa_size macro renamed to tcr_compute_pa_size] [catalin.marinas@arm.com: comments added to tcr_compute_pa_size] [catalin.marinas@arm.com: definitions added for TCR_*PS_SHIFT] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-12arm64: Add software workaround for Falkor erratum 1041Shanker Donthineni
The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11arm64: mm: Rename post_ttbr0_update_workaroundWill Deacon
The post_ttbr0_update_workaround hook applies to any change to TTBRx_EL1. Since we're using TTBR1 for the ASID, rename the hook to make it clearer as to what it's doing. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11arm64: mm: Remove pre_ttbr0_update_workaround for Falkor erratum #E1003Will Deacon
The pre_ttbr0_update_workaround hook is called prior to context-switching TTBR0 because Falkor erratum E1003 can cause TLB allocation with the wrong ASID if both the ASID and the base address of the TTBR are updated at the same time. With the ASID sitting safely in TTBR1, we no longer update things atomically, so we can remove the pre_ttbr0_update_workaround macro as it's no longer required. The erratum infrastructure and documentation is left around for #E1003, as it will be required by the entry trampoline code in a future patch. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02arm64: entry.S: convert elX_irqJames Morse
Following our 'dai' order, irqs should be processed with debug and serror exceptions unmasked. Add a helper to unmask these two, (and fiq for good measure). Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02arm64: entry.S convert el0_syncJames Morse
el0_sync also unmasks exceptions on a case-by-case basis, debug exceptions are enabled, unless this was a debug exception. Irqs are unmasked for some exception types but not for others. el0_dbg should run with everything masked to prevent us taking a debug exception from do_debug_exception. For the other cases we can unmask everything. This changes the behaviour of fpsimd_{acc,exc} and el0_inv which previously ran with irqs masked. This patch removed the last user of enable_dbg_and_irq, remove it. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02arm64: entry.S: convert el1_syncJames Morse
el1_sync unmasks exceptions on a case-by-case basis, debug exceptions are unmasked, unless this was a debug exception. IRQs are unmasked for instruction and data aborts only if the interupted context had irqs unmasked. Following our 'dai' order, el1_dbg should run with everything masked. For the other cases we can inherit whatever we interrupted. Add a macro inherit_daif to set daif based on the interrupted pstate. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02arm64: entry.S: Remove disable_dbgJames Morse
enable_step_tsk is the only user of disable_dbg, which doesn't respect our 'dai' order for exception masking. enable_step_tsk may enable single-step, so previously needed to mask debug exceptions to prevent us from single-stepping kernel_exit. enable_step_tsk is called at the end of the ret_to_user loop, which has already masked all exceptions so this is no longer needed. Remove disable_dbg, add a comment that enable_step_tsk's caller should have masked debug. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02arm64: explicitly mask all exceptionsJames Morse
There are a few places where we want to mask all exceptions. Today we do this in a piecemeal fashion, typically we expect the caller to have masked irqs and the arch code masks debug exceptions, ignoring serror which is probably masked. Make it clear that 'mask all exceptions' is the intention by adding helpers to do exactly that. This will let us unmask SError without having to add 'oh and SError' to these paths. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-25arm64: Use existing defines for mdscrJulien Thierry
Literal values are being used to set single stepping in mdscr from assembly code. There are already existing defines representing those values, use those instead of the literal values. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-15Merge branch 'arm64/vmap-stack' of ↵Catalin Marinas
git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core * 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux: arm64: add VMAP_STACK overflow detection arm64: add on_accessible_stack() arm64: add basic VMAP_STACK support arm64: use an irq stack pointer arm64: assembler: allow adr_this_cpu to use the stack pointer arm64: factor out entry stack manipulation efi/arm64: add EFI_KIMG_ALIGN arm64: move SEGMENT_ALIGN to <asm/memory.h> arm64: clean up irq stack definitions arm64: clean up THREAD_* definitions arm64: factor out PAGE_* and CONT_* definitions arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP fork: allow arch-override of VMAP stack alignment arm64: remove __die()'s stack dump
2017-08-15arm64: assembler: allow adr_this_cpu to use the stack pointerArd Biesheuvel
Given that adr_this_cpu already requires a temp register in addition to the destination register, tweak the instruction sequence so that sp may be used as well. This will simplify switching to per-cpu stacks in subsequent patches. While this limits the range of adr_this_cpu, to +/-4GiB, we don't currently use adr_this_cpu in modules, and this is not problematic for the main kernel image. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [Mark: add more commit text] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
2017-08-09Merge branch 'arm64/exception-stack' of ↵Catalin Marinas
git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core * 'arm64/exception-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux: arm64: unwind: remove sp from struct stackframe arm64: unwind: reference pt_regs via embedded stack frame arm64: unwind: disregard frame.sp when validating frame pointer arm64: unwind: avoid percpu indirection for irq stack arm64: move non-entry code out of .entry.text arm64: consistently use bl for C exception entry arm64: Add ASM_BUG()
2017-08-09arm64: Implement pmem API supportRobin Murphy
Add a clean-to-point-of-persistence cache maintenance helper, and wire up the basic architectural support for the pmem driver based on it. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [catalin.marinas@arm.com: move arch_*_pmem() functions to arch/arm64/mm/flush.c] [catalin.marinas@arm.com: change dmb(sy) to dmb(osh)] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-08arm64: move non-entry code out of .entry.textMark Rutland
Currently, cpu_switch_to and ret_from_fork both live in .entry.text, though neither form the critical path for an exception entry. In subsequent patches, we will require that code in .entry.text is part of the critical path for exception entry, for which we can assume certain properties (e.g. the presence of exception regs on the stack). Neither cpu_switch_to nor ret_from_fork will meet these requirements, so we must move them out of .entry.text. To ensure that neither are kprobed after being moved out of .entry.text, we must explicitly blacklist them, requiring a new NOKPROBE() asm helper. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-02-22Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: - Errata workarounds for Qualcomm's Falkor CPU - Qualcomm L2 Cache PMU driver - Qualcomm SMCCC firmware quirk - Support for DEBUG_VIRTUAL - CPU feature detection for userspace via MRS emulation - Preliminary work for the Statistical Profiling Extension - Misc cleanups and non-critical fixes * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (74 commits) arm64/kprobes: consistently handle MRS/MSR with XZR arm64: cpufeature: correctly handle MRS to XZR arm64: traps: correctly handle MRS/MSR with XZR arm64: ptrace: add XZR-safe regs accessors arm64: include asm/assembler.h in entry-ftrace.S arm64: fix warning about swapper_pg_dir overflow arm64: Work around Falkor erratum 1003 arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2 arm64: arch_timer: document Hisilicon erratum 161010101 arm64: use is_vmalloc_addr arm64: use linux/sizes.h for constants arm64: uaccess: consistently check object sizes perf: add qcom l2 cache perf events driver arm64: remove wrong CONFIG_PROC_SYSCTL ifdef ARM: smccc: Update HVC comment to describe new quirk parameter arm64: do not trace atomic operations ACPI/IORT: Fix the error return code in iort_add_smmu_platform_device() ACPI/IORT: Fix iort_node_get_id() mapping entries indexing arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA perf: xgene: Include module.h ...
2017-02-10arm64: Work around Falkor erratum 1003Christopher Covington
The Qualcomm Datacenter Technologies Falkor v1 CPU may allocate TLB entries using an incorrect ASID when TTBRx_EL1 is being updated. When the erratum is triggered, page table entries using the new translation table base address (BADDR) will be allocated into the TLB using the old ASID. All circumstances leading to the incorrect ASID being cached in the TLB arise when software writes TTBRx_EL1[ASID] and TTBRx_EL1[BADDR], a memory operation is in the process of performing a translation using the specific TTBRx_EL1 being written, and the memory operation uses a translation table descriptor designated as non-global. EL2 and EL3 code changing the EL1&0 ASID is not subject to this erratum because hardware is prohibited from performing translations from an out-of-context translation regime. Consider the following pseudo code. write new BADDR and ASID values to TTBRx_EL1 Replacing the above sequence with the one below will ensure that no TLB entries with an incorrect ASID are used by software. write reserved value to TTBRx_EL1[ASID] ISB write new value to TTBRx_EL1[BADDR] ISB write new value to TTBRx_EL1[ASID] ISB When the above sequence is used, page table entries using the new BADDR value may still be incorrectly allocated into the TLB using the reserved ASID. Yet this will not reduce functionality, since TLB entries incorrectly tagged with the reserved ASID will never be hit by a later instruction. Based on work by Shanker Donthineni <shankerd@codeaurora.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-01-12arm64: assembler: make adr_l work in modules under KASLRArd Biesheuvel
When CONFIG_RANDOMIZE_MODULE_REGION_FULL=y, the offset between loaded modules and the core kernel may exceed 4 GB, putting symbols exported by the core kernel out of the reach of the ordinary adrp/add instruction pairs used to generate relative symbol references. So make the adr_l macro emit a movz/movk sequence instead when executing in module context. While at it, remove the pointless special case for the stack pointer. Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-21arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1Catalin Marinas
This patch adds the uaccess macros/functions to disable access to user space by setting TTBR0_EL1 to a reserved zeroed page. Since the value written to TTBR0_EL1 must be a physical address, for simplicity this patch introduces a reserved_ttbr0 page at a constant offset from swapper_pg_dir. The uaccess_disable code uses the ttbr1_el1 value adjusted by the reserved_ttbr0 offset. Enabling access to user is done by restoring TTBR0_EL1 with the value from the struct thread_info ttbr0 variable. Interrupts must be disabled during the uaccess_ttbr0_enable code to ensure the atomicity of the thread_info.ttbr0 read and TTBR0_EL1 write. This patch also moves the get_thread_info asm macro from entry.S to assembler.h for reuse in the uaccess_ttbr0_* macros. Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-21arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macroCatalin Marinas
This patch takes the errata workaround code out of cpu_do_switch_mm into a dedicated post_ttbr0_update_workaround macro which will be reused in a subsequent patch. Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11arm64: assembler: introduce ldr_this_cpuMark Rutland
Shortly we will want to load a percpu variable in the return from userspace path. We can save an instruction by folding the addition of the percpu offset into the load instruction, and this patch adds a new helper to do so. At the same time, we clean up this_cpu_ptr for consistency. As with {adr,ldr,str}_l, we change the template to take the destination register first, and name this dst. Secondly, we rename the macro to adr_this_cpu, following the scheme of adr_l, and matching the newly added ldr_this_cpu. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-09arm64: barriers: introduce nops and __nops macros for NOP sequencesWill Deacon
NOP sequences tend to get used for padding out alternative sections and uarch-specific pipeline flushes in errata workarounds. This patch adds macros for generating these sequences as both inline asm blocks, but also as strings suitable for embedding in other asm blocks directly. Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09arm64: Work around systems with mismatched cache line sizesSuzuki K Poulose
Systems with differing CPU i-cache/d-cache line sizes can cause problems with the cache management by software when the execution is migrated from one to another. Usually, the application reads the cache size on a CPU and then uses that length to perform cache operations. However, if it gets migrated to another CPU with a smaller cache line size, things could go completely wrong. To prevent such cases, always use the smallest cache line size among the CPUs. The kernel CPU feature infrastructure already keeps track of the safe value for all CPUID registers including CTR. This patch works around the problem by : For kernel, dynamically patch the kernel to read the cache size from the system wide copy of CTR_EL0. For applications, trap read accesses to CTR_EL0 (by clearing the SCTLR.UCT) and emulate the mrs instruction to return the system wide safe value of CTR_EL0. For faster access (i.e, avoiding to lookup the system wide value of CTR_EL0 via read_system_reg), we keep track of the pointer to table entry for CTR_EL0 in the CPU feature infrastructure. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09arm64: Introduce raw_{d,i}cache_line_sizeSuzuki K Poulose
On systems with mismatched i/d cache min line sizes, we need to use the smallest size possible across all CPUs. This will be done by fetching the system wide safe value from CPU feature infrastructure. However the some special users(e.g kexec, hibernate) would need the line size on the CPU (rather than the system wide), when either the system wide feature may not be accessible or it is guranteed that the caller executes with a gurantee of no migration. Provide another helper which will fetch cache line size on the current CPU. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: James Morse <james.morse@arm.com> Reviewed-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-01arm64: include alternative handling in dcache_by_line_opAndre Przywara
The newly introduced dcache_by_line_op macro is used at least in one occassion at the moment to issue a "dc cvau" instruction, which is affected by ARM errata 819472, 826319, 827319 and 824069. Change the macro to allow for alternative patching in there to protect affected Cortex-A53 cores. Signed-off-by: Andre Przywara <andre.przywara@arm.com> [catalin.marinas@arm.com: indentation fixups] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-04-28arm64: Add new asm macro copy_pageGeoff Levand
Kexec and hibernate need to copy pages of memory, but may not have all of the kernel mapped, and are unable to call copy_page(). Add a simplistic copy_page() macro, that can be inlined in these situations. lib/copy_page.S provides a bigger better version, but uses more registers. Signed-off-by: Geoff Levand <geoff@infradead.org> [Changed asm label to 9998, added commit message] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-28arm64: Fold proc-macros.S into assembler.hGeoff Levand
To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to be used outside the mm code move the contents of proc-macros.S to asm/assembler.h. Also, delete proc-macros.S, and fix up all references to proc-macros.S. Signed-off-by: Geoff Levand <geoff@infradead.org> Acked-by: Pavel Machek <pavel@ucw.cz> [rebased, included dcache_by_line_op] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-26arm64: introduce mov_q macro to move a constant into a 64-bit registerArd Biesheuvel
Implement a macro mov_q that can be used to move an immediate constant into a 64-bit register, using between 2 and 4 movz/movk instructions (depending on the operand) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-20arm64: asm: remove unused push/pop macrosMark Rutland
We haven't used the push/pop macros for a while now, as it's typically better to use immediate offsets for batches of accesses to the stack, as we now do in the entry assembly for the kernel and hyp code. Remove the unused macros. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-24arm64: switch to relative exception tablesArd Biesheuvel
Instead of using absolute addresses for both the exception location and the fixup, use offsets relative to the exception table entry values. Not only does this cut the size of the exception table in half, it is also a prerequisite for KASLR, since absolute exception table entries are subject to dynamic relocation, which is incompatible with the sorting of the exception table that occurs at build time. This patch also introduces the _ASM_EXTABLE preprocessor macro (which exists on x86 as well) and its _asm_extable assembly counterpart, as shorthands to emit exception table entries. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-24arm64: avoid R_AARCH64_ABS64 relocations for Image header fieldsArd Biesheuvel
Unfortunately, the current way of using the linker to emit build time constants into the Image header will no longer work once we switch to the use of PIE executables. The reason is that such constants are emitted into the binary using R_AARCH64_ABS64 relocations, which are resolved at runtime, not at build time, and the places targeted by those relocations will contain zeroes before that. So refactor the endian swapping linker script constant generation code so that it emits the upper and lower 32-bit words separately. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-12-10arm64: Add this_cpu_ptr() assembler macro for use in entry.SJames Morse
irq_stack is a per_cpu variable, that needs to be access from entry.S. Use an assembler macro instead of the unreadable details. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-12arm64: use ENDPIPROC() to annotate position independent assembler routinesArd Biesheuvel
For more control over which functions are called with the MMU off or with the UEFI 1:1 mapping active, annotate some assembler routines as position independent. This is done by introducing ENDPIPROC(), which replaces the ENDPROC() declaration of those routines. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-27arm64: force CONFIG_SMP=y and remove redundant #ifdefsWill Deacon
Nobody seems to be producing !SMP systems anymore, so this is just becoming a source of kernel bugs, particularly if people want to use coherent DMA with non-shared pages. This patch forces CONFIG_SMP=y for arm64, removing a modest amount of code in the process. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27arm64: Remove unused macros from assembler.hDaniel Thompson
Commit 68234df4ea79 ("arm64: kill flush_cache_all()") removed the only users of these macros. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19arm64: add macros for common adrp usagesArd Biesheuvel
The adrp instruction is mostly used in combination with either an add, a ldr or a str instruction with the low bits of the referenced symbol in the 12-bit immediate of the followup instruction. Introduce the macros adr_l, ldr_l and str_l that encapsulate these common patterns. Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-02-23arm64: guard asm/assembler.h against multiple inclusionsMarc Zyngier
asm/assembler.h lacks the usual guard against multiple inclusion, leading to a compilation failure if it is accidentally included twice. Using the classic #ifndef/#define/#endif construct solves the issue. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-05-12arm64: debug: avoid accessing mdscr_el1 on fault paths where possibleWill Deacon
Since mdscr_el1 is part of the debug register group, it is highly likely to be trapped by a hypervisor to prevent virtual machines from debugging (buggering?) each other. Unfortunately, this absolutely destroys our performance, since we access the register on many of our low-level fault handling paths to keep track of the various debug state machines. This patch removes our dependency on mdscr_el1 in the case that debugging is not being used. More specifically we: - Use TIF_SINGLESTEP to indicate that a task is stepping at EL0 and avoid disabling step in the MDSCR when we don't need to. MDSCR_EL1.SS handling is moved to kernel_entry, when trapping from userspace. - Ensure debug exceptions are re-enabled on *all* exception entry paths, even the debug exception handling path (where we re-enable exceptions after invoking the handler). Since we can now rely on MDSCR_EL1.SS being cleared by the entry code, exception handlers can usually enable debug immediately before enabling interrupts. - Remove all debug exception unmasking from ret_to_user and el1_preempt, since we will never get here with debug exceptions masked. This results in a slight change to kernel debug behaviour, where we now step into interrupt handlers and data aborts from EL1 when debugging the kernel, which is actually a useful thing to do. A side-effect of this is that it *does* potentially prevent stepping off {break,watch}points when there is a high-frequency interrupt source (e.g. a timer), so a debugger would need to use either breakpoints or manually disable interrupts to get around this issue. With this patch applied, guest performance is restored under KVM when debug register accesses are trapped (and we get a measurable performance increase on the host on Cortex-A57 too). Cc: Ian Campbell <ian.campbell@citrix.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-10-25arm64: asm: add CPU_LE & CPU_BE assembler helpersMatthew Leach
Add CPU_LE and CPU_BE to select assembler code in little and big endian configurations respectively. Signed-off-by: Matthew Leach <matthew.leach@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-10-25arm64: compat: correct register concatenation for syscall wrappersMatthew Leach
The arm64 port contains wrappers for arm32 syscalls that pass 64-bit values. These wrappers concatenate the two registers to hold a 64-bit value in a single X register. On BE, however, the lower and higher words are swapped. Create a new assembler macro, regs_to_64, that when on BE systems swaps the registers in the orr instruction. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Matthew Leach <matthew.leach@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-05-17arm64: debug: fix mdscr.ss check when enabling debug exceptionsWill Deacon
When we take an exception at EL1, we only want to enable debug exceptions if we're not currently stepping, otherwise we can easily get stuck in a loop stepping into interrupt handlers. Unfortunately, the current code tests the wrong bit in the mdscr, so fix that. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2012-12-05arm64: move vector entry macro to assembler.hMarc Zyngier
This macro is also useful to other bits defining vectors (hypervisor stub, KVM...). Move it to a common location. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>