summaryrefslogtreecommitdiff
path: root/arch/x86/lib
AgeCommit message (Collapse)Author
2021-02-23Merge tag 'objtool-core-2021-02-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Thomas Gleixner: - Make objtool work for big-endian cross compiles - Make stack tracking via stack pointer memory operations match push/pop semantics to prepare for architectures w/o PUSH/POP instructions. - Add support for analyzing alternatives - Improve retpoline detection and handling - Improve assembly code coverage on x86 - Provide support for inlined stack switching * tag 'objtool-core-2021-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) objtool: Support stack-swizzle objtool,x86: Additionally decode: mov %rsp, (%reg) x86/unwind/orc: Change REG_SP_INDIRECT x86/power: Support objtool validation in hibernate_asm_64.S x86/power: Move restore_registers() to top of the file x86/power: Annotate indirect branches as safe x86/acpi: Support objtool validation in wakeup_64.S x86/acpi: Annotate indirect branch as safe x86/ftrace: Support objtool vmlinux.o validation in ftrace_64.S x86/xen/pvh: Annotate indirect branch as safe x86/xen: Support objtool vmlinux.o validation in xen-head.S x86/xen: Support objtool validation in xen-asm.S objtool: Add xen_start_kernel() to noreturn list objtool: Combine UNWIND_HINT_RET_OFFSET and UNWIND_HINT_FUNC objtool: Add asm version of STACK_FRAME_NON_STANDARD objtool: Assume only ELF functions do sibling calls x86/ftrace: Add UNWIND_HINT_FUNC annotation for ftrace_stub objtool: Support retpoline jump detection for vmlinux.o objtool: Fix ".cold" section suffix check for newer versions of GCC objtool: Fix retpoline detection in asm code ...
2021-01-26objtool: Combine UNWIND_HINT_RET_OFFSET and UNWIND_HINT_FUNCJosh Poimboeuf
The ORC metadata generated for UNWIND_HINT_FUNC isn't actually very func-like. With certain usages it can cause stack state mismatches because it doesn't set the return address (CFI_RA). Also, users of UNWIND_HINT_RET_OFFSET no longer need to set a custom return stack offset. Instead they just need to specify a func-like situation, so the current ret_offset code is hacky for no good reason. Solve both problems by simplifying the RET_OFFSET handling and converting it into a more useful UNWIND_HINT_FUNC. If we end up needing the old 'ret_offset' functionality again in the future, we should be able to support it pretty easily with the addition of a custom 'sp_offset' in UNWIND_HINT_FUNC. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/db9d1f5d79dddfbb3725ef6d8ec3477ad199948d.1611263462.git.jpoimboe@redhat.com
2021-01-21x86/mmx: Use KFPU_387 for MMX string operationsAndy Lutomirski
The default kernel_fpu_begin() doesn't work on systems that support XMM but haven't yet enabled CR4.OSFXSR. This causes crashes when _mmx_memcpy() is called too early because LDMXCSR generates #UD when the aforementioned bit is clear. Fix it by using kernel_fpu_begin_mask(KFPU_387) explicitly. Fixes: 7ad816762f9b ("x86/fpu: Reset MXCSR to default in kernel_fpu_begin()") Reported-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Krzysztof Piotr Olędzki <ole@ans.pl> Tested-by: Krzysztof Mazur <krzysiek@podlesie.net> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/e7bf21855fe99e5f3baa27446e32623358f69e8d.1611205691.git.luto@kernel.org
2021-01-13x86/insn: Fix vector instruction decoding on big endian cross-compilesVasily Gorbik
Running instruction decoder posttest on an s390 host with an x86 target with allyesconfig shows errors. Instructions used in a couple of kernel objects could not be correctly decoded on big endian system. insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 5 insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this. insn_decoder_test: warning: ffffffff831eb4e1: 62 d1 fd 48 7f 04 24 vmovdqa64 %zmm0,(%r12) insn_decoder_test: warning: objdump says 7 bytes, but insn_get_length() says 6 insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this. insn_decoder_test: warning: ffffffff831eb4e8: 62 51 fd 48 7f 44 24 01 vmovdqa64 %zmm8,0x40(%r12) insn_decoder_test: warning: objdump says 8 bytes, but insn_get_length() says 6 This is because in a few places instruction field bytes are set directly with further usage of "value". To address that introduce and use a insn_set_byte() helper, which correctly updates "value" on big endian systems. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13x86/insn: Support big endian cross-compilesMartin Schwidefsky
The x86 instruction decoder code is shared across the kernel source and the tools. Currently objtool seems to be the only tool from build tools needed which breaks x86 cross-compilation on big endian systems. Make the x86 instruction decoder build host endianness agnostic to support x86 cross-compilation and enable objtool to implement endianness awareness for big endian architectures support. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Co-developed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-12-14Merge tag 'sched-core-2020-12-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - migrate_disable/enable() support which originates from the RT tree and is now a prerequisite for the new preemptible kmap_local() API which aims to replace kmap_atomic(). - A fair amount of topology and NUMA related improvements - Improvements for the frequency invariant calculations - Enhanced robustness for the global CPU priority tracking and decision making - The usual small fixes and enhancements all over the place * tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits) sched/fair: Trivial correction of the newidle_balance() comment sched/fair: Clear SMT siblings after determining the core is not idle sched: Fix kernel-doc markup x86: Print ratio freq_max/freq_base used in frequency invariance calculations x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC x86, sched: Calculate frequency invariance for AMD systems irq_work: Optimize irq_work_single() smp: Cleanup smp_call_function*() irq_work: Cleanup sched: Limit the amount of NUMA imbalance that can exist at fork time sched/numa: Allow a floating imbalance between NUMA nodes sched: Avoid unnecessary calculation of load imbalance at clone time sched/numa: Rename nr_running and break out the magic number sched: Make migrate_disable/enable() independent of RT sched/topology: Condition EAS enablement on FIE support arm64: Rebuild sched domains on invariance status changes sched/topology,schedutil: Wrap sched domains rebuild sched/uclamp: Allow to reset a task uclamp constraint value sched/core: Fix typos in comments Documentation: scheduler: fix information on arch SD flags, sched_domain and sched_debug ...
2020-12-14Merge tag 'x86_cleanups_for_v5.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: "Another branch with a nicely negative diffstat, just the way I like 'em: - Remove all uses of TIF_IA32 and TIF_X32 and reclaim the two bits in the end (Gabriel Krisman Bertazi) - All kinds of minor cleanups all over the tree" * tag 'x86_cleanups_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/ia32_signal: Propagate __user annotation properly x86/alternative: Update text_poke_bp() kernel-doc comment x86/PCI: Make a kernel-doc comment a normal one x86/asm: Drop unused RDPID macro x86/boot/compressed/64: Use TEST %reg,%reg instead of CMP $0,%reg x86/head64: Remove duplicate include x86/mm: Declare 'start' variable where it is used x86/head/64: Remove unused GET_CR2_INTO() macro x86/boot: Remove unused finalize_identity_maps() x86/uaccess: Document copy_from_user_nmi() x86/dumpstack: Make show_trace_log_lvl() static x86/mtrr: Fix a kernel-doc markup x86/setup: Remove unused MCA variables x86, libnvdimm/test: Remove COPY_MC_TEST x86: Reclaim TIF_IA32 and TIF_X32 x86/mm: Convert mmu context ia32_compat into a proper flags field x86/elf: Use e_machine to check for x32/ia32 in setup_additional_pages() elf: Expose ELF header on arch_setup_additional_pages() x86/elf: Use e_machine to select start_thread for x32 elf: Expose ELF header in compat_start_thread() ...
2020-12-06x86/insn-eval: Use new for_each_insn_prefix() macro to loop over prefixes bytesMasami Hiramatsu
Since insn.prefixes.nbytes can be bigger than the size of insn.prefixes.bytes[] when a prefix is repeated, the proper check must be insn.prefixes.bytes[i] != 0 and i < 4 instead of using insn.prefixes.nbytes. Use the new for_each_insn_prefix() macro which does it correctly. Debugged by Kees Cook <keescook@chromium.org>. [ bp: Massage commit message. ] Fixes: 32d0b95300db ("x86/insn-eval: Add utility functions to get segment selector") Reported-by: syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/160697104969.3146288.16329307586428270032.stgit@devnote2
2020-11-27Merge branch 'linus' into sched/core, to resolve semantic conflictIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-11-24smp: Cleanup smp_call_function*()Peter Zijlstra
Get rid of the __call_single_node union and cleanup the API a little to avoid external code relying on the structure layout as much. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
2020-11-18x86/uaccess: Document copy_from_user_nmi()Thomas Gleixner
Document the functionality of copy_from_user_nmi() to avoid further confusion. Fix the typo in the existing comment while at it. Requested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201117202753.806376613@linutronix.de
2020-11-04x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.SFangrui Song
Commit 393f203f5fd5 ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions") added .weak directives to arch/x86/lib/mem*_64.S instead of changing the existing ENTRY macros to WEAK. This can lead to the assembly snippet .weak memcpy ... .globl memcpy which will produce a STB_WEAK memcpy with GNU as but STB_GLOBAL memcpy with LLVM's integrated assembler before LLVM 12. LLVM 12 (since https://reviews.llvm.org/D90108) will error on such an overridden symbol binding. Commit ef1e03152cb0 ("x86/asm: Make some functions local") changed ENTRY in arch/x86/lib/memcpy_64.S to SYM_FUNC_START_LOCAL, which was ineffective due to the preceding .weak directive. Use the appropriate SYM_FUNC_START_WEAK instead. Fixes: 393f203f5fd5 ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions") Fixes: ef1e03152cb0 ("x86/asm: Make some functions local") Reported-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201103012358.168682-1-maskray@google.com
2020-10-26x86, libnvdimm/test: Remove COPY_MC_TESTDan Williams
The COPY_MC_TEST facility has served its purpose for validating the early termination conditions of the copy_mc_fragile() implementation. Remove it and the EXPORT_SYMBOL_GPL of copy_mc_fragile(). Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/160316688322.3374697.8648308115165836243.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-22Merge branch 'work.set_fs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull initial set_fs() removal from Al Viro: "Christoph's set_fs base series + fixups" * 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Allow a NULL pos pointer to __kernel_read fs: Allow a NULL pos pointer to __kernel_write powerpc: remove address space overrides using set_fs() powerpc: use non-set_fs based maccess routines x86: remove address space overrides using set_fs() x86: make TASK_SIZE_MAX usable from assembly code x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h lkdtm: remove set_fs-based tests test_bitmap: remove user bitmap tests uaccess: add infrastructure for kernel builds with set_fs() fs: don't allow splice read/write without explicit ops fs: don't allow kernel reads and writes without iter ops sysctl: Convert to iter interfaces proc: add a read_iter method to proc proc_ops proc: cleanup the compat vs no compat file ops proc: remove a level of indentation in proc_get_inode
2020-10-14Merge tag 'x86_seves_for_v5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV-ES support from Borislav Petkov: "SEV-ES enhances the current guest memory encryption support called SEV by also encrypting the guest register state, making the registers inaccessible to the hypervisor by en-/decrypting them on world switches. Thus, it adds additional protection to Linux guests against exfiltration, control flow and rollback attacks. With SEV-ES, the guest is in full control of what registers the hypervisor can access. This is provided by a guest-host exchange mechanism based on a new exception vector called VMM Communication Exception (#VC), a new instruction called VMGEXIT and a shared Guest-Host Communication Block which is a decrypted page shared between the guest and the hypervisor. Intercepts to the hypervisor become #VC exceptions in an SEV-ES guest so in order for that exception mechanism to work, the early x86 init code needed to be made able to handle exceptions, which, in itself, brings a bunch of very nice cleanups and improvements to the early boot code like an early page fault handler, allowing for on-demand building of the identity mapping. With that, !KASLR configurations do not use the EFI page table anymore but switch to a kernel-controlled one. The main part of this series adds the support for that new exchange mechanism. The goal has been to keep this as much as possibly separate from the core x86 code by concentrating the machinery in two SEV-ES-specific files: arch/x86/kernel/sev-es-shared.c arch/x86/kernel/sev-es.c Other interaction with core x86 code has been kept at minimum and behind static keys to minimize the performance impact on !SEV-ES setups. Work by Joerg Roedel and Thomas Lendacky and others" * tag 'x86_seves_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits) x86/sev-es: Use GHCB accessor for setting the MMIO scratch buffer x86/sev-es: Check required CPU features for SEV-ES x86/efi: Add GHCB mappings when SEV-ES is active x86/sev-es: Handle NMI State x86/sev-es: Support CPU offline/online x86/head/64: Don't call verify_cpu() on starting APs x86/smpboot: Load TSS and getcpu GDT entry before loading IDT x86/realmode: Setup AP jump table x86/realmode: Add SEV-ES specific trampoline entry point x86/vmware: Add VMware-specific handling for VMMCALL under SEV-ES x86/kvm: Add KVM-specific VMMCALL handling under SEV-ES x86/paravirt: Allow hypervisor-specific VMMCALL handling under SEV-ES x86/sev-es: Handle #DB Events x86/sev-es: Handle #AC Events x86/sev-es: Handle VMMCALL Events x86/sev-es: Handle MWAIT/MWAITX Events x86/sev-es: Handle MONITOR/MONITORX Events x86/sev-es: Handle INVD Events x86/sev-es: Handle RDPMC Events x86/sev-es: Handle RDTSC(P) Events ...
2020-10-12x86: Make __put_user() generate an out-of-line callLinus Torvalds
Instead of inlining the stac/mov/clac sequence (which also requires individual exception table entries and several asm instruction alternatives entries), just generate "call __put_user_nocheck_X" for the __put_user() cases, the same way we changed __get_user earlier. Unlike the get_user() case, we didn't have the same nice infrastructure to just generate the call with a single case, so this actually has to change some of the infrastructure in order to do this. But that only cleans up the code further. So now, instead of using a case statement for the sizes, we just do the same thing we've done on the get_user() side for a long time: use the size as an immediate constant to the asm, and generate the asm that way directly. In order to handle the special case of 64-bit data on a 32-bit kernel, I needed to change the calling convention slightly: the data is passed in %eax[:%edx], the pointer in %ecx, and the return value is also returned in %ecx. It used to be returned in %eax, but because of how %eax can now be a double register input, we don't want mix that with a single-register output. The actual low-level asm is easier to handle: we'll just share the code between the checking and non-checking case, with the non-checking case jumping into the middle of the function. That may sound a bit too special, but this code is all very very special anyway, so... Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-12x86: Make __get_user() generate an out-of-line callLinus Torvalds
Instead of inlining the whole stac/lfence/mov/clac sequence (which also requires individual exception table entries and several asm instruction alternatives entries), just generate "call __get_user_nocheck_X" for the __get_user() cases. We can use all the same infrastructure that we already do for the regular "get_user()", and the end result is simpler source code, and much simpler code generation. It also means that when I introduce asm goto with input for "unsafe_get_user()", there are no nasty interactions with the __get_user() code. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-12Merge branch 'work.csum_and_copy' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull copy_and_csum cleanups from Al Viro: "Saner calling conventions for csum_and_copy_..._user() and friends" [ Removing 800+ lines of code and cleaning stuff up is good - Linus ] * 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ppc: propagate the calling conventions change down to csum_partial_copy_generic() amd64: switch csum_partial_copy_generic() to new calling conventions sparc64: propagate the calling convention changes down to __csum_partial_copy_...() xtensa: propagate the calling conventions change down into csum_partial_copy_generic() mips: propagate the calling convention change down into __csum_partial_copy_..._user() mips: __csum_partial_copy_kernel() has no users left mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() i386: propagate the calling conventions change down to csum_partial_copy_generic() sh: propage the calling conventions change down to csum_partial_copy_generic() m68k: get rid of zeroing destination on error in csum_and_copy_from_user() arm: propagate the calling convention changes down to csum_partial_copy_from_user() alpha: propagate the calling convention changes down to csum_partial_copy.c helpers saner calling conventions for csum_and_copy_..._user() csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum csum_partial_copy_nocheck(): drop the last argument unify generic instances of csum_partial_copy_nocheck() icmp_push_reply(): reorder adding the checksum up skb_copy_and_csum_bits(): don't bother with the last argument
2020-10-12Merge tag 'ras_updates_for_v5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Extend the recovery from MCE in kernel space also to processes which encounter an MCE in kernel space but while copying from user memory by sending them a SIGBUS on return to user space and umapping the faulty memory, by Tony Luck and Youquan Song. - memcpy_mcsafe() rework by splitting the functionality into copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables support for new hardware which can recover from a machine check encountered during a fast string copy and makes that the default and lets the older hardware which does not support that advance recovery, opt in to use the old, fragile, slow variant, by Dan Williams. - New AMD hw enablement, by Yazen Ghannam and Akshay Gupta. - Do not use MSR-tracing accessors in #MC context and flag any fault while accessing MCA architectural MSRs as an architectural violation with the hope that such hw/fw misdesigns are caught early during the hw eval phase and they don't make it into production. - Misc fixes, improvements and cleanups, as always. * tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Allow for copy_mc_fragile symbol checksum to be generated x86/mce: Decode a kernel instruction to determine if it is copying from user x86/mce: Recover from poison found while copying from user space x86/mce: Avoid tail copy when machine check terminated a copy from user x86/mce: Add _ASM_EXTABLE_CPY for copy user access x86/mce: Provide method to find out the type of an exception handler x86/mce: Pass pointer to saved pt_regs to severity calculation routines x86/copy_mc: Introduce copy_mc_enhanced_fast_string() x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list x86/mce: Add Skylake quirk for patrol scrub reported errors RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE() x86/mce: Annotate mce_rd/wrmsrl() with noinstr x86/mce/dev-mcelog: Do not update kflags on AMD systems x86/mce: Stop mce_reign() from re-computing severity for every CPU x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR x86/mce: Increase maximum number of banks to 64 x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check() x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap RAS/CEC: Fix cec_init() prototype
2020-10-07x86/mce: Avoid tail copy when machine check terminated a copy from userTony Luck
In the page fault case it is ok to see if a few more unaligned bytes can be copied from the source address. Worst case is that the page fault will be triggered again. Machine checks are more serious. Just give up at the point where the main copy loop triggered the #MC and return from the copy code as if the copy succeeded. The machine check handler will use task_work_add() to make sure that the task is sent a SIGBUS. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-5-tony.luck@intel.com
2020-10-07x86/mce: Add _ASM_EXTABLE_CPY for copy user accessYouquan Song
_ASM_EXTABLE_UA is a general exception entry to record the exception fixup for all exception spots between kernel and user space access. To enable recovery from machine checks while coping data from user addresses it is necessary to be able to distinguish the places that are looping copying data from those that copy a single byte/word/etc. Add a new macro _ASM_EXTABLE_CPY and use it in place of _ASM_EXTABLE_UA in the copy functions. Record the exception reason number to regs->ax at ex_handler_uaccess which is used to check MCE triggered. The new fixup routine ex_handler_copy() is almost an exact copy of ex_handler_uaccess() The difference is that it sets regs->ax to the trap number. Following patches use this to avoid trying to copy remaining bytes from the tail of the copy and possibly hitting the poison again. New mce.kflags bit MCE_IN_KERNEL_COPYIN will be used by mce_severity() calculation to indicate that a machine check is recoverable because the kernel was copying from user space. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-4-tony.luck@intel.com
2020-10-06x86/copy_mc: Introduce copy_mc_enhanced_fast_string()Dan Williams
The motivations to go rework memcpy_mcsafe() are that the benefit of doing slow and careful copies is obviated on newer CPUs, and that the current opt-in list of CPUs to instrument recovery is broken relative to those CPUs. There is no need to keep an opt-in list up to date on an ongoing basis if pmem/dax operations are instrumented for recovery by default. With recovery enabled by default the old "mcsafe_key" opt-in to careful copying can be made a "fragile" opt-out. Where the "fragile" list takes steps to not consume poison across cachelines. The discussion with Linus made clear that the current "_mcsafe" suffix was imprecise to a fault. The operations that are needed by pmem/dax are to copy from a source address that might throw #MC to a destination that may write-fault, if it is a user page. So copy_to_user_mcsafe() becomes copy_mc_to_user() to indicate the separate precautions taken on source and destination. copy_mc_to_kernel() is introduced as a non-SMAP version that does not expect write-faults on the destination, but is still prepared to abort with an error code upon taking #MC. The original copy_mc_fragile() implementation had negative performance implications since it did not use the fast-string instruction sequence to perform copies. For this reason copy_mc_to_kernel() fell back to plain memcpy() to preserve performance on platforms that did not indicate the capability to recover from machine check exceptions. However, that capability detection was not architectural and now that some platforms can recover from fast-string consumption of memory errors the memcpy() fallback now causes these more capable platforms to fail. Introduce copy_mc_enhanced_fast_string() as the fast default implementation of copy_mc_to_kernel() and finalize the transition of copy_mc_fragile() to be a platform quirk to indicate 'copy-carefully'. With this in place, copy_mc_to_kernel() is fast and recovery-ready by default regardless of hardware capability. Thanks to Vivek for identifying that copy_user_generic() is not suitable as the copy_mc_to_user() backend since the #MC handler explicitly checks ex_has_fault_handler(). Thanks to the 0day robot for catching a performance bug in the x86/copy_mc_to_user implementation. [ bp: Add the "why" for this change from the 0/2th message, massage. ] Fixes: 92b0729c34ca ("x86/mm, x86/mce: Add memcpy_mcsafe()") Reported-by: Erwin Tsaur <erwin.tsaur@intel.com> Reported-by: 0day robot <lkp@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Tested-by: Erwin Tsaur <erwin.tsaur@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/160195562556.2163339.18063423034951948973.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-06x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()Dan Williams
In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-09-26arch/x86/lib/usercopy_64.c: fix __copy_user_flushcache() cache writebackMikulas Patocka
If we copy less than 8 bytes and if the destination crosses a cache line, __copy_user_flushcache would invalidate only the first cache line. This patch makes it invalidate the second cache line as well. Fixes: 0aed55af88345b ("x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Dan Williams <dan.j.wiilliams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2009161451140.21915@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-08x86: remove address space overrides using set_fs()Christoph Hellwig
Stop providing the possibility to override the address space using set_fs() now that there is no need for that any more. To properly handle the TASK_SIZE_MAX checking for 4 vs 5-level page tables on x86 a new alternative is introduced, which just like the one in entry_64.S has to use the hardcoded virtual address bits to escape the fact that TASK_SIZE_MAX isn't actually a constant when 5-level page tables are enabled. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-07x86/insn: Add insn_has_rep_prefix() helperJoerg Roedel
Add a function to check whether an instruction has a REP prefix. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200907131613.12703-12-joro@8bytes.org
2020-09-07x86/insn: Add insn_get_modrm_reg_off()Joerg Roedel
Add a function to the instruction decoder which returns the pt_regs offset of the register specified in the reg field of the modrm byte. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200907131613.12703-11-joro@8bytes.org
2020-09-07x86/umip: Factor out instruction decodingJoerg Roedel
Factor out the code used to decode an instruction with the correct address and operand sizes to a helper function. No functional changes. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-10-joro@8bytes.org
2020-09-07x86/umip: Factor out instruction fetchJoerg Roedel
Factor out the code to fetch the instruction from user-space to a helper function. No functional changes. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-9-joro@8bytes.org
2020-09-03x86/cmdline: Disable jump tables for cmdline.cArvind Sankar
When CONFIG_RETPOLINE is disabled, Clang uses a jump table for the switch statement in cmdline_find_option (jump tables are disabled when CONFIG_RETPOLINE is enabled). This function is called very early in boot from sme_enable() if CONFIG_AMD_MEM_ENCRYPT is enabled. At this time, the kernel is still executing out of the identity mapping, but the jump table will contain virtual addresses. Fix this by disabling jump tables for cmdline.c when AMD_MEM_ENCRYPT is enabled. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200903023056.3914690-1-nivedita@alum.mit.edu
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-20amd64: switch csum_partial_copy_generic() to new calling conventionsAl Viro
... and fold handling of misaligned case into it. Implementation note: we stash the "will we need to rol8 the sum in the end" flag into the MSB of %rcx (the lower 32 bits are used for length); the rest is pretty straightforward. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-20i386: propagate the calling conventions change down to ↵Al Viro
csum_partial_copy_generic() ... and don't bother zeroing destination on error Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-20saner calling conventions for csum_and_copy_..._user()Al Viro
All callers of these primitives will * discard anything we might've copied in case of error * ignore the csum value in case of error * always pass 0xffffffff as the initial sum, so the resulting csum value (in case of success, that is) will never be 0. That suggest the following calling conventions: * don't pass err_ptr - just return 0 on error. * don't bother with zeroing destination, etc. in case of error * don't pass the initial sum - just use 0xffffffff. This commit does the minimal conversion in the instances of csum_and_copy_...(); the changes of actual asm code behind them are done later in the series. Note that this asm code is often shared with csum_partial_copy_nocheck(); the difference is that csum_partial_copy_nocheck() passes 0 for initial sum while csum_and_copy_..._user() pass 0xffffffff. Fortunately, we are free to pass 0xffffffff in all cases and subsequent patches will use that freedom without any special comments. A part that could be split off: parisc and uml/i386 claimed to have csum_and_copy_to_user() instances of their own, but those were identical to the generic one, so we simply drop them. Not sure if it's worth a separate commit... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-20csum_partial_copy_nocheck(): drop the last argumentAl Viro
It's always 0. Note that we theoretically could use ~0U as well - result will be the same modulo 0xffff, _if_ the damn thing did the right thing for any value of initial sum; later we'll make use of that when convenient. However, unlike csum_and_copy_..._user(), there are instances that did not work for arbitrary initial sums; c6x is one such. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-07kbuild: remove cc-option test of -fno-stack-protectorMasahiro Yamada
Some Makefiles already pass -fno-stack-protector unconditionally. For example, arch/arm64/kernel/vdso/Makefile, arch/x86/xen/Makefile. No problem report so far about hard-coding this option. So, we can assume all supported compilers know -fno-stack-protector. GCC 4.8 and Clang support this option (https://godbolt.org/z/_HDGzN) Get rid of cc-option from -fno-stack-protector. Remove CONFIG_CC_HAS_STACKPROTECTOR_NONE, which is always 'y'. Note: arch/mips/vdso/Makefile adds -fno-stack-protector twice, first unconditionally, and second conditionally. I removed the second one. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2020-06-28Merge tag 'x86_urgent_for_5.8_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - AMD Memory bandwidth counter width fix, by Babu Moger. - Use the proper length type in the 32-bit truncate() syscall variant, by Jiri Slaby. - Reinit IA32_FEAT_CTL during wakeup to fix the case where after resume, VMXON would #GP due to VMX not being properly enabled, by Sean Christopherson. - Fix a static checker warning in the resctrl code, by Dan Carpenter. - Add a CR4 pinning mask for bits which cannot change after boot, by Kees Cook. - Align the start of the loop of __clear_user() to 16 bytes, to improve performance on AMD zen1 and zen2 microarchitectures, by Matt Fleming. * tag 'x86_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm/64: Align start of __clear_user() loop to 16-bytes x86/cpu: Use pinning mask for CR4 bits needing to be 0 x86/resctrl: Fix a NULL vs IS_ERR() static checker warning in rdt_cdp_peer_get() x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup syscalls: Fix offset type of ksys_ftruncate() x86/resctrl: Fix memory bandwidth counter width for AMD
2020-06-25x86/entry: Fixup bad_iret vs noinstrPeter Zijlstra
vmlinux.o: warning: objtool: fixup_bad_iret()+0x8e: call to memcpy() leaves .noinstr.text section Worse, when KASAN there is no telling what memcpy() actually is. Force the use of __memcpy() which is our assmebly implementation. Reported-by: Marco Elver <elver@google.com> Suggested-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20200618144801.760070502@infradead.org
2020-06-19x86/asm/64: Align start of __clear_user() loop to 16-bytesMatt Fleming
x86 CPUs can suffer severe performance drops if a tight loop, such as the ones in __clear_user(), straddles a 16-byte instruction fetch window, or worse, a 64-byte cacheline. This issues was discovered in the SUSE kernel with the following commit, 1153933703d9 ("x86/asm/64: Micro-optimize __clear_user() - Use immediate constants") which increased the code object size from 10 bytes to 15 bytes and caused the 8-byte copy loop in __clear_user() to be split across a 64-byte cacheline. Aligning the start of the loop to 16-bytes makes this fit neatly inside a single instruction fetch window again and restores the performance of __clear_user() which is used heavily when reading from /dev/zero. Here are some numbers from running libmicro's read_z* and pread_z* microbenchmarks which read from /dev/zero: Zen 1 (Naples) libmicro-file 5.7.0-rc6 5.7.0-rc6 5.7.0-rc6 revert-1153933703d9+ align16+ Time mean95-pread_z100k 9.9195 ( 0.00%) 5.9856 ( 39.66%) 5.9938 ( 39.58%) Time mean95-pread_z10k 1.1378 ( 0.00%) 0.7450 ( 34.52%) 0.7467 ( 34.38%) Time mean95-pread_z1k 0.2623 ( 0.00%) 0.2251 ( 14.18%) 0.2252 ( 14.15%) Time mean95-pread_zw100k 9.9974 ( 0.00%) 6.0648 ( 39.34%) 6.0756 ( 39.23%) Time mean95-read_z100k 9.8940 ( 0.00%) 5.9885 ( 39.47%) 5.9994 ( 39.36%) Time mean95-read_z10k 1.1394 ( 0.00%) 0.7483 ( 34.33%) 0.7482 ( 34.33%) Note that this doesn't affect Haswell or Broadwell microarchitectures which seem to avoid the alignment issue by executing the loop straight out of the Loop Stream Detector (verified using perf events). Fixes: 1153933703d9 ("x86/asm/64: Micro-optimize __clear_user() - Use immediate constants") Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # v4.19+ Link: https://lkml.kernel.org/r/20200618102002.30034-1-matt@codeblueprint.co.uk
2020-06-11Rebase locking/kcsan to locking/urgentThomas Gleixner
Merge the state of the locking kcsan branch before the read/write_once() and the atomics modifications got merged. Squash the fallout of the rebase on top of the read/write once and atomic fallback work into the merge. The history of the original branch is preserved in tag locking-kcsan-2020-06-02. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2020-06-03Merge tag 'x86-timers-2020-06-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Thomas Gleixner: "X86 timer specific updates: - Add TPAUSE based delay which allows the CPU to enter an optimized power state while waiting for the delay to pass. The delay is based on TSC cycles. - Add tsc_early_khz command line parameter to workaround the problem that overclocked CPUs can report the wrong frequency via CPUID.16h which causes the refined calibration to fail because the delta to the initial frequency value is too big. With the parameter users can provide an halfways accurate initial value" * tag 'x86-timers-2020-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc: Add tsc_early_khz command line parameter x86/delay: Introduce TPAUSE delay x86/delay: Refactor delay_mwaitx() for TPAUSE support x86/delay: Preparatory code cleanup
2020-06-01Merge branch 'uaccess.csum' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess/csum updates from Al Viro: "Regularize the sitation with uaccess checksum primitives: - fold csum_partial_... into csum_and_copy_..._user() - on x86 collapse several access_ok()/stac()/clac() into user_access_begin()/user_access_end()" * 'uaccess.csum' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: default csum_and_copy_to_user(): don't bother with access_ok() take the dummy csum_and_copy_from_user() into net/checksum.h arm: switch to csum_and_copy_from_user() sh32: convert to csum_and_copy_from_user() m68k: convert to csum_and_copy_from_user() xtensa: switch to providing csum_and_copy_from_user() sparc: switch to providing csum_and_copy_from_user() parisc: turn csum_partial_copy_from_user() into csum_and_copy_from_user() alpha: turn csum_partial_copy_from_user() into csum_and_copy_from_user() ia64: turn csum_partial_copy_from_user() into csum_and_copy_from_user() ia64: csum_partial_copy_nocheck(): don't abuse csum_partial_copy_from_user() x86: switch 32bit csum_and_copy_to_user() to user_access_{begin,end}() x86: switch both 32bit and 64bit to providing csum_and_copy_from_user() x86_64: csum_..._copy_..._user(): switch to unsafe_..._user() get rid of csum_partial_copy_to_user()
2020-05-29x86: switch both 32bit and 64bit to providing csum_and_copy_from_user()Al Viro
... rather than messing with the wrapper. As a side effect, 32bit variant gets access_ok() into it and can be switched to user_access_begin()/user_access_end() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-29x86_64: csum_..._copy_..._user(): switch to unsafe_..._user()Al Viro
We already have stac/clac pair around the calls of csum_partial_copy_generic(). Stretch that area back, so that it covers the preceding loop (and convert the loop body from __{get,put}_user() to unsafe_{get,put}_user()). That brings the beginning of the areas to the earlier access_ok(), which allows to convert them into user_access_{begin,end}() ones. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-07x86/delay: Introduce TPAUSE delayKyung Min Park
TPAUSE instructs the processor to enter an implementation-dependent optimized state. The instruction execution wakes up when the time-stamp counter reaches or exceeds the implicit EDX:EAX 64-bit input value. The instruction execution also wakes up due to the expiration of the operating system time-limit or by an external interrupt or exceptions such as a debug exception or a machine check exception. TPAUSE offers a choice of two lower power states: 1. Light-weight power/performance optimized state C0.1 2. Improved power/performance optimized state C0.2 This way, it can save power with low wake-up latency in comparison to spinloop based delay. The selection between the two is governed by the input register. TPAUSE is available on processors with X86_FEATURE_WAITPKG. Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1587757076-30337-4-git-send-email-kyung.min.park@intel.com
2020-05-07x86/delay: Refactor delay_mwaitx() for TPAUSE supportKyung Min Park
Refactor code to make it easier to add a new model specific function to delay for a number of cycles. No functional change. Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1587757076-30337-3-git-send-email-kyung.min.park@intel.com
2020-05-07x86/delay: Preparatory code cleanupThomas Gleixner
The naming conventions in the delay code are confusing at best. All delay variants use a loops argument and or variable which originates from the original delay_loop() implementation. But all variants except delay_loop() are based on TSC cycles. Rename the argument to cycles and make it type u64 to avoid these weird expansions to u64 in the functions. Rename MWAITX_MAX_LOOPS to MWAITX_MAX_WAIT_CYCLES for the same reason and fixup the comment of delay_mwaitx() as well. Mark the delay_fn function pointer __ro_after_init and fixup the comment for it. No functional change and preparation for the upcoming TPAUSE based delay variant. [ Kyung Min Park: Added __init to use_tsc_delay() ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1587757076-30337-2-git-send-email-kyung.min.park@intel.com
2020-04-30x86/retpoline: Fix retpoline unwindPeter Zijlstra
Currently objtool cannot understand retpolines, and thus cannot generate ORC unwind information for them. This means that we cannot unwind from the middle of a retpoline. The recent ANNOTATE_INTRA_FUNCTION_CALL and UNWIND_HINT_RET_OFFSET support in objtool enables it to understand the basic retpoline construct. A further problem is that the ORC unwind information is alternative invariant; IOW. every alternative should have the same ORC, retpolines obviously violate this. This means we need to out-of-line them. Since all GCC generated code already uses out-of-line retpolines, this should not affect performance much, if anything. This will enable objtool to generate valid ORC data for the out-of-line copies, which means we can correctly and reliably unwind through a retpoline. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191700.210835357@infradead.org
2020-04-30x86: Change {JMP,CALL}_NOSPEC argumentPeter Zijlstra
In order to change the {JMP,CALL}_NOSPEC macros to call out-of-line versions of the retpoline magic, we need to remove the '%' from the argument, such that we can paste it onto symbol names. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191700.151623523@infradead.org
2020-04-30x86: Simplify retpoline declarationPeter Zijlstra
Because of how KSYM works, we need one declaration per line. Seeing how we're going to be doubling the amount of retpoline symbols, simplify the machinery in order to avoid having to copy/paste even more. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191700.091696925@infradead.org