summaryrefslogtreecommitdiff
path: root/arch/x86/crypto/sha1_ssse3_glue.c
AgeCommit message (Collapse)Author
2020-05-08crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.hEric Biggers
<linux/cryptohash.h> sounds very generic and important, like it's the header to include if you're doing cryptographic hashing in the kernel. But actually it only includes the library implementation of the SHA-1 compression function (not even the full SHA-1). This should basically never be used anymore; SHA-1 is no longer considered secure, and there are much better ways to do cryptographic hashing in the kernel. Most files that include this header don't actually need it. So in preparation for removing it, remove all these unneeded includes of it. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-04-09x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2Jason A. Donenfeld
Now that the kernel specifies binutils 2.23 as the minimum version, we can remove ifdefs for AVX2 and ADX throughout. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-04-09x86: remove always-defined CONFIG_AS_AVXMasahiro Yamada
CONFIG_AS_AVX was introduced by commit ea4d26ae24e5 ("raid5: add AVX optimized RAID5 checksumming"). We raise the minimal supported binutils version from time to time. The last bump was commit 1fb12b35e5ff ("kbuild: Raise the minimum required binutils version to 2.21"). I confirmed the code in $(call as-instr,...) can be assembled by the binutils 2.21 assembler and also by LLVM integrated assembler. Remove CONFIG_AS_AVX, which is always defined. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ingo Molnar <mingo@kernel.org>
2020-01-22crypto: x86/sha - Eliminate casts on asm implementationsKees Cook
In order to avoid CFI function prototype mismatches, this removes the casts on assembly implementations of sha1/256/512 accelerators. The safety checks from BUILD_BUG_ON() remain. Additionally, this renames various arguments for clarity, as suggested by Eric Biggers. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-22crypto: x86 - convert to use crypto_simd_usable()Eric Biggers
Replace all calls to irq_fpu_usable() in the x86 crypto code with crypto_simd_usable(), in order to allow testing the no-SIMD code paths. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-07-09crypto: shash - remove useless setting of type flagsEric Biggers
Many shash algorithms set .cra_flags = CRYPTO_ALG_TYPE_SHASH. But this is redundant with the C structure type ('struct shash_alg'), and crypto_register_shash() already sets the type flag automatically, clearing any type flag that was already there. Apparently the useless assignment has just been copy+pasted around. So, remove the useless assignment from all the shash algorithms. This patch shouldn't change any actual behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-09crypto: x86/sha1 - Fix reads beyond the number of blocks passedmegha.dey@linux.intel.com
It was reported that the sha1 AVX2 function(sha1_transform_avx2) is reading ahead beyond its intended data, and causing a crash if the next block is beyond page boundary: http://marc.info/?l=linux-crypto-vger&m=149373371023377 This patch makes sure that there is no overflow for any buffer length. It passes the tests written by Jan Stancek that revealed this problem: https://github.com/jstancek/sha1-avx2-crash I have re-enabled sha1-avx2 by reverting commit b82ce24426a4071da9529d726057e4e642948667 Cc: <stable@vger.kernel.org> Fixes: b82ce24426a4 ("crypto: sha1-ssse3 - Disable avx2") Originally-by: Ilya Albrekht <ilya.albrekht@intel.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-07-05crypto: sha1-ssse3 - Disable avx2Herbert Xu
It has been reported that sha1-avx2 can cause page faults by reading beyond the end of the input. This patch disables it until it can be fixed. Cc: <stable@vger.kernel.org> Fixes: 7c1da8d0d046 ("crypto: sha - SHA1 transform x86_64 AVX2") Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-05-31crypto: sha-ssse3 - add MODULE_ALIASStephan Mueller
Add the MODULE_ALIAS for the cra_driver_name of the different ciphers to allow an automated loading if a driver name is used. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-13x86/cpufeature: Replace cpu_has_avx with boot_cpu_has() usageBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-crypto@vger.kernel.org Link: http://lkml.kernel.org/r/1459801503-15600-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-04Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Add support for cipher output IVs in testmgr - Add missing crypto_ahash_blocksize helper - Mark authenc and des ciphers as not allowed under FIPS. Algorithms: - Add CRC support to 842 compression - Add keywrap algorithm - A number of changes to the akcipher interface: + Separate functions for setting public/private keys. + Use SG lists. Drivers: - Add Intel SHA Extension optimised SHA1 and SHA256 - Use dma_map_sg instead of custom functions in crypto drivers - Add support for STM32 RNG - Add support for ST RNG - Add Device Tree support to exynos RNG driver - Add support for mxs-dcp crypto device on MX6SL - Add xts(aes) support to caam - Add ctr(aes) and xts(aes) support to qat - A large set of fixes from Russell King for the marvell/cesa driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (115 commits) crypto: asymmetric_keys - Fix unaligned access in x509_get_sig_params() crypto: akcipher - Don't #include crypto/public_key.h as the contents aren't used hwrng: exynos - Add Device Tree support hwrng: exynos - Fix missing configuration after suspend to RAM hwrng: exynos - Add timeout for waiting on init done dt-bindings: rng: Describe Exynos4 PRNG bindings crypto: marvell/cesa - use __le32 for hardware descriptors crypto: marvell/cesa - fix missing cpu_to_le32() in mv_cesa_dma_add_op() crypto: marvell/cesa - use memcpy_fromio()/memcpy_toio() crypto: marvell/cesa - use gfp_t for gfp flags crypto: marvell/cesa - use dma_addr_t for cur_dma crypto: marvell/cesa - use readl_relaxed()/writel_relaxed() crypto: caam - fix indentation of close braces crypto: caam - only export the state we really need to export crypto: caam - fix non-block aligned hash calculation crypto: caam - avoid needlessly saving and restoring caam_hash_ctx crypto: caam - print errno code when hash registration fails crypto: marvell/cesa - fix memory leak crypto: marvell/cesa - fix first-fragment handling in mv_cesa_ahash_dma_last_req() crypto: marvell/cesa - rearrange handling for sw padded hashes ...
2015-09-21crypto: x86/sha - Restructure x86 sha1 glue code to expose all the available ↵tim
sha1 transforms Restructure the x86 sha1 glue code so we will expose sha1 transforms based on SSSE3, AVX, AVX2 or SHA-NI extension as separate individual drivers when cpu provides such support. This will make it easy for alternative algorithms to be used if desired and makes the code cleaner and easier to maintain. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-09-21crypto: x86/sha - glue code for Intel SHA extensions optimized SHA1 & SHA256tim
This patch adds the glue code to detect and utilize the Intel SHA extensions optimized SHA1 and SHA256 update transforms when available. This code has been tested on Broxton for functionality. Originally-by: Chandramouli Narayanan <mouli_7982@yahoo.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-09-14x86/fpu: Rename XSAVE macrosDave Hansen
There are two concepts that have some confusing naming: 1. Extended State Component numbers (currently called XFEATURE_BIT_*) 2. Extended State Component masks (currently called XSTATE_*) The numbers are (currently) from 0-9. State component 3 is the bounds registers for MPX, for instance. But when we want to enable "state component 3", we go set a bit in XCR0. The bit we set is 1<<3. We can check to see if a state component feature is enabled by looking at its bit. The current 'xfeature_bit's are at best xfeature bit _numbers_. Calling them bits is at best inconsistent with ending the enum list with 'XFEATURES_NR_MAX'. This patch renames the enum to be 'xfeature'. These also happen to be what the Intel documentation calls a "state component". We also want to differentiate these from the "XSTATE_*" macros. The "XSTATE_*" macros are a mask, and we rename them to match. These macros are reasonably widely used so this patch is a wee bit big, but this really is just a rename. The only non-mechanical part of this is the s/XSTATE_EXTEND_MASK/XFEATURE_MASK_EXTEND/ We need a better name for it, but that's another patch. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: dave@sr71.net Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20150902233126.38653250@viggo.jf.intel.com [ Ported to v4.3-rc1. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19x86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature ↵Ingo Molnar
checks Use the new 'cpu_has_xfeatures()' function to query AVX CPU support. This has the following advantages to the driver: - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>. - Removes detection complexity from the driver, no more raw XGETBV instruction - Shrinks the code a bit. - Standardizes feature name error message printouts across drivers There are also advantages to the x86 FPU code: once all drivers are decoupled from internals we can move them out of common headers and we'll also be able to remove xcr.h. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19x86/fpu: Rename fpu/xsave.h to fpu/xstate.hIngo Molnar
'xsave' is an x86 instruction name to most people - but xsave.h is about a lot more than just the XSAVE instruction: it includes definitions and support, both internal and external, related to xstate and xfeatures support. As a first step in cleaning up the various xstate uses rename this header to 'fpu/xstate.h' to better reflect what this header file is about. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19x86/fpu: Move xsave.h to fpu/xsave.hIngo Molnar
Move the xsave.h header file to the FPU directory as well. Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19x86/fpu: Rename i387.h to fpu/api.hIngo Molnar
We already have fpu/types.h, move i387.h to fpu/api.h. The file name has become a misnomer anyway: it offers generic FPU APIs, but is not limited to i387 functionality. Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-10crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layerArd Biesheuvel
This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-24crypto: prefix module autoloading with "crypto-"Kees Cook
This prefixes all crypto module loading with "crypto-" so we never run the risk of exposing module auto-loading to userspace via a crypto API, as demonstrated by Mathias Krause: https://lkml.org/lkml/2013/3/4/70 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-03-25crypto: x86/sha1 - re-enable the AVX variantMathias Krause
Commit 7c1da8d0d0 "crypto: sha - SHA1 transform x86_64 AVX2" accidentally disabled the AVX variant by making the avx_usable() test not only fail in case the CPU doesn't support AVX or OSXSAVE but also if it doesn't support AVX2. Fix that regression by splitting up the AVX/AVX2 test into two functions. Also test for the BMI1 extension in the avx2_usable() test as the AVX2 implementation not only makes use of BMI2 but also BMI1 instructions. Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Reviewed-by: H. Peter Anvin <hpa@linux.intel.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-03-21crypto: sha - SHA1 transform x86_64 AVX2chandramouli narayanan
This git patch adds x86_64 AVX2 optimization of SHA1 transform to crypto support. The patch has been tested with 3.14.0-rc1 kernel. On a Haswell desktop, with turbo disabled and all cpus running at maximum frequency, tcrypt shows AVX2 performance improvement from 3% for 256 bytes update to 16% for 1024 bytes update over AVX implementation. This patch adds sha1_avx2_transform(), the glue, build and configuration changes needed for AVX2 optimization of SHA1 transform to crypto support. sha1-ssse3 is one module which adds the necessary optimization support (SSSE3/AVX/AVX2) for the low-level SHA1 transform function. With better optimization support, transform function is overridden as the case may be. In the case of AVX2, due to performance reasons across datablock sizes, the AVX or AVX2 transform function is used at run-time as it suits best. The Makefile change therefore appends the necessary objects to the linkage. Due to this, the patch merely appends AVX2 transform to the existing build mix and Kconfig support and leaves the configuration build support as is. Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com> Reviewed-by: Marek Vasut <marex@denx.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12crypto: sha1 - use Kbuild supplied flags for AVX testMathias Krause
Commit ea4d26ae ("raid5: add AVX optimized RAID5 checksumming") introduced x86/ arch wide defines for AFLAGS and CFLAGS indicating AVX support in binutils based on the same test we have in x86/crypto/ right now. To minimize duplication drop our implementation in favour to the one in x86/. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-08-10crypto: sha1 - SSSE3 based SHA1 implementation for x86-64Mathias Krause
This is an assembler implementation of the SHA1 algorithm using the Supplemental SSE3 (SSSE3) instructions or, when available, the Advanced Vector Extensions (AVX). Testing with the tcrypt module shows the raw hash performance is up to 2.3 times faster than the C implementation, using 8k data blocks on a Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25% faster. Since this implementation uses SSE/YMM registers it cannot safely be used in every situation, e.g. while an IRQ interrupts a kernel thread. The implementation falls back to the generic SHA1 variant, if using the SSE/YMM registers is not possible. With this algorithm I was able to increase the throughput of a single IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using the SSSE3 variant -- a speedup of +34.8%. Saving and restoring SSE/YMM state might make the actual throughput fluctuate when there are FPU intensive userland applications running. For example, meassuring the performance using iperf2 directly on the machine under test gives wobbling numbers because iperf2 uses the FPU for each packet to check if the reporting interval has expired (in the above test I got min/max/avg: 402/484/464 MBit/s). Using this algorithm on a IPsec gateway gives much more reasonable and stable numbers, albeit not as high as in the directly connected case. Here is the result from an RFC 2544 test run with a EXFO Packet Blazer FTB-8510: frame size sha1-generic sha1-ssse3 delta 64 byte 37.5 MBit/s 37.5 MBit/s 0.0% 128 byte 56.3 MBit/s 62.5 MBit/s +11.0% 256 byte 87.5 MBit/s 100.0 MBit/s +14.3% 512 byte 131.3 MBit/s 150.0 MBit/s +14.2% 1024 byte 162.5 MBit/s 193.8 MBit/s +19.3% 1280 byte 175.0 MBit/s 212.5 MBit/s +21.4% 1420 byte 175.0 MBit/s 218.7 MBit/s +25.0% 1518 byte 150.0 MBit/s 181.2 MBit/s +20.8% The throughput for the largest frame size is lower than for the previous size because the IP packets need to be fragmented in this case to make there way through the IPsec tunnel. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>