linux-toradex.git/arch/arm64/crypto, branch v5.18

crypto: arm64 - cleanup comments

2022-03-09T03:12:32+00:00

For spdx, use // for *.c files

Replacements
significanty to significantly

Signed-off-by: Tom Rix 
Signed-off-by: Herbert Xu

crypto: arm64/aes-neonbs-xts - use plain NEON for non-power-of-2 input sizes

2022-02-05T04:10:51+00:00

Even though the kernel's implementations of AES-XTS were updated to
implement ciphertext stealing and can operate on inputs of any size
larger than or equal to the AES block size, this feature is rarely used
in practice.

In fact, in the kernel, AES-XTS is only used to operate on 4096 or 512
byte blocks, which means that not only the ciphertext stealing is
effectively dead code, the logic in the bit sliced NEON implementation
to deal with fewer than 8 blocks at a time is also never used.

Since the bit-sliced NEON driver already depends on the plain NEON
version, which is slower but can operate on smaller data quantities more
straightforwardly, let's fallback to the plain NEON implementation of
XTS for any residual inputs that are not multiples of 128 bytes. This
allows us to remove a lot of complicated logic that rarely gets
exercised in practice.

Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu

crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk

2022-02-05T04:10:51+00:00

Instead of processing the entire input with the 8-way bit sliced
algorithm, which is sub-optimal for inputs that are not a multiple of
128 bytes in size, invoke the plain NEON version of CTR for the
remainder of the input after processing the bulk using 128 byte strides.

This allows us to greatly simplify the asm code that implements CTR, and
get rid of all the branches and special code paths. It also gains us a
couple of percent of performance.

Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu

crypto: arm64/aes-neon-ctr - improve handling of single tail block

2022-02-05T04:10:51+00:00

Instead of falling back to C code to do a memcpy of the output of the
last block, handle this in the asm code directly if possible, which is
the case if the entire input is longer than 16 bytes.

Cc: Nathan Huckleberry 
Cc: Eric Biggers 
Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu

crypto: arm64/sm3-ce - make dependent on sm3 library

2022-01-28T05:51:10+00:00

SM3 generic library is stand-alone implementation, sm3-ce can depend
on the SM3 library instead of sm3-generic.

Signed-off-by: Tianjia Zhang 
Signed-off-by: Herbert Xu

arm64: Add macro version of the BTI instruction

2021-12-14T18:12:58+00:00

BTI is only available from v8.5 so we need to encode it using HINT in
generic code and for older toolchains. Add an assembler macro based on
one written by Mark Rutland which lets us use the mnemonic and update
the existing users.

Suggested-by: Mark Rutland 
Acked-by: Ard Biesheuvel 
Acked-by: Will Deacon 
Signed-off-by: Mark Brown 
Acked-by: Mark Rutland 
Link: https://lore.kernel.org/r/20211214152714.2380849-2-broonie@kernel.org
Signed-off-by: Catalin Marinas

crypto: arm64/aes-ccm - avoid by-ref argument for ce_aes_ccm_auth_data

2021-09-17T03:05:11+00:00

With the SIMD code path removed, we can clean up the CCM auth-only path
a bit further, by passing the 'macp' input buffer pointer by value,
rather than by reference, and taking the output value from the
function's return value.

This way, the compiler is no longer forced to allocate macp on the
stack. This is not expected to make any difference in practice, it just
makes for slightly cleaner code.

Signed-off-by: Ard Biesheuvel 
Reviewed-by: Eric Biggers 
Signed-off-by: Herbert Xu

crypto: arm64/aes-ccm - reduce NEON begin/end calls for common case

2021-09-17T03:05:11+00:00

AES-CCM (as used in WPA2 CCMP, for instance) typically involves
authenticate-only data, and operates on a single network packet, and so
the common case is for the authenticate, en/decrypt and finalize SIMD
helpers to all be called exactly once in sequence. Since
kernel_neon_end() now involves manipulation of the preemption state as
well as the softirq mask state, let's reduce the number of times we are
forced to call it to only once if we are handling this common case.

Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu

crypto: arm64/aes-ccm - remove non-SIMD fallback path

2021-09-17T03:05:11+00:00

AES/CCM on arm64 is implemented as a synchronous AEAD, and so it is
guaranteed by the API that it is only invoked in task or softirq
context. Since softirqs are now only handled when the SIMD is not
being used in the task context that was interrupted to service the
softirq, we no longer need a fallback path. Let's remove it.

Signed-off-by: Ard Biesheuvel 
Reviewed-by: Eric Biggers 
Signed-off-by: Herbert Xu

crypto: arm64/aes-ccm - yield NEON when processing auth-only data

2021-09-17T03:05:10+00:00

In SIMD accelerated crypto drivers, we typically yield the SIMD unit
after processing 4 KiB of input, to avoid scheduling blackouts caused by
the fact that claiming the SIMD unit disables preemption as well as
softirq processing.

The arm64 CCM driver does this implicitly for the ciphertext, due to the
fact that the skcipher API never processes more than a single page at a
time. However, the scatterwalk performed by this driver when processing
the authenticate-only data will keep the SIMD unit occupied until it
completes.

So cap the scatterwalk steps to 4 KiB.

Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu