linux-toradex.git/include/asm-generic/bitops, branch v5.15-rc5

Merge tag 'asm-generic-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

2021-09-01T22:13:02+00:00

Pull asm-generic updates from Arnd Bergmann:
 "The main content for 5.15 is a series that cleans up the handling of
  strncpy_from_user() and strnlen_user(), removing a lot of slightly
  incorrect versions of these in favor of the lib/strn*.c helpers that
  implement these correctly and more efficiently.

  The only architectures that retain a private version now are mips,
  ia64, um and parisc. I had offered to convert those at all, but Thomas
  Bogendoerfer wanted to keep the mips version for the moment until he
  had a chance to do regression testing.

  The branch also contains two patches for bitops and for ffs()"

* tag 'asm-generic-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  bitops/non-atomic: make @nr unsigned to avoid any DIV
  asm-generic: ffs: Drop bogus reference to ffz location
  asm-generic: reverse GENERIC_{STRNCPY_FROM,STRNLEN}_USER symbols
  asm-generic: remove extra strn{cpy_from,len}_user declarations
  asm-generic: uaccess: remove inline strncpy_from_user/strnlen_user
  s390: use generic strncpy/strnlen from_user
  microblaze: use generic strncpy/strnlen from_user
  csky: use generic strncpy/strnlen from_user
  arc: use generic strncpy/strnlen from_user
  hexagon: use generic strncpy/strnlen from_user
  h8300: remove stale strncpy_from_user
  asm-generic/uaccess.h: remove __strncpy_from_user/__strnlen_user

bitops/non-atomic: make @nr unsigned to avoid any DIV

2021-08-14T11:07:42+00:00

signed math causes generation of costlier instructions such as DIV when
they could be done by barrerl shifter.

Worse part is this is not caught by things like bloat-o-meter since
instruction length / symbols are typically same size.

e.g.

stock (signed math)
__________________

919b4614 :
919b4614:	div	r2,r0,0x20
                ^^^
919b4618:	add2	r2,0x920f6050,r2
919b4620:	ld_s	r2,[r2,0]
919b4622:	lsr	r0,r2,r0
919b4626:	j_s.d	[blink]
919b4628:	bmsk_s	r0,r0,0
919b462a:	nop_s

(patched) unsigned math
__________________

919b4614 :
919b4614:	lsr	r2,r0,0x5  @nr/32
                ^^^
919b4618:	add2	r2,0x920f6050,r2
919b4620:	ld_s	r2,[r2,0]
919b4622:	lsr	r0,r2,r0     #test_bit()
919b4626:	j_s.d	[blink]
919b4628:	bmsk_s	r0,r0,0
919b462a:	nop_s

Signed-off-by: Vineet Gupta 
Acked-by: Will Deacon 
Signed-off-by: Arnd Bergmann

asm-generic: ffs: Drop bogus reference to ffz location

2021-08-11T09:38:10+00:00

The generic definition of ffz() is not defined in the same header files
as the generic definitions of ffs().

Signed-off-by: Geert Uytterhoeven 
Signed-off-by: Arnd Bergmann

locking/atomic: simplify non-atomic wrappers

2021-08-04T13:16:47+00:00

Since the non-atomic arch_*() bitops use plain accesses, they are
implicitly instrumnted by the compiler, and we work around this in the
instrumented wrappers to avoid double instrumentation.

It's simpler to avoid the wrappers entirely, and use the preprocessor to
alias the arch_*() bitops to their regular versions, removing the need
for checks in the instrumented wrappers.

Suggested-by: Marco Elver 
Signed-off-by: Mark Rutland 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Marco Elver 
Link: https://lore.kernel.org/r/20210721155813.17082-1-mark.rutland@arm.com

locking/atomic: add generic arch_*() bitops

2021-07-16T16:46:45+00:00

Now that all architectures provide arch_atomic_long_*(), we can
implement the generic bitops atop these rather than atop
atomic_long_*(), and provide arch_*() forms of the bitops that are safe
to use in noinstr code.

Now that all architectures provide arch_atomic_long_*(), we can
build the generic arch_*() bitops atop these, which can be safely used
in noinstr code. The regular bitop wrappers are built atop these.

As the generic non-atomic bitops use plain accesses, these will be
implicitly instrumented unless they are inlined into noinstr functions
(which is similar to arch_atomic*_read() when based on READ_ONCE()).
The wrappers are modified so that where the underlying arch_*() function
uses a plain access, no explicit instrumentation is added, as this is
redundant and could result in confusing reports.

Since function prototypes get excessively long with both an `arch_`
prefix and `__always_inline` attribute, the return type and function
attributes have been split onto a separate line, matching the style of
the generated atomic headers.

Signed-off-by: Mark Rutland 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210713105253.7615-6-mark.rutland@arm.com

lib: add fast path for find_first_*_bit() and find_last_bit()

2021-05-07T02:24:12+00:00

Similarly to bitmap functions, users would benefit if we'll handle a case
of small-size bitmaps that fit into a single word.

While here, move the find_last_bit() declaration to bitops/find.h where
other find_*_bit() functions sit.

Link: https://lkml.kernel.org/r/20210401003153.97325-11-yury.norov@gmail.com
Signed-off-by: Yury Norov 
Acked-by: Rasmus Villemoes 
Acked-by: Andy Shevchenko 
Cc: Alexey Klimov 
Cc: Arnd Bergmann 
Cc: David Sterba 
Cc: Dennis Zhou 
Cc: Geert Uytterhoeven 
Cc: Jianpeng Ma 
Cc: Joe Perches 
Cc: John Paul Adrian Glaubitz 
Cc: Josh Poimboeuf 
Cc: Rich Felker 
Cc: Stefano Brivio 
Cc: Wei Yang 
Cc: Wolfram Sang 
Cc: Yoshinori Sato 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

lib: add fast path for find_next_*_bit()

2021-05-07T02:24:12+00:00

Similarly to bitmap functions, find_next_*_bit() users will benefit if
we'll handle a case of bitmaps that fit into a single word inline.  In the
very best case, the compiler may replace a function call with a few
instructions.

This is the quite typical find_next_bit() user:

	unsigned int cpumask_next(int n, const struct cpumask *srcp)
	{
		/* -1 is a legal arg here. */
		if (n != -1)
			cpumask_check(n);
		return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
	}
	EXPORT_SYMBOL(cpumask_next);

Currently, on ARM64 the generated code looks like this:
	0000000000000000 :
	   0:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
	   4:   11000402        add     w2, w0, #0x1
	   8:   aa0103e0        mov     x0, x1
	   c:   d2800401        mov     x1, #0x40                       // #64
	  10:   910003fd        mov     x29, sp
	  14:   93407c42        sxtw    x2, w2
	  18:   94000000        bl      0 
	  1c:   a8c17bfd        ldp     x29, x30, [sp], #16
	  20:   d65f03c0        ret
	  24:   d503201f        nop

After applying this patch:
	0000000000000140 :
	 140:   11000400        add     w0, w0, #0x1
	 144:   93407c00        sxtw    x0, w0
	 148:   f100fc1f        cmp     x0, #0x3f
	 14c:   54000168        b.hi    178   // b.pmore
	 150:   f9400023        ldr     x3, [x1]
	 154:   92800001        mov     x1, #0xffffffffffffffff         // #-1
	 158:   9ac02020        lsl     x0, x1, x0
	 15c:   52800802        mov     w2, #0x40                       // #64
	 160:   8a030001        and     x1, x0, x3
	 164:   dac00020        rbit    x0, x1
	 168:   f100003f        cmp     x1, #0x0
	 16c:   dac01000        clz     x0, x0
	 170:   1a800040        csel    w0, w2, w0, eq  // eq = none
	 174:   d65f03c0        ret
	 178:   52800800        mov     w0, #0x40                       // #64
	 17c:   d65f03c0        ret

find_next_bit() call is replaced with 6 instructions.  find_next_bit()
itself is 41 instructions plus function call overhead.

Despite inlining, the scripts/bloat-o-meter report smaller .text size
after applying the series:
	add/remove: 11/9 grow/shrink: 233/176 up/down: 5780/-6768 (-988)

Link: https://lkml.kernel.org/r/20210401003153.97325-10-yury.norov@gmail.com
Signed-off-by: Yury Norov 
Acked-by: Rasmus Villemoes 
Acked-by: Andy Shevchenko 
Cc: Alexey Klimov 
Cc: Arnd Bergmann 
Cc: David Sterba 
Cc: Dennis Zhou 
Cc: Geert Uytterhoeven 
Cc: Jianpeng Ma 
Cc: Joe Perches 
Cc: John Paul Adrian Glaubitz 
Cc: Josh Poimboeuf 
Cc: Rich Felker 
Cc: Stefano Brivio 
Cc: Wei Yang 
Cc: Wolfram Sang 
Cc: Yoshinori Sato 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

lib: inline _find_next_bit() wrappers

2021-05-07T02:24:12+00:00

lib/find_bit.c declares five single-line wrappers for _find_next_bit().
We may turn those wrappers to inline functions.  It eliminates unneeded
function calls and opens room for compile-time optimizations.

Link: https://lkml.kernel.org/r/20210401003153.97325-8-yury.norov@gmail.com
Signed-off-by: Yury Norov 
Acked-by: Rasmus Villemoes 
Acked-by: Andy Shevchenko 
Cc: Alexey Klimov 
Cc: Arnd Bergmann 
Cc: David Sterba 
Cc: Dennis Zhou 
Cc: Geert Uytterhoeven 
Cc: Jianpeng Ma 
Cc: Joe Perches 
Cc: John Paul Adrian Glaubitz 
Cc: Josh Poimboeuf 
Cc: Rich Felker 
Cc: Stefano Brivio 
Cc: Wei Yang 
Cc: Wolfram Sang 
Cc: Yoshinori Sato 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

arm64: make atomic helpers __always_inline

2021-01-13T15:09:06+00:00

With UBSAN enabled and building with clang, there are occasionally
warnings like

WARNING: modpost: vmlinux.o(.text+0xc533ec): Section mismatch in reference from the function arch_atomic64_or() to the variable .init.data:numa_nodes_parsed
The function arch_atomic64_or() references
the variable __initdata numa_nodes_parsed.
This is often because arch_atomic64_or lacks a __initdata
annotation or the annotation of numa_nodes_parsed is wrong.

for functions that end up not being inlined as intended but operating
on __initdata variables. Mark these as __always_inline, along with
the corresponding asm-generic wrappers.

Signed-off-by: Arnd Bergmann 
Acked-by: Will Deacon 
Link: https://lore.kernel.org/r/20210108092024.4034860-1-arnd@kernel.org
Signed-off-by: Catalin Marinas

asm-generic: fix ffs -Wshadow warning

2020-10-26T16:00:29+00:00

gcc -Wshadow warns about the ffs() definition that has the
same name as the global ffs() built-in:

include/asm-generic/bitops/builtin-ffs.h:13:28: warning: declaration of 'ffs' shadows a built-in function [-Wshadow]

This is annoying because 'make W=2' warns every time this
header gets included.

Change it to use a #define instead, making callers directly
reference the builtin.

Signed-off-by: Arnd Bergmann