linux-toradex.git/scripts/atomic/fallbacks, branch v6.15-rc5

locking/atomic: scripts: simplify raw_atomic*() definitions

2023-06-05T07:57:22+00:00

Currently each ordering variant has several potential definitions,
with a mixture of preprocessor and C definitions, including several
copies of its C prototype, e.g.

| #if defined(arch_atomic_fetch_andnot_acquire)
| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire
| #elif defined(arch_atomic_fetch_andnot_relaxed)
| static __always_inline int
| raw_atomic_fetch_andnot_acquire(int i, atomic_t *v)
| {
|       int ret = arch_atomic_fetch_andnot_relaxed(i, v);
|       __atomic_acquire_fence();
|       return ret;
| }
| #elif defined(arch_atomic_fetch_andnot)
| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot
| #else
| static __always_inline int
| raw_atomic_fetch_andnot_acquire(int i, atomic_t *v)
| {
|       return raw_atomic_fetch_and_acquire(~i, v);
| }
| #endif

Make this a bit simpler by defining the C prototype once, and writing
the various potential definitions as plain C code guarded by ifdeffery.
For example, the above becomes:

| static __always_inline int
| raw_atomic_fetch_andnot_acquire(int i, atomic_t *v)
| {
| #if defined(arch_atomic_fetch_andnot_acquire)
|         return arch_atomic_fetch_andnot_acquire(i, v);
| #elif defined(arch_atomic_fetch_andnot_relaxed)
|         int ret = arch_atomic_fetch_andnot_relaxed(i, v);
|         __atomic_acquire_fence();
|         return ret;
| #elif defined(arch_atomic_fetch_andnot)
|         return arch_atomic_fetch_andnot(i, v);
| #else
|         return raw_atomic_fetch_and_acquire(~i, v);
| #endif
| }

Which is far easier to read. As we now always have a single copy of the
C prototype wrapping all the potential definitions, we now have an
obvious single location for kerneldoc comments.

At the same time, the fallbacks for raw_atomic*_xhcg() are made to use
'new' rather than 'i' as the name of the new value. This is what the
existing fallback template used, and is more consistent with the
raw_atomic{_try,}cmpxchg() fallbacks.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Kees Cook 
Link: https://lore.kernel.org/r/20230605070124.3741859-24-mark.rutland@arm.com

locking/atomic: scripts: restructure fallback ifdeffery

2023-06-05T07:57:21+00:00

Currently the various ordering variants of an atomic operation are
defined in groups of full/acquire/release/relaxed ordering variants with
some shared ifdeffery and several potential definitions of each ordering
variant in different branches of the shared ifdeffery.

As an ordering variant can have several potential definitions down
different branches of the shared ifdeffery, it can be painful for a
human to find a relevant definition, and we don't have a good location
to place anything common to all definitions of an ordering variant (e.g.
kerneldoc).

Historically the grouping of full/acquire/release/relaxed ordering
variants was necessary as we filled in the missing atomics in the same
namespace as the architecture used. It would be easy to accidentally
define one ordering fallback in terms of another ordering fallback with
redundant barriers, and avoiding that would otherwise require a lot of
baroque ifdeffery.

With recent changes we no longer need to fill in the missing atomics in
the arch_atomic*_() namespace, and only need to fill in the
raw_atomic*_() namespace. Due to this, there's no risk of a
namespace collision, and we can define each raw_atomic*_ ordering
variant with its own ifdeffery checking for the arch_atomic*_
ordering variants.

Restructure the fallbacks in this way, with each ordering variant having
its own ifdeffery of the form:

| #if defined(arch_atomic_fetch_andnot_acquire)
| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire
| #elif defined(arch_atomic_fetch_andnot_relaxed)
| static __always_inline int
| raw_atomic_fetch_andnot_acquire(int i, atomic_t *v)
| {
| 	int ret = arch_atomic_fetch_andnot_relaxed(i, v);
| 	__atomic_acquire_fence();
| 	return ret;
| }
| #elif defined(arch_atomic_fetch_andnot)
| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot
| #else
| static __always_inline int
| raw_atomic_fetch_andnot_acquire(int i, atomic_t *v)
| {
| 	return raw_atomic_fetch_and_acquire(~i, v);
| }
| #endif

Note that where there's no relevant arch_atomic*_() ordering
variant, we'll define the operation in terms of a distinct
raw_atomic*_(), as this itself might have been filled in with a
fallback.

As we now generate the raw_atomic*_() implementations directly, we
no longer need the trivial wrappers, so they are removed.

This makes the ifdeffery easier to follow, and will allow for further
improvements in subsequent patches.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Kees Cook 
Link: https://lore.kernel.org/r/20230605070124.3741859-21-mark.rutland@arm.com

locking/atomic: make atomic*_{cmp,}xchg optional

2023-06-05T07:57:14+00:00

Most architectures define the atomic/atomic64 xchg and cmpxchg
operations in terms of arch_xchg and arch_cmpxchg respectfully.

Add fallbacks for these cases and remove the trivial cases from arch
code. On some architectures the existing definitions are kept as these
are used to build other arch_atomic*() operations.

Signed-off-by: Mark Rutland 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Kees Cook 
Link: https://lore.kernel.org/r/20230605070124.3741859-5-mark.rutland@arm.com

locking/atomic: remove fallback comments

2023-06-05T07:57:13+00:00

Currently a subset of the fallback templates have kerneldoc comments,
resulting in a haphazard set of generated kerneldoc comments as only
some operations have fallback templates to begin with.

We'd like to generate more consistent kerneldoc comments, and to do so
we'll need to restructure the way the fallback code is generated.

To minimize churn and to make it easier to restructure the fallback
code, this patch removes the existing kerneldoc comments from the
fallback templates. We can add new kerneldoc comments in subsequent
patches.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Kees Cook 
Link: https://lore.kernel.org/r/20230605070124.3741859-3-mark.rutland@arm.com

atomics: Provide atomic_add_negative() variants

2023-03-28T08:39:29+00:00

atomic_add_negative() does not provide the relaxed/acquire/release
variants.

Provide them in preparation for a new scalable reference count algorithm.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Mark Rutland 
Link: https://lore.kernel.org/r/20230323102800.101763813@linutronix.de

atomics: Fix atomic64_{read_acquire,set_release} fallbacks

2022-02-11T11:13:56+00:00

Arnd reports that on 32-bit architectures, the fallbacks for
atomic64_read_acquire() and atomic64_set_release() are broken as they
use smp_load_acquire() and smp_store_release() respectively, which do
not work on types larger than the native word size.

Since those contain compiletime_assert_atomic_type(), any attempt to use
those fallbacks will result in a build-time error. e.g. with the
following added to arch/arm/kernel/setup.c:

| void test_atomic64(atomic64_t *v)
| {
|        atomic64_set_release(v, 5);
|        atomic64_read_acquire(v);
| }

The compiler will complain as follows:

| In file included from :
| In function 'arch_atomic64_set_release',
|     inlined from 'test_atomic64' at ./include/linux/atomic/atomic-instrumented.h:669:2:
| ././include/linux/compiler_types.h:346:38: error: call to '__compiletime_assert_9' declared with attribute error: Need native word sized stores/loads for atomicity.
|   346 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
|       |                                      ^
| ././include/linux/compiler_types.h:327:4: note: in definition of macro '__compiletime_assert'
|   327 |    prefix ## suffix();    \
|       |    ^~~~~~
| ././include/linux/compiler_types.h:346:2: note: in expansion of macro '_compiletime_assert'
|   346 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
|       |  ^~~~~~~~~~~~~~~~~~~
| ././include/linux/compiler_types.h:349:2: note: in expansion of macro 'compiletime_assert'
|   349 |  compiletime_assert(__native_word(t),    \
|       |  ^~~~~~~~~~~~~~~~~~
| ./include/asm-generic/barrier.h:133:2: note: in expansion of macro 'compiletime_assert_atomic_type'
|   133 |  compiletime_assert_atomic_type(*p);    \
|       |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| ./include/asm-generic/barrier.h:164:55: note: in expansion of macro '__smp_store_release'
|   164 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
|       |                                                       ^~~~~~~~~~~~~~~~~~~
| ./include/linux/atomic/atomic-arch-fallback.h:1270:2: note: in expansion of macro 'smp_store_release'
|  1270 |  smp_store_release(&(v)->counter, i);
|       |  ^~~~~~~~~~~~~~~~~
| make[2]: *** [scripts/Makefile.build:288: arch/arm/kernel/setup.o] Error 1
| make[1]: *** [scripts/Makefile.build:550: arch/arm/kernel] Error 2
| make: *** [Makefile:1831: arch/arm] Error 2

Fix this by only using smp_load_acquire() and smp_store_release() for
native atomic types, and otherwise falling back to the regular barriers
necessary for acquire/release semantics, as we do in the more generic
acquire and release fallbacks.

Since the fallback templates are used to generate the atomic64_*() and
atomic_*() operations, the __native_word() check is added to both. For
the atomic_*() operations, which are always 32-bit, the __native_word()
check is redundant but not harmful, as it is always true.

For the example above this works as expected on 32-bit, e.g. for arm
multi_v7_defconfig:

| :
|         push    {r4, r5}
|         dmb     ish
|         pldw    [r0]
|         mov     r2, #5
|         mov     r3, #0
|         ldrexd  r4, [r0]
|         strexd  r4, r2, [r0]
|         teq     r4, #0
|         bne     484 
|         ldrexd  r2, [r0]
|         dmb     ish
|         pop     {r4, r5}
|         bx      lr

... and also on 64-bit, e.g. for arm64 defconfig:

| :
|         bti     c
|         paciasp
|         mov     x1, #0x5
|         stlr    x1, [x0]
|         ldar    x0, [x0]
|         autiasp
|         ret

Reported-by: Arnd Bergmann 
Signed-off-by: Mark Rutland 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Ard Biesheuvel 
Reviewed-by: Boqun Feng 
Link: https://lore.kernel.org/r/20220207101943.439825-1-mark.rutland@arm.com

locking/atomic: remove ARCH_ATOMIC remanants

2021-07-16T16:46:44+00:00

Now that gen-atomic-fallback.sh is only used to generate the arch_*
fallbacks, we don't need to also generate the non-arch_* forms, and can
removethe infrastructure this needed.

There is no change to any of the generated headers as a result of this
patch.

Signed-off-by: Mark Rutland 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210713105253.7615-3-mark.rutland@arm.com

locking/atomics: Flip fallbacks and instrumentation

2020-06-11T06:03:24+00:00

Currently instrumentation of atomic primitives is done at the architecture
level, while composites or fallbacks are provided at the generic level.

The result is that there are no uninstrumented variants of the
fallbacks. Since there is now need of such variants to isolate text poke
from any form of instrumentation invert this ordering.

Doing this means moving the instrumentation into the generic code as
well as having (for now) two variants of the fallbacks.

Notes:

 - the various *cond_read* primitives are not proper fallbacks
   and got moved into linux/atomic.c. No arch_ variants are
   generated because the base primitives smp_cond_load*()
   are instrumented.

 - once all architectures are moved over to arch_atomic_ one of the
   fallback variants can be removed and some 2300 lines reclaimed.

 - atomic_{read,set}*() are no longer double-instrumented

Reported-by: Thomas Gleixner 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Thomas Gleixner 
Acked-by: Mark Rutland 
Link: https://lkml.kernel.org/r/20200505134058.769149955@linutronix.de

asm-generic/atomic: Use __always_inline for fallback wrappers

2020-06-11T06:03:24+00:00

Use __always_inline for atomic fallback wrappers. When building for size
(CC_OPTIMIZE_FOR_SIZE), some compilers appear to be less inclined to
inline even relatively small static inline functions that are assumed to
be inlinable such as atomic ops. This can cause problems, for example in
UACCESS regions.

While the fallback wrappers aren't pure wrappers, they are trivial
nonetheless, and the function they wrap should determine the final
inlining policy.

For x86 tinyconfig we observe:
- vmlinux baseline: 1315988
- vmlinux with patch: 1315928 (-60 bytes)

[ tglx: Cherry-picked from KCSAN ]

Suggested-by: Mark Rutland 
Signed-off-by: Marco Elver 
Acked-by: Mark Rutland 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Thomas Gleixner

locking/atomics: Fix scripts/atomic/ script permissions

2018-11-01T11:45:46+00:00

Mark all these scripts executable.

Cc: Mark Rutland 
Cc: Peter Zijlstra (Intel) 
Cc: Will Deacon 
Cc: linux-arm-kernel@lists.infradead.org
Cc: Catalin Marinas 
Cc: linuxdrivers@attotech.com
Cc: dvyukov@google.com
Cc: boqun.feng@gmail.com
Cc: arnd@arndb.de
Cc: aryabinin@virtuozzo.com
Cc: glider@google.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar