<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/s390/kernel, branch master</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>s390/idle: Add missing EXPORT_SYMBOL_GPL()</title>
<updated>2026-06-18T12:44:44+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2026-06-17T14:53:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=69388468721151e15d34657a2e4a654741f32f1b'/>
<id>69388468721151e15d34657a2e4a654741f32f1b</id>
<content type='text'>
Uwe Kleine-König reported this build breakage caused by a recent commit
which provides arch specific kcpustat_field_idle()/kcpustat_field_iowait()
functions:

ERROR: modpost: "arch_kcpustat_field_idle" [drivers/leds/trigger/ledtrig-activity.ko] undefined!
ERROR: modpost: "arch_kcpustat_field_iowait" [drivers/leds/trigger/ledtrig-activity.ko] undefined!

Fix this by adding the missing EXPORT_SYMBOL_GPL().

Fixes: 670e057744e0 ("s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait()")
Reported-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Closes: https://lore.kernel.org/r/ajKsG0JP6qTssQBX@monoceros
Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Tested-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Uwe Kleine-König reported this build breakage caused by a recent commit
which provides arch specific kcpustat_field_idle()/kcpustat_field_iowait()
functions:

ERROR: modpost: "arch_kcpustat_field_idle" [drivers/leds/trigger/ledtrig-activity.ko] undefined!
ERROR: modpost: "arch_kcpustat_field_iowait" [drivers/leds/trigger/ledtrig-activity.ko] undefined!

Fix this by adding the missing EXPORT_SYMBOL_GPL().

Fixes: 670e057744e0 ("s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait()")
Reported-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Closes: https://lore.kernel.org/r/ajKsG0JP6qTssQBX@monoceros
Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Tested-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'idle-time-acc' into features</title>
<updated>2026-06-16T14:21:46+00:00</updated>
<author>
<name>Alexander Gordeev</name>
<email>agordeev@linux.ibm.com</email>
</author>
<published>2026-06-16T14:21:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9fb519794ecb2af57eff63ad82fdfcaf12e7b4b4'/>
<id>9fb519794ecb2af57eff63ad82fdfcaf12e7b4b4</id>
<content type='text'>
Heiko Carstens says:

===================
This is supposed to improve s390 idle time accounting, and brings it
back to the state it was before arch_cpu_idle_time() was removed from
s390 [3].

In result all cpu time accounting is done by the s390 architecture backend
again, instead of having a mix of architecure specific and common code
accounting (common code: idle, s390 architecture: everything else).
===================

Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Heiko Carstens says:

===================
This is supposed to improve s390 idle time accounting, and brings it
back to the state it was before arch_cpu_idle_time() was removed from
s390 [3].

In result all cpu time accounting is done by the s390 architecture backend
again, instead of having a mix of architecure specific and common code
accounting (common code: idle, s390 architecture: everything else).
===================

Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 's390-7.2-1' of gitolite.kernel.org:pub/scm/linux/kernel/git/s390/linux</title>
<updated>2026-06-15T23:38:13+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-15T23:38:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=25a01b5155d207e72bdd31b138406f37788403cb'/>
<id>25a01b5155d207e72bdd31b138406f37788403cb</id>
<content type='text'>
Pull s390 updates from Alexander Gordeev:

 - Use CIO device online variable instead of the internal FSM state to
   determine device availability during purge operations

 - Remove extra check of task_stack_page() because try_get_task_stack()
   already takes care of that when reading /proc/&lt;pid&gt;/wchan

 - Allow user-space to use the new SCLP action qualifier 4 for to
   provide NVMe SMART log data to the platform.

 - Send AP CHANGE uevents on successful bind and successful association
   to notify user-space about SE operations on AP queue devices

 - Add an s390dbf kernel parameter to configure debug log levels and
   area sizes during early boot

 - On arm64 the empty zero page is going to be mapped read-only. Do the
   same for s390 with an explicit set_memory_ro() call

 - Improve s390-specific bcr_serialize() and cpu_relax() implementations

 - Remove all unused variables to avoid allmodconfig W=1 build fails
   with latest clang-23

 - Cleanup default Kconfig values for s390 selftests

 - Add a s390-tod trace clock to allow comparing trace timestamps
   between different systems or virtual machines on s390

 - Remove the s390 implementation of strlcat() in favor of the generic
   variant

 - Make consistent the calling order between
   page_table_check_pte_clear() and secure page conversion across all
   code paths

 - Rearrange some fields within AP and zcrypt structs to reduce memory
   consumption and unused holes

 - Shorten GR_NUM and VX_NUM macros and move them to a separate header

 - Replace __get_free_page() with kmalloc() in few sources

 - Introduce an infrastructure for more efficient this_cpu operations.
   Eliminate conditional branches when PREEMPT_NONE is removed

 - Enable Rust support

 - Use z10 as minimum architecture level, similar to the boot code, to
   enforce a defined architecture level set

 - Improve and convert various mem*() helper functions to C. For that
   add .noinstr.text section to avoid orphaned warnings from the linker

 - Fix the function pointer type in __ret_from_fork() to correct the
   indirect call to match kernel thread return type of int

 - Revert support for DCACHE_WORD_ACCESS to avoid an endless exception
   loop on read from donated Ultravisor pages at unaligned addresses

* tag 's390-7.2-1' of gitolite.kernel.org:pub/scm/linux/kernel/git/s390/linux: (52 commits)
  s390: Revert support for DCACHE_WORD_ACCESS
  s390/process: Fix kernel thread function pointer type
  s390/tishift: Convert __ashlti3(), __ashrti3(), __lshrti3() to C
  s390/memmove: Optimize backward copy case
  s390/string: Convert memset(16|32|64)() to C
  s390/string: Convert memcpy() to C
  s390/string: Convert memset() to C
  s390/string: Convert memmove() to C
  s390/string: Add -ffreestanding compile option to string.o
  s390: Add .noinstr.text to boot and purgatory linker scripts
  s390/purgatory: Enforce z10 minimum architecture level
  s390: Enable Rust support
  s390/cmpxchg: Fix KASAN stack-out-of-bounds in atomic helpers
  rust: helpers: Add memchr wrapper for string operations
  rust/bindgen_parameters: Mark s390 types as opaque to prevent repr conflicts
  s390/jump_label: Implement ARCH_STATIC_BRANCH_JUMP_ASM and ARCH_STATIC_BRANCH_ASM macros
  s390/bug: Provide ARCH_WARN_ASM for Rust WARN/BUG support
  s390/ap: Fix locking issue in SE bind and associate sysfs functions
  s390/percpu: Provide arch_this_cpu_write() implementation
  s390/percpu: Provide arch_this_cpu_read() implementation
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull s390 updates from Alexander Gordeev:

 - Use CIO device online variable instead of the internal FSM state to
   determine device availability during purge operations

 - Remove extra check of task_stack_page() because try_get_task_stack()
   already takes care of that when reading /proc/&lt;pid&gt;/wchan

 - Allow user-space to use the new SCLP action qualifier 4 for to
   provide NVMe SMART log data to the platform.

 - Send AP CHANGE uevents on successful bind and successful association
   to notify user-space about SE operations on AP queue devices

 - Add an s390dbf kernel parameter to configure debug log levels and
   area sizes during early boot

 - On arm64 the empty zero page is going to be mapped read-only. Do the
   same for s390 with an explicit set_memory_ro() call

 - Improve s390-specific bcr_serialize() and cpu_relax() implementations

 - Remove all unused variables to avoid allmodconfig W=1 build fails
   with latest clang-23

 - Cleanup default Kconfig values for s390 selftests

 - Add a s390-tod trace clock to allow comparing trace timestamps
   between different systems or virtual machines on s390

 - Remove the s390 implementation of strlcat() in favor of the generic
   variant

 - Make consistent the calling order between
   page_table_check_pte_clear() and secure page conversion across all
   code paths

 - Rearrange some fields within AP and zcrypt structs to reduce memory
   consumption and unused holes

 - Shorten GR_NUM and VX_NUM macros and move them to a separate header

 - Replace __get_free_page() with kmalloc() in few sources

 - Introduce an infrastructure for more efficient this_cpu operations.
   Eliminate conditional branches when PREEMPT_NONE is removed

 - Enable Rust support

 - Use z10 as minimum architecture level, similar to the boot code, to
   enforce a defined architecture level set

 - Improve and convert various mem*() helper functions to C. For that
   add .noinstr.text section to avoid orphaned warnings from the linker

 - Fix the function pointer type in __ret_from_fork() to correct the
   indirect call to match kernel thread return type of int

 - Revert support for DCACHE_WORD_ACCESS to avoid an endless exception
   loop on read from donated Ultravisor pages at unaligned addresses

* tag 's390-7.2-1' of gitolite.kernel.org:pub/scm/linux/kernel/git/s390/linux: (52 commits)
  s390: Revert support for DCACHE_WORD_ACCESS
  s390/process: Fix kernel thread function pointer type
  s390/tishift: Convert __ashlti3(), __ashrti3(), __lshrti3() to C
  s390/memmove: Optimize backward copy case
  s390/string: Convert memset(16|32|64)() to C
  s390/string: Convert memcpy() to C
  s390/string: Convert memset() to C
  s390/string: Convert memmove() to C
  s390/string: Add -ffreestanding compile option to string.o
  s390: Add .noinstr.text to boot and purgatory linker scripts
  s390/purgatory: Enforce z10 minimum architecture level
  s390: Enable Rust support
  s390/cmpxchg: Fix KASAN stack-out-of-bounds in atomic helpers
  rust: helpers: Add memchr wrapper for string operations
  rust/bindgen_parameters: Mark s390 types as opaque to prevent repr conflicts
  s390/jump_label: Implement ARCH_STATIC_BRANCH_JUMP_ASM and ARCH_STATIC_BRANCH_ASM macros
  s390/bug: Provide ARCH_WARN_ASM for Rust WARN/BUG support
  s390/ap: Fix locking issue in SE bind and associate sysfs functions
  s390/percpu: Provide arch_this_cpu_write() implementation
  s390/percpu: Provide arch_this_cpu_read() implementation
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/idle: Remove idle time and count sysfs files</title>
<updated>2026-06-15T14:33:40+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2026-05-13T14:01:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=968ec3f960d013fc74d3a9bed3ae9ab9a44cb951'/>
<id>968ec3f960d013fc74d3a9bed3ae9ab9a44cb951</id>
<content type='text'>
Remove the s390 specific idle_time_us and idle_count per cpu sysfs
files. They do not provide any additional value. The risk that there
are existing applications which rely on these architecture specific
files should be very low.

However if it turns out such applications exist, this can be easily
reverted.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the s390 specific idle_time_us and idle_count per cpu sysfs
files. They do not provide any additional value. The risk that there
are existing applications which rely on these architecture specific
files should be very low.

However if it turns out such applications exist, this can be easily
reverted.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait()</title>
<updated>2026-06-15T14:33:40+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2026-05-13T14:01:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=670e057744e0cc8047bf96d15d18c46e16ae2e93'/>
<id>670e057744e0cc8047bf96d15d18c46e16ae2e93</id>
<content type='text'>
The former s390 specific arch_cpu_idle_time() implementation was
removed, since its implementation was racy and reported idle time
could go backwards [1].

However this removal was not necessary, since independently of the s390
architecture specific races there exists the iowait counter update race,
which can also lead to reported idle time going backwards [2].

With Frederic Weisbecker's recent cpu idle time accounting refactoring
kernel_cpustat got a sequence counter. Use this to implement s390 specific
variants of kcpustat_field_idle() and kcpustat_field_iowait(). This is
logically a revert of [1] and moves cpu idle time accounting back into s390
architecture code, which is also more precise than the dyntick idle time
accounting by nohz/scheduler.

For comparing cross cpu time stamps it is necessary to use the stcke
instead of the stckf instruction in irq entry path. Furthermore this
open-codes a sequence lock in assembler and C code, which is required to
update the irq entry time stamp to the per cpu idle_data structure in a
race free manner.

[1] commit be76ea614460 ("s390/idle: remove arch_cpu_idle_time() and corresponding code")
[2] commit ead70b752373 ("timers/nohz: Add a comment about broken iowait counter update race")

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The former s390 specific arch_cpu_idle_time() implementation was
removed, since its implementation was racy and reported idle time
could go backwards [1].

However this removal was not necessary, since independently of the s390
architecture specific races there exists the iowait counter update race,
which can also lead to reported idle time going backwards [2].

With Frederic Weisbecker's recent cpu idle time accounting refactoring
kernel_cpustat got a sequence counter. Use this to implement s390 specific
variants of kcpustat_field_idle() and kcpustat_field_iowait(). This is
logically a revert of [1] and moves cpu idle time accounting back into s390
architecture code, which is also more precise than the dyntick idle time
accounting by nohz/scheduler.

For comparing cross cpu time stamps it is necessary to use the stcke
instead of the stckf instruction in irq entry path. Furthermore this
open-codes a sequence lock in assembler and C code, which is required to
update the irq entry time stamp to the per cpu idle_data structure in a
race free manner.

[1] commit be76ea614460 ("s390/idle: remove arch_cpu_idle_time() and corresponding code")
[2] commit ead70b752373 ("timers/nohz: Add a comment about broken iowait counter update race")

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/irq/idle: Use stcke instead of stckf for time stamps</title>
<updated>2026-06-15T14:33:40+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2026-05-13T14:01:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4fe307f0f5c1fa6afff9175f59930dcd39ee99fd'/>
<id>4fe307f0f5c1fa6afff9175f59930dcd39ee99fd</id>
<content type='text'>
The upcoming cpu idle time accounting rework involves comparing and
subtracting cross cpu time stamps. Time stamps created with the stckf
instruction monotonic with respect to the local cpu. For cross cpu
monotonic time stamps the slightly slower stcke instruction has to
be used [1].

Convert the idle time accounting relevant usages of stckf to stcke.

[1] Principles of Operation - Setting and Inspecting the Clock

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The upcoming cpu idle time accounting rework involves comparing and
subtracting cross cpu time stamps. Time stamps created with the stckf
instruction monotonic with respect to the local cpu. For cross cpu
monotonic time stamps the slightly slower stcke instruction has to
be used [1].

Convert the idle time accounting relevant usages of stckf to stcke.

[1] Principles of Operation - Setting and Inspecting the Clock

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/process: Fix kernel thread function pointer type</title>
<updated>2026-06-11T15:45:35+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2026-06-08T14:19:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d0478f5d3cba1095bfdeb43a9b063c10cdebef14'/>
<id>d0478f5d3cba1095bfdeb43a9b063c10cdebef14</id>
<content type='text'>
In case of a kernel thread __ret_from_fork() calls the specified function
indirectly. Fix the kernel thread function pointer, since kernel threads
return an int instead of void.

Fixes: 56e62a737028 ("s390: convert to generic entry")
Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In case of a kernel thread __ret_from_fork() calls the specified function
indirectly. Fix the kernel thread function pointer, since kernel threads
return an int instead of void.

Fixes: 56e62a737028 ("s390: convert to generic entry")
Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/percpu: Infrastructure for more efficient this_cpu operations</title>
<updated>2026-06-03T13:32:46+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2026-05-26T05:56:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a737737cdb9c94e40a9926cdc2320f874c05d709'/>
<id>a737737cdb9c94e40a9926cdc2320f874c05d709</id>
<content type='text'>
With the intended removal of PREEMPT_NONE this_cpu operations based on
atomic instructions, guarded with preempt_disable()/preempt_enable() pairs
become more expensive: the preempt_disable() / preempt_enable() pairs are
not optimized away anymore during compile time.

In particular the conditional call to preempt_schedule_notrace() after
preempt_enable() adds additional code and register pressure.

E.g. this simple C code sequence

DEFINE_PER_CPU(long, foo);
long bar(long a) { return this_cpu_add_return(foo, a); }

generates this code:

  11a976:       eb af f0 68 00 24       stmg    %r10,%r15,104(%r15)
  11a97c:       b9 04 00 ef             lgr     %r14,%r15
  11a980:       b9 04 00 b2             lgr     %r11,%r2
  11a984:       e3 f0 ff c8 ff 71       lay     %r15,-56(%r15)
  11a98a:       e3 e0 f0 98 00 24       stg     %r14,152(%r15)
  11a990:       eb 01 03 a8 00 6a       asi     936,1            &lt;- __preempt_count_add(1)
  11a996:       c0 10 00 d2 ac b5       larl    %r1,1b70300      &lt;- address of percpu var
  11a9a0:       e3 10 23 b8 00 08       ag      %r1,952          &lt;- add percpu offset
  11a9a6:       eb ab 10 00 00 e8       laag    %r10,%r11,0(%r1) &lt;- atomic op
  11a9ac:       eb ff 03 a8 00 6e       alsi    936,-1           &lt;- __preempt_count_dec_and_test()
  11a9b2:       a7 54 00 05             jnhe    11a9bc &lt;bar+0x4c&gt;
  11a9b6:       c0 e5 00 76 d1 bd       brasl   %r14,ff4d30 &lt;preempt_schedule_notrace&gt;
  11a9bc:       b9 e8 b0 2a             agrk    %r2,%r10,%r11
  11a9c0:       eb af f0 a0 00 04       lmg     %r10,%r15,160(%r15)
  11a9c6        07 fe                   br      %r14

Even though the above example is more or less the worst case, since the
branch to preempt_schedule_notrace() requires a stackframe, which
otherwise wouldn't be necessary, there is also the conditional jnhe branch
instruction.

Get rid of the conditional branch with the following code sequence:

  11a8e6:       c0 30 00 d0 c5 0d       larl    %r3,1b33300
  11a8ec:       b9 04 00 43             lgr     %r4,%r3
  11a8f0:       eb 00 43 c0 00 52       mviy    960,4
  11a8f6:       e3 40 03 b8 00 08       ag      %r4,952
  11a8fc:       eb 52 40 00 00 e8       laag    %r5,%r2,0(%r4)
  11a902:       eb 00 03 c0 00 52       mviy    960,0
  11a908:       b9 08 00 25             agr     %r2,%r5
  11a90c        07 fe                   br      %r14

The general idea is that this_cpu operations based on atomic instructions
are guarded with mviy instructions:

- The first mviy instruction writes the register number, which contains
  the percpu address variable to lowcore. This also indicates that a
  percpu code section is executed.

- The first instruction following the mviy instruction must be the ag
  instruction which adds the percpu offset to the percpu address register.

- Afterwards the atomic percpu operation follows.

- Then a second mviy instruction writes a zero to lowcore, which indicates
  the end of the percpu code section.

- In case of an interrupt/exception/nmi the register number which was
  written to lowcore is copied to the exception frame (pt_regs), and a zero
  is written to lowcore.

- On return to the previous context it is checked if a percpu code section
  was executed (saved register number not zero), and if the process was
  migrated to a different cpu. If the percpu offset was already added to
  the percpu address register (instruction address does _not_ point to the
  ag instruction) the content of the percpu address register is adjusted so
  it points to percpu variable of the new cpu.

Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With the intended removal of PREEMPT_NONE this_cpu operations based on
atomic instructions, guarded with preempt_disable()/preempt_enable() pairs
become more expensive: the preempt_disable() / preempt_enable() pairs are
not optimized away anymore during compile time.

In particular the conditional call to preempt_schedule_notrace() after
preempt_enable() adds additional code and register pressure.

E.g. this simple C code sequence

DEFINE_PER_CPU(long, foo);
long bar(long a) { return this_cpu_add_return(foo, a); }

generates this code:

  11a976:       eb af f0 68 00 24       stmg    %r10,%r15,104(%r15)
  11a97c:       b9 04 00 ef             lgr     %r14,%r15
  11a980:       b9 04 00 b2             lgr     %r11,%r2
  11a984:       e3 f0 ff c8 ff 71       lay     %r15,-56(%r15)
  11a98a:       e3 e0 f0 98 00 24       stg     %r14,152(%r15)
  11a990:       eb 01 03 a8 00 6a       asi     936,1            &lt;- __preempt_count_add(1)
  11a996:       c0 10 00 d2 ac b5       larl    %r1,1b70300      &lt;- address of percpu var
  11a9a0:       e3 10 23 b8 00 08       ag      %r1,952          &lt;- add percpu offset
  11a9a6:       eb ab 10 00 00 e8       laag    %r10,%r11,0(%r1) &lt;- atomic op
  11a9ac:       eb ff 03 a8 00 6e       alsi    936,-1           &lt;- __preempt_count_dec_and_test()
  11a9b2:       a7 54 00 05             jnhe    11a9bc &lt;bar+0x4c&gt;
  11a9b6:       c0 e5 00 76 d1 bd       brasl   %r14,ff4d30 &lt;preempt_schedule_notrace&gt;
  11a9bc:       b9 e8 b0 2a             agrk    %r2,%r10,%r11
  11a9c0:       eb af f0 a0 00 04       lmg     %r10,%r15,160(%r15)
  11a9c6        07 fe                   br      %r14

Even though the above example is more or less the worst case, since the
branch to preempt_schedule_notrace() requires a stackframe, which
otherwise wouldn't be necessary, there is also the conditional jnhe branch
instruction.

Get rid of the conditional branch with the following code sequence:

  11a8e6:       c0 30 00 d0 c5 0d       larl    %r3,1b33300
  11a8ec:       b9 04 00 43             lgr     %r4,%r3
  11a8f0:       eb 00 43 c0 00 52       mviy    960,4
  11a8f6:       e3 40 03 b8 00 08       ag      %r4,952
  11a8fc:       eb 52 40 00 00 e8       laag    %r5,%r2,0(%r4)
  11a902:       eb 00 03 c0 00 52       mviy    960,0
  11a908:       b9 08 00 25             agr     %r2,%r5
  11a90c        07 fe                   br      %r14

The general idea is that this_cpu operations based on atomic instructions
are guarded with mviy instructions:

- The first mviy instruction writes the register number, which contains
  the percpu address variable to lowcore. This also indicates that a
  percpu code section is executed.

- The first instruction following the mviy instruction must be the ag
  instruction which adds the percpu offset to the percpu address register.

- Afterwards the atomic percpu operation follows.

- Then a second mviy instruction writes a zero to lowcore, which indicates
  the end of the percpu code section.

- In case of an interrupt/exception/nmi the register number which was
  written to lowcore is copied to the exception frame (pt_regs), and a zero
  is written to lowcore.

- On return to the previous context it is checked if a percpu code section
  was executed (saved register number not zero), and if the process was
  migrated to a different cpu. If the percpu offset was already added to
  the percpu address register (instruction address does _not_ point to the
  ag instruction) the content of the percpu address register is adjusted so
  it points to percpu variable of the new cpu.

Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/time: Prepare to stop elapsing in dynticks-idle</title>
<updated>2026-06-02T19:27:25+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2026-05-08T13:16:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ad5a9e14ec8b4a868fea13a9dfa1fb38b2c35354'/>
<id>ad5a9e14ec8b4a868fea13a9dfa1fb38b2c35354</id>
<content type='text'>
Currently the tick subsystem stores the idle cputime accounting in private
fields, allowing cohabitation with architecture idle vtime accounting. The
former is fetched on online CPUs, the latter on offline CPUs.

For consolidation purposes, architecture vtime accounting will continue to
account the cputime but will make a break when the idle tick is
stopped. The dyntick cputime accounting will then be relayed by the tick
subsystem so that the idle cputime is still seen advancing coherently even
when the tick isn't there to flush the idle vtime.

Prepare for that and introduce three new APIs which will be used in
subsequent patches:

  - vtime_dynticks_start() is deemed to be called when idle enters in
    dyntick mode. The idle cputime that elapsed so far is accumulated
    and accounted. Also idle time accounting is ignored.

  - vtime_dynticks_stop() is deemed to be called when idle exits from
    dyntick mode. The vtime entry clocks are fast-forward to current time
    so that idle accounting restarts elapsing from now. Also idle time
    accounting is resumed.

  - vtime_reset() is deemed to be called from dynticks idle IRQ entry to
    fast-forward the clock to current time so that the IRQ time is still
    accounted by vtime while nohz cputime is paused.

Also accumulated vtime won't be flushed from dyntick-idle ticks to avoid
accounting twice the idle cputime, along with nohz accounting.

Co-developed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt;
Link: https://patch.msgid.link/20260508131647.43868-7-frederic@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently the tick subsystem stores the idle cputime accounting in private
fields, allowing cohabitation with architecture idle vtime accounting. The
former is fetched on online CPUs, the latter on offline CPUs.

For consolidation purposes, architecture vtime accounting will continue to
account the cputime but will make a break when the idle tick is
stopped. The dyntick cputime accounting will then be relayed by the tick
subsystem so that the idle cputime is still seen advancing coherently even
when the tick isn't there to flush the idle vtime.

Prepare for that and introduce three new APIs which will be used in
subsequent patches:

  - vtime_dynticks_start() is deemed to be called when idle enters in
    dyntick mode. The idle cputime that elapsed so far is accumulated
    and accounted. Also idle time accounting is ignored.

  - vtime_dynticks_stop() is deemed to be called when idle exits from
    dyntick mode. The vtime entry clocks are fast-forward to current time
    so that idle accounting restarts elapsing from now. Also idle time
    accounting is resumed.

  - vtime_reset() is deemed to be called from dynticks idle IRQ entry to
    fast-forward the clock to current time so that the IRQ time is still
    accounted by vtime while nohz cputime is paused.

Also accumulated vtime won't be flushed from dyntick-idle ticks to avoid
accounting twice the idle cputime, along with nohz accounting.

Co-developed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt;
Link: https://patch.msgid.link/20260508131647.43868-7-frederic@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/tracing: Add s390-tod clock</title>
<updated>2026-05-26T06:15:41+00:00</updated>
<author>
<name>Sven Schnelle</name>
<email>svens@linux.ibm.com</email>
</author>
<published>2026-05-21T10:41:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=29eb8a7b6e3bfca74a9077304b08d42274cbb4d6'/>
<id>29eb8a7b6e3bfca74a9077304b08d42274cbb4d6</id>
<content type='text'>
In order to allow comparing trace timestamps between different
systems or virtual machines on s390, add a s390-tod trace clock.
This clock just uses the returned TOD clock value from stcke
directly.

Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Reviewed-by: Ilya Leoshkevich &lt;iii@linux.ibm.com&gt;
Reviewed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to allow comparing trace timestamps between different
systems or virtual machines on s390, add a s390-tod trace clock.
This clock just uses the returned TOD clock value from stcke
directly.

Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Reviewed-by: Ilya Leoshkevich &lt;iii@linux.ibm.com&gt;
Reviewed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
