<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/rcu/tree_exp.h, branch v6.3</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>rcu: Allow expedited RCU CPU stall warnings to dump task stacks</title>
<updated>2023-01-04T01:47:44+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-12-20T02:02:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=92987fe8bdd1cbec61919a394bb11316c5d860f4'/>
<id>92987fe8bdd1cbec61919a394bb11316c5d860f4</id>
<content type='text'>
This commit introduces the rcupdate.rcu_exp_stall_task_details kernel
boot parameter, which cause expedited RCU CPU stall warnings to dump
the stacks of any tasks blocking the current expedited grace period.

Reported-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit introduces the rcupdate.rcu_exp_stall_task_details kernel
boot parameter, which cause expedited RCU CPU stall warnings to dump
the stacks of any tasks blocking the current expedited grace period.

Reported-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: Suppress smp_processor_id() complaint in synchronize_rcu_expedited_wait()</title>
<updated>2023-01-04T01:28:34+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-12-16T23:55:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2d7f00b2f01301d6e41fd4a28030dab0442265be'/>
<id>2d7f00b2f01301d6e41fd4a28030dab0442265be</id>
<content type='text'>
The normal grace period's RCU CPU stall warnings are invoked from the
scheduling-clock interrupt handler, and can thus invoke smp_processor_id()
with impunity, which allows them to directly invoke dump_cpu_task().
In contrast, the expedited grace period's RCU CPU stall warnings are
invoked from process context, which causes the dump_cpu_task() function's
calls to smp_processor_id() to complain bitterly in debug kernels.

This commit therefore causes synchronize_rcu_expedited_wait() to disable
preemption around its call to dump_cpu_task().

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The normal grace period's RCU CPU stall warnings are invoked from the
scheduling-clock interrupt handler, and can thus invoke smp_processor_id()
with impunity, which allows them to directly invoke dump_cpu_task().
In contrast, the expedited grace period's RCU CPU stall warnings are
invoked from process context, which causes the dump_cpu_task() function's
calls to smp_processor_id() to complain bitterly in debug kernels.

This commit therefore causes synchronize_rcu_expedited_wait() to disable
preemption around its call to dump_cpu_task().

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: Make call_rcu() lazy to save power</title>
<updated>2022-11-29T22:02:23+00:00</updated>
<author>
<name>Joel Fernandes (Google)</name>
<email>joel@joelfernandes.org</email>
</author>
<published>2022-10-16T16:22:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3cb278e73be58bfb780ecd55129296d2f74c1fb7'/>
<id>3cb278e73be58bfb780ecd55129296d2f74c1fb7</id>
<content type='text'>
Implement timer-based RCU callback batching (also known as lazy
callbacks). With this we save about 5-10% of power consumed due
to RCU requests that happen when system is lightly loaded or idle.

By default, all async callbacks (queued via call_rcu) are marked
lazy. An alternate API call_rcu_hurry() is provided for the few users,
for example synchronize_rcu(), that need the old behavior.

The batch is flushed whenever a certain amount of time has passed, or
the batch on a particular CPU grows too big. Also memory pressure will
flush it in a future patch.

To handle several corner cases automagically (such as rcu_barrier() and
hotplug), we re-use bypass lists which were originally introduced to
address lock contention, to handle lazy CBs as well. The bypass list
length has the lazy CB length included in it. A separate lazy CB length
counter is also introduced to keep track of the number of lazy CBs.

[ paulmck: Fix formatting of inline call_rcu_lazy() definition. ]
[ paulmck: Apply Zqiang feedback. ]
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Suggested-by: Paul McKenney &lt;paulmck@kernel.org&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement timer-based RCU callback batching (also known as lazy
callbacks). With this we save about 5-10% of power consumed due
to RCU requests that happen when system is lightly loaded or idle.

By default, all async callbacks (queued via call_rcu) are marked
lazy. An alternate API call_rcu_hurry() is provided for the few users,
for example synchronize_rcu(), that need the old behavior.

The batch is flushed whenever a certain amount of time has passed, or
the batch on a particular CPU grows too big. Also memory pressure will
flush it in a future patch.

To handle several corner cases automagically (such as rcu_barrier() and
hotplug), we re-use bypass lists which were originally introduced to
address lock contention, to handle lazy CBs as well. The bypass list
length has the lazy CB length included in it. A separate lazy CB length
counter is also introduced to keep track of the number of lazy CBs.

[ paulmck: Fix formatting of inline call_rcu_lazy() definition. ]
[ paulmck: Apply Zqiang feedback. ]
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Suggested-by: Paul McKenney &lt;paulmck@kernel.org&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branches 'doc.2022.08.31b', 'fixes.2022.08.31b', 'kvfree.2022.08.31b', 'nocb.2022.09.01a', 'poll.2022.08.31b', 'poll-srcu.2022.08.31b' and 'tasks.2022.08.31b' into HEAD</title>
<updated>2022-09-01T17:55:57+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-09-01T17:55:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5c0ec4900497f7c9cc12f393c329a52e67bc6b8b'/>
<id>5c0ec4900497f7c9cc12f393c329a52e67bc6b8b</id>
<content type='text'>
doc.2022.08.31b: Documentation updates
fixes.2022.08.31b: Miscellaneous fixes
kvfree.2022.08.31b: kvfree_rcu() updates
nocb.2022.09.01a: NOCB CPU updates
poll.2022.08.31b: Full-oldstate RCU polling grace-period API
poll-srcu.2022.08.31b: Polled SRCU grace-period updates
tasks.2022.08.31b: Tasks RCU updates
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
doc.2022.08.31b: Documentation updates
fixes.2022.08.31b: Miscellaneous fixes
kvfree.2022.08.31b: kvfree_rcu() updates
nocb.2022.09.01a: NOCB CPU updates
poll.2022.08.31b: Full-oldstate RCU polling grace-period API
poll-srcu.2022.08.31b: Polled SRCU grace-period updates
tasks.2022.08.31b: Tasks RCU updates
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: Make synchronize_rcu_expedited() fast path update .expedited_sequence</title>
<updated>2022-08-31T12:09:21+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-08-05T00:43:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=43ff97cc997f5641127152f97e1fd0fc9fa060f6'/>
<id>43ff97cc997f5641127152f97e1fd0fc9fa060f6</id>
<content type='text'>
This commit causes the early boot single-CPU synchronize_rcu_expedited()
fastpath to update the rcu_state structure's -&gt;expedited_sequence
counter.  This will allow the full-state polled grace-period APIs to
detect all expedited grace periods without the need to track the special
combined polling-only counter, which is another step towards removing
the -&gt;rgos_polled field from the rcu_gp_oldstate, thereby reducing its
size by one third.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit causes the early boot single-CPU synchronize_rcu_expedited()
fastpath to update the rcu_state structure's -&gt;expedited_sequence
counter.  This will allow the full-state polled grace-period APIs to
detect all expedited grace periods without the need to track the special
combined polling-only counter, which is another step towards removing
the -&gt;rgos_polled field from the rcu_gp_oldstate, thereby reducing its
size by one third.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: Remove expedited grace-period fast-path forward-progress helper</title>
<updated>2022-08-31T12:09:21+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-08-05T00:34:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e8755d2bde7c296026fe5d47f1b75c7b19ba46fd'/>
<id>e8755d2bde7c296026fe5d47f1b75c7b19ba46fd</id>
<content type='text'>
Now that the expedited grace-period fast path can only happen during
the pre-scheduler portion of early boot, this fast path can no longer
block run-time RCU Trace grace periods.  This commit therefore removes
the conditional cond_resched() invocation.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the expedited grace-period fast path can only happen during
the pre-scheduler portion of early boot, this fast path can no longer
block run-time RCU Trace grace periods.  This commit therefore removes
the conditional cond_resched() invocation.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: Add full-sized polling for cond_sync_exp_full()</title>
<updated>2022-08-31T12:08:08+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-08-04T22:23:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8df13f01608ea48712956c0b1afce35bdba5a1c5'/>
<id>8df13f01608ea48712956c0b1afce35bdba5a1c5</id>
<content type='text'>
The cond_synchronize_rcu_expedited() API compresses the combined expedited and
normal grace-period states into a single unsigned long, which conserves
storage, but can miss grace periods in certain cases involving overlapping
normal and expedited grace periods.  Missing the occasional grace period
is usually not a problem, but there are use cases that care about each
and every grace period.

This commit therefore adds yet another member of the full-state RCU
grace-period polling API, which is the cond_synchronize_rcu_exp_full()
function.  This uses up to three times the storage (rcu_gp_oldstate
structure instead of unsigned long), but is guaranteed not to miss
grace periods.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The cond_synchronize_rcu_expedited() API compresses the combined expedited and
normal grace-period states into a single unsigned long, which conserves
storage, but can miss grace periods in certain cases involving overlapping
normal and expedited grace periods.  Missing the occasional grace period
is usually not a problem, but there are use cases that care about each
and every grace period.

This commit therefore adds yet another member of the full-state RCU
grace-period polling API, which is the cond_synchronize_rcu_exp_full()
function.  This uses up to three times the storage (rcu_gp_oldstate
structure instead of unsigned long), but is guaranteed not to miss
grace periods.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: Add full-sized polling for start_poll_expedited()</title>
<updated>2022-08-31T12:08:08+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-08-03T19:38:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6c502b14ba66da0670a59e20354469fa56eab26c'/>
<id>6c502b14ba66da0670a59e20354469fa56eab26c</id>
<content type='text'>
The start_poll_synchronize_rcu_expedited() API compresses the combined
expedited and normal grace-period states into a single unsigned long,
which conserves storage, but can miss grace periods in certain cases
involving overlapping normal and expedited grace periods.  Missing the
occasional grace period is usually not a problem, but there are use
cases that care about each and every grace period.

This commit therefore adds yet another member of the
full-state RCU grace-period polling API, which is the
start_poll_synchronize_rcu_expedited_full() function.  This uses up to
three times the storage (rcu_gp_oldstate structure instead of unsigned
long), but is guaranteed not to miss grace periods.

[ paulmck: Apply feedback from kernel test robot and Julia Lawall. ]

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The start_poll_synchronize_rcu_expedited() API compresses the combined
expedited and normal grace-period states into a single unsigned long,
which conserves storage, but can miss grace periods in certain cases
involving overlapping normal and expedited grace periods.  Missing the
occasional grace period is usually not a problem, but there are use
cases that care about each and every grace period.

This commit therefore adds yet another member of the
full-state RCU grace-period polling API, which is the
start_poll_synchronize_rcu_expedited_full() function.  This uses up to
three times the storage (rcu_gp_oldstate structure instead of unsigned
long), but is guaranteed not to miss grace periods.

[ paulmck: Apply feedback from kernel test robot and Julia Lawall. ]

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: Add QS check in rcu_exp_handler() for non-preemptible kernels</title>
<updated>2022-08-31T12:03:14+00:00</updated>
<author>
<name>Zqiang</name>
<email>qiang1.zhang@intel.com</email>
</author>
<published>2022-07-05T19:09:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fcb42c9a77d490ed0974e4d394519481aa06e585'/>
<id>fcb42c9a77d490ed0974e4d394519481aa06e585</id>
<content type='text'>
Kernels built with CONFIG_PREEMPTION=n and CONFIG_PREEMPT_COUNT=y maintain
preempt_count() state.  Because such kernels map __rcu_read_lock()
and __rcu_read_unlock() to preempt_disable() and preempt_enable(),
respectively, this allows the expedited grace period's !CONFIG_PREEMPT_RCU
version of the rcu_exp_handler() IPI handler function to use
preempt_count() to detect quiescent states.

This preempt_count() usage might seem to risk failures due to
use of implicit RCU readers in portions of the kernel under #ifndef
CONFIG_PREEMPTION, except that rcu_core() already disallows such implicit
RCU readers.  The moral of this story is that you must use explicit
read-side markings such as rcu_read_lock() or preempt_disable() even if
the code knows that this kernel does not support preemption.

This commit therefore adds a preempt_count()-based check for a quiescent
state in the !CONFIG_PREEMPT_RCU version of the rcu_exp_handler()
function for kernels built with CONFIG_PREEMPT_COUNT=y, reporting an
immediate quiescent state when the interrupted code had both preemption
and softirqs enabled.

This change results in about a 2% reduction in expedited grace-period
latency in kernels built with both CONFIG_PREEMPT_RCU=n and
CONFIG_PREEMPT_COUNT=y.

Signed-off-by: Zqiang &lt;qiang1.zhang@intel.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Link: https://lore.kernel.org/all/20220622103549.2840087-1-qiang1.zhang@intel.com/
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Kernels built with CONFIG_PREEMPTION=n and CONFIG_PREEMPT_COUNT=y maintain
preempt_count() state.  Because such kernels map __rcu_read_lock()
and __rcu_read_unlock() to preempt_disable() and preempt_enable(),
respectively, this allows the expedited grace period's !CONFIG_PREEMPT_RCU
version of the rcu_exp_handler() IPI handler function to use
preempt_count() to detect quiescent states.

This preempt_count() usage might seem to risk failures due to
use of implicit RCU readers in portions of the kernel under #ifndef
CONFIG_PREEMPTION, except that rcu_core() already disallows such implicit
RCU readers.  The moral of this story is that you must use explicit
read-side markings such as rcu_read_lock() or preempt_disable() even if
the code knows that this kernel does not support preemption.

This commit therefore adds a preempt_count()-based check for a quiescent
state in the !CONFIG_PREEMPT_RCU version of the rcu_exp_handler()
function for kernels built with CONFIG_PREEMPT_COUNT=y, reporting an
immediate quiescent state when the interrupted code had both preemption
and softirqs enabled.

This change results in about a 2% reduction in expedited grace-period
latency in kernels built with both CONFIG_PREEMPT_RCU=n and
CONFIG_PREEMPT_COUNT=y.

Signed-off-by: Zqiang &lt;qiang1.zhang@intel.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Link: https://lore.kernel.org/all/20220622103549.2840087-1-qiang1.zhang@intel.com/
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'ctxt.2022.07.05a' into HEAD</title>
<updated>2022-07-22T00:46:18+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-07-22T00:46:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=34bc7b454dc31f75a0be7ee8ab378135523d7c51'/>
<id>34bc7b454dc31f75a0be7ee8ab378135523d7c51</id>
<content type='text'>
ctxt.2022.07.05a: Linux-kernel memory model development branch.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ctxt.2022.07.05a: Linux-kernel memory model development branch.
</pre>
</div>
</content>
</entry>
</feed>
