linux-toradex.git/kernel/rcupdate.c, branch v2.6.16.53

[PATCH] rcu batch tuning

2006-03-08T22:14:01+00:00

This patch adds new tunables for RCU queue and finished batches.  There are
two types of controls - number of completed RCU updates invoked in a batch
(blimit) and monitoring for high rate of incoming RCUs on a cpu (qhimark,
qlowmark).

By default, the per-cpu batch limit is set to a small value.  If the input
RCU rate exceeds the high watermark, we do two things - force quiescent
state on all cpus and set the batch limit of the CPU to INTMAX.  Setting
batch limit to INTMAX forces all finished RCUs to be processed in one shot.
 If we have more than INTMAX RCUs queued up, then we have bigger problems
anyway.  Once the incoming queued RCUs fall below the low watermark, the
batch limit is set to the default.

Signed-off-by: Dipankar Sarma 
Cc: "Paul E. McKenney" 
Cc: "David S. Miller" 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] rcu: fix hotplug-cpu ->donelist leak

2006-01-10T16:49:47+00:00

Pointed out by Srivatsa Vaddagiri .

rcu_do_batch() stops after processing maxbatch callbacks
on ->donelist leaving rcu_tasklet in TASKLET_STATE_SCHED
state.

If CPU_DEAD event happens remaining ->donelist entries are
lost, rcu_offline_cpu() kills this tasklet.

With this patch ->donelist migrates along with ->curlist
and ->nxtlist to the current cpu.

Compile tested.

Signed-off-by: Oleg Nesterov 
Acked-by: Paul E. McKenney 
Cc: Srivatsa Vaddagiri 
Cc: Dipankar Sarma 
Signed-off-by: Linus Torvalds

[PATCH] rcu: join rcu_ctrlblk and rcu_state

2006-01-10T16:42:50+00:00

This patch moves rcu_state into the rcu_ctrlblk. I think there
are no reasons why we should have 2 different variables to control
rcu state. Every user of rcu_state has also "rcu_ctrlblk *rcp" in
the parameter list.

Signed-off-by: Oleg Nesterov 
Acked-by: Paul E. McKenney 
Signed-off-by: Linus Torvalds

[PATCH] rcu: don't set ->next_pending in rcu_start_batch()

2006-01-10T01:01:39+00:00

I think it is better to set ->next_pending in the caller, when
it is needed. This saves one parameter, and this coincides with
cpu_quiet() beahaviour, which sets ->completed = ->cur itself.

Signed-off-by: Oleg Nesterov 
Acked-by: Paul E. McKenney 
Signed-off-by: Linus Torvalds

[PATCH] rcu: uninline __rcu_pending()

2006-01-09T17:35:44+00:00

__rcu_pending() is rather fat and called twice from rcu_pending().

rcu_pending() has multiple callers, and not that small too.

This patch uninlines both of them.

Signed-off-by: Oleg Nesterov 
Acked-by: Paul E. McKenney 
Signed-off-by: Linus Torvalds

[PATCH] rcu file: use atomic primitives

2006-01-09T04:13:48+00:00

Use atomic_inc_not_zero for rcu files instead of special case rcuref.

Signed-off-by: Nick Piggin 
Cc: "Paul E. McKenney" 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] RCU signal handling

2006-01-09T04:13:40+00:00

RCU tasklist_lock and RCU signal handling: send signals RCU-read-locked
instead of tasklist_lock read-locked.  This is a scalability improvement on
SMP and a preemption-latency improvement under PREEMPT_RCU.

Signed-off-by: Paul E. McKenney 
Signed-off-by: Ingo Molnar 
Acked-by: William Irwin 
Cc: Roland McGrath 
Cc: Oleg Nesterov 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] Change maxaligned_in_smp alignemnt macros to internodealigned_in_smp macros

2006-01-09T04:13:38+00:00

____cacheline_maxaligned_in_smp is currently used to align critical structures
and avoid false sharing.  It uses per-arch L1_CACHE_SHIFT_MAX and people find
L1_CACHE_SHIFT_MAX useless.

However, we have been using ____cacheline_maxaligned_in_smp to align
structures on the internode cacheline size.  As per Andi's suggestion,
following patch kills ____cacheline_maxaligned_in_smp and introduces
INTERNODE_CACHE_SHIFT, which defaults to L1_CACHE_SHIFT for all arches.
Arches needing L3/Internode cacheline alignment can define
INTERNODE_CACHE_SHIFT in the arch asm/cache.h.  Patch replaces
____cacheline_maxaligned_in_smp with ____cacheline_internodealigned_in_smp

With this patch, L1_CACHE_SHIFT_MAX can be killed

Signed-off-by: Ravikiran Thirumalai 
Signed-off-by: Shai Fultheim 
Signed-off-by: Andi Kleen 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] Fix RCU race in access of nohz_cpu_mask

2005-12-12T16:57:42+00:00

Accessing nohz_cpu_mask before incrementing rcp->cur is racy.  It can cause
tickless idle CPUs to be included in rsp->cpumask, which will extend
graceperiods unnecessarily.

Fix this race.  It has been tested using extensions to RCU torture module
that forces various CPUs to become idle.

Signed-off-by: Srivatsa Vaddagiri 
Cc: Dipankar Sarma 
Cc: "Paul E. McKenney" 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] add rcu_barrier() synchronization point

2005-12-12T16:57:42+00:00

This introduces a new interface - rcu_barrier() which waits until all
the RCUs queued until this call have been completed.

Reiser4 needs this, because we do more than just freeing memory object
in our RCU callback: we also remove it from the list hanging off
super-block.  This means, that before freeing reiser4-specific portion
of super-block (during umount) we have to wait until all pending RCU
callbacks are executed.

The only change of reiser4 made to the original patch, is exporting of
rcu_barrier().

Cc: Hans Reiser 
Cc: Vladimir V. Saveliev 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds