linux-toradex.git/include/linux/percpu-rwsem.h, branch v4.11-rc3

locking/percpu-rwsem: Replace waitqueue with rcuwait

2017-01-14T10:14:35+00:00

The use of any kind of wait queue is an overkill for pcpu-rwsems.
While one option would be to use the less heavy simple (swait)
flavor, this is still too much for what pcpu-rwsems needs. For one,
we do not care about any sort of queuing in that the only (rare) time
writers (and readers, for that matter) are queued is when trying to
acquire the regular contended rw_sem. There cannot be any further
queuing as writers are serialized by the rw_sem in the first place.

Given that percpu_down_write() must not be called after exit_notify(),
we can replace the bulky waitqueue with rcuwait such that a writer
can wait for its turn to take the lock. As such, we can avoid the
queue handling and locking overhead.

Signed-off-by: Davidlohr Bueso 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Oleg Nesterov 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: dave@stgolabs.net
Link: http://lkml.kernel.org/r/1484148146-14210-3-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

locking/percpu-rwsem: Add down_read_preempt_disable()

2016-09-22T13:25:54+00:00

Provide a down_read()/up_read() variant that keeps preemption disabled
over the whole thing, when possible.

This avoids a needless preemption point for constructs such as:

	percpu_down_read(&global_rwsem);
	spin_lock(&lock);
	...
	spin_unlock(&lock);
	percpu_up_read(&global_rwsem);

Which perturbs timings. In particular it was found to cure a
performance regression in a follow up patch in fs/locks.c

Signed-off-by: Peter Zijlstra (Intel) 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

locking/percpu-rwsem: Add DEFINE_STATIC_PERCPU_RWSEMand percpu_rwsem_assert_held()

2016-09-22T13:25:52+00:00

Provide a static init and a standard locking assertion method.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: oleg@redhat.com
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: tj@kernel.org
Cc: viro@ZenIV.linux.org.uk
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-10T12:34:01+00:00

Currently the percpu-rwsem switches to (global) atomic ops while a
writer is waiting; which could be quite a while and slows down
releasing the readers.

This patch cures this problem by ordering the reader-state vs
reader-count (see the comments in __percpu_down_read() and
percpu_down_write()). This changes a global atomic op into a full
memory barrier, which doesn't have the global cacheline contention.

This also enables using the percpu-rwsem with rcu_sync disabled in order
to bias the implementation differently, reducing the writer latency by
adding some cost to readers.

Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Oleg Nesterov 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Paul McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-kernel@vger.kernel.org
[ Fixed modular build. ]
Signed-off-by: Ingo Molnar

locking/percpu-rwsem: Make use of the rcu_sync infrastructure

2015-10-06T18:25:31+00:00

Currently down_write/up_write calls synchronize_sched_expedited()
twice, which is evil.  Change this code to rely on rcu-sync primitives.
This avoids the _expedited "big hammer", and this can be faster in
the contended case or even in the case when a single thread does
down_write/up_write in a loop.

Of course, a single down_write() will take more time, but otoh it
will be much more friendly to the whole system.

To simplify the review this patch doesn't update the comments, fixed
by the next change.

Signed-off-by: Oleg Nesterov 
Signed-off-by: Paul E. McKenney 
Reviewed-by: Josh Triplett

percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()

2015-08-15T11:52:10+00:00

Add percpu_rwsem_release() and percpu_rwsem_acquire() for the users
which need to return to userspace with percpu-rwsem lock held and/or
pass the ownership to another thread.

TODO: change percpu_rwsem_release() to use rwsem_clear_owner(). We can
either fold kernel/locking/rwsem.h into include/linux/rwsem.h, or add
the non-inline percpu_rwsem_clear_owner().

Signed-off-by: Oleg Nesterov

percpu-rwsem: introduce percpu_down_read_trylock()

2015-08-15T11:52:10+00:00

Add percpu_down_read_trylock(), it will have the user soon.

Signed-off-by: Oleg Nesterov

percpu_rw_semaphore: add lockdep annotations

2012-12-18T01:15:18+00:00

Add lockdep annotations.  Not only this can help to find the potential
problems, we do not want the false warnings if, say, the task takes two
different percpu_rw_semaphore's for reading.  IOW, at least ->rw_sem
should not use a single class.

This patch exposes this internal lock to lockdep so that it represents the
whole percpu_rw_semaphore.  This way we do not need to add another "fake"
->lockdep_map and lock_class_key.  More importantly, this also makes the
output from lockdep much more understandable if it finds the problem.

In short, with this patch from lockdep pov percpu_down_read() and
percpu_up_read() acquire/release ->rw_sem for reading, this matches the
actual semantics.  This abuses __up_read() but I hope this is fine and in
fact I'd like to have down_read_no_lockdep() as well,
percpu_down_read_recursive_readers() will need it.

Signed-off-by: Oleg Nesterov 
Cc: Anton Arapov 
Cc: Ingo Molnar 
Cc: Linus Torvalds 
Cc: Michal Marek 
Cc: Mikulas Patocka 
Cc: "Paul E. McKenney" 
Cc: Peter Zijlstra 
Cc: Srikar Dronamraju 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

percpu_rw_semaphore: kill ->writer_mutex, add ->write_ctr

2012-12-18T01:15:18+00:00

percpu_rw_semaphore->writer_mutex was only added to simplify the initial
rewrite, the only thing it protects is clear_fast_ctr() which otherwise
could be called by multiple writers.  ->rw_sem is enough to serialize the
writers.

Kill this mutex and add "atomic_t write_ctr" instead.  The writers
increment/decrement this counter, the readers check it is zero instead of
mutex_is_locked().

Move atomic_add(clear_fast_ctr(), slow_read_ctr) under down_write() to
avoid the race with other writers.  This is a bit sub-optimal, only the
first writer needs this and we do not need to exclude the readers at this
stage.  But this is simple, we do not want another internal lock until we
add more features.

And this speeds up the write-contended case.  Before this patch the racing
writers sleep in synchronize_sched_expedited() sequentially, with this
patch multiple synchronize_sched_expedited's can "overlap" with each
other.  Note: we can do more optimizations, this is only the first step.

Signed-off-by: Oleg Nesterov 
Cc: Anton Arapov 
Cc: Ingo Molnar 
Cc: Linus Torvalds 
Cc: Michal Marek 
Cc: Mikulas Patocka 
Cc: "Paul E. McKenney" 
Cc: Peter Zijlstra 
Cc: Srikar Dronamraju 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

percpu_rw_semaphore: reimplement to not block the readers unnecessarily

2012-12-18T01:15:18+00:00

Currently the writer does msleep() plus synchronize_sched() 3 times to
acquire/release the semaphore, and during this time the readers are
blocked completely.  Even if the "write" section was not actually started
or if it was already finished.

With this patch down_write/up_write does synchronize_sched() twice and
down_read/up_read are still possible during this time, just they use the
slow path.

percpu_down_write() first forces the readers to use rw_semaphore and
increment the "slow" counter to take the lock for reading, then it
takes that rw_semaphore for writing and blocks the readers.

Also.  With this patch the code relies on the documented behaviour of
synchronize_sched(), it doesn't try to pair synchronize_sched() with
barrier.

Signed-off-by: Oleg Nesterov 
Reviewed-by: Paul E. McKenney 
Cc: Linus Torvalds 
Cc: Mikulas Patocka 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Srikar Dronamraju 
Cc: Ananth N Mavinakayanahalli 
Cc: Anton Arapov 
Cc: Jens Axboe 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds