linux-toradex.git/kernel/locking/rwsem.h, branch v5.10-rc3

locking/percpu-rwsem: Remove the embedded rwsem

2020-02-11T12:10:56+00:00

The filesystem freezer uses percpu-rwsem in a way that is effectively
write_non_owner() and achieves this with a few horrible hacks that
rely on the rwsem (!percpu) implementation.

When PREEMPT_RT replaces the rwsem implementation with a PI aware
variant this comes apart.

Remove the embedded rwsem and implement it using a waitqueue and an
atomic_t.

 - make readers_block an atomic, and use it, with the waitqueue
   for a blocking test-and-set write-side.

 - have the read-side wait for the 'lock' state to clear.

Have the waiters use FIFO queueing and mark them (reader/writer) with
a new WQ_FLAG. Use a custom wake_function to wake either a single
writer or all readers until a writer.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Davidlohr Bueso 
Acked-by: Will Deacon 
Acked-by: Waiman Long 
Tested-by: Juri Lelli 
Link: https://lkml.kernel.org/r/20200204092403.GB14879@hirez.programming.kicks-ass.net

locking/percpu-rwsem, lockdep: Make percpu-rwsem use its own lockdep_map

2020-02-11T12:10:53+00:00

As preparation for replacing the embedded rwsem, give percpu-rwsem its
own lockdep_map.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Davidlohr Bueso 
Acked-by: Will Deacon 
Acked-by: Waiman Long 
Tested-by: Juri Lelli 
Link: https://lkml.kernel.org/r/20200131151539.927625541@infradead.org

locking/rwsem: Merge rwsem.h and rwsem-xadd.c into rwsem.c

2019-06-17T10:27:57+00:00

Now we only have one implementation of rwsem. Even though we still use
xadd to handle reader locking, we use cmpxchg for writer instead. So
the filename rwsem-xadd.c is not strictly correct. Also no one outside
of the rwsem code need to know the internal implementation other than
function prototypes for two internal functions that are called directly
from percpu-rwsem.c.

So the rwsem-xadd.c and rwsem.h files are now merged into rwsem.c in
the following order:

  
  
  
  

The rwsem.h file now contains only 2 function declarations for
__up_read() and __down_read().

This is a code relocation patch with no code change at all except
making __up_read() and __down_read() non-static functions so they
can be used by percpu-rwsem.c.

Suggested-by: Peter Zijlstra 
Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Thomas Gleixner 
Cc: Tim Chen 
Cc: Will Deacon 
Cc: huang ying 
Link: https://lkml.kernel.org/r/20190520205918.22251-5-longman@redhat.com
Signed-off-by: Ingo Molnar

locking/rwsem: Implement a new locking scheme

2019-06-17T10:27:56+00:00

The current way of using various reader, writer and waiting biases
in the rwsem code are confusing and hard to understand. I have to
reread the rwsem count guide in the rwsem-xadd.c file from time to
time to remind myself how this whole thing works. It also makes the
rwsem code harder to be optimized.

To make rwsem more sane, a new locking scheme similar to the one in
qrwlock is now being used.  The atomic long count has the following
bit definitions:

  Bit  0   - writer locked bit
  Bit  1   - waiters present bit
  Bits 2-7 - reserved for future extension
  Bits 8-X - reader count (24/56 bits)

The cmpxchg instruction is now used to acquire the write lock. The read
lock is still acquired with xadd instruction, so there is no change here.
This scheme will allow up to 16M/64P active readers which should be
more than enough. We can always use some more reserved bits if necessary.

With that change, we can deterministically know if a rwsem has been
write-locked. Looking at the count alone, however, one cannot determine
for certain if a rwsem is owned by readers or not as the readers that
set the reader count bits may be in the process of backing out. So we
still need the reader-owned bit in the owner field to be sure.

With a locking microbenchmark running on 5.1 based kernel, the total
locking rates (in kops/s) of the benchmark on a 8-socket 120-core
IvyBridge-EX system before and after the patch were as follows:

                  Before Patch      After Patch
   # of Threads  wlock    rlock    wlock    rlock
   ------------  -----    -----    -----    -----
        1        30,659   31,341   31,055   31,283
        2         8,909   16,457    9,884   17,659
        4         9,028   15,823    8,933   20,233
        8         8,410   14,212    7,230   17,140
       16         8,217   25,240    7,479   24,607

The locking rates of the benchmark on a Power8 system were as follows:

                  Before Patch      After Patch
   # of Threads  wlock    rlock    wlock    rlock
   ------------  -----    -----    -----    -----
        1        12,963   13,647   13,275   13,601
        2         7,570   11,569    7,902   10,829
        4         5,232    5,516    5,466    5,435
        8         5,233    3,386    5,467    3,168

The locking rates of the benchmark on a 2-socket ARM64 system were
as follows:

                  Before Patch      After Patch
   # of Threads  wlock    rlock    wlock    rlock
   ------------  -----    -----    -----    -----
        1        21,495   21,046   21,524   21,074
        2         5,293   10,502    5,333   10,504
        4         5,325   11,463    5,358   11,631
        8         5,391   11,712    5,470   11,680

The performance are roughly the same before and after the patch. There
are run-to-run variations in performance. Runs with higher variances
usually have higher throughput.

Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tim Chen 
Cc: Will Deacon 
Cc: huang ying 
Link: https://lkml.kernel.org/r/20190520205918.22251-4-longman@redhat.com
Signed-off-by: Ingo Molnar

locking/rwsem: Make owner available even if !CONFIG_RWSEM_SPIN_ON_OWNER

2019-06-17T10:27:54+00:00

The owner field in the rw_semaphore structure is used primarily for
optimistic spinning. However, identifying the rwsem owner can also be
helpful in debugging as well as tracing locking related issues when
analyzing crash dump. The owner field may also store state information
that can be important to the operation of the rwsem.

So the owner field is now made a permanent member of the rw_semaphore
structure irrespective of CONFIG_RWSEM_SPIN_ON_OWNER.

Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tim Chen 
Cc: Will Deacon 
Cc: huang ying 
Link: https://lkml.kernel.org/r/20190520205918.22251-2-longman@redhat.com
Signed-off-by: Ingo Molnar

locking/rwsem: Prevent unneeded warning during locking selftest

2019-04-14T09:09:35+00:00

Disable the DEBUG_RWSEMS check when locking selftest is running with
debug_locks_silent flag set.

Signed-off-by: Waiman Long 
Cc: Davidlohr Bueso 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tim Chen 
Cc: Will Deacon 
Cc: huang ying 
Link: http://lkml.kernel.org/r/20190413172259.2740-2-longman@redhat.com
Signed-off-by: Ingo Molnar

locking/rwsem: Enable lock event counting

2019-04-10T08:56:06+00:00

Add lock event counting calls so that we can track the number of lock
events happening in the rwsem code.

With CONFIG_LOCK_EVENT_COUNTS on and booting a 4-socket 112-thread x86-64
system, the rwsem counts after system bootup were as follows:

  rwsem_opt_fail=261
  rwsem_opt_wlock=50636
  rwsem_rlock=445
  rwsem_rlock_fail=0
  rwsem_rlock_fast=22
  rwsem_rtrylock=810144
  rwsem_sleep_reader=441
  rwsem_sleep_writer=310
  rwsem_wake_reader=355
  rwsem_wake_writer=2335
  rwsem_wlock=261
  rwsem_wlock_fail=0
  rwsem_wtrylock=20583

It can be seen that most of the lock acquisitions in the slowpath were
write-locks in the optimistic spinning code path with no sleeping at
all. For this system, over 97% of the locks are acquired via optimistic
spinning. It illustrates the importance of optimistic spinning in
improving the performance of rwsem.

Signed-off-by: Waiman Long 
Acked-by: Peter Zijlstra 
Acked-by: Davidlohr Bueso 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tim Chen 
Cc: Will Deacon 
Link: http://lkml.kernel.org/r/20190404174320.22416-11-longman@redhat.com
Signed-off-by: Ingo Molnar

locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro

2019-04-10T08:56:03+00:00

Currently, the DEBUG_RWSEMS_WARN_ON() macro just dumps a stack trace
when the rwsem isn't in the right state. It does not show the actual
states of the rwsem. This may not be that helpful in the debugging
process.

Enhance the DEBUG_RWSEMS_WARN_ON() macro to also show the current
content of the rwsem count and owner fields to give more information
about what is wrong with the rwsem. The debug_locks_off() function is
called as is done inside DEBUG_LOCKS_WARN_ON().

Signed-off-by: Waiman Long 
Acked-by: Peter Zijlstra 
Acked-by: Davidlohr Bueso 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tim Chen 
Cc: Will Deacon 
Link: http://lkml.kernel.org/r/20190404174320.22416-7-longman@redhat.com
Signed-off-by: Ingo Molnar

locking/rwsem: Add debug check for __down_read*()

2019-04-10T08:56:02+00:00

When rwsem_down_read_failed*() return, the read lock is acquired
indirectly by others. So debug checks are added in __down_read() and
__down_read_killable() to make sure the rwsem is really reader-owned.

The other debug check calls in kernel/locking/rwsem.c except the
one in up_read_non_owner() are also moved over to rwsem-xadd.h.

Signed-off-by: Waiman Long 
Acked-by: Peter Zijlstra 
Acked-by: Davidlohr Bueso 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tim Chen 
Cc: Will Deacon 
Link: http://lkml.kernel.org/r/20190404174320.22416-6-longman@redhat.com
Signed-off-by: Ingo Molnar

locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h

2019-04-10T08:56:00+00:00

We don't need to expose rwsem internal functions which are not supposed
to be called directly from other kernel code.

Signed-off-by: Waiman Long 
Acked-by: Peter Zijlstra 
Acked-by: Will Deacon 
Acked-by: Davidlohr Bueso 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tim Chen 
Link: http://lkml.kernel.org/r/20190404174320.22416-4-longman@redhat.com
Signed-off-by: Ingo Molnar