linux-toradex.git/kernel/locking/qspinlock_stat.h, branch v4.9.16

locking/pvstat: Separate wait_again and spurious wakeup stats

2016-08-10T12:16:02+00:00

Currently there are overlap in the pvqspinlock wait_again and
spurious_wakeup stat counters. Because of lock stealing, it is
no longer possible to accurately determine if spurious wakeup has
happened in the queue head.  As they track both the queue node and
queue head status, it is also hard to tell how many of those comes
from the queue head and how many from the queue node.

This patch changes the accounting rules so that spurious wakeup is
only tracked in the queue node. The wait_again count, however, is
only tracked in the queue head when the vCPU failed to acquire the
lock after a vCPU kick. This should give a much better indication of
the wait-kick dynamics in the queue node and the queue head.

Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Boqun Feng 
Cc: Douglas Hatch 
Cc: Linus Torvalds 
Cc: Pan Xinhui 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Scott J Norton 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/1464713631-1066-2-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Fix a bug in qstat_read()

2016-08-10T12:13:29+00:00

It's obviously wrong to set stat to NULL. So lets remove it.
Otherwise it is always zero when we check the latency of kick/wake.

Signed-off-by: Pan Xinhui 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Waiman Long 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/1468405414-3700-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Robustify init_qspinlock_stat()

2016-05-05T07:58:51+00:00

Specifically around the debugfs file creation calls,
I have no idea if they could ever possibly fail, but
this is core code (debug aside) so lets at least
check the return value and inform anything fishy.

Signed-off-by: Davidlohr Bueso 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Waiman Long 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20160420041725.GC3472@linux-uzut.site
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Avoid double resetting of stats

2016-05-05T07:58:49+00:00

... remove the redundant second iteration, this is most
likely a copy/past buglet.

Signed-off-by: Davidlohr Bueso 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: dave@stgolabs.net
Cc: waiman.long@hpe.com
Link: http://lkml.kernel.org/r/1460961103-24953-2-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Fix division by zero in qstat_read()

2016-04-19T08:49:19+00:00

While playing with the qstat statistics (in /qlockstat/) I ran into
the following splat on a VM when opening pv_hash_hops:

  divide error: 0000 [#1] SMP
  ...
  RIP: 0010:[]  [] qstat_read+0x12e/0x1e0
  ...
  Call Trace:
    [] ? mem_cgroup_commit_charge+0x6c/0xd0
    [] ? page_add_new_anon_rmap+0x8c/0xd0
    [] ? handle_mm_fault+0x1439/0x1b40
    [] ? do_mmap+0x449/0x550
    [] ? __vfs_read+0x23/0xd0
    [] ? rw_verify_area+0x52/0xd0
    [] ? vfs_read+0x81/0x120
    [] ? SyS_read+0x42/0xa0
    [] ? entry_SYSCALL_64_fastpath+0x1e/0xa8

Fix this by verifying that qstat_pv_kick_unlock is in fact non-zero,
similarly to what the qstat_pv_latency_wake case does, as if nothing
else, this can come from resetting the statistics, thus having 0 kicks
should be quite valid in this context.

Signed-off-by: Davidlohr Bueso 
Reviewed-by: Waiman Long 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: dave@stgolabs.net
Cc: waiman.long@hpe.com
Link: http://lkml.kernel.org/r/1460961103-24953-1-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Enable slowpath locking count tracking

2016-02-29T09:02:42+00:00

This patch enables the tracking of the number of slowpath locking
operations performed. This can be used to compare against the number
of lock stealing operations to see what percentage of locks are stolen
versus acquired via the regular slowpath.

Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Douglas Hatch 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Scott J Norton 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/1449778666-13593-2-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Move lock stealing count tracking code into pv_queued_spin_steal_lock()

2016-02-29T09:02:41+00:00

This patch moves the lock stealing count tracking code into
pv_queued_spin_steal_lock() instead of via a jacket function simplifying
the code.

Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Douglas Hatch 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Scott J Norton 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/1449778666-13593-3-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Queue node adaptive spinning

2015-12-04T10:39:51+00:00

In an overcommitted guest where some vCPUs have to be halted to make
forward progress in other areas, it is highly likely that a vCPU later
in the spinlock queue will be spinning while the ones earlier in the
queue would have been halted. The spinning in the later vCPUs is then
just a waste of precious CPU cycles because they are not going to
get the lock soon as the earlier ones have to be woken up and take
their turn to get the lock.

This patch implements an adaptive spinning mechanism where the vCPU
will call pv_wait() if the previous vCPU is not running.

Linux kernel builds were run in KVM guest on an 8-socket, 4
cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
Haswell-EX system. Both systems are configured to have 32 physical
CPUs. The kernel build times before and after the patch were:

		    Westmere			Haswell
  Patch		32 vCPUs    48 vCPUs	32 vCPUs    48 vCPUs
  -----		--------    --------    --------    --------
  Before patch   3m02.3s     5m00.2s     1m43.7s     3m03.5s
  After patch    3m03.0s     4m37.5s	 1m43.0s     2m47.2s

For 32 vCPUs, this patch doesn't cause any noticeable change in
performance. For 48 vCPUs (over-committed), there is about 8%
performance improvement.

Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Davidlohr Bueso 
Cc: Douglas Hatch 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Scott J Norton 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/1447114167-47185-8-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Allow limited lock stealing

2015-12-04T10:39:51+00:00

This patch allows one attempt for the lock waiter to steal the lock
when entering the PV slowpath. To prevent lock starvation, the pending
bit will be set by the queue head vCPU when it is in the active lock
spinning loop to disable any lock stealing attempt.  This helps to
reduce the performance penalty caused by lock waiter preemption while
not having much of the downsides of a real unfair lock.

The pv_wait_head() function was renamed as pv_wait_head_or_lock()
as it was modified to acquire the lock before returning. This is
necessary because of possible lock stealing attempts from other tasks.

Linux kernel builds were run in KVM guest on an 8-socket, 4
cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
Haswell-EX system. Both systems are configured to have 32 physical
CPUs. The kernel build times before and after the patch were:

                    Westmere                    Haswell
  Patch         32 vCPUs    48 vCPUs    32 vCPUs    48 vCPUs
  -----         --------    --------    --------    --------
  Before patch   3m15.6s    10m56.1s     1m44.1s     5m29.1s
  After patch    3m02.3s     5m00.2s     1m43.7s     3m03.5s

For the overcommited case (48 vCPUs), this patch is able to reduce
kernel build time by more than 54% for Westmere and 44% for Haswell.

Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Davidlohr Bueso 
Cc: Douglas Hatch 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Scott J Norton 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/1447190336-53317-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar

locking/pvqspinlock: Collect slowpath lock statistics

2015-12-04T10:39:50+00:00

This patch enables the accumulation of kicking and waiting related
PV qspinlock statistics when the new QUEUED_LOCK_STAT configuration
option is selected. It also enables the collection of data which
enable us to calculate the kicking and wakeup latencies which have
a heavy dependency on the CPUs being used.

The statistical counters are per-cpu variables to minimize the
performance overhead in their updates. These counters are exported
via the debugfs filesystem under the qlockstat directory.  When the
corresponding debugfs files are read, summation and computing of the
required data are then performed.

The measured latencies for different CPUs are:

	CPU		Wakeup		Kicking
	---		------		-------
	Haswell-EX	63.6us		 7.4us
	Westmere-EX	67.6us		 9.3us

The measured latencies varied a bit from run-to-run. The wakeup
latency is much higher than the kicking latency.

A sample of statistical counters after system bootup (with vCPU
overcommit) was:

	pv_hash_hops=1.00
	pv_kick_unlock=1148
	pv_kick_wake=1146
	pv_latency_kick=11040
	pv_latency_wake=194840
	pv_spurious_wakeup=7
	pv_wait_again=4
	pv_wait_head=23
	pv_wait_node=1129

Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Davidlohr Bueso 
Cc: Douglas Hatch 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Scott J Norton 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/1447114167-47185-6-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar