<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/arc/include/asm/spinlock.h, branch v4.9.44</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>locking/spinlock, arch: Update and fix spin_unlock_wait() implementations</title>
<updated>2016-06-14T09:55:15+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-05-26T08:35:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=726328d92a42b6d4b76078e2659f43067f82c4e8'/>
<id>726328d92a42b6d4b76078e2659f43067f82c4e8</id>
<content type='text'>
This patch updates/fixes all spin_unlock_wait() implementations.

The update is in semantics; where it previously was only a control
dependency, we now upgrade to a full load-acquire to match the
store-release from the spin_unlock() we waited on. This ensures that
when spin_unlock_wait() returns, we're guaranteed to observe the full
critical section we waited on.

This fixes a number of spin_unlock_wait() users that (not
unreasonably) rely on this.

I also fixed a number of ticket lock versions to only wait on the
current lock holder, instead of for a full unlock, as this is
sufficient.

Furthermore; again for ticket locks; I added an smp_rmb() in between
the initial ticket load and the spin loop testing the current value
because I could not convince myself the address dependency is
sufficient, esp. if the loads are of different sizes.

I'm more than happy to remove this smp_rmb() again if people are
certain the address dependency does indeed work as expected.

Note: PPC32 will be fixed independently

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: chris@zankel.net
Cc: cmetcalf@mellanox.com
Cc: davem@davemloft.net
Cc: dhowells@redhat.com
Cc: james.hogan@imgtec.com
Cc: jejb@parisc-linux.org
Cc: linux@armlinux.org.uk
Cc: mpe@ellerman.id.au
Cc: ralf@linux-mips.org
Cc: realmz6@gmail.com
Cc: rkuo@codeaurora.org
Cc: rth@twiddle.net
Cc: schwidefsky@de.ibm.com
Cc: tony.luck@intel.com
Cc: vgupta@synopsys.com
Cc: ysato@users.sourceforge.jp
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch updates/fixes all spin_unlock_wait() implementations.

The update is in semantics; where it previously was only a control
dependency, we now upgrade to a full load-acquire to match the
store-release from the spin_unlock() we waited on. This ensures that
when spin_unlock_wait() returns, we're guaranteed to observe the full
critical section we waited on.

This fixes a number of spin_unlock_wait() users that (not
unreasonably) rely on this.

I also fixed a number of ticket lock versions to only wait on the
current lock holder, instead of for a full unlock, as this is
sufficient.

Furthermore; again for ticket locks; I added an smp_rmb() in between
the initial ticket load and the spin loop testing the current value
because I could not convince myself the address dependency is
sufficient, esp. if the loads are of different sizes.

I'm more than happy to remove this smp_rmb() again if people are
certain the address dependency does indeed work as expected.

Note: PPC32 will be fixed independently

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: chris@zankel.net
Cc: cmetcalf@mellanox.com
Cc: davem@davemloft.net
Cc: dhowells@redhat.com
Cc: james.hogan@imgtec.com
Cc: jejb@parisc-linux.org
Cc: linux@armlinux.org.uk
Cc: mpe@ellerman.id.au
Cc: ralf@linux-mips.org
Cc: realmz6@gmail.com
Cc: rkuo@codeaurora.org
Cc: rth@twiddle.net
Cc: schwidefsky@de.ibm.com
Cc: tony.luck@intel.com
Cc: vgupta@synopsys.com
Cc: ysato@users.sourceforge.jp
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff"</title>
<updated>2016-06-02T05:29:23+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2016-05-31T11:05:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ed6aefed726a305bd36344e230d2a9e9301226fc'/>
<id>ed6aefed726a305bd36344e230d2a9e9301226fc</id>
<content type='text'>
This reverts commit e78fdfef84be13a5c2b8276e12203cdf24778596.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL so revert the whole delayed retry series !

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit e78fdfef84be13a5c2b8276e12203cdf24778596.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL so revert the whole delayed retry series !

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle"</title>
<updated>2016-06-02T05:29:23+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2016-05-31T11:03:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=819f3602dcbd6b021cd50e18f5d05da30bca5b07'/>
<id>819f3602dcbd6b021cd50e18f5d05da30bca5b07</id>
<content type='text'>
This reverts commit b89aa12c177477e34caa722818536fb5d0bffd76.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL so revert the whole delayed retry series !

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit b89aa12c177477e34caa722818536fb5d0bffd76.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL so revert the whole delayed retry series !

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff"</title>
<updated>2016-06-02T05:29:22+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2016-05-31T11:01:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=42316a201a60be38b07db1ebc3a1633107ed7209'/>
<id>42316a201a60be38b07db1ebc3a1633107ed7209</id>
<content type='text'>
This reverts commit 10971638701dedadb58c88ce4d31c9375b224ed6.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL - so revert thw whole delayed retry
series !

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 10971638701dedadb58c88ce4d31c9375b224ed6.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL - so revert thw whole delayed retry
series !

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: rwlock: disable interrupts in !LLSC variant</title>
<updated>2016-05-09T04:02:32+00:00</updated>
<author>
<name>Noam Camus</name>
<email>noamc@ezchip.com</email>
</author>
<published>2015-06-09T11:05:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2a1021fce85cb9867f3655c58a9c826a3612fae9'/>
<id>2a1021fce85cb9867f3655c58a9c826a3612fae9</id>
<content type='text'>
If we hold rwlock and interrupt occures we may
end up spinning on it for ever during softirq.
Note that this lock is an internal lock
and since the lock is free to be used from any context,
the lock needs to be IRQ-safe.

Below you may see an example for interrupt we get while
nl_table_lock is holding its rw-&gt;lock_mutex and we spinned
on it for ever.

The concept for the fix was taken from SPARC.

[2015-05-12 19:16:12] Stack Trace:
[2015-05-12 19:16:12]   arc_unwind_core+0xb8/0x11c
[2015-05-12 19:16:12]   dump_stack+0x68/0xac
[2015-05-12 19:16:12]   _raw_read_lock+0xa8/0xac
[2015-05-12 19:16:12]   netlink_broadcast_filtered+0x56/0x35c
[2015-05-12 19:16:12]   nlmsg_notify+0x42/0xa4
[2015-05-12 19:16:13]   neigh_update+0x1fe/0x44c
[2015-05-12 19:16:13]   neigh_event_ns+0x40/0xa4
[2015-05-12 19:16:13]   arp_process+0x46e/0x5a8
[2015-05-12 19:16:13]   __netif_receive_skb_core+0x358/0x500
[2015-05-12 19:16:13]   process_backlog+0x92/0x154
[2015-05-12 19:16:13]   net_rx_action+0xb8/0x188
[2015-05-12 19:16:13]   __do_softirq+0xda/0x1d8
[2015-05-12 19:16:14]   irq_exit+0x8a/0x8c
[2015-05-12 19:16:14]   arch_do_IRQ+0x6c/0xa8
[2015-05-12 19:16:14]   handle_interrupt_level1+0xe4/0xf0

Signed-off-by: Noam Camus &lt;noamc@ezchip.com&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we hold rwlock and interrupt occures we may
end up spinning on it for ever during softirq.
Note that this lock is an internal lock
and since the lock is free to be used from any context,
the lock needs to be IRQ-safe.

Below you may see an example for interrupt we get while
nl_table_lock is holding its rw-&gt;lock_mutex and we spinned
on it for ever.

The concept for the fix was taken from SPARC.

[2015-05-12 19:16:12] Stack Trace:
[2015-05-12 19:16:12]   arc_unwind_core+0xb8/0x11c
[2015-05-12 19:16:12]   dump_stack+0x68/0xac
[2015-05-12 19:16:12]   _raw_read_lock+0xa8/0xac
[2015-05-12 19:16:12]   netlink_broadcast_filtered+0x56/0x35c
[2015-05-12 19:16:12]   nlmsg_notify+0x42/0xa4
[2015-05-12 19:16:13]   neigh_update+0x1fe/0x44c
[2015-05-12 19:16:13]   neigh_event_ns+0x40/0xa4
[2015-05-12 19:16:13]   arp_process+0x46e/0x5a8
[2015-05-12 19:16:13]   __netif_receive_skb_core+0x358/0x500
[2015-05-12 19:16:13]   process_backlog+0x92/0x154
[2015-05-12 19:16:13]   net_rx_action+0xb8/0x188
[2015-05-12 19:16:13]   __do_softirq+0xda/0x1d8
[2015-05-12 19:16:14]   irq_exit+0x8a/0x8c
[2015-05-12 19:16:14]   arch_do_IRQ+0x6c/0xa8
[2015-05-12 19:16:14]   handle_interrupt_level1+0xe4/0xf0

Signed-off-by: Noam Camus &lt;noamc@ezchip.com&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff</title>
<updated>2015-08-07T08:26:16+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2015-08-07T07:31:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=10971638701dedadb58c88ce4d31c9375b224ed6'/>
<id>10971638701dedadb58c88ce4d31c9375b224ed6</id>
<content type='text'>
The increment of delay counter was 2 instructions:
Arithmatic Shfit Left (ASL) + set to 1 on overflow

This can be done in 1 using ROtate Left (ROL)

Suggested-by: Nigel Topham &lt;ntopham@synopsys.com&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The increment of delay counter was 2 instructions:
Arithmatic Shfit Left (ASL) + set to 1 on overflow

This can be done in 1 using ROtate Left (ROL)

Suggested-by: Nigel Topham &lt;ntopham@synopsys.com&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle</title>
<updated>2015-08-04T03:56:35+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2015-07-21T15:46:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b89aa12c177477e34caa722818536fb5d0bffd76'/>
<id>b89aa12c177477e34caa722818536fb5d0bffd76</id>
<content type='text'>
The previous commit for delayed retry of SCOND needs some fine tuning
for spin locks.

The backoff from delayed retry in conjunction with spin looping of lock
itself can potentially cause the delay counter to reach high values.
So to provide fairness to any lock operation, after a lock "seems"
available (i.e. just before first SCOND try0, reset the delay counter
back to starting value of 1

Essentially reset delay to 1 for a new spin-wait-loop-acquire cycle.

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The previous commit for delayed retry of SCOND needs some fine tuning
for spin locks.

The backoff from delayed retry in conjunction with spin looping of lock
itself can potentially cause the delay counter to reach high values.
So to provide fairness to any lock operation, after a lock "seems"
available (i.e. just before first SCOND try0, reset the delay counter
back to starting value of 1

Essentially reset delay to 1 for a new spin-wait-loop-acquire cycle.

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff</title>
<updated>2015-08-04T03:56:34+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2015-07-14T14:20:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e78fdfef84be13a5c2b8276e12203cdf24778596'/>
<id>e78fdfef84be13a5c2b8276e12203cdf24778596</id>
<content type='text'>
This is to workaround the llock/scond livelock

HS38x4 could get into a LLOCK/SCOND livelock in case of multiple overlapping
coherency transactions in the SCU. The exclusive line state keeps rotating
among contenting cores leading to a never ending cycle. So break the cycle
by deferring the retry of failed exclusive access (SCOND). The actual delay
needed is function of number of contending cores as well as the unrelated
coherency traffic from other cores. To keep the code simple, start off with
small delay of 1 which would suffice most cases and in case of contention
double the delay. Eventually the delay is sufficient such that the coherency
pipeline is drained, thus a subsequent exclusive access would succeed.

Link: http://lkml.kernel.org/r/1438612568-28265-1-git-send-email-vgupta@synopsys.com
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is to workaround the llock/scond livelock

HS38x4 could get into a LLOCK/SCOND livelock in case of multiple overlapping
coherency transactions in the SCU. The exclusive line state keeps rotating
among contenting cores leading to a never ending cycle. So break the cycle
by deferring the retry of failed exclusive access (SCOND). The actual delay
needed is function of number of contending cores as well as the unrelated
coherency traffic from other cores. To keep the code simple, start off with
small delay of 1 which would suffice most cases and in case of contention
double the delay. Eventually the delay is sufficient such that the coherency
pipeline is drained, thus a subsequent exclusive access would succeed.

Link: http://lkml.kernel.org/r/1438612568-28265-1-git-send-email-vgupta@synopsys.com
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: LLOCK/SCOND based rwlock</title>
<updated>2015-08-04T03:56:33+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2015-07-16T05:01:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=69cbe630f54ec02efe47fdb9e257e617161da370'/>
<id>69cbe630f54ec02efe47fdb9e257e617161da370</id>
<content type='text'>
With LLOCK/SCOND, the rwlock counter can be atomically updated w/o need
for a guarding spin lock.

This in turn elides the EXchange instruction based spinning which causes
the cacheline transition to exclusive state and concurrent spinning
across cores would cause the line to keep bouncing around.
LLOCK/SCOND based implementation is superior as spinning on LLOCK keeps
the cacheline in shared state.

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With LLOCK/SCOND, the rwlock counter can be atomically updated w/o need
for a guarding spin lock.

This in turn elides the EXchange instruction based spinning which causes
the cacheline transition to exclusive state and concurrent spinning
across cores would cause the line to keep bouncing around.
LLOCK/SCOND based implementation is superior as spinning on LLOCK keeps
the cacheline in shared state.

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: LLOCK/SCOND based spin_lock</title>
<updated>2015-08-04T03:56:33+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2015-07-14T12:25:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ae7eae9e031206bde8645d038c76ad89e7d3fbcb'/>
<id>ae7eae9e031206bde8645d038c76ad89e7d3fbcb</id>
<content type='text'>
Current spin_lock uses EXchange instruction to implement the atomic test
and set of lock location (reads orig value and ST 1). This however forces
the cacheline into exclusive state (because of the ST) and concurrent
loops in multiple cores will bounce the line around between cores.

Instead, use LLOCK/SCOND to implement the atomic test and set which is
better as line is in shared state while lock is spinning on LLOCK

The real motivation of this change however is to make way for future
changes in atomics to implement delayed retry (with backoff).
Initial experiment with delayed retry in atomics combined with orig
EX based spinlock was a total disaster (broke even LMBench) as
struct sock has a cache line sharing an atomic_t and spinlock. The
tight spinning on lock, caused the atomic retry to keep backing off
such that it would never finish.

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current spin_lock uses EXchange instruction to implement the atomic test
and set of lock location (reads orig value and ST 1). This however forces
the cacheline into exclusive state (because of the ST) and concurrent
loops in multiple cores will bounce the line around between cores.

Instead, use LLOCK/SCOND to implement the atomic test and set which is
better as line is in shared state while lock is spinning on LLOCK

The real motivation of this change however is to make way for future
changes in atomics to implement delayed retry (with backoff).
Initial experiment with delayed retry in atomics combined with orig
EX based spinlock was a total disaster (broke even LMBench) as
struct sock has a cache line sharing an atomic_t and spinlock. The
tight spinning on lock, caused the atomic retry to keep backing off
such that it would never finish.

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
