<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/mips/include/asm/barrier.h, branch v5.18</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3</title>
<updated>2019-10-07T16:43:08+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ae4cd0b1a4756344cb99c0004d156b585cf9e907'/>
<id>ae4cd0b1a4756344cb99c0004d156b585cf9e907</id>
<content type='text'>
Loongson3 systems with CONFIG_CPU_LOONGSON3_WORKAROUNDS enabled already
emit a full completion barrier as part of the inline assembly containing
LL/SC loops for atomic operations. As such the barrier emitted by
__smp_mb__before_atomic() is redundant, and we can remove it.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Loongson3 systems with CONFIG_CPU_LOONGSON3_WORKAROUNDS enabled already
emit a full completion barrier as part of the inline assembly containing
LL/SC loops for atomic operations. As such the barrier emitted by
__smp_mb__before_atomic() is redundant, and we can remove it.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: barrier: Remove loongson_llsc_mb()</title>
<updated>2019-10-07T16:43:06+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7f56b123548142fd48b2c6891977e8fda695a838'/>
<id>7f56b123548142fd48b2c6891977e8fda695a838</id>
<content type='text'>
The loongson_llsc_mb() macro is no longer used - instead barriers are
emitted as part of inline asm using the __SYNC() macro. Remove the
now-defunct loongson_llsc_mb() macro.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The loongson_llsc_mb() macro is no longer used - instead barriers are
emitted as part of inline asm using the __SYNC() macro. Remove the
now-defunct loongson_llsc_mb() macro.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: futex: Emit Loongson3 sync workarounds within asm</title>
<updated>2019-10-07T16:43:03+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3c1d3f0979721a39dd2980c97466127ce65aa130'/>
<id>3c1d3f0979721a39dd2980c97466127ce65aa130</id>
<content type='text'>
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync &amp; ll instructions respectively.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync &amp; ll instructions respectively.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: barrier: Clean up sync_ginv()</title>
<updated>2019-10-07T16:42:24+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=185d7d7a58194e3784e8dc2898756065f974090a'/>
<id>185d7d7a58194e3784e8dc2898756065f974090a</id>
<content type='text'>
Use the new __SYNC() infrastructure to implement sync_ginv(), for
consistency with much of the rest of the asm/barrier.h.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the new __SYNC() infrastructure to implement sync_ginv(), for
consistency with much of the rest of the asm/barrier.h.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: barrier: Clean up __sync() definition</title>
<updated>2019-10-07T16:42:22+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fe0065e56227a2f6a6ad717c6d8d871263e482a8'/>
<id>fe0065e56227a2f6a6ad717c6d8d871263e482a8</id>
<content type='text'>
Implement __sync() using the new __SYNC() infrastructure, which will
take care of not emitting an instruction for old R3k CPUs that don't
support it. The only behavioral difference is that __sync() will now
provide a compiler barrier on these old CPUs, but that seems like
reasonable behavior anyway.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement __sync() using the new __SYNC() infrastructure, which will
take care of not emitting an instruction for old R3k CPUs that don't
support it. The only behavioral difference is that __sync() will now
provide a compiler barrier on these old CPUs, but that seems like
reasonable behavior anyway.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery</title>
<updated>2019-10-07T16:42:21+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5c12a6eff6ae3ed32f1c4d6458e58e6c4e9b2352'/>
<id>5c12a6eff6ae3ed32f1c4d6458e58e6c4e9b2352</id>
<content type='text'>
The definition of fast_mb() is the same in both the Octeon &amp; non-Octeon
cases, so remove the duplication &amp; define it only once.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The definition of fast_mb() is the same in both the Octeon &amp; non-Octeon
cases, so remove the duplication &amp; define it only once.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: barrier: Clean up __smp_mb() definition</title>
<updated>2019-10-07T16:42:20+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=05e6da742b5b708057e84487576655e4d7238dd1'/>
<id>05e6da742b5b708057e84487576655e4d7238dd1</id>
<content type='text'>
We #ifdef on Cavium Octeon CPUs, but emit the same sync instruction in
both cases. Remove the #ifdef &amp; simply expand to the __sync() macro.

Whilst here indent the strong ordering case definitions to match the
indentation of the weak ordering ones, helping readability.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We #ifdef on Cavium Octeon CPUs, but emit the same sync instruction in
both cases. Remove the #ifdef &amp; simply expand to the __sync() macro.

Whilst here indent the strong ordering case definitions to match the
indentation of the weak ordering ones, helping readability.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: barrier: Clean up rmb() &amp; wmb() definitions</title>
<updated>2019-10-07T16:42:18+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=21e3134b3ec09e722cbcda69788f206adc8db1f4'/>
<id>21e3134b3ec09e722cbcda69788f206adc8db1f4</id>
<content type='text'>
Simplify our definitions of rmb() &amp; wmb() using the new __SYNC()
infrastructure.

The fast_rmb() &amp; fast_wmb() macros are removed, since they only provided
a level of indirection that made the code less readable &amp; weren't
directly used anywhere in the kernel tree.

The Octeon #ifdef'ery is removed, since the "syncw" instruction
previously used is merely an alias for "sync 4" which __SYNC() will emit
for the wmb sync type when the kernel is configured for an Octeon CPU.
Similarly __SYNC() will emit nothing for the rmb sync type in Octeon
configurations.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Simplify our definitions of rmb() &amp; wmb() using the new __SYNC()
infrastructure.

The fast_rmb() &amp; fast_wmb() macros are removed, since they only provided
a level of indirection that made the code less readable &amp; weren't
directly used anywhere in the kernel tree.

The Octeon #ifdef'ery is removed, since the "syncw" instruction
previously used is merely an alias for "sync 4" which __SYNC() will emit
for the wmb sync type when the kernel is configured for an Octeon CPU.
Similarly __SYNC() will emit nothing for the rmb sync type in Octeon
configurations.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: barrier: Add __SYNC() infrastructure</title>
<updated>2019-10-07T16:42:17+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2019-10-01T21:53:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bf92927251b3642c10f8562d4f884a785cdd1855'/>
<id>bf92927251b3642c10f8562d4f884a785cdd1855</id>
<content type='text'>
Introduce an asm/sync.h header which provides infrastructure that can be
used to generate sync instructions of various types, and for various
reasons. For example if we need a sync instruction that provides a full
completion barrier but only on systems which have weak memory ordering,
we can generate the appropriate assembly code using:

  __SYNC(full, weak_ordering)

When the kernel is configured to run on systems with weak memory
ordering (ie. CONFIG_WEAK_ORDERING is selected) we'll emit a sync
instruction. When the kernel is configured to run on systems with strong
memory ordering (ie. CONFIG_WEAK_ORDERING is not selected) we'll emit
nothing. The caller doesn't need to know which happened - it simply says
what it needs &amp; when, with no concern for checking the kernel
configuration.

There are some scenarios in which we may want to emit code only when we
*didn't* emit a sync instruction. For example, some Loongson3 CPUs
suffer from a bug that requires us to emit a sync instruction prior to
each ll instruction (enabled by CONFIG_CPU_LOONGSON3_WORKAROUNDS). In
cases where this bug workaround is enabled, it's wasteful to then have
more generic code emit another sync instruction to provide barriers we
need in general. A __SYNC_ELSE() macro allows for this, providing an
extra argument that contains code to be assembled only in cases where
the sync instruction was not emitted. For example if we have a scenario
in which we generally want to emit a release barrier but for affected
Loongson3 configurations upgrade that to a full completion barrier, we
can do that like so:

  __SYNC_ELSE(full, loongson3_war, __SYNC(rl, always))

The assembly generated by these macros can be used either as inline
assembly or in assembly source files.

Differing types of sync as provided by MIPSr6 are defined, but currently
they all generate a full completion barrier except in kernels configured
for Cavium Octeon systems. There the wmb sync-type is used, and rmb
syncs are omitted, as has been the case since commit 6b07d38aaa52
("MIPS: Octeon: Use optimized memory barrier primitives."). Using
__SYNC() with the wmb or rmb types will abstract away the Octeon
specific behavior and allow us to later clean up asm/barrier.h code that
currently includes a plethora of #ifdef's.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce an asm/sync.h header which provides infrastructure that can be
used to generate sync instructions of various types, and for various
reasons. For example if we need a sync instruction that provides a full
completion barrier but only on systems which have weak memory ordering,
we can generate the appropriate assembly code using:

  __SYNC(full, weak_ordering)

When the kernel is configured to run on systems with weak memory
ordering (ie. CONFIG_WEAK_ORDERING is selected) we'll emit a sync
instruction. When the kernel is configured to run on systems with strong
memory ordering (ie. CONFIG_WEAK_ORDERING is not selected) we'll emit
nothing. The caller doesn't need to know which happened - it simply says
what it needs &amp; when, with no concern for checking the kernel
configuration.

There are some scenarios in which we may want to emit code only when we
*didn't* emit a sync instruction. For example, some Loongson3 CPUs
suffer from a bug that requires us to emit a sync instruction prior to
each ll instruction (enabled by CONFIG_CPU_LOONGSON3_WORKAROUNDS). In
cases where this bug workaround is enabled, it's wasteful to then have
more generic code emit another sync instruction to provide barriers we
need in general. A __SYNC_ELSE() macro allows for this, providing an
extra argument that contains code to be assembled only in cases where
the sync instruction was not emitted. For example if we have a scenario
in which we generally want to emit a release barrier but for affected
Loongson3 configurations upgrade that to a full completion barrier, we
can do that like so:

  __SYNC_ELSE(full, loongson3_war, __SYNC(rl, always))

The assembly generated by these macros can be used either as inline
assembly or in assembly source files.

Differing types of sync as provided by MIPSr6 are defined, but currently
they all generate a full completion barrier except in kernels configured
for Cavium Octeon systems. There the wmb sync-type is used, and rmb
syncs are omitted, as has been the case since commit 6b07d38aaa52
("MIPS: Octeon: Use optimized memory barrier primitives."). Using
__SYNC() with the wmb or rmb types will abstract away the Octeon
specific behavior and allow us to later clean up asm/barrier.h code that
currently includes a plethora of #ifdef's.

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen &lt;chenhc@lemote.com&gt;
Cc: Jiaxun Yang &lt;jiaxun.yang@flygoat.com&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>mips/atomic: Fix smp_mb__{before,after}_atomic()</title>
<updated>2019-08-31T10:06:02+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-06-13T13:43:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=42344113ba7a1ed7b5654cd5270af0d5698d8521'/>
<id>42344113ba7a1ed7b5654cd5270af0d5698d8521</id>
<content type='text'>
Recent probing at the Linux Kernel Memory Model uncovered a
'surprise'. Strongly ordered architectures where the atomic RmW
primitive implies full memory ordering and
smp_mb__{before,after}_atomic() are a simple barrier() (such as MIPS
without WEAK_REORDERING_BEYOND_LLSC) fail for:

	*x = 1;
	atomic_inc(u);
	smp_mb__after_atomic();
	r0 = *y;

Because, while the atomic_inc() implies memory order, it
(surprisingly) does not provide a compiler barrier. This then allows
the compiler to re-order like so:

	atomic_inc(u);
	*x = 1;
	smp_mb__after_atomic();
	r0 = *y;

Which the CPU is then allowed to re-order (under TSO rules) like:

	atomic_inc(u);
	r0 = *y;
	*x = 1;

And this very much was not intended. Therefore strengthen the atomic
RmW ops to include a compiler barrier.

Reported-by: Andrea Parri &lt;andrea.parri@amarulasolutions.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Recent probing at the Linux Kernel Memory Model uncovered a
'surprise'. Strongly ordered architectures where the atomic RmW
primitive implies full memory ordering and
smp_mb__{before,after}_atomic() are a simple barrier() (such as MIPS
without WEAK_REORDERING_BEYOND_LLSC) fail for:

	*x = 1;
	atomic_inc(u);
	smp_mb__after_atomic();
	r0 = *y;

Because, while the atomic_inc() implies memory order, it
(surprisingly) does not provide a compiler barrier. This then allows
the compiler to re-order like so:

	atomic_inc(u);
	*x = 1;
	smp_mb__after_atomic();
	r0 = *y;

Which the CPU is then allowed to re-order (under TSO rules) like:

	atomic_inc(u);
	r0 = *y;
	*x = 1;

And this very much was not intended. Therefore strengthen the atomic
RmW ops to include a compiler barrier.

Reported-by: Andrea Parri &lt;andrea.parri@amarulasolutions.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
