<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/arm/include/asm/percpu.h, branch v4.9.51</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>ARM: 8174/1: Use global stack register variable for percpu</title>
<updated>2014-11-13T23:58:06+00:00</updated>
<author>
<name>Mark Charlebois</name>
<email>charlebm@gmail.com</email>
</author>
<published>2014-09-26T23:31:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ccbd2da50db2111f6d055e3201c39a37e542ceb6'/>
<id>ccbd2da50db2111f6d055e3201c39a37e542ceb6</id>
<content type='text'>
Using global current_stack_pointer works on both clang and gcc.
current_stack_pointer is an unsigned long and needs to be cast
as a pointer to dereference.

Signed-off-by: Mark Charlebois &lt;charlebm@gmail.com&gt;
Signed-off-by: Behan Webster &lt;behanw@converseincode.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using global current_stack_pointer works on both clang and gcc.
current_stack_pointer is an unsigned long and needs to be cast
as a pointer to dereference.

Signed-off-by: Mark Charlebois &lt;charlebm@gmail.com&gt;
Signed-off-by: Behan Webster &lt;behanw@converseincode.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 7747/1: pcpu: ensure __my_cpu_offset cannot be re-ordered across barrier()</title>
<updated>2013-06-05T22:35:56+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2013-06-05T10:20:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=509eb76ebf9771abc9fe51859382df2571f11447'/>
<id>509eb76ebf9771abc9fe51859382df2571f11447</id>
<content type='text'>
__my_cpu_offset is non-volatile, since we want its value to be cached
when we access several per-cpu variables in a row with preemption
disabled. This means that we rely on preempt_{en,dis}able to hazard
with the operation via the barrier() macro, so that we can't end up
migrating CPUs without reloading the per-cpu offset.

Unfortunately, GCC doesn't treat a "memory" clobber on a non-volatile
asm block as a side-effect, and will happily re-order it before other
memory clobbers (including those in prempt_disable()) and cache the
value. This has been observed to break the cmpxchg logic in the slub
allocator, leading to livelock in kmem_cache_alloc in mainline kernels.

This patch adds a dummy memory input operand to __my_cpu_offset,
forcing it to be ordered with respect to the barrier() macro.

Cc: &lt;stable@vger.kernel.org&gt;
Cc: Rob Herring &lt;rob.herring@calxeda.com&gt;
Reviewed-by: Nicolas Pitre &lt;nico@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__my_cpu_offset is non-volatile, since we want its value to be cached
when we access several per-cpu variables in a row with preemption
disabled. This means that we rely on preempt_{en,dis}able to hazard
with the operation via the barrier() macro, so that we can't end up
migrating CPUs without reloading the per-cpu offset.

Unfortunately, GCC doesn't treat a "memory" clobber on a non-volatile
asm block as a side-effect, and will happily re-order it before other
memory clobbers (including those in prempt_disable()) and cache the
value. This has been observed to break the cmpxchg logic in the slub
allocator, leading to livelock in kmem_cache_alloc in mainline kernels.

This patch adds a dummy memory input operand to __my_cpu_offset,
forcing it to be ordered with respect to the barrier() macro.

Cc: &lt;stable@vger.kernel.org&gt;
Cc: Rob Herring &lt;rob.herring@calxeda.com&gt;
Reviewed-by: Nicolas Pitre &lt;nico@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 7587/1: implement optimized percpu variable access</title>
<updated>2012-12-03T11:16:36+00:00</updated>
<author>
<name>Rob Herring</name>
<email>rob.herring@calxeda.com</email>
</author>
<published>2012-11-29T19:39:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=14318efb322e2fe1a034c69463d725209eb9d548'/>
<id>14318efb322e2fe1a034c69463d725209eb9d548</id>
<content type='text'>
Use the previously unused TPIDRPRW register to store percpu offsets.
TPIDRPRW is only accessible in PL1, so it can only be used in the kernel.

This replaces 2 loads with a mrc instruction for each percpu variable
access. With hackbench, the performance improvement is 1.4% on Cortex-A9
(highbank). Taking an average of 30 runs of "hackbench -l 1000" yields:

Before: 6.2191
After: 6.1348

Will Deacon reported similar delta on v6 with 11MPCore.

The asm "memory clobber" are needed here to ensure the percpu offset
gets reloaded. Testing by Will found that this would not happen in
__schedule() which is a bit of a special case as preemption is disabled
but the execution can move cores.

Signed-off-by: Rob Herring &lt;rob.herring@calxeda.com&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Nicolas Pitre &lt;nico@linaro.org&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the previously unused TPIDRPRW register to store percpu offsets.
TPIDRPRW is only accessible in PL1, so it can only be used in the kernel.

This replaces 2 loads with a mrc instruction for each percpu variable
access. With hackbench, the performance improvement is 1.4% on Cortex-A9
(highbank). Taking an average of 30 runs of "hackbench -l 1000" yields:

Before: 6.2191
After: 6.1348

Will Deacon reported similar delta on v6 with 11MPCore.

The asm "memory clobber" are needed here to ensure the percpu offset
gets reloaded. Testing by Will found that this would not happen in
__schedule() which is a bit of a special case as preemption is disabled
but the execution can move cores.

Signed-off-by: Rob Herring &lt;rob.herring@calxeda.com&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Nicolas Pitre &lt;nico@linaro.org&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 7006/1: Migrate to asm-generic wrapper support</title>
<updated>2011-10-17T08:12:40+00:00</updated>
<author>
<name>Stephen Boyd</name>
<email>sboyd@codeaurora.org</email>
</author>
<published>2011-07-26T17:45:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3f8e288033ec7f52b570efad7c2eb42741f6d710'/>
<id>3f8e288033ec7f52b570efad7c2eb42741f6d710</id>
<content type='text'>
With d8ecc5c (kbuild: asm-generic support, 2011-04-27) we can
remove a handful of asm-generic wrappers in ARM code. Since the
generic version of sizes.h doesn't contain SZ_48M, we replace
the 4 users of SZ_48M with the equivalent SZ_32M + SZ_16M.

Signed-off-by: Stephen Boyd &lt;sboyd@codeaurora.org&gt;
Cc: Imre Kaloz &lt;kaloz@openwrt.org&gt;
Acked-by: Krzysztof Halasa &lt;khc@pm.waw.pl&gt;
Cc: Eric Miao &lt;eric.y.miao@gmail.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With d8ecc5c (kbuild: asm-generic support, 2011-04-27) we can
remove a handful of asm-generic wrappers in ARM code. Since the
generic version of sizes.h doesn't contain SZ_48M, we replace
the 4 users of SZ_48M with the equivalent SZ_32M + SZ_16M.

Signed-off-by: Stephen Boyd &lt;sboyd@codeaurora.org&gt;
Cc: Imre Kaloz &lt;kaloz@openwrt.org&gt;
Acked-by: Krzysztof Halasa &lt;khc@pm.waw.pl&gt;
Cc: Eric Miao &lt;eric.y.miao@gmail.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[ARM] move include/asm-arm to arch/arm/include/asm</title>
<updated>2008-08-02T20:32:35+00:00</updated>
<author>
<name>Russell King</name>
<email>rmk@dyn-67.arm.linux.org.uk</email>
</author>
<published>2008-08-02T09:55:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4baa9922430662431231ac637adedddbb0cfb2d7'/>
<id>4baa9922430662431231ac637adedddbb0cfb2d7</id>
<content type='text'>
Move platform independent header files to arch/arm/include/asm, leaving
those in asm/arch* and asm/plat* alone.

Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move platform independent header files to arch/arm/include/asm, leaving
those in asm/arch* and asm/plat* alone.

Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
