<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/arm/lib/bitops.h, branch v4.2.7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+</title>
<updated>2014-07-18T11:29:04+00:00</updated>
<author>
<name>Russell King</name>
<email>rmk+kernel@arm.linux.org.uk</email>
</author>
<published>2014-06-30T15:29:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6ebbf2ce437b33022d30badd49dc94d33ecfa498'/>
<id>6ebbf2ce437b33022d30badd49dc94d33ecfa498</id>
<content type='text'>
ARMv6 and greater introduced a new instruction ("bx") which can be used
to return from function calls.  Recent CPUs perform better when the
"bx lr" instruction is used rather than the "mov pc, lr" instruction,
and this sequence is strongly recommended to be used by the ARM
architecture manual (section A.4.1.1).

We provide a new macro "ret" with all its variants for the condition
code which will resolve to the appropriate instruction.

Rather than doing this piecemeal, and miss some instances, change all
the "mov pc" instances to use the new macro, with the exception of
the "movs" instruction and the kprobes code.  This allows us to detect
the "mov pc, lr" case and fix it up - and also gives us the possibility
of deploying this for other registers depending on the CPU selection.

Reported-by: Will Deacon &lt;will.deacon@arm.com&gt;
Tested-by: Stephen Warren &lt;swarren@nvidia.com&gt; # Tegra Jetson TK1
Tested-by: Robert Jarzmik &lt;robert.jarzmik@free.fr&gt; # mioa701_bootresume.S
Tested-by: Andrew Lunn &lt;andrew@lunn.ch&gt; # Kirkwood
Tested-by: Shawn Guo &lt;shawn.guo@freescale.com&gt;
Tested-by: Tony Lindgren &lt;tony@atomide.com&gt; # OMAPs
Tested-by: Gregory CLEMENT &lt;gregory.clement@free-electrons.com&gt; # Armada XP, 375, 385
Acked-by: Sekhar Nori &lt;nsekhar@ti.com&gt; # DaVinci
Acked-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt; # kvm/hyp
Acked-by: Haojian Zhuang &lt;haojian.zhuang@gmail.com&gt; # PXA3xx
Acked-by: Stefano Stabellini &lt;stefano.stabellini@eu.citrix.com&gt; # Xen
Tested-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt; # ARMv7M
Tested-by: Simon Horman &lt;horms+renesas@verge.net.au&gt; # Shmobile
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ARMv6 and greater introduced a new instruction ("bx") which can be used
to return from function calls.  Recent CPUs perform better when the
"bx lr" instruction is used rather than the "mov pc, lr" instruction,
and this sequence is strongly recommended to be used by the ARM
architecture manual (section A.4.1.1).

We provide a new macro "ret" with all its variants for the condition
code which will resolve to the appropriate instruction.

Rather than doing this piecemeal, and miss some instances, change all
the "mov pc" instances to use the new macro, with the exception of
the "movs" instruction and the kprobes code.  This allows us to detect
the "mov pc, lr" case and fix it up - and also gives us the possibility
of deploying this for other registers depending on the CPU selection.

Reported-by: Will Deacon &lt;will.deacon@arm.com&gt;
Tested-by: Stephen Warren &lt;swarren@nvidia.com&gt; # Tegra Jetson TK1
Tested-by: Robert Jarzmik &lt;robert.jarzmik@free.fr&gt; # mioa701_bootresume.S
Tested-by: Andrew Lunn &lt;andrew@lunn.ch&gt; # Kirkwood
Tested-by: Shawn Guo &lt;shawn.guo@freescale.com&gt;
Tested-by: Tony Lindgren &lt;tony@atomide.com&gt; # OMAPs
Tested-by: Gregory CLEMENT &lt;gregory.clement@free-electrons.com&gt; # Armada XP, 375, 385
Acked-by: Sekhar Nori &lt;nsekhar@ti.com&gt; # DaVinci
Acked-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt; # kvm/hyp
Acked-by: Haojian Zhuang &lt;haojian.zhuang@gmail.com&gt; # PXA3xx
Acked-by: Stefano Stabellini &lt;stefano.stabellini@eu.citrix.com&gt; # Xen
Tested-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt; # ARMv7M
Tested-by: Simon Horman &lt;horms+renesas@verge.net.au&gt; # Shmobile
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 7984/1: prefetch: add prefetchw invocations for barriered atomics</title>
<updated>2014-02-25T11:30:20+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2014-02-21T16:01:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c32ffce0f66e5d1d4856254516e24f5ef275cd00'/>
<id>c32ffce0f66e5d1d4856254516e24f5ef275cd00</id>
<content type='text'>
After a bunch of benchmarking on the interaction between dmb and pldw,
it turns out that issuing the pldw *after* the dmb instruction can
give modest performance gains (~3% atomic_add_return improvement on a
dual A15).

This patch adds prefetchw invocations to our barriered atomic operations
including cmpxchg, test_and_xxx and futexes.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After a bunch of benchmarking on the interaction between dmb and pldw,
it turns out that issuing the pldw *after* the dmb instruction can
give modest performance gains (~3% atomic_add_return improvement on a
dual A15).

This patch adds prefetchw invocations to our barriered atomic operations
including cmpxchg, test_and_xxx and futexes.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 7893/1: bitops: only emit .arch_extension mp if CONFIG_SMP</title>
<updated>2013-11-20T23:05:53+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2013-11-19T14:46:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b7ec699405f55667caeb46d96229d75bf33a83ad'/>
<id>b7ec699405f55667caeb46d96229d75bf33a83ad</id>
<content type='text'>
Uwe reported a build failure when targetting a NOMMU platform with my
recent prefetch changes:

  arch/arm/lib/changebit.S: Assembler messages:
  arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is
			not allowed for the current base architecture

This is due to use of the .arch_extension mp directive immediately prior
to an ALT_SMP(...) instruction. Whilst the ALT_SMP macro will expand to
nothing if !CONFIG_SMP, gas will still choke on the directive.

This patch fixes the issue by only emitting the sequence (including the
directive) if CONFIG_SMP=y.

Tested-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Uwe reported a build failure when targetting a NOMMU platform with my
recent prefetch changes:

  arch/arm/lib/changebit.S: Assembler messages:
  arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is
			not allowed for the current base architecture

This is due to use of the .arch_extension mp directive immediately prior
to an ALT_SMP(...) instruction. Whilst the ALT_SMP macro will expand to
nothing if !CONFIG_SMP, gas will still choke on the directive.

This patch fixes the issue by only emitting the sequence (including the
directive) if CONFIG_SMP=y.

Tested-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: bitops: prefetch the destination word for write prior to strex</title>
<updated>2013-09-30T15:42:56+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2013-06-27T11:01:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d779c07dd72098a7416d907494f958213b7726f3'/>
<id>d779c07dd72098a7416d907494f958213b7726f3</id>
<content type='text'>
The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.

This patch prefixes our atomic bitops implementation with prefetchw,
to try and grab the line in exclusive state from the start. The testop
macro is left alone, since the barrier semantics limit the usefulness
of prefetching data.

Acked-by: Nicolas Pitre &lt;nico@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.

This patch prefixes our atomic bitops implementation with prefetchw,
to try and grab the line in exclusive state from the start. The testop
macro is left alone, since the barrier semantics limit the usefulness
of prefetching data.

Acked-by: Nicolas Pitre &lt;nico@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 7171/1: unwind: add unwind directives to bitops assembly macros</title>
<updated>2011-11-26T21:58:53+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2011-11-23T10:28:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c36ef4b1762302a493c6cb754073bded084700e2'/>
<id>c36ef4b1762302a493c6cb754073bded084700e2</id>
<content type='text'>
The bitops functions (e.g. _test_and_set_bit) on ARM do not have unwind
annotations and therefore the kernel cannot backtrace out of them on a
fatal error (for example, NULL pointer dereference).

This patch annotates the bitops assembly macros with UNWIND annotations
so that we can produce a meaningful backtrace on error. Callers of the
macros are modified to pass their function name as a macro parameter,
enforcing that the macros are used as standalone function implementations.

Acked-by: Dave Martin &lt;dave.martin@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The bitops functions (e.g. _test_and_set_bit) on ARM do not have unwind
annotations and therefore the kernel cannot backtrace out of them on a
fatal error (for example, NULL pointer dereference).

This patch annotates the bitops assembly macros with UNWIND annotations
so that we can produce a meaningful backtrace on error. Callers of the
macros are modified to pass their function name as a macro parameter,
enforcing that the macros are used as standalone function implementations.

Acked-by: Dave Martin &lt;dave.martin@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 6653/1: bitops: Use BX instead of MOV PC,LR</title>
<updated>2011-02-19T16:07:21+00:00</updated>
<author>
<name>Dave Martin</name>
<email>dave.martin@linaro.org</email>
</author>
<published>2011-02-08T11:09:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3ba6e69ad887f8a814267ed36fd4bfbddf8855a9'/>
<id>3ba6e69ad887f8a814267ed36fd4bfbddf8855a9</id>
<content type='text'>
The kernel doesn't officially need to interwork, but using BX
wherever appropriate will help educate people into good assembler
coding habits.

BX is appropriate here because this code is predicated on
__LINUX_ARM_ARCH__ &gt;= 6

Signed-off-by: Dave Martin &lt;dave.martin@linaro.org&gt;
Acked-by: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kernel doesn't officially need to interwork, but using BX
wherever appropriate will help educate people into good assembler
coding habits.

BX is appropriate here because this code is predicated on
__LINUX_ARM_ARCH__ &gt;= 6

Signed-off-by: Dave Martin &lt;dave.martin@linaro.org&gt;
Acked-by: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: bitops: switch set/clear/change bitops to use ldrex/strex</title>
<updated>2011-02-02T21:23:25+00:00</updated>
<author>
<name>Russell King</name>
<email>rmk+kernel@arm.linux.org.uk</email>
</author>
<published>2011-01-16T18:02:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6323f0ccedf756dfe5f46549cec69a2d6d97937b'/>
<id>6323f0ccedf756dfe5f46549cec69a2d6d97937b</id>
<content type='text'>
Switch the set/clear/change bitops to use the word-based exclusive
operations, which are only present in a wider range of ARM architectures
than the byte-based exclusive operations.

Tested record:
- Nicolas Pitre: ext3,rw,le
- Sourav Poddar: nfs,le
- Will Deacon: ext3,rw,le
- Tony Lindgren: ext3+nfs,le

Reviewed-by: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Tested-by: Sourav Poddar &lt;sourav.poddar@ti.com&gt;
Tested-by: Will Deacon &lt;will.deacon@arm.com&gt;
Tested-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Switch the set/clear/change bitops to use the word-based exclusive
operations, which are only present in a wider range of ARM architectures
than the byte-based exclusive operations.

Tested record:
- Nicolas Pitre: ext3,rw,le
- Sourav Poddar: nfs,le
- Will Deacon: ext3,rw,le
- Tony Lindgren: ext3+nfs,le

Reviewed-by: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Tested-by: Sourav Poddar &lt;sourav.poddar@ti.com&gt;
Tested-by: Will Deacon &lt;will.deacon@arm.com&gt;
Tested-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: bitops: ensure set/clear/change bitops take a word-aligned pointer</title>
<updated>2011-02-02T21:21:53+00:00</updated>
<author>
<name>Russell King</name>
<email>rmk+kernel@arm.linux.org.uk</email>
</author>
<published>2011-01-16T17:59:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a16ede35a2659170c855c5d267776666c0630f1f'/>
<id>a16ede35a2659170c855c5d267776666c0630f1f</id>
<content type='text'>
Add additional instructions to our assembly bitops functions to ensure
that they only operate on word-aligned pointers.  This will be necessary
when we switch these operations to use the word-based exclusive
operations.

Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add additional instructions to our assembly bitops functions to ensure
that they only operate on word-aligned pointers.  This will be necessary
when we switch these operations to use the word-based exclusive
operations.

Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Complete irq tracing support for ARM</title>
<updated>2009-08-13T18:34:37+00:00</updated>
<author>
<name>Uwe Kleine-König</name>
<email>u.kleine-koenig@pengutronix.de</email>
</author>
<published>2009-08-13T18:38:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0d928b0b616d1c5c5fe76019a87cba171ca91633'/>
<id>0d928b0b616d1c5c5fe76019a87cba171ca91633</id>
<content type='text'>
Before this patch enabling and disabling irqs in assembler code and by
the hardware wasn't tracked completly.

I had to transpose two instructions in arch/arm/lib/bitops.h because
restore_irqs doesn't preserve the flags with CONFIG_TRACE_IRQFLAGS=y

Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;

Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before this patch enabling and disabling irqs in assembler code and by
the hardware wasn't tracked completly.

I had to transpose two instructions in arch/arm/lib/bitops.h because
restore_irqs doesn't preserve the flags with CONFIG_TRACE_IRQFLAGS=y

Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;

Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[ARM] barriers: improve xchg, bitops and atomic SMP barriers</title>
<updated>2009-05-28T18:39:27+00:00</updated>
<author>
<name>Russell King</name>
<email>rmk@dyn-67.arm.linux.org.uk</email>
</author>
<published>2009-05-25T19:58:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bac4e960b5ce2453d862beaf20e59aa68af3b43a'/>
<id>bac4e960b5ce2453d862beaf20e59aa68af3b43a</id>
<content type='text'>
Mathieu Desnoyers pointed out that the ARM barriers were lacking:

- cmpxchg, xchg and atomic add return need memory barriers on
  architectures which can reorder the relative order in which memory
  read/writes can be seen between CPUs, which seems to include recent
  ARM architectures. Those barriers are currently missing on ARM.

- test_and_xxx_bit were missing SMP barriers.

So put these barriers in.  Provide separate atomic_add/atomic_sub
operations which do not require barriers.

Reported-Reviewed-and-Acked-by: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mathieu Desnoyers pointed out that the ARM barriers were lacking:

- cmpxchg, xchg and atomic add return need memory barriers on
  architectures which can reorder the relative order in which memory
  read/writes can be seen between CPUs, which seems to include recent
  ARM architectures. Those barriers are currently missing on ARM.

- test_and_xxx_bit were missing SMP barriers.

So put these barriers in.  Provide separate atomic_add/atomic_sub
operations which do not require barriers.

Reported-Reviewed-and-Acked-by: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
