<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/parisc/kernel/syscall.S, branch v4.4.151</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>parisc: Define mb() and add memory barriers to assembler unlock sequences</title>
<updated>2018-08-15T15:42:05+00:00</updated>
<author>
<name>John David Anglin</name>
<email>dave.anglin@bell.net</email>
</author>
<published>2018-08-05T17:30:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=277b161b1a1d339985b4c24e796e86eae9511382'/>
<id>277b161b1a1d339985b4c24e796e86eae9511382</id>
<content type='text'>
commit fedb8da96355f5f64353625bf96dc69423ad1826 upstream.

For years I thought all parisc machines executed loads and stores in
order. However, Jeff Law recently indicated on gcc-patches that this is
not correct. There are various degrees of out-of-order execution all the
way back to the PA7xxx processor series (hit-under-miss). The PA8xxx
series has full out-of-order execution for both integer operations, and
loads and stores.

This is described in the following article:
http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml

For this reason, we need to define mb() and to insert a memory barrier
before the store unlocking spinlocks. This ensures that all memory
accesses are complete prior to unlocking. The ldcw instruction performs
the same function on entry.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit fedb8da96355f5f64353625bf96dc69423ad1826 upstream.

For years I thought all parisc machines executed loads and stores in
order. However, Jeff Law recently indicated on gcc-patches that this is
not correct. There are various degrees of out-of-order execution all the
way back to the PA7xxx processor series (hit-under-miss). The PA8xxx
series has full out-of-order execution for both integer operations, and
loads and stores.

This is described in the following article:
http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml

For this reason, we need to define mb() and to insert a memory barrier
before the store unlocking spinlocks. This ensures that all memory
accesses are complete prior to unlocking. The ldcw instruction performs
the same function on entry.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Fix validity check of pointer size argument in new CAS implementation</title>
<updated>2017-11-30T08:37:24+00:00</updated>
<author>
<name>John David Anglin</name>
<email>dave.anglin@bell.net</email>
</author>
<published>2017-11-11T22:11:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=937a91cd399220c11f513e899218b26b226ee334'/>
<id>937a91cd399220c11f513e899218b26b226ee334</id>
<content type='text'>
commit 05f016d2ca7a4fab99d5d5472168506ddf95e74f upstream.

As noted by Christoph Biedl, passing a pointer size of 4 in the new CAS
implementation causes a kernel crash.  The attached patch corrects the
off by one error in the argument validity check.

In reviewing the code, I noticed that we only perform word operations
with the pointer size argument.  The subi instruction intentionally uses
a word condition on 64-bit kernels.  Nullification was used instead of a
cmpib instruction as the branch should never be taken.  The shlw
pseudo-operation generates a depw,z instruction and it clears the target
before doing a shift left word deposit.  Thus, we don't need to clip the
upper 32 bits of this argument on 64-bit kernels.

Tested with a gcc testsuite run with a 64-bit kernel.  The gcc atomic
code in libgcc is the only direct user of the new CAS implementation
that I am aware of.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 05f016d2ca7a4fab99d5d5472168506ddf95e74f upstream.

As noted by Christoph Biedl, passing a pointer size of 4 in the new CAS
implementation causes a kernel crash.  The attached patch corrects the
off by one error in the argument validity check.

In reviewing the code, I noticed that we only perform word operations
with the pointer size argument.  The subi instruction intentionally uses
a word condition on 64-bit kernels.  Nullification was used instead of a
cmpib instruction as the branch should never be taken.  The shlw
pseudo-operation generates a depw,z instruction and it clears the target
before doing a shift left word deposit.  Thus, we don't need to clip the
upper 32 bits of this argument on 64-bit kernels.

Tested with a gcc testsuite run with a 64-bit kernel.  The gcc atomic
code in libgcc is the only direct user of the new CAS implementation
that I am aware of.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Fix double-word compare and exchange in LWS code on 32-bit kernels</title>
<updated>2017-10-27T08:23:17+00:00</updated>
<author>
<name>John David Anglin</name>
<email>dave.anglin@bell.net</email>
</author>
<published>2017-09-30T21:24:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fcc65ab173ebf797472b046f2d84663fbbe443a7'/>
<id>fcc65ab173ebf797472b046f2d84663fbbe443a7</id>
<content type='text'>
commit 374b3bf8e8b519f61eb9775888074c6e46b3bf0c upstream.

As discussed on the debian-hppa list, double-wordcompare and exchange
operations fail on 32-bit kernels.  Looking at the code, I realized that
the ",ma" completer does the wrong thing in the  "ldw,ma  4(%r26), %r29"
instruction.  This increments %r26 and causes the following store to
write to the wrong location.

Note by Helge Deller:
The patch applies cleanly to stable kernel series if this upstream
commit is merged in advance:
f4125cfdb300 ("parisc: Avoid trashing sr2 and sr3 in LWS code").

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Tested-by: Christoph Biedl &lt;debian.axhn@manchmal.in-ulm.de&gt;
Fixes: 89206491201c ("parisc: Implement new LWS CAS supporting 64 bit operations.")
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 374b3bf8e8b519f61eb9775888074c6e46b3bf0c upstream.

As discussed on the debian-hppa list, double-wordcompare and exchange
operations fail on 32-bit kernels.  Looking at the code, I realized that
the ",ma" completer does the wrong thing in the  "ldw,ma  4(%r26), %r29"
instruction.  This increments %r26 and causes the following store to
write to the wrong location.

Note by Helge Deller:
The patch applies cleanly to stable kernel series if this upstream
commit is merged in advance:
f4125cfdb300 ("parisc: Avoid trashing sr2 and sr3 in LWS code").

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Tested-by: Christoph Biedl &lt;debian.axhn@manchmal.in-ulm.de&gt;
Fixes: 89206491201c ("parisc: Implement new LWS CAS supporting 64 bit operations.")
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Avoid trashing sr2 and sr3 in LWS code</title>
<updated>2017-10-27T08:23:17+00:00</updated>
<author>
<name>John David Anglin</name>
<email>dave.anglin@bell.net</email>
</author>
<published>2016-10-28T20:13:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=558ca24dc296a859af75edf495a0972a00e9200d'/>
<id>558ca24dc296a859af75edf495a0972a00e9200d</id>
<content type='text'>
commit f4125cfdb3008363137f744c101e5d76ead760ba upstream.

There is no need to trash sr2 and sr3 in the Light-weight syscall (LWS).  sr2
already points to kernel space (it's zero in userspace, otherwise syscalls
wouldn't work), and since the LWS code is executed in userspace, we can simply
ignore to preload sr3.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f4125cfdb3008363137f744c101e5d76ead760ba upstream.

There is no need to trash sr2 and sr3 in the Light-weight syscall (LWS).  sr2
already points to kernel space (it's zero in userspace, otherwise syscalls
wouldn't work), and since the LWS code is executed in userspace, we can simply
ignore to preload sr3.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Ensure consistent state when switching to kernel stack at syscall entry</title>
<updated>2016-11-10T15:36:34+00:00</updated>
<author>
<name>John David Anglin</name>
<email>dave.anglin@bell.net</email>
</author>
<published>2016-10-29T03:00:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f2d9107bd0a0f86c4b3fb539a2d980ada926c276'/>
<id>f2d9107bd0a0f86c4b3fb539a2d980ada926c276</id>
<content type='text'>
commit 6ed518328d0189e0fdf1bb7c73290d546143ea66 upstream.

We have one critical section in the syscall entry path in which we switch from
the userspace stack to kernel stack. In the event of an external interrupt, the
interrupt code distinguishes between those two states by analyzing the value of
sr7. If sr7 is zero, it uses the kernel stack. Therefore it's important, that
the value of sr7 is in sync with the currently enabled stack.

This patch now disables interrupts while executing the critical section.  This
prevents the interrupt handler to possibly see an inconsistent state which in
the worst case can lead to crashes.

Interestingly, in the syscall exit path interrupts were already disabled in the
critical section which switches back to the userspace stack.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6ed518328d0189e0fdf1bb7c73290d546143ea66 upstream.

We have one critical section in the syscall entry path in which we switch from
the userspace stack to kernel stack. In the event of an external interrupt, the
interrupt code distinguishes between those two states by analyzing the value of
sr7. If sr7 is zero, it uses the kernel stack. Therefore it's important, that
the value of sr7 is in sync with the currently enabled stack.

This patch now disables interrupts while executing the critical section.  This
prevents the interrupt handler to possibly see an inconsistent state which in
the worst case can lead to crashes.

Interestingly, in the syscall exit path interrupts were already disabled in the
critical section which switches back to the userspace stack.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Fix ptrace syscall number and return value modification</title>
<updated>2016-03-09T23:34:50+00:00</updated>
<author>
<name>Helge Deller</name>
<email>deller@gmx.de</email>
</author>
<published>2016-01-19T15:08:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=59dc7ab2e8ab1a0fd0ff104a67f7827436435a46'/>
<id>59dc7ab2e8ab1a0fd0ff104a67f7827436435a46</id>
<content type='text'>
commit 98e8b6c9ac9d1b1e9d1122dfa6783d5d566bb8f7 upstream.

Mike Frysinger reported that his ptrace testcase showed strange
behaviour on parisc: It was not possible to avoid a syscall and the
return value of a syscall couldn't be changed.

To modify a syscall number, we were missing to save the new syscall
number to gr20 which is then picked up later in assembly again.

The effect that the return value couldn't be changed is a side-effect of
another bug in the assembly code. When a process is ptraced, userspace
expects each syscall to report entrance and exit of a syscall.  If a
syscall number was given which doesn't exist, we jumped to the normal
syscall exit code instead of informing userspace that the (non-existant)
syscall exits. This unexpected behaviour confuses userspace and thus the
bug was misinterpreted as if we can't change the return value.

This patch fixes both problems and was tested on 64bit kernel with
32bit userspace.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Cc: Mike Frysinger &lt;vapier@gentoo.org&gt;
Tested-by: Mike Frysinger &lt;vapier@gentoo.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 98e8b6c9ac9d1b1e9d1122dfa6783d5d566bb8f7 upstream.

Mike Frysinger reported that his ptrace testcase showed strange
behaviour on parisc: It was not possible to avoid a syscall and the
return value of a syscall couldn't be changed.

To modify a syscall number, we were missing to save the new syscall
number to gr20 which is then picked up later in assembly again.

The effect that the return value couldn't be changed is a side-effect of
another bug in the assembly code. When a process is ptraced, userspace
expects each syscall to report entrance and exit of a syscall.  If a
syscall number was given which doesn't exist, we jumped to the normal
syscall exit code instead of informing userspace that the (non-existant)
syscall exits. This unexpected behaviour confuses userspace and thus the
bug was misinterpreted as if we can't change the return value.

This patch fixes both problems and was tested on 64bit kernel with
32bit userspace.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Cc: Mike Frysinger &lt;vapier@gentoo.org&gt;
Tested-by: Mike Frysinger &lt;vapier@gentoo.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Use long branch to do_syscall_trace_exit</title>
<updated>2015-11-22T11:23:02+00:00</updated>
<author>
<name>Helge Deller</name>
<email>deller@gmx.de</email>
</author>
<published>2015-11-20T10:22:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=337685e556c6f080bf4775950e3b9493852715f8'/>
<id>337685e556c6f080bf4775950e3b9493852715f8</id>
<content type='text'>
Use the 22bit instead of the 17bit branch instruction on a 64bit kernel
to reach the do_syscall_trace_exit function from the gateway page.
A huge page enabled kernel may need the additional branch distance bits.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the 22bit instead of the 17bit branch instruction on a 64bit kernel
to reach the do_syscall_trace_exit function from the gateway page.
A huge page enabled kernel may need the additional branch distance bits.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Use double word condition in 64bit CAS operation</title>
<updated>2015-09-08T14:15:54+00:00</updated>
<author>
<name>John David Anglin</name>
<email>dave.anglin@bell.net</email>
</author>
<published>2015-09-08T00:13:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1b59ddfcf1678de38a1f8ca9fb8ea5eebeff1843'/>
<id>1b59ddfcf1678de38a1f8ca9fb8ea5eebeff1843</id>
<content type='text'>
The attached change fixes the condition used in the "sub" instruction.
A double word comparison is needed.  This fixes the 64-bit LWS CAS
operation on 64-bit kernels.

I can now enable 64-bit atomic support in GCC.

Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: John David Anglin &lt;dave.anglin&gt;
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The attached change fixes the condition used in the "sub" instruction.
A double word comparison is needed.  This fixes the 64-bit LWS CAS
operation on 64-bit kernels.

I can now enable 64-bit atomic support in GCC.

Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: John David Anglin &lt;dave.anglin&gt;
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Implement new LWS CAS supporting 64 bit operations.</title>
<updated>2014-09-13T20:40:48+00:00</updated>
<author>
<name>Guy Martin</name>
<email>gmsoft@tuxicoman.be</email>
</author>
<published>2014-09-12T16:02:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=89206491201cbd1571009b36292af781cef74c1b'/>
<id>89206491201cbd1571009b36292af781cef74c1b</id>
<content type='text'>
The current LWS cas only works correctly for 32bit. The new LWS allows
for CAS operations of variable size.

Signed-off-by: Guy Martin &lt;gmsoft@tuxicoman.be&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 3.13+
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current LWS cas only works correctly for 32bit. The new LWS allows
for CAS operations of variable size.

Signed-off-by: Guy Martin &lt;gmsoft@tuxicoman.be&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 3.13+
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: Improve LWS-CAS performance</title>
<updated>2014-05-15T19:12:26+00:00</updated>
<author>
<name>John David Anglin</name>
<email>dave.anglin@bell.net</email>
</author>
<published>2014-05-11T22:40:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c776cd89fc705fc8b5c2e5ad906bf5d791620fed'/>
<id>c776cd89fc705fc8b5c2e5ad906bf5d791620fed</id>
<content type='text'>
The attached change significantly improves the performance of the LWS-CAS code
in syscall.S.
This allows a number of packages to build (e.g., zeromq3, gtest and libxs)
that previously failed because slow LWS-CAS performance under contention. In
particular, interrupts taken while the lock was taken degraded performance
significantly.

The change does the following:

1) Disables interrupts around the CAS operation, and
2) Changes the loads and stores to use the ordered completer, "o", on
PA 2.0. "o" and "ma" with a zero offset are equivalent. The latter is
accepted on both PA 1.X and 2.0.

The use of ordered loads and stores probably makes no difference on all
existing hardware, but it seemed pedantically correct. In particular, the CAS
operation must complete before LDCW lock is released. As written before, a
processor could reorder the operations.

I don't believe the period interrupts are disabled is long enough to
significantly increase interrupt latency. For example, the TLB insert code is
longer. Worst case is a memory fault in the CAS operation.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The attached change significantly improves the performance of the LWS-CAS code
in syscall.S.
This allows a number of packages to build (e.g., zeromq3, gtest and libxs)
that previously failed because slow LWS-CAS performance under contention. In
particular, interrupts taken while the lock was taken degraded performance
significantly.

The change does the following:

1) Disables interrupts around the CAS operation, and
2) Changes the loads and stores to use the ordered completer, "o", on
PA 2.0. "o" and "ma" with a zero offset are equivalent. The latter is
accepted on both PA 1.X and 2.0.

The use of ordered loads and stores probably makes no difference on all
existing hardware, but it seemed pedantically correct. In particular, the CAS
operation must complete before LDCW lock is released. As written before, a
processor could reorder the operations.

I don't believe the period interrupts are disabled is long enough to
significantly increase interrupt latency. For example, the TLB insert code is
longer. Worst case is a memory fault in the CAS operation.

Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
