<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/powerpc/lib, branch v6.7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>powerpc/code-patching: introduce patch_instructions()</title>
<updated>2023-10-23T09:33:19+00:00</updated>
<author>
<name>Hari Bathini</name>
<email>hbathini@linux.ibm.com</email>
</author>
<published>2023-10-20T14:13:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=465cabc97b42405eb89380ea6ba8d8b03e4ae1a2'/>
<id>465cabc97b42405eb89380ea6ba8d8b03e4ae1a2</id>
<content type='text'>
patch_instruction() entails setting up pte, patching the instruction,
clearing the pte and flushing the tlb. If multiple instructions need
to be patched, every instruction would have to go through the above
drill unnecessarily. Instead, introduce patch_instructions() function
that sets up the pte, clears the pte and flushes the tlb only once
per page range of instructions to be patched. Duplicate most of the
patch_instruction() code instead of merging with it, to avoid the
performance degradation observed on ppc32, for patch_instruction(),
with the code path merged. Also, setup poking_init() always as BPF
expects poking_init() to be setup even when STRICT_KERNEL_RWX is off.

Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Acked-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231020141358.643575-2-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
patch_instruction() entails setting up pte, patching the instruction,
clearing the pte and flushing the tlb. If multiple instructions need
to be patched, every instruction would have to go through the above
drill unnecessarily. Instead, introduce patch_instructions() function
that sets up the pte, clears the pte and flushes the tlb only once
per page range of instructions to be patched. Duplicate most of the
patch_instruction() code instead of merging with it, to avoid the
performance degradation observed on ppc32, for patch_instruction(),
with the code path merged. Also, setup poking_init() always as BPF
expects poking_init() to be setup even when STRICT_KERNEL_RWX is off.

Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Acked-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231020141358.643575-2-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure</title>
<updated>2023-10-20T12:19:13+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2023-10-07T10:46:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=74726fda9fe306f848088ef73ec266cae0470d5b'/>
<id>74726fda9fe306f848088ef73ec266cae0470d5b</id>
<content type='text'>
Commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for
Radix MMU") added a hwsync for when __patch_instruction() fails,
we results in a quite odd unbalanced logic.

Instead of calling mb() when __patch_instruction() returns an error,
call mb() in the __patch_instruction()'s error path directly.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/e88b154eaf2efd9ff177d472d3411dcdec8ff4f5.1696675567.git.christophe.leroy@csgroup.eu

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for
Radix MMU") added a hwsync for when __patch_instruction() fails,
we results in a quite odd unbalanced logic.

Instead of calling mb() when __patch_instruction() returns an error,
call mb() in the __patch_instruction()'s error path directly.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/e88b154eaf2efd9ff177d472d3411dcdec8ff4f5.1696675567.git.christophe.leroy@csgroup.eu

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/qspinlock: Rename yield_propagate_owner tunable</title>
<updated>2023-10-20T11:43:34+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2023-10-16T12:43:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b629b541702b082d161c307c9d7d9867911fc933'/>
<id>b629b541702b082d161c307c9d7d9867911fc933</id>
<content type='text'>
Rename yield_propagate_owner to yield_sleepy_owner, which better
describes what it does (what, not how).

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-7-npiggin@gmail.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rename yield_propagate_owner to yield_sleepy_owner, which better
describes what it does (what, not how).

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-7-npiggin@gmail.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/qspinlock: Propagate sleepy if previous waiter is preempted</title>
<updated>2023-10-20T11:43:34+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2023-10-16T12:43:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1e6d5f725738fff83e1c3cbdb8547142eb8820c2'/>
<id>1e6d5f725738fff83e1c3cbdb8547142eb8820c2</id>
<content type='text'>
The sleepy (aka lock-owner-is-preempted) condition is propagated down
the queue by each waiter. If a waiter becomes preempted, it can no
longer propagate sleepy. To allow subsequent waiters to yield to the
lock owner, also check the lock owner in this case.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-6-npiggin@gmail.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The sleepy (aka lock-owner-is-preempted) condition is propagated down
the queue by each waiter. If a waiter becomes preempted, it can no
longer propagate sleepy. To allow subsequent waiters to yield to the
lock owner, also check the lock owner in this case.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-6-npiggin@gmail.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/qspinlock: don't propagate the not-sleepy state</title>
<updated>2023-10-20T11:43:34+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2023-10-16T12:43:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fcf77d44274b96a55cc74043561a4b3151b9ad24'/>
<id>fcf77d44274b96a55cc74043561a4b3151b9ad24</id>
<content type='text'>
To simplify things, don't propagate the not-sleepy condition back down
the queue. Instead, have the waiters clear their own node-&gt;sleepy when
finding the lock owner is not preempted.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-5-npiggin@gmail.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To simplify things, don't propagate the not-sleepy condition back down
the queue. Instead, have the waiters clear their own node-&gt;sleepy when
finding the lock owner is not preempted.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-5-npiggin@gmail.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/qspinlock: propagate owner preemptedness rather than CPU number</title>
<updated>2023-10-20T11:43:34+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2023-10-16T12:43:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fd8fae50c9c6117c4e05954deab2cc515508666b'/>
<id>fd8fae50c9c6117c4e05954deab2cc515508666b</id>
<content type='text'>
Rather than propagating the CPU number of the preempted lock owner,
just propagate whether the owner was preempted. Waiters must read the
lock value when yielding to it to prevent races anyway, so might as
well always load the owner CPU from the lock.

To further simplify the code, also don't propagate the -1 (or
sleepy=false in the new scheme) down the queue. Instead, have the
waiters clear it themselves when finding the lock owner is not
preempted.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-4-npiggin@gmail.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rather than propagating the CPU number of the preempted lock owner,
just propagate whether the owner was preempted. Waiters must read the
lock value when yielding to it to prevent races anyway, so might as
well always load the owner CPU from the lock.

To further simplify the code, also don't propagate the -1 (or
sleepy=false in the new scheme) down the queue. Instead, have the
waiters clear it themselves when finding the lock owner is not
preempted.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-4-npiggin@gmail.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/qspinlock: stop queued waiters trying to set lock sleepy</title>
<updated>2023-10-20T11:43:34+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2023-10-16T12:43:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f6568647382c667912245c8d07aa26c9c6d4d0c8'/>
<id>f6568647382c667912245c8d07aa26c9c6d4d0c8</id>
<content type='text'>
If a queued waiter notices the lock owner or the previous waiter has
been preempted, it attempts to mark the lock sleepy, but it does this
as a try-set operation using the original lock value it got when
queueing, which will become stale as the queue progresses, and the
try-set will fail. Drop this and just set the sleepy seen clock.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-3-npiggin@gmail.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a queued waiter notices the lock owner or the previous waiter has
been preempted, it attempts to mark the lock sleepy, but it does this
as a try-set operation using the original lock value it got when
queueing, which will become stale as the queue progresses, and the
try-set will fail. Drop this and just set the sleepy seen clock.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Reviewed-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-3-npiggin@gmail.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/qspinlock: Fix stale propagated yield_cpu</title>
<updated>2023-10-18T10:07:21+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2023-10-16T12:43:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f9bc9bbe8afdf83412728f0b464979a72a3b9ec2'/>
<id>f9bc9bbe8afdf83412728f0b464979a72a3b9ec2</id>
<content type='text'>
yield_cpu is a sample of a preempted lock holder that gets propagated
back through the queue. Queued waiters use this to yield to the
preempted lock holder without continually sampling the lock word (which
would defeat the purpose of MCS queueing by bouncing the cache line).

The problem is that yield_cpu can become stale. It can take some time to
be passed down the chain, and if any queued waiter gets preempted then
it will cease to propagate the yield_cpu to later waiters.

This can result in yielding to a CPU that no longer holds the lock,
which is bad, but particularly if it is currently in H_CEDE (idle),
then it appears to be preempted and some hypervisors (PowerVM) can
cause very long H_CONFER latencies waiting for H_CEDE wakeup. This
results in latency spikes and hard lockups on oversubscribed
partitions with lock contention.

This is a minimal fix. Before yielding to yield_cpu, sample the lock
word to confirm yield_cpu is still the owner, and bail out of it is not.

Thanks to a bunch of people who reported this and tracked down the
exact problem using tracepoints and dispatch trace logs.

Fixes: 28db61e207ea ("powerpc/qspinlock: allow propagation of yield CPU down the queue")
Cc: stable@vger.kernel.org # v6.2+
Reported-by: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Reported-by: Laurent Dufour &lt;ldufour@linux.ibm.com&gt;
Reported-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Debugged-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-2-npiggin@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
yield_cpu is a sample of a preempted lock holder that gets propagated
back through the queue. Queued waiters use this to yield to the
preempted lock holder without continually sampling the lock word (which
would defeat the purpose of MCS queueing by bouncing the cache line).

The problem is that yield_cpu can become stale. It can take some time to
be passed down the chain, and if any queued waiter gets preempted then
it will cease to propagate the yield_cpu to later waiters.

This can result in yielding to a CPU that no longer holds the lock,
which is bad, but particularly if it is currently in H_CEDE (idle),
then it appears to be preempted and some hypervisors (PowerVM) can
cause very long H_CONFER latencies waiting for H_CEDE wakeup. This
results in latency spikes and hard lockups on oversubscribed
partitions with lock contention.

This is a minimal fix. Before yielding to yield_cpu, sample the lock
word to confirm yield_cpu is still the owner, and bail out of it is not.

Thanks to a bunch of people who reported this and tracked down the
exact problem using tracepoints and dispatch trace logs.

Fixes: 28db61e207ea ("powerpc/qspinlock: allow propagation of yield CPU down the queue")
Cc: stable@vger.kernel.org # v6.2+
Reported-by: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Reported-by: Laurent Dufour &lt;ldufour@linux.ibm.com&gt;
Reported-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Debugged-by: "Nysal Jan K.A" &lt;nysal@linux.ibm.com&gt;
Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231016124305.139923-2-npiggin@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Drop zalloc_maybe_bootmem()</title>
<updated>2023-08-24T12:33:16+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2023-08-23T05:54:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fabdb27da78afb93b0a83c0579025cb8d05c0d2d'/>
<id>fabdb27da78afb93b0a83c0579025cb8d05c0d2d</id>
<content type='text'>
The only callers of zalloc_maybe_bootmem() are PCI setup routines. These
used to be called early during boot before slab setup, and also during
runtime due to hotplug.

But commit 5537fcb319d0 ("powerpc/pci: Add ppc_md.discover_phbs()")
moved the boot-time calls later, after slab setup, meaning there's no
longer any need for zalloc_maybe_bootmem(), kzalloc() can be used in all
cases.

Reviewed-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230823055430.752550-1-mpe@ellerman.id.au

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The only callers of zalloc_maybe_bootmem() are PCI setup routines. These
used to be called early during boot before slab setup, and also during
runtime due to hotplug.

But commit 5537fcb319d0 ("powerpc/pci: Add ppc_md.discover_phbs()")
moved the boot-time calls later, after slab setup, meaning there's no
longer any need for zalloc_maybe_bootmem(), kzalloc() can be used in all
cases.

Reviewed-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230823055430.752550-1-mpe@ellerman.id.au

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: replace #include &lt;asm/export.h&gt; with #include &lt;linux/export.h&gt;</title>
<updated>2023-08-16T13:54:48+00:00</updated>
<author>
<name>Masahiro Yamada</name>
<email>masahiroy@kernel.org</email>
</author>
<published>2023-08-06T15:09:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=393261828740c3ed95fc810c3f4c1018b86af7e5'/>
<id>393261828740c3ed95fc810c3f4c1018b86af7e5</id>
<content type='text'>
Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
deprecated &lt;asm/export.h&gt;, which is now a wrapper of &lt;linux/export.h&gt;.

Replace #include &lt;asm/export.h&gt; with #include &lt;linux/export.h&gt;.

After all the &lt;asm/export.h&gt; lines are converted, &lt;asm/export.h&gt; and
&lt;asm-generic/export.h&gt; will be removed.

Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
[mpe: Fixup selftests that stub asm/export.h]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230806150954.394189-2-masahiroy@kernel.org

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
deprecated &lt;asm/export.h&gt;, which is now a wrapper of &lt;linux/export.h&gt;.

Replace #include &lt;asm/export.h&gt; with #include &lt;linux/export.h&gt;.

After all the &lt;asm/export.h&gt; lines are converted, &lt;asm/export.h&gt; and
&lt;asm-generic/export.h&gt; will be removed.

Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
[mpe: Fixup selftests that stub asm/export.h]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230806150954.394189-2-masahiroy@kernel.org

</pre>
</div>
</content>
</entry>
</feed>
