<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/powerpc, branch v3.4.67</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>KVM: PPC: Book3S HV: Fix typo in saving DSCR</title>
<updated>2013-10-22T08:02:25+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2013-09-20T23:53:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2eec4f8e8f10e97104649c0a8fa0fbf6f3f51792'/>
<id>2eec4f8e8f10e97104649c0a8fa0fbf6f3f51792</id>
<content type='text'>
commit cfc860253abd73e1681696c08ea268d33285a2c4 upstream.

This fixes a typo in the code that saves the guest DSCR (Data Stream
Control Register) into the kvm_vcpu_arch struct on guest exit.  The
effect of the typo was that the DSCR value was saved in the wrong place,
so changes to the DSCR by the guest didn't persist across guest exit
and entry, and some host kernel memory got corrupted.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Acked-by: Alexander Graf &lt;agraf@suse.de&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit cfc860253abd73e1681696c08ea268d33285a2c4 upstream.

This fixes a typo in the code that saves the guest DSCR (Data Stream
Control Register) into the kvm_vcpu_arch struct on guest exit.  The
effect of the typo was that the DSCR value was saved in the wrong place,
so changes to the DSCR by the guest didn't persist across guest exit
and entry, and some host kernel memory got corrupted.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Acked-by: Alexander Graf &lt;agraf@suse.de&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix parameter clobber in csum_partial_copy_generic()</title>
<updated>2013-10-13T22:42:48+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.vnet.ibm.com</email>
</author>
<published>2013-10-01T06:54:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=77ddd59bbeb05825ee5126d128a31a0c6b461255'/>
<id>77ddd59bbeb05825ee5126d128a31a0c6b461255</id>
<content type='text'>
commit d9813c3681a36774b254c0cdc9cce53c9e22c756 upstream.

The csum_partial_copy_generic() uses register r7 to adjust the remaining
bytes to process.  Unfortunately, r7 also holds a parameter, namely the
address of the flag to set in case of access exceptions while reading
the source buffer.  Lacking a quantum implementation of PowerPC, this
commit instead uses register r9 to do the adjusting, leaving r7's
pointer uncorrupted.

Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d9813c3681a36774b254c0cdc9cce53c9e22c756 upstream.

The csum_partial_copy_generic() uses register r7 to adjust the remaining
bytes to process.  Unfortunately, r7 also holds a parameter, namely the
address of the flag to set in case of access exceptions while reading
the source buffer.  Lacking a quantum implementation of PowerPC, this
commit instead uses register r9 to do the adjusting, leaving r7's
pointer uncorrupted.

Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/vio: Fix modalias_show return values</title>
<updated>2013-10-13T22:42:48+00:00</updated>
<author>
<name>Prarit Bhargava</name>
<email>prarit@redhat.com</email>
</author>
<published>2013-09-23T13:33:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ff9c3c0a26b973735037843a7446d5fcc5e52ef0'/>
<id>ff9c3c0a26b973735037843a7446d5fcc5e52ef0</id>
<content type='text'>
commit e82b89a6f19bae73fb064d1b3dd91fcefbb478f4 upstream.

modalias_show() should return an empty string on error, not -ENODEV.

This causes the following false and annoying error:

&gt; find /sys/devices -name modalias -print0 | xargs -0 cat &gt;/dev/null
cat: /sys/devices/vio/4000/modalias: No such device
cat: /sys/devices/vio/4001/modalias: No such device
cat: /sys/devices/vio/4002/modalias: No such device
cat: /sys/devices/vio/4004/modalias: No such device
cat: /sys/devices/vio/modalias: No such device

Signed-off-by: Prarit Bhargava &lt;prarit@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e82b89a6f19bae73fb064d1b3dd91fcefbb478f4 upstream.

modalias_show() should return an empty string on error, not -ENODEV.

This causes the following false and annoying error:

&gt; find /sys/devices -name modalias -print0 | xargs -0 cat &gt;/dev/null
cat: /sys/devices/vio/4000/modalias: No such device
cat: /sys/devices/vio/4001/modalias: No such device
cat: /sys/devices/vio/4002/modalias: No such device
cat: /sys/devices/vio/4004/modalias: No such device
cat: /sys/devices/vio/modalias: No such device

Signed-off-by: Prarit Bhargava &lt;prarit@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/iommu: Use GFP_KERNEL instead of GFP_ATOMIC in iommu_init_table()</title>
<updated>2013-10-13T22:42:48+00:00</updated>
<author>
<name>Nishanth Aravamudan</name>
<email>nacc@linux.vnet.ibm.com</email>
</author>
<published>2013-10-01T21:04:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e6629f15623d31e21775a1e5c28b8d5bd6559506'/>
<id>e6629f15623d31e21775a1e5c28b8d5bd6559506</id>
<content type='text'>
commit 1cf389df090194a0976dc867b7fffe99d9d490cb upstream.

Under heavy (DLPAR?) stress, we tripped this panic() in
arch/powerpc/kernel/iommu.c::iommu_init_table():

	page = alloc_pages_node(nid, GFP_ATOMIC, get_order(sz));
	if (!page)
		panic("iommu_init_table: Can't allocate %ld bytes\n", sz);

Before the panic() we got a page allocation failure for an order-2
allocation. There appears to be memory free, but perhaps not in the
ATOMIC context. I looked through all the call-sites of
iommu_init_table() and didn't see any obvious reason to need an ATOMIC
allocation. Most call-sites in fact have an explicit GFP_KERNEL
allocation shortly before the call to iommu_init_table(), indicating we
are not in an atomic context. There is some indirection for some paths,
but I didn't see any locks indicating that GFP_KERNEL is inappropriate.

With this change under the same conditions, we have not been able to
reproduce the panic.

Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 1cf389df090194a0976dc867b7fffe99d9d490cb upstream.

Under heavy (DLPAR?) stress, we tripped this panic() in
arch/powerpc/kernel/iommu.c::iommu_init_table():

	page = alloc_pages_node(nid, GFP_ATOMIC, get_order(sz));
	if (!page)
		panic("iommu_init_table: Can't allocate %ld bytes\n", sz);

Before the panic() we got a page allocation failure for an order-2
allocation. There appears to be memory free, but perhaps not in the
ATOMIC context. I looked through all the call-sites of
iommu_init_table() and didn't see any obvious reason to need an ATOMIC
allocation. Most call-sites in fact have an explicit GFP_KERNEL
allocation shortly before the call to iommu_init_table(), indicating we
are not in an atomic context. There is some indirection for some paths,
but I didn't see any locks indicating that GFP_KERNEL is inappropriate.

With this change under the same conditions, we have not been able to
reproduce the panic.

Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Handle unaligned ldbrx/stdbrx</title>
<updated>2013-09-27T00:15:30+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2013-08-06T16:01:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=774620ba0d5c80f92d9949a54a029cf41b116c57'/>
<id>774620ba0d5c80f92d9949a54a029cf41b116c57</id>
<content type='text'>
commit 230aef7a6a23b6166bd4003bfff5af23c9bd381f upstream.

Normally when we haven't implemented an alignment handler for
a load or store instruction the process will be terminated.

The alignment handler uses the DSISR (or a pseudo one) to locate
the right handler. Unfortunately ldbrx and stdbrx overlap lfs and
stfs so we incorrectly think ldbrx is an lfs and stdbrx is an
stfs.

This bug is particularly nasty - instead of terminating the
process we apply an incorrect fixup and continue on.

With more and more overlapping instructions we should stop
creating a pseudo DSISR and index using the instruction directly,
but for now add a special case to catch ldbrx/stdbrx.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 230aef7a6a23b6166bd4003bfff5af23c9bd381f upstream.

Normally when we haven't implemented an alignment handler for
a load or store instruction the process will be terminated.

The alignment handler uses the DSISR (or a pseudo one) to locate
the right handler. Unfortunately ldbrx and stdbrx overlap lfs and
stfs so we incorrectly think ldbrx is an lfs and stdbrx is an
stfs.

This bug is particularly nasty - instead of terminating the
process we apply an incorrect fixup and continue on.

With more and more overlapping instructions we should stop
creating a pseudo DSISR and index using the instruction directly,
but for now add a special case to catch ldbrx/stdbrx.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Work around gcc miscompilation of __pa() on 64-bit</title>
<updated>2013-09-08T04:58:14+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2013-08-27T06:07:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7a72233b3de7a35cd106db00f46c38960f9698f4'/>
<id>7a72233b3de7a35cd106db00f46c38960f9698f4</id>
<content type='text'>
commit bdbc29c19b2633b1d9c52638fb732bcde7a2031a upstream.

On 64-bit, __pa(&amp;static_var) gets miscompiled by recent versions of
gcc as something like:

        addis 3,2,.LANCHOR1+4611686018427387904@toc@ha
        addi 3,3,.LANCHOR1+4611686018427387904@toc@l

This ends up effectively ignoring the offset, since its bottom 32 bits
are zero, and means that the result of __pa() still has 0xC in the top
nibble.  This happens with gcc 4.8.1, at least.

To work around this, for 64-bit we make __pa() use an AND operator,
and for symmetry, we make __va() use an OR operator.  Using an AND
operator rather than a subtraction ends up with slightly shorter code
since it can be done with a single clrldi instruction, whereas it
takes three instructions to form the constant (-PAGE_OFFSET) and add
it on.  (Note that MEMORY_START is always 0 on 64-bit.)

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bdbc29c19b2633b1d9c52638fb732bcde7a2031a upstream.

On 64-bit, __pa(&amp;static_var) gets miscompiled by recent versions of
gcc as something like:

        addis 3,2,.LANCHOR1+4611686018427387904@toc@ha
        addi 3,3,.LANCHOR1+4611686018427387904@toc@l

This ends up effectively ignoring the offset, since its bottom 32 bits
are zero, and means that the result of __pa() still has 0xC in the top
nibble.  This happens with gcc 4.8.1, at least.

To work around this, for 64-bit we make __pa() use an AND operator,
and for symmetry, we make __va() use an OR operator.  Using an AND
operator rather than a subtraction ends up with slightly shorter code
since it can be done with a single clrldi instruction, whereas it
takes three instructions to form the constant (-PAGE_OFFSET) and add
it on.  (Note that MEMORY_START is always 0 on 64-bit.)

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/numa: Avoid stupid uninitialized warning from gcc</title>
<updated>2013-08-20T15:26:28+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-07-05T16:30:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=30d03769e318ea195cbc30333eab29a653c9bdf6'/>
<id>30d03769e318ea195cbc30333eab29a653c9bdf6</id>
<content type='text'>
commit aa709f3bc92c6daaf177cd7e3446da2ef64426c6 upstream.

Newer gcc are being a bit blind here (it's pretty obvious we don't
reach the code path using the array if we haven't initialized the
pointer) but none of that is performance critical so let's just
silence it.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit aa709f3bc92c6daaf177cd7e3446da2ef64426c6 upstream.

Newer gcc are being a bit blind here (it's pretty obvious we don't
reach the code path using the array if we haven't initialized the
pointer) but none of that is performance critical so let's just
silence it.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/modules: Module CRC relocation fix causes perf issues</title>
<updated>2013-08-04T08:25:55+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2013-07-15T04:04:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ef445a321016a464e32b1b4d581efa61706b12cf'/>
<id>ef445a321016a464e32b1b4d581efa61706b12cf</id>
<content type='text'>
commit 0e0ed6406e61434d3f38fb58aa8464ec4722b77e upstream.

Module CRCs are implemented as absolute symbols that get resolved by
a linker script. We build an intermediate .o that contains an
unresolved symbol for each CRC. genksysms parses this .o, calculates
the CRCs and writes a linker script that "resolves" the symbols to
the calculated CRC.

Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols
that need relocating and relocates them at boot. Commit d4703aef
(module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y)
added a hook to reverse the bogus relocations. Part of this patch
created a symbol at 0x0:

# head -2 /proc/kallsyms
0000000000000000 T reloc_start
c000000000000000 T .__start

This reloc_start symbol is causing lots of confusion to perf. It
thinks reloc_start is a massive function that stretches from 0x0 to
0xc000000000000000 and we get various cryptic errors out of perf,
including:

problem incrementing symbol count, skipping event

This patch removes the  reloc_start linker script label and instead
defines it as PHYSICAL_START. We also need to wrap it with
CONFIG_PPC64 because the ppc32 kernel can set a non zero
PHYSICAL_START at compile time and we wouldn't want to subtract
it from the CRCs in that case.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Acked-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0e0ed6406e61434d3f38fb58aa8464ec4722b77e upstream.

Module CRCs are implemented as absolute symbols that get resolved by
a linker script. We build an intermediate .o that contains an
unresolved symbol for each CRC. genksysms parses this .o, calculates
the CRCs and writes a linker script that "resolves" the symbols to
the calculated CRC.

Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols
that need relocating and relocates them at boot. Commit d4703aef
(module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y)
added a hook to reverse the bogus relocations. Part of this patch
created a symbol at 0x0:

# head -2 /proc/kallsyms
0000000000000000 T reloc_start
c000000000000000 T .__start

This reloc_start symbol is causing lots of confusion to perf. It
thinks reloc_start is a massive function that stretches from 0x0 to
0xc000000000000000 and we get various cryptic errors out of perf,
including:

problem incrementing symbol count, skipping event

This patch removes the  reloc_start linker script label and instead
defines it as PHYSICAL_START. We also need to wrap it with
CONFIG_PPC64 because the ppc32 kernel can set a non zero
PHYSICAL_START at compile time and we wouldn't want to subtract
it from the CRCs in that case.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Acked-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix missing/delayed calls to irq_work</title>
<updated>2013-06-20T18:58:47+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2013-06-15T02:13:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=05266fa348c3a30a3fef2c2993ca73e28b17232f'/>
<id>05266fa348c3a30a3fef2c2993ca73e28b17232f</id>
<content type='text'>
commit 230b3034793247f61e6a0b08c44cf415f6d92981 upstream.

When replaying interrupts (as a result of the interrupt occurring
while soft-disabled), in the case of the decrementer, we are exclusively
testing for a pending timer target. However we also use decrementer
interrupts to trigger the new "irq_work", which in this case would
be missed.

This change the logic to force a replay in both cases of a timer
boundary reached and a decrementer interrupt having actually occurred
while disabled. The former test is still useful to catch cases where
a CPU having been hard-disabled for a long time completely misses the
interrupt due to a decrementer rollover.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Tested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 230b3034793247f61e6a0b08c44cf415f6d92981 upstream.

When replaying interrupts (as a result of the interrupt occurring
while soft-disabled), in the case of the decrementer, we are exclusively
testing for a pending timer target. However we also use decrementer
interrupts to trigger the new "irq_work", which in this case would
be missed.

This change the logic to force a replay in both cases of a timer
boundary reached and a decrementer interrupt having actually occurred
while disabled. The former test is still useful to catch cases where
a CPU having been hard-disabled for a long time completely misses the
interrupt due to a decrementer rollover.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Tested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix stack overflow crash in resume_kernel when ftracing</title>
<updated>2013-06-20T18:58:47+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2013-06-13T11:04:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0031e5e3e3cd8a21fe1224e0e51d646ca4099eab'/>
<id>0031e5e3e3cd8a21fe1224e0e51d646ca4099eab</id>
<content type='text'>
commit 0e37739b1c96d65e6433998454985de994383019 upstream.

It's possible for us to crash when running with ftrace enabled, eg:

  Bad kernel stack pointer bffffd12 at c00000000000a454
  cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
      pc: c00000000000a454: resume_kernel+0x34/0x60
      lr: c00000000000335c: performance_monitor_common+0x15c/0x180
      sp: bffffd12
     msr: 8000000000001032
     dar: bffffd12
   dsisr: 42000000

If we look at current's stack (paca-&gt;__current-&gt;stack) we see it is
equal to c0000002ecab0000. Our stack is 16K, and comparing to
paca-&gt;kstack (c0000002ecab3e30) we can see that we have overflowed our
kernel stack. This leads to us writing over our struct thread_info, and
in this case we have corrupted thread_info-&gt;flags and set
_TIF_EMULATE_STACK_STORE.

Dumping the stack we see:

  3:mon&gt; t c0000002ecab0000
  [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
  [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
  --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
  [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
  [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
  [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
  [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
  [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
  [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
  [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
  --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
  [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
  [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
  [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
  [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
  [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
  [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
  --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0

  ... and so on

__ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
path. At that point the irq state is not consistent, ie. interrupts are
hard disabled (by the exception entry), but the paca soft-enabled flag
may be out of sync.

This leads to the local_irq_restore() in trace_graph_entry() actually
enabling interrupts, which we do not want. Because we have not yet
reprogrammed the decrementer we immediately take another decrementer
exception, and recurse.

The fix is twofold. Firstly make sure we call DISABLE_INTS before
calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
the irq state in the paca with the hardware, making it safe again to
call local_irq_save/restore().

Although that should be sufficient to fix the bug, we also mark the
runlatch routines as notrace. They are called very early in the
exception entry and we are asking for trouble tracing them. They are
also fairly uninteresting and tracing them just adds unnecessary
overhead.

[ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873
  "powerpc: Rework runlatch code" by myself --BenH
]

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0e37739b1c96d65e6433998454985de994383019 upstream.

It's possible for us to crash when running with ftrace enabled, eg:

  Bad kernel stack pointer bffffd12 at c00000000000a454
  cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
      pc: c00000000000a454: resume_kernel+0x34/0x60
      lr: c00000000000335c: performance_monitor_common+0x15c/0x180
      sp: bffffd12
     msr: 8000000000001032
     dar: bffffd12
   dsisr: 42000000

If we look at current's stack (paca-&gt;__current-&gt;stack) we see it is
equal to c0000002ecab0000. Our stack is 16K, and comparing to
paca-&gt;kstack (c0000002ecab3e30) we can see that we have overflowed our
kernel stack. This leads to us writing over our struct thread_info, and
in this case we have corrupted thread_info-&gt;flags and set
_TIF_EMULATE_STACK_STORE.

Dumping the stack we see:

  3:mon&gt; t c0000002ecab0000
  [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
  [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
  --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
  [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
  [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
  [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
  [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
  [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
  [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
  [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
  --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
  [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
  [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
  [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
  [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
  [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
  [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
  --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0

  ... and so on

__ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
path. At that point the irq state is not consistent, ie. interrupts are
hard disabled (by the exception entry), but the paca soft-enabled flag
may be out of sync.

This leads to the local_irq_restore() in trace_graph_entry() actually
enabling interrupts, which we do not want. Because we have not yet
reprogrammed the decrementer we immediately take another decrementer
exception, and recurse.

The fix is twofold. Firstly make sure we call DISABLE_INTS before
calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
the irq state in the paca with the hardware, making it safe again to
call local_irq_save/restore().

Although that should be sufficient to fix the bug, we also mark the
runlatch routines as notrace. They are called very early in the
exception entry and we are asking for trouble tracing them. They are
also fairly uninteresting and tracing them just adds unnecessary
overhead.

[ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873
  "powerpc: Rework runlatch code" by myself --BenH
]

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
