<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch, branch v4.1.21</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>arm64: replace read_lock to rcu lock in call_step_hook</title>
<updated>2016-03-24T12:40:20+00:00</updated>
<author>
<name>Yang Shi</name>
<email>yang.shi@linaro.org</email>
</author>
<published>2016-03-24T11:14:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=143cf26c48278bd438a97a8bd3e18b6460192981'/>
<id>143cf26c48278bd438a97a8bd3e18b6460192981</id>
<content type='text'>
[ Upstream commit cf0a25436f05753aca5151891aea4fd130556e2a ]

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
Preemption disabled at:[&lt;ffff800000124c18&gt;] kgdb_cpu_enter+0x158/0x6b8

CPU: 3 PID: 383 Comm: sh Tainted: G        W       4.1.13-rt13 #2
Hardware name: Freescale Layerscape 2085a RDB Board (DT)
Call trace:
[&lt;ffff8000000885e8&gt;] dump_backtrace+0x0/0x128
[&lt;ffff800000088734&gt;] show_stack+0x24/0x30
[&lt;ffff80000079a7c4&gt;] dump_stack+0x80/0xa0
[&lt;ffff8000000bd324&gt;] ___might_sleep+0x18c/0x1a0
[&lt;ffff8000007a20ac&gt;] __rt_spin_lock+0x2c/0x40
[&lt;ffff8000007a2268&gt;] rt_read_lock+0x40/0x58
[&lt;ffff800000085328&gt;] single_step_handler+0x38/0xd8
[&lt;ffff800000082368&gt;] do_debug_exception+0x58/0xb8
Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0)
7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000
7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000
7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000
7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000
7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000
7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000
7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff
7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff
7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000
[&lt;ffff8000000833f4&gt;] el1_dbg+0x18/0x6c

This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in
call_break_hook"), but comes to single_step_handler.

This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.

Signed-off-by: Yang Shi &lt;yang.shi@linaro.org&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: He Kuang &lt;hekuang@huawei.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cf0a25436f05753aca5151891aea4fd130556e2a ]

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
Preemption disabled at:[&lt;ffff800000124c18&gt;] kgdb_cpu_enter+0x158/0x6b8

CPU: 3 PID: 383 Comm: sh Tainted: G        W       4.1.13-rt13 #2
Hardware name: Freescale Layerscape 2085a RDB Board (DT)
Call trace:
[&lt;ffff8000000885e8&gt;] dump_backtrace+0x0/0x128
[&lt;ffff800000088734&gt;] show_stack+0x24/0x30
[&lt;ffff80000079a7c4&gt;] dump_stack+0x80/0xa0
[&lt;ffff8000000bd324&gt;] ___might_sleep+0x18c/0x1a0
[&lt;ffff8000007a20ac&gt;] __rt_spin_lock+0x2c/0x40
[&lt;ffff8000007a2268&gt;] rt_read_lock+0x40/0x58
[&lt;ffff800000085328&gt;] single_step_handler+0x38/0xd8
[&lt;ffff800000082368&gt;] do_debug_exception+0x58/0xb8
Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0)
7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000
7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000
7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000
7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000
7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000
7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000
7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff
7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff
7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000
[&lt;ffff8000000833f4&gt;] el1_dbg+0x18/0x6c

This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in
call_break_hook"), but comes to single_step_handler.

This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.

Signed-off-by: Yang Shi &lt;yang.shi@linaro.org&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: He Kuang &lt;hekuang@huawei.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: replace read_lock to rcu lock in call_break_hook</title>
<updated>2016-03-24T12:40:20+00:00</updated>
<author>
<name>Yang Shi</name>
<email>yang.shi@linaro.org</email>
</author>
<published>2016-03-24T11:14:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1a138f3e487026aede3642cbe09aee0f64c2f66b'/>
<id>1a138f3e487026aede3642cbe09aee0f64c2f66b</id>
<content type='text'>
[ Upstream commit 62c6c61adbc623cdacf74b8f29c278e539060c48 ]

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 0, irqs_disabled(): 128, pid: 342, name: perf
1 lock held by perf/342:
 #0:  (break_hook_lock){+.+...}, at: [&lt;ffffffc0000851ac&gt;] call_break_hook+0x34/0xd0
irq event stamp: 62224
hardirqs last  enabled at (62223): [&lt;ffffffc00010b7bc&gt;] __call_rcu.constprop.59+0x104/0x270
hardirqs last disabled at (62224): [&lt;ffffffc0000fbe20&gt;] vprintk_emit+0x68/0x640
softirqs last  enabled at (0): [&lt;ffffffc000097928&gt;] copy_process.part.8+0x428/0x17f8
softirqs last disabled at (0): [&lt;          (null)&gt;]           (null)
CPU: 0 PID: 342 Comm: perf Not tainted 4.1.6-rt5 #4
Hardware name: linux,dummy-virt (DT)
Call trace:
[&lt;ffffffc000089968&gt;] dump_backtrace+0x0/0x128
[&lt;ffffffc000089ab0&gt;] show_stack+0x20/0x30
[&lt;ffffffc0007030d0&gt;] dump_stack+0x7c/0xa0
[&lt;ffffffc0000c878c&gt;] ___might_sleep+0x174/0x260
[&lt;ffffffc000708ac8&gt;] __rt_spin_lock+0x28/0x40
[&lt;ffffffc000708db0&gt;] rt_read_lock+0x60/0x80
[&lt;ffffffc0000851a8&gt;] call_break_hook+0x30/0xd0
[&lt;ffffffc000085a70&gt;] brk_handler+0x30/0x98
[&lt;ffffffc000082248&gt;] do_debug_exception+0x50/0xb8
Exception stack(0xffffffc00514fe30 to 0xffffffc00514ff50)
fe20:                                     00000000 00000000 c1594680 0000007f
fe40: ffffffff ffffffff 92063940 0000007f 0550dcd8 ffffffc0 00000000 00000000
fe60: 0514fe70 ffffffc0 000be1f8 ffffffc0 0514feb0 ffffffc0 0008948c ffffffc0
fe80: 00000004 00000000 0514fed0 ffffffc0 ffffffff ffffffff 9282a948 0000007f
fea0: 00000000 00000000 9282b708 0000007f c1592820 0000007f 00083914 ffffffc0
fec0: 00000000 00000000 00000010 00000000 00000064 00000000 00000001 00000000
fee0: 005101e0 00000000 c1594680 0000007f c1594740 0000007f ffffffd8 ffffff80
ff00: 00000000 00000000 00000000 00000000 c1594770 0000007f c1594770 0000007f
ff20: 00665e10 00000000 7f7f7f7f 7f7f7f7f 01010101 01010101 00000000 00000000
ff40: 928e4cc0 0000007f 91ff11e8 0000007f

call_break_hook is called in atomic context (hard irq disabled), so replace
the sleepable lock to rcu lock, replace relevant list operations to rcu
version and call synchronize_rcu() in unregister_break_hook().

And, replace write lock to spinlock in {un}register_break_hook.

Signed-off-by: Yang Shi &lt;yang.shi@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: He Kuang &lt;hekuang@huawei.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 62c6c61adbc623cdacf74b8f29c278e539060c48 ]

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 0, irqs_disabled(): 128, pid: 342, name: perf
1 lock held by perf/342:
 #0:  (break_hook_lock){+.+...}, at: [&lt;ffffffc0000851ac&gt;] call_break_hook+0x34/0xd0
irq event stamp: 62224
hardirqs last  enabled at (62223): [&lt;ffffffc00010b7bc&gt;] __call_rcu.constprop.59+0x104/0x270
hardirqs last disabled at (62224): [&lt;ffffffc0000fbe20&gt;] vprintk_emit+0x68/0x640
softirqs last  enabled at (0): [&lt;ffffffc000097928&gt;] copy_process.part.8+0x428/0x17f8
softirqs last disabled at (0): [&lt;          (null)&gt;]           (null)
CPU: 0 PID: 342 Comm: perf Not tainted 4.1.6-rt5 #4
Hardware name: linux,dummy-virt (DT)
Call trace:
[&lt;ffffffc000089968&gt;] dump_backtrace+0x0/0x128
[&lt;ffffffc000089ab0&gt;] show_stack+0x20/0x30
[&lt;ffffffc0007030d0&gt;] dump_stack+0x7c/0xa0
[&lt;ffffffc0000c878c&gt;] ___might_sleep+0x174/0x260
[&lt;ffffffc000708ac8&gt;] __rt_spin_lock+0x28/0x40
[&lt;ffffffc000708db0&gt;] rt_read_lock+0x60/0x80
[&lt;ffffffc0000851a8&gt;] call_break_hook+0x30/0xd0
[&lt;ffffffc000085a70&gt;] brk_handler+0x30/0x98
[&lt;ffffffc000082248&gt;] do_debug_exception+0x50/0xb8
Exception stack(0xffffffc00514fe30 to 0xffffffc00514ff50)
fe20:                                     00000000 00000000 c1594680 0000007f
fe40: ffffffff ffffffff 92063940 0000007f 0550dcd8 ffffffc0 00000000 00000000
fe60: 0514fe70 ffffffc0 000be1f8 ffffffc0 0514feb0 ffffffc0 0008948c ffffffc0
fe80: 00000004 00000000 0514fed0 ffffffc0 ffffffff ffffffff 9282a948 0000007f
fea0: 00000000 00000000 9282b708 0000007f c1592820 0000007f 00083914 ffffffc0
fec0: 00000000 00000000 00000010 00000000 00000064 00000000 00000001 00000000
fee0: 005101e0 00000000 c1594680 0000007f c1594740 0000007f ffffffd8 ffffff80
ff00: 00000000 00000000 00000000 00000000 c1594770 0000007f c1594770 0000007f
ff20: 00665e10 00000000 7f7f7f7f 7f7f7f7f 01010101 01010101 00000000 00000000
ff40: 928e4cc0 0000007f 91ff11e8 0000007f

call_break_hook is called in atomic context (hard irq disabled), so replace
the sleepable lock to rcu lock, replace relevant list operations to rcu
version and call synchronize_rcu() in unregister_break_hook().

And, replace write lock to spinlock in {un}register_break_hook.

Signed-off-by: Yang Shi &lt;yang.shi@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: He Kuang &lt;hekuang@huawei.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: Fix build error when SMP is used without GIC</title>
<updated>2016-03-22T15:10:36+00:00</updated>
<author>
<name>Hauke Mehrtens</name>
<email>hauke@hauke-m.de</email>
</author>
<published>2016-03-06T21:28:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=14b4d1419ee6e71e672926d90a5bd87f698014d3'/>
<id>14b4d1419ee6e71e672926d90a5bd87f698014d3</id>
<content type='text'>
[ Upstream commit 588bad2ef32cae7abad24d5ca2f4611a7a7fb2a2 ]

commit 7a50e4688dabb8005df39b2b992d76629b8af8aa upstream.

The MIPS_GIC_IPI should only be selected when MIPS_GIC is also
selected, otherwise it results in a compile error. smp-gic.c uses some
functions from include/linux/irqchip/mips-gic.h like
plat_ipi_call_int_xlate() which are only added to the header file when
MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP.
The calls top the functions from smp-gic.c are already protected by
some #ifdefs

The first part of this was introduced in commit 72e20142b2bf ("MIPS:
Move GIC IPI functions out of smp-cmp.c")

Signed-off-by: Hauke Mehrtens &lt;hauke@hauke-m.de&gt;
Cc: Paul Burton &lt;paul.burton@imgtec.com&gt;
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12774/
Signed-off-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 588bad2ef32cae7abad24d5ca2f4611a7a7fb2a2 ]

commit 7a50e4688dabb8005df39b2b992d76629b8af8aa upstream.

The MIPS_GIC_IPI should only be selected when MIPS_GIC is also
selected, otherwise it results in a compile error. smp-gic.c uses some
functions from include/linux/irqchip/mips-gic.h like
plat_ipi_call_int_xlate() which are only added to the header file when
MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP.
The calls top the functions from smp-gic.c are already protected by
some #ifdefs

The first part of this was introduced in commit 72e20142b2bf ("MIPS:
Move GIC IPI functions out of smp-cmp.c")

Signed-off-by: Hauke Mehrtens &lt;hauke@hauke-m.de&gt;
Cc: Paul Burton &lt;paul.burton@imgtec.com&gt;
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12774/
Signed-off-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: Kconfig: Disable MIPS MT and SMP implementations for R6</title>
<updated>2016-03-22T15:10:36+00:00</updated>
<author>
<name>Markos Chandras</name>
<email>markos.chandras@imgtec.com</email>
</author>
<published>2015-07-09T09:40:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7c196e5a2e90d172ce2bca85bb368f70a016b02c'/>
<id>7c196e5a2e90d172ce2bca85bb368f70a016b02c</id>
<content type='text'>
[ Upstream commit 5676319c91c8d668635ac0b9b6d9145c4fa418ac ]

R6 does not support the MIPS MT ASE and the CMP/SMP options so
restrict them in order to prevent users from selecting incompatible
SMP configuration for R6 cores. We also disable the CPS/SMP option
because its support hasn't been added to the CPS code yet.

Signed-off-by: Markos Chandras &lt;markos.chandras@imgtec.com&gt;
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10637/
Signed-off-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5676319c91c8d668635ac0b9b6d9145c4fa418ac ]

R6 does not support the MIPS MT ASE and the CMP/SMP options so
restrict them in order to prevent users from selecting incompatible
SMP configuration for R6 cores. We also disable the CPS/SMP option
because its support hasn't been added to the CPS code yet.

Signed-off-by: Markos Chandras &lt;markos.chandras@imgtec.com&gt;
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10637/
Signed-off-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "MIPS: Kconfig: Disable SMP/CPS for 64-bit"</title>
<updated>2016-03-22T15:10:35+00:00</updated>
<author>
<name>Markos Chandras</name>
<email>markos.chandras@imgtec.com</email>
</author>
<published>2015-07-01T08:31:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e652be4b177954875b8d2d842abfda3626cb1d6a'/>
<id>e652be4b177954875b8d2d842abfda3626cb1d6a</id>
<content type='text'>
[ Upstream commit 1c885357da2d3cf62132e611c0beaf4cdf607dd9 ]

This reverts commit 6ca716f2e5571d25a3899c6c5c91ff72ea6d6f5e.

SMP/CPS is now supported on 64bit cores.

Cc: &lt;stable@vger.kernel.org&gt; # 4.1
Reviewed-by: Paul Burton &lt;paul.burton@imgtec.com&gt;
Signed-off-by: Markos Chandras &lt;markos.chandras@imgtec.com&gt;
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10592/
Signed-off-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 1c885357da2d3cf62132e611c0beaf4cdf607dd9 ]

This reverts commit 6ca716f2e5571d25a3899c6c5c91ff72ea6d6f5e.

SMP/CPS is now supported on 64bit cores.

Cc: &lt;stable@vger.kernel.org&gt; # 4.1
Reviewed-by: Paul Burton &lt;paul.burton@imgtec.com&gt;
Signed-off-by: Markos Chandras &lt;markos.chandras@imgtec.com&gt;
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10592/
Signed-off-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo</title>
<updated>2016-03-22T15:10:34+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2016-03-08T11:13:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=eac525506a083a389ba173880979a6291401af2d'/>
<id>eac525506a083a389ba173880979a6291401af2d</id>
<content type='text'>
[ Upstream commit 844a5fe219cf472060315971e15cbf97674a3324 ]

Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.

KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0.  Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution.  This will still cause a user write to fault, while
supervisor writes will succeed.  User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0).  User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.

When SMEP is in effect, however, U=0 will enable kernel execution of
this page.  To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0.  If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.

The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch.  (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).

There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.

Cc: stable@vger.kernel.org
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Reviewed-by: Xiao Guangrong &lt;guangrong.xiao@linux.intel.com&gt;
Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 844a5fe219cf472060315971e15cbf97674a3324 ]

Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.

KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0.  Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution.  This will still cause a user write to fault, while
supervisor writes will succeed.  User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0).  User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.

When SMEP is in effect, however, U=0 will enable kernel execution of
this page.  To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0.  If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.

The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch.  (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).

There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.

Cc: stable@vger.kernel.org
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Reviewed-by: Xiao Guangrong &lt;guangrong.xiao@linux.intel.com&gt;
Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/mm: four page table levels vs. fork</title>
<updated>2016-03-22T15:10:34+00:00</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2016-02-15T13:46:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=09b4fd2014b1ef7d46df8df553f94254ba2a0497'/>
<id>09b4fd2014b1ef7d46df8df553f94254ba2a0497</id>
<content type='text'>
[ Upstream commit 3446c13b268af86391d06611327006b059b8bab1 ]

The fork of a process with four page table levels is broken since
git commit 6252d702c5311ce9 "[S390] dynamic page tables."

All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm-&gt;pgd pointer without a pgd_deref
in between.

The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm-&gt;context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.

This fixes CVE-2016-2143.

Reported-by: Marcin Kościelnicki &lt;koriakin@0x04.net&gt;
Reviewed-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 3446c13b268af86391d06611327006b059b8bab1 ]

The fork of a process with four page table levels is broken since
git commit 6252d702c5311ce9 "[S390] dynamic page tables."

All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm-&gt;pgd pointer without a pgd_deref
in between.

The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm-&gt;context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.

This fixes CVE-2016-2143.

Reported-by: Marcin Kościelnicki &lt;koriakin@0x04.net&gt;
Reviewed-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: VMX: disable PEBS before a guest entry</title>
<updated>2016-03-22T15:10:31+00:00</updated>
<author>
<name>Radim Krčmář</name>
<email>rkrcmar@redhat.com</email>
</author>
<published>2016-03-04T14:08:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=eb34a645aee3906baac9cad7defdabf61ac40bfd'/>
<id>eb34a645aee3906baac9cad7defdabf61ac40bfd</id>
<content type='text'>
[ Upstream commit 7099e2e1f4d9051f31bbfa5803adf954bb5d76ef ]

Linux guests on Haswell (and also SandyBridge and Broadwell, at least)
would crash if you decided to run a host command that uses PEBS, like
  perf record -e 'cpu/mem-stores/pp' -a

This happens because KVM is using VMX MSR switching to disable PEBS, but
SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it
isn't safe:
  When software needs to reconfigure PEBS facilities, it should allow a
  quiescent period between stopping the prior event counting and setting
  up a new PEBS event. The quiescent period is to allow any latent
  residual PEBS records to complete its capture at their previously
  specified buffer address (provided by IA32_DS_AREA).

There might not be a quiescent period after the MSR switch, so a CPU
ends up using host's MSR_IA32_DS_AREA to access an area in guest's
memory.  (Or MSR switching is just buggy on some models.)

The guest can learn something about the host this way:
If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results
in #PF where we leak host's MSR_IA32_DS_AREA through CR2.

After that, a malicious guest can map and configure memory where
MSR_IA32_DS_AREA is pointing and can therefore get an output from
host's tracing.

This is not a critical leak as the host must initiate with PEBS tracing
and I have not been able to get a record from more than one instruction
before vmentry in vmx_vcpu_run() (that place has most registers already
overwritten with guest's).

We could disable PEBS just few instructions before vmentry, but
disabling it earlier shouldn't affect host tracing too much.
We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that
optimization isn't worth its code, IMO.

(If you are implementing PEBS for guests, be sure to handle the case
 where both host and guest enable PEBS, because this patch doesn't.)

Fixes: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.")
Cc: &lt;stable@vger.kernel.org&gt;
Reported-by: Jiří Olša &lt;jolsa@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7099e2e1f4d9051f31bbfa5803adf954bb5d76ef ]

Linux guests on Haswell (and also SandyBridge and Broadwell, at least)
would crash if you decided to run a host command that uses PEBS, like
  perf record -e 'cpu/mem-stores/pp' -a

This happens because KVM is using VMX MSR switching to disable PEBS, but
SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it
isn't safe:
  When software needs to reconfigure PEBS facilities, it should allow a
  quiescent period between stopping the prior event counting and setting
  up a new PEBS event. The quiescent period is to allow any latent
  residual PEBS records to complete its capture at their previously
  specified buffer address (provided by IA32_DS_AREA).

There might not be a quiescent period after the MSR switch, so a CPU
ends up using host's MSR_IA32_DS_AREA to access an area in guest's
memory.  (Or MSR switching is just buggy on some models.)

The guest can learn something about the host this way:
If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results
in #PF where we leak host's MSR_IA32_DS_AREA through CR2.

After that, a malicious guest can map and configure memory where
MSR_IA32_DS_AREA is pointing and can therefore get an output from
host's tracing.

This is not a critical leak as the host must initiate with PEBS tracing
and I have not been able to get a record from more than one instruction
before vmentry in vmx_vcpu_run() (that place has most registers already
overwritten with guest's).

We could disable PEBS just few instructions before vmentry, but
disabling it earlier shouldn't affect host tracing too much.
We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that
optimization isn't worth its code, IMO.

(If you are implementing PEBS for guests, be sure to handle the case
 where both host and guest enable PEBS, because this patch doesn't.)

Fixes: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.")
Cc: &lt;stable@vger.kernel.org&gt;
Reported-by: Jiří Olša &lt;jolsa@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit</title>
<updated>2016-03-22T15:10:30+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2016-03-05T08:34:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6d44ac3f884b220573b2d46c691127fb6fee0707'/>
<id>6d44ac3f884b220573b2d46c691127fb6fee0707</id>
<content type='text'>
[ Upstream commit ccec44563b18a0ce90e2d4f332784b3cb25c8e9c ]

Thomas Huth discovered that a guest could cause a hard hang of a
host CPU by setting the Instruction Authority Mask Register (IAMR)
to a suitable value.  It turns out that this is because when the
code was added to context-switch the new special-purpose registers
(SPRs) that were added in POWER8, we forgot to add code to ensure
that they were restored to a sane value on guest exit.

This adds code to set those registers where a bad value could
compromise the execution of the host kernel to a suitable neutral
value on guest exit.

Cc: stable@vger.kernel.org # v3.14+
Fixes: b005255e12a3
Reported-by: Thomas Huth &lt;thuth@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit ccec44563b18a0ce90e2d4f332784b3cb25c8e9c ]

Thomas Huth discovered that a guest could cause a hard hang of a
host CPU by setting the Instruction Authority Mask Register (IAMR)
to a suitable value.  It turns out that this is because when the
code was added to context-switch the new special-purpose registers
(SPRs) that were added in POWER8, we forgot to add code to ensure
that they were restored to a sane value on guest exit.

This adds code to set those registers where a bad value could
compromise the execution of the host kernel to a suitable neutral
value on guest exit.

Cc: stable@vger.kernel.org # v3.14+
Fixes: b005255e12a3
Reported-by: Thomas Huth &lt;thuth@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: dts: dra7: do not gate cpsw clock due to errata i877</title>
<updated>2016-03-22T15:10:29+00:00</updated>
<author>
<name>Mugunthan V N</name>
<email>mugunthanvnm@ti.com</email>
</author>
<published>2016-03-07T08:41:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0d00dbe120f15fd798e7b0e2c69d67da887a9bfb'/>
<id>0d00dbe120f15fd798e7b0e2c69d67da887a9bfb</id>
<content type='text'>
[ Upstream commit 0f514e690740e54815441a87708c3326f8aa8709 ]

Errata id: i877

Description:
------------
The RGMII 1000 Mbps Transmit timing is based on the output clock
(rgmiin_txc) being driven relative to the rising edge of an internal
clock and the output control/data (rgmiin_txctl/txd) being driven relative
to the falling edge of an internal clock source. If the internal clock
source is allowed to be static low (i.e., disabled) for an extended period
of time then when the clock is actually enabled the timing delta between
the rising edge and falling edge can change over the lifetime of the
device. This can result in the device switching characteristics degrading
over time, and eventually failing to meet the Data Manual Delay Time/Skew
specs.
To maintain RGMII 1000 Mbps IO Timings, SW should minimize the
duration that the Ethernet internal clock source is disabled. Note that
the device reset state for the Ethernet clock is "disabled".
Other RGMII modes (10 Mbps, 100Mbps) are not affected

Workaround:
-----------
If the SoC Ethernet interface(s) are used in RGMII mode at 1000 Mbps,
SW should minimize the time the Ethernet internal clock source is disabled
to a maximum of 200 hours in a device life cycle. This is done by enabling
the clock as early as possible in IPL (QNX) or SPL/u-boot (Linux/Android)
by setting the register CM_GMAC_CLKSTCTRL[1:0]CLKTRCTRL = 0x2:SW_WKUP.

So, do not allow to gate the cpsw clocks using ti,no-idle property in
cpsw node assuming 1000 Mbps is being used all the time. If someone does
not need 1000 Mbps and wants to gate clocks to cpsw, this property needs
to be deleted in their respective board files.

Signed-off-by: Mugunthan V N &lt;mugunthanvnm@ti.com&gt;
Signed-off-by: Grygorii Strashko &lt;grygorii.strashko@ti.com&gt;
Signed-off-by: Lokesh Vutla &lt;lokeshvutla@ti.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Paul Walmsley &lt;paul@pwsan.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0f514e690740e54815441a87708c3326f8aa8709 ]

Errata id: i877

Description:
------------
The RGMII 1000 Mbps Transmit timing is based on the output clock
(rgmiin_txc) being driven relative to the rising edge of an internal
clock and the output control/data (rgmiin_txctl/txd) being driven relative
to the falling edge of an internal clock source. If the internal clock
source is allowed to be static low (i.e., disabled) for an extended period
of time then when the clock is actually enabled the timing delta between
the rising edge and falling edge can change over the lifetime of the
device. This can result in the device switching characteristics degrading
over time, and eventually failing to meet the Data Manual Delay Time/Skew
specs.
To maintain RGMII 1000 Mbps IO Timings, SW should minimize the
duration that the Ethernet internal clock source is disabled. Note that
the device reset state for the Ethernet clock is "disabled".
Other RGMII modes (10 Mbps, 100Mbps) are not affected

Workaround:
-----------
If the SoC Ethernet interface(s) are used in RGMII mode at 1000 Mbps,
SW should minimize the time the Ethernet internal clock source is disabled
to a maximum of 200 hours in a device life cycle. This is done by enabling
the clock as early as possible in IPL (QNX) or SPL/u-boot (Linux/Android)
by setting the register CM_GMAC_CLKSTCTRL[1:0]CLKTRCTRL = 0x2:SW_WKUP.

So, do not allow to gate the cpsw clocks using ti,no-idle property in
cpsw node assuming 1000 Mbps is being used all the time. If someone does
not need 1000 Mbps and wants to gate clocks to cpsw, this property needs
to be deleted in their respective board files.

Signed-off-by: Mugunthan V N &lt;mugunthanvnm@ti.com&gt;
Signed-off-by: Grygorii Strashko &lt;grygorii.strashko@ti.com&gt;
Signed-off-by: Lokesh Vutla &lt;lokeshvutla@ti.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Paul Walmsley &lt;paul@pwsan.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
