<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch, branch v4.4.105</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>ARM: OMAP1: DMA: Correct the number of logical channels</title>
<updated>2017-12-09T17:42:41+00:00</updated>
<author>
<name>Peter Ujfalusi</name>
<email>peter.ujfalusi@ti.com</email>
</author>
<published>2017-01-03T11:22:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e6533243f74fd2ca85387e1d332eecdf961931e4'/>
<id>e6533243f74fd2ca85387e1d332eecdf961931e4</id>
<content type='text'>
[ Upstream commit 657279778af54f35e54b07b6687918f254a2992c ]

OMAP1510, OMAP5910 and OMAP310 have only 9 logical channels.
OMAP1610, OMAP5912, OMAP1710, OMAP730, and OMAP850 have 16 logical channels
available.

The wired 17 for the lch_count must have been used to cover the 16 + 1
dedicated LCD channel, in reality we can only use 9 or 16 channels.

The d-&gt;chan_count is not used by the omap-dma stack, so we can skip the
setup. chan_count was configured to the number of logical channels and not
the actual number of physical channels anyways.

Signed-off-by: Peter Ujfalusi &lt;peter.ujfalusi@ti.com&gt;
Acked-by: Aaro Koskinen &lt;aaro.koskinen@iki.fi&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 657279778af54f35e54b07b6687918f254a2992c ]

OMAP1510, OMAP5910 and OMAP310 have only 9 logical channels.
OMAP1610, OMAP5912, OMAP1710, OMAP730, and OMAP850 have 16 logical channels
available.

The wired 17 for the lch_count must have been used to cover the 16 + 1
dedicated LCD channel, in reality we can only use 9 or 16 channels.

The d-&gt;chan_count is not used by the omap-dma stack, so we can skip the
setup. chan_count was configured to the number of logical channels and not
the actual number of physical channels anyways.

Signed-off-by: Peter Ujfalusi &lt;peter.ujfalusi@ti.com&gt;
Acked-by: Aaro Koskinen &lt;aaro.koskinen@iki.fi&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kprobes/x86: Disable preemption in ftrace-based jprobes</title>
<updated>2017-12-09T17:42:41+00:00</updated>
<author>
<name>Masami Hiramatsu</name>
<email>mhiramat@kernel.org</email>
</author>
<published>2017-09-19T10:01:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8633eec0ee08d6dd4148941bba4ac2b4d918826b'/>
<id>8633eec0ee08d6dd4148941bba4ac2b4d918826b</id>
<content type='text'>
[ Upstream commit 5bb4fc2d8641219732eb2bb654206775a4219aca ]

Disable preemption in ftrace-based jprobe handlers as
described in Documentation/kprobes.txt:

  "Probe handlers are run with preemption disabled."

This will fix jprobes behavior when CONFIG_PREEMPT=y.

Signed-off-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Alexei Starovoitov &lt;ast@fb.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Ananth N Mavinakayanahalli &lt;ananth@linux.vnet.ibm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E . McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/150581530024.32348.9863783558598926771.stgit@devbox
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5bb4fc2d8641219732eb2bb654206775a4219aca ]

Disable preemption in ftrace-based jprobe handlers as
described in Documentation/kprobes.txt:

  "Probe handlers are run with preemption disabled."

This will fix jprobes behavior when CONFIG_PREEMPT=y.

Signed-off-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Alexei Starovoitov &lt;ast@fb.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Ananth N Mavinakayanahalli &lt;ananth@linux.vnet.ibm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E . McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/150581530024.32348.9863783558598926771.stgit@devbox
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/entry: Use SYSCALL_DEFINE() macros for sys_modify_ldt()</title>
<updated>2017-12-09T17:42:40+00:00</updated>
<author>
<name>Dave Hansen</name>
<email>dave.hansen@linux.intel.com</email>
</author>
<published>2017-10-18T17:21:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c2e1be144805266fbd75fa29ad896b9671fdae8e'/>
<id>c2e1be144805266fbd75fa29ad896b9671fdae8e</id>
<content type='text'>
[ Upstream commit da20ab35180780e4a6eadc804544f1fa967f3567 ]

We do not have tracepoints for sys_modify_ldt() because we define
it directly instead of using the normal SYSCALL_DEFINEx() macros.

However, there is a reason sys_modify_ldt() does not use the macros:
it has an 'int' return type instead of 'unsigned long'.  This is
a bug, but it's a bug cemented in the ABI.

What does this mean?  If we return -EINVAL from a function that
returns 'int', we have 0x00000000ffffffea in %rax.  But, if we
return -EINVAL from a function returning 'unsigned long', we end
up with 0xffffffffffffffea in %rax, which is wrong.

To work around this and maintain the 'int' behavior while using
the SYSCALL_DEFINEx() macros, so we add a cast to 'unsigned int'
in both implementations of sys_modify_ldt().

Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Reviewed-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20171018172107.1A79C532@viggo.jf.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit da20ab35180780e4a6eadc804544f1fa967f3567 ]

We do not have tracepoints for sys_modify_ldt() because we define
it directly instead of using the normal SYSCALL_DEFINEx() macros.

However, there is a reason sys_modify_ldt() does not use the macros:
it has an 'int' return type instead of 'unsigned long'.  This is
a bug, but it's a bug cemented in the ABI.

What does this mean?  If we return -EINVAL from a function that
returns 'int', we have 0x00000000ffffffea in %rax.  But, if we
return -EINVAL from a function returning 'unsigned long', we end
up with 0xffffffffffffffea in %rax, which is wrong.

To work around this and maintain the 'int' behavior while using
the SYSCALL_DEFINEx() macros, so we add a cast to 'unsigned int'
in both implementations of sys_modify_ldt().

Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Reviewed-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20171018172107.1A79C532@viggo.jf.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/pci: do not require AIS facility</title>
<updated>2017-12-09T17:42:39+00:00</updated>
<author>
<name>Christian Borntraeger</name>
<email>borntraeger@de.ibm.com</email>
</author>
<published>2017-10-30T13:38:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=93f45d8c67dc914ed0d83b02549474759bdfc1ac'/>
<id>93f45d8c67dc914ed0d83b02549474759bdfc1ac</id>
<content type='text'>
[ Upstream commit 48070c73058be6de9c0d754d441ed7092dfc8f12 ]

As of today QEMU does not provide the AIS facility to its guest.  This
prevents Linux guests from using PCI devices as the ais facility is
checked during init. As this is just a performance optimization, we can
move the ais check into the code where we need it (calling the SIC
instruction). This is used at initialization and on interrupt. Both
places do not require any serialization, so we can simply skip the
instruction.

Since we will now get all interrupts, we can also avoid the 2nd scan.
As we can have multiple interrupts in parallel we might trigger spurious
irqs more often for the non-AIS case but the core code can handle that.

Signed-off-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Reviewed-by: Pierre Morel &lt;pmorel@linux.vnet.ibm.com&gt;
Reviewed-by: Halil Pasic &lt;pasic@linux.vnet.ibm.com&gt;
Acked-by: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 48070c73058be6de9c0d754d441ed7092dfc8f12 ]

As of today QEMU does not provide the AIS facility to its guest.  This
prevents Linux guests from using PCI devices as the ais facility is
checked during init. As this is just a performance optimization, we can
move the ais check into the code where we need it (calling the SIC
instruction). This is used at initialization and on interrupt. Both
places do not require any serialization, so we can simply skip the
instruction.

Since we will now get all interrupts, we can also avoid the 2nd scan.
As we can have multiple interrupts in parallel we might trigger spurious
irqs more often for the non-AIS case but the core code can handle that.

Signed-off-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Reviewed-by: Pierre Morel &lt;pmorel@linux.vnet.ibm.com&gt;
Reviewed-by: Halil Pasic &lt;pasic@linux.vnet.ibm.com&gt;
Acked-by: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/runtime instrumentation: simplify task exit handling</title>
<updated>2017-12-09T17:42:38+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2017-09-11T09:24:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9e51ee1b76efc7b5e9404010793a39fde0e03cb7'/>
<id>9e51ee1b76efc7b5e9404010793a39fde0e03cb7</id>
<content type='text'>
commit 8d9047f8b967ce6181fd824ae922978e1b055cc0 upstream.

Free data structures required for runtime instrumentation from
arch_release_task_struct(). This allows to simplify the code a bit,
and also makes the semantics a bit easier: arch_release_task_struct()
is never called from the task that is being removed.

In addition this allows to get rid of exit_thread() in a later patch.

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8d9047f8b967ce6181fd824ae922978e1b055cc0 upstream.

Free data structures required for runtime instrumentation from
arch_release_task_struct(). This allows to simplify the code a bit,
and also makes the semantics a bit easier: arch_release_task_struct()
is never called from the task that is being removed.

In addition this allows to get rid of exit_thread() in a later patch.

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: x86: inject exceptions produced by x86_decode_insn</title>
<updated>2017-12-05T10:22:51+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2017-11-10T09:49:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a6493ad6fc893431705637d7efa3035235d9e135'/>
<id>a6493ad6fc893431705637d7efa3035235d9e135</id>
<content type='text'>
commit 6ea6e84309ca7e0e850b3083e6b09344ee15c290 upstream.

Sometimes, a processor might execute an instruction while another
processor is updating the page tables for that instruction's code page,
but before the TLB shootdown completes.  The interesting case happens
if the page is in the TLB.

In general, the processor will succeed in executing the instruction and
nothing bad happens.  However, what if the instruction is an MMIO access?
If *that* happens, KVM invokes the emulator, and the emulator gets the
updated page tables.  If the update side had marked the code page as non
present, the page table walk then will fail and so will x86_decode_insn.

Unfortunately, even though kvm_fetch_guest_virt is correctly returning
X86EMUL_PROPAGATE_FAULT, x86_decode_insn's caller treats the failure as
a fatal error if the instruction cannot simply be reexecuted (as is the
case for MMIO).  And this in fact happened sometimes when rebooting
Windows 2012r2 guests.  Just checking ctxt-&gt;have_exception and injecting
the exception if true is enough to fix the case.

Thanks to Eduardo Habkost for helping in the debugging of this issue.

Reported-by: Yanan Fu &lt;yfu@redhat.com&gt;
Cc: Eduardo Habkost &lt;ehabkost@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6ea6e84309ca7e0e850b3083e6b09344ee15c290 upstream.

Sometimes, a processor might execute an instruction while another
processor is updating the page tables for that instruction's code page,
but before the TLB shootdown completes.  The interesting case happens
if the page is in the TLB.

In general, the processor will succeed in executing the instruction and
nothing bad happens.  However, what if the instruction is an MMIO access?
If *that* happens, KVM invokes the emulator, and the emulator gets the
updated page tables.  If the update side had marked the code page as non
present, the page table walk then will fail and so will x86_decode_insn.

Unfortunately, even though kvm_fetch_guest_virt is correctly returning
X86EMUL_PROPAGATE_FAULT, x86_decode_insn's caller treats the failure as
a fatal error if the instruction cannot simply be reexecuted (as is the
case for MMIO).  And this in fact happened sometimes when rebooting
Windows 2012r2 guests.  Just checking ctxt-&gt;have_exception and injecting
the exception if true is enough to fix the case.

Thanks to Eduardo Habkost for helping in the debugging of this issue.

Reported-by: Yanan Fu &lt;yfu@redhat.com&gt;
Cc: Eduardo Habkost &lt;ehabkost@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: x86: Exit to user-mode on #UD intercept when emulator requires</title>
<updated>2017-12-05T10:22:50+00:00</updated>
<author>
<name>Liran Alon</name>
<email>liran.alon@oracle.com</email>
</author>
<published>2017-11-05T14:56:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1e9e6bdccb80d9dca2a9dfd6cf2d34ce8f226d47'/>
<id>1e9e6bdccb80d9dca2a9dfd6cf2d34ce8f226d47</id>
<content type='text'>
commit 61cb57c9ed631c95b54f8e9090c89d18b3695b3c upstream.

Instruction emulation after trapping a #UD exception can result in an
MMIO access, for example when emulating a MOVBE on a processor that
doesn't support the instruction.  In this case, the #UD vmexit handler
must exit to user mode, but there wasn't any code to do so.  Add it for
both VMX and SVM.

Signed-off-by: Liran Alon &lt;liran.alon@oracle.com&gt;
Reviewed-by: Nikita Leshenko &lt;nikita.leshchenko@oracle.com&gt;
Reviewed-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Reviewed-by: Wanpeng Li &lt;wanpeng.li@hotmail.com&gt;
Reviewed-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 61cb57c9ed631c95b54f8e9090c89d18b3695b3c upstream.

Instruction emulation after trapping a #UD exception can result in an
MMIO access, for example when emulating a MOVBE on a processor that
doesn't support the instruction.  In this case, the #UD vmexit handler
must exit to user mode, but there wasn't any code to do so.  Add it for
both VMX and SVM.

Signed-off-by: Liran Alon &lt;liran.alon@oracle.com&gt;
Reviewed-by: Nikita Leshenko &lt;nikita.leshchenko@oracle.com&gt;
Reviewed-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Reviewed-by: Wanpeng Li &lt;wanpeng.li@hotmail.com&gt;
Reviewed-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk</title>
<updated>2017-12-05T10:22:50+00:00</updated>
<author>
<name>Liran Alon</name>
<email>liran.alon@oracle.com</email>
</author>
<published>2017-11-05T14:11:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ab29b6b818aa1cc8c36fc510464c732d392f5df2'/>
<id>ab29b6b818aa1cc8c36fc510464c732d392f5df2</id>
<content type='text'>
commit 51c4b8bba674cfd2260d173602c4dac08e4c3a99 upstream.

When guest passes KVM it's pvclock-page GPA via WRMSR to
MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize
pvclock-page to some start-values. It just requests a clock-update which
will happen before entering to guest.

The clock-update logic will call kvm_setup_pvclock_page() to update the
pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly*
assumes that the version-field is initialized to an even number. This is
wrong because at first-time write, field could be any-value.

Fix simply makes sure that if first-time version-field is odd, increment
it once more to make it even and only then start standard logic.
This follows same logic as done in other pvclock shared-pages (See
kvm_write_wall_clock() and record_steal_time()).

Signed-off-by: Liran Alon &lt;liran.alon@oracle.com&gt;
Reviewed-by: Nikita Leshenko &lt;nikita.leshchenko@oracle.com&gt;
Reviewed-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Reviewed-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 51c4b8bba674cfd2260d173602c4dac08e4c3a99 upstream.

When guest passes KVM it's pvclock-page GPA via WRMSR to
MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize
pvclock-page to some start-values. It just requests a clock-update which
will happen before entering to guest.

The clock-update logic will call kvm_setup_pvclock_page() to update the
pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly*
assumes that the version-field is initialized to an even number. This is
wrong because at first-time write, field could be any-value.

Fix simply makes sure that if first-time version-field is odd, increment
it once more to make it even and only then start standard logic.
This follows same logic as done in other pvclock shared-pages (See
kvm_write_wall_clock() and record_steal_time()).

Signed-off-by: Liran Alon &lt;liran.alon@oracle.com&gt;
Reviewed-by: Nikita Leshenko &lt;nikita.leshchenko@oracle.com&gt;
Reviewed-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Reviewed-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/efi-bgrt: Replace early_memremap() with memremap()</title>
<updated>2017-12-05T10:22:50+00:00</updated>
<author>
<name>Matt Fleming</name>
<email>matt@codeblueprint.co.uk</email>
</author>
<published>2015-12-21T14:12:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f97fc9ab1ce23a8666cfe460264a6a45b7e0b335'/>
<id>f97fc9ab1ce23a8666cfe460264a6a45b7e0b335</id>
<content type='text'>
commit e2c90dd7e11e3025b46719a79fb4bb1e7a5cef9f upstream.

Môshe reported the following warning triggered on his machine since
commit 50a0cb565246 ("x86/efi-bgrt: Fix kernel panic when mapping BGRT
data"),

  [    0.026936] ------------[ cut here ]------------
  [    0.026941] WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:137 __early_ioremap+0x102/0x1bb()
  [    0.026941] Modules linked in:
  [    0.026944] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.0-rc1 #2
  [    0.026945] Hardware name: Dell Inc. XPS 13 9343/09K8G1, BIOS A05 07/14/2015
  [    0.026946]  0000000000000000 900f03d5a116524d ffffffff81c03e60 ffffffff813a3fff
  [    0.026948]  0000000000000000 ffffffff81c03e98 ffffffff810a0852 00000000d7b76000
  [    0.026949]  0000000000000000 0000000000000001 0000000000000001 000000000000017c
  [    0.026951] Call Trace:
  [    0.026955]  [&lt;ffffffff813a3fff&gt;] dump_stack+0x44/0x55
  [    0.026958]  [&lt;ffffffff810a0852&gt;] warn_slowpath_common+0x82/0xc0
  [    0.026959]  [&lt;ffffffff810a099a&gt;] warn_slowpath_null+0x1a/0x20
  [    0.026961]  [&lt;ffffffff81d8c395&gt;] __early_ioremap+0x102/0x1bb
  [    0.026962]  [&lt;ffffffff81d8c602&gt;] early_memremap+0x13/0x15
  [    0.026964]  [&lt;ffffffff81d78361&gt;] efi_bgrt_init+0x162/0x1ad
  [    0.026966]  [&lt;ffffffff81d778ec&gt;] efi_late_init+0x9/0xb
  [    0.026968]  [&lt;ffffffff81d58ff5&gt;] start_kernel+0x46f/0x49f
  [    0.026970]  [&lt;ffffffff81d58120&gt;] ? early_idt_handler_array+0x120/0x120
  [    0.026972]  [&lt;ffffffff81d58339&gt;] x86_64_start_reservations+0x2a/0x2c
  [    0.026974]  [&lt;ffffffff81d58485&gt;] x86_64_start_kernel+0x14a/0x16d
  [    0.026977] ---[ end trace f9b3812eb8e24c58 ]---
  [    0.026978] efi_bgrt: Ignoring BGRT: failed to map image memory

early_memremap() has an upper limit on the size of mapping it can
handle which is ~200KB. Clearly the BGRT image on Môshe's machine is
much larger than that.

There's actually no reason to restrict ourselves to using the early_*
version of memremap() - the ACPI BGRT driver is invoked late enough in
boot that we can use the standard version, with the benefit that the
late version allows mappings of arbitrary size.

Reported-by: Môshe van der Sterre &lt;me@moshe.nl&gt;
Tested-by: Môshe van der Sterre &lt;me@moshe.nl&gt;
Signed-off-by: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Sai Praneeth Prakhya &lt;sai.praneeth.prakhya@intel.com&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Link: http://lkml.kernel.org/r/1450707172-12561-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "Ghannam, Yazen" &lt;Yazen.Ghannam@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e2c90dd7e11e3025b46719a79fb4bb1e7a5cef9f upstream.

Môshe reported the following warning triggered on his machine since
commit 50a0cb565246 ("x86/efi-bgrt: Fix kernel panic when mapping BGRT
data"),

  [    0.026936] ------------[ cut here ]------------
  [    0.026941] WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:137 __early_ioremap+0x102/0x1bb()
  [    0.026941] Modules linked in:
  [    0.026944] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.0-rc1 #2
  [    0.026945] Hardware name: Dell Inc. XPS 13 9343/09K8G1, BIOS A05 07/14/2015
  [    0.026946]  0000000000000000 900f03d5a116524d ffffffff81c03e60 ffffffff813a3fff
  [    0.026948]  0000000000000000 ffffffff81c03e98 ffffffff810a0852 00000000d7b76000
  [    0.026949]  0000000000000000 0000000000000001 0000000000000001 000000000000017c
  [    0.026951] Call Trace:
  [    0.026955]  [&lt;ffffffff813a3fff&gt;] dump_stack+0x44/0x55
  [    0.026958]  [&lt;ffffffff810a0852&gt;] warn_slowpath_common+0x82/0xc0
  [    0.026959]  [&lt;ffffffff810a099a&gt;] warn_slowpath_null+0x1a/0x20
  [    0.026961]  [&lt;ffffffff81d8c395&gt;] __early_ioremap+0x102/0x1bb
  [    0.026962]  [&lt;ffffffff81d8c602&gt;] early_memremap+0x13/0x15
  [    0.026964]  [&lt;ffffffff81d78361&gt;] efi_bgrt_init+0x162/0x1ad
  [    0.026966]  [&lt;ffffffff81d778ec&gt;] efi_late_init+0x9/0xb
  [    0.026968]  [&lt;ffffffff81d58ff5&gt;] start_kernel+0x46f/0x49f
  [    0.026970]  [&lt;ffffffff81d58120&gt;] ? early_idt_handler_array+0x120/0x120
  [    0.026972]  [&lt;ffffffff81d58339&gt;] x86_64_start_reservations+0x2a/0x2c
  [    0.026974]  [&lt;ffffffff81d58485&gt;] x86_64_start_kernel+0x14a/0x16d
  [    0.026977] ---[ end trace f9b3812eb8e24c58 ]---
  [    0.026978] efi_bgrt: Ignoring BGRT: failed to map image memory

early_memremap() has an upper limit on the size of mapping it can
handle which is ~200KB. Clearly the BGRT image on Môshe's machine is
much larger than that.

There's actually no reason to restrict ourselves to using the early_*
version of memremap() - the ACPI BGRT driver is invoked late enough in
boot that we can use the standard version, with the benefit that the
late version allows mappings of arbitrary size.

Reported-by: Môshe van der Sterre &lt;me@moshe.nl&gt;
Tested-by: Môshe van der Sterre &lt;me@moshe.nl&gt;
Signed-off-by: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Sai Praneeth Prakhya &lt;sai.praneeth.prakhya@intel.com&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Link: http://lkml.kernel.org/r/1450707172-12561-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "Ghannam, Yazen" &lt;Yazen.Ghannam@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/efi-bgrt: Fix kernel panic when mapping BGRT data</title>
<updated>2017-12-05T10:22:50+00:00</updated>
<author>
<name>Sai Praneeth</name>
<email>sai.praneeth.prakhya@intel.com</email>
</author>
<published>2015-12-09T23:41:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e85c6907b2b46b516950a801ea0648fec8f59ba9'/>
<id>e85c6907b2b46b516950a801ea0648fec8f59ba9</id>
<content type='text'>
commit 50a0cb565246f20d59cdb161778531e4b19d35ac upstream.

Starting with this commit 35eb8b81edd4 ("x86/efi: Build our own page
table structures") efi regions have a separate page directory called
"efi_pgd". In order to access any efi region we have to first shift %cr3
to this page table. In the bgrt code we are trying to copy bgrt_header
and image, but these regions fall under "EFI_BOOT_SERVICES_DATA"
and to access these regions we have to shift %cr3 to efi_pgd and not
doing so will cause page fault as shown below.

[    0.251599] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
[    0.259126] Freeing SMP alternatives memory: 32K (ffffffff8230e000 - ffffffff82316000)
[    0.271803] BUG: unable to handle kernel paging request at fffffffefce35002
[    0.279740] IP: [&lt;ffffffff821bca49&gt;] efi_bgrt_init+0x144/0x1fd
[    0.286383] PGD 300f067 PUD 0
[    0.289879] Oops: 0000 [#1] SMP
[    0.293566] Modules linked in:
[    0.297039] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.0-rc1-eywa-eywa-built-in-47041+ #2
[    0.306619] Hardware name: Intel Corporation Skylake Client platform/Skylake Y LPDDR3 RVP3, BIOS SKLSE2R1.R00.B104.B01.1511110114 11/11/2015
[    0.320925] task: ffffffff820134c0 ti: ffffffff82000000 task.ti: ffffffff82000000
[    0.329420] RIP: 0010:[&lt;ffffffff821bca49&gt;]  [&lt;ffffffff821bca49&gt;] efi_bgrt_init+0x144/0x1fd
[    0.338821] RSP: 0000:ffffffff82003f18  EFLAGS: 00010246
[    0.344852] RAX: fffffffefce35000 RBX: fffffffefce35000 RCX: fffffffefce2b000
[    0.352952] RDX: 000000008a82b000 RSI: ffffffff8235bb80 RDI: 000000008a835000
[    0.361050] RBP: ffffffff82003f30 R08: 000000008a865000 R09: ffffffffff202850
[    0.369149] R10: ffffffff811ad62f R11: 0000000000000000 R12: 0000000000000000
[    0.377248] R13: ffff88016dbaea40 R14: ffffffff822622c0 R15: ffffffff82003fb0
[    0.385348] FS:  0000000000000000(0000) GS:ffff88016d800000(0000) knlGS:0000000000000000
[    0.394533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.401054] CR2: fffffffefce35002 CR3: 000000000300c000 CR4: 00000000003406f0
[    0.409153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.417252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.425350] Stack:
[    0.427638]  ffffffffffffffff ffffffff82256900 ffff88016dbaea40 ffffffff82003f40
[    0.436086]  ffffffff821bbce0 ffffffff82003f88 ffffffff8219c0c2 0000000000000000
[    0.444533]  ffffffff8219ba4a ffffffff822622c0 0000000000083000 00000000ffffffff
[    0.452978] Call Trace:
[    0.455763]  [&lt;ffffffff821bbce0&gt;] efi_late_init+0x9/0xb
[    0.461697]  [&lt;ffffffff8219c0c2&gt;] start_kernel+0x463/0x47f
[    0.467928]  [&lt;ffffffff8219ba4a&gt;] ? set_init_arg+0x55/0x55
[    0.474159]  [&lt;ffffffff8219b120&gt;] ? early_idt_handler_array+0x120/0x120
[    0.481669]  [&lt;ffffffff8219b5ee&gt;] x86_64_start_reservations+0x2a/0x2c
[    0.488982]  [&lt;ffffffff8219b72d&gt;] x86_64_start_kernel+0x13d/0x14c
[    0.495897] Code: 00 41 b4 01 48 8b 78 28 e8 09 36 01 00 48 85 c0 48 89 c3 75 13 48 c7 c7 f8 ac d3 81 31 c0 e8 d7 3b fb fe e9 b5 00 00 00 45 84 e4 &lt;44&gt; 8b 6b 02 74 0d be 06 00 00 00 48 89 df e8 ae 34 0$
[    0.518151] RIP  [&lt;ffffffff821bca49&gt;] efi_bgrt_init+0x144/0x1fd
[    0.524888]  RSP &lt;ffffffff82003f18&gt;
[    0.528851] CR2: fffffffefce35002
[    0.532615] ---[ end trace 7b06521e6ebf2aea ]---
[    0.537852] Kernel panic - not syncing: Attempted to kill the idle task!

As said above one way to fix this bug is to shift %cr3 to efi_pgd but we
are not doing that way because it leaks inner details of how we switch
to EFI page tables into a new call site and it also adds duplicate code.
Instead, we remove the call to efi_lookup_mapped_addr() and always
perform early_mem*() instead of early_io*() because we want to remap RAM
regions and not I/O regions. We also delete efi_lookup_mapped_addr()
because we are no longer using it.

Signed-off-by: Sai Praneeth Prakhya &lt;sai.praneeth.prakhya@intel.com&gt;
Reported-by: Wendy Wang &lt;wendy.wang@intel.com&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Ricardo Neri &lt;ricardo.neri@intel.com&gt;
Cc: Ravi Shankar &lt;ravi.v.shankar@intel.com&gt;
Signed-off-by: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Cc: "Ghannam, Yazen" &lt;Yazen.Ghannam@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 50a0cb565246f20d59cdb161778531e4b19d35ac upstream.

Starting with this commit 35eb8b81edd4 ("x86/efi: Build our own page
table structures") efi regions have a separate page directory called
"efi_pgd". In order to access any efi region we have to first shift %cr3
to this page table. In the bgrt code we are trying to copy bgrt_header
and image, but these regions fall under "EFI_BOOT_SERVICES_DATA"
and to access these regions we have to shift %cr3 to efi_pgd and not
doing so will cause page fault as shown below.

[    0.251599] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
[    0.259126] Freeing SMP alternatives memory: 32K (ffffffff8230e000 - ffffffff82316000)
[    0.271803] BUG: unable to handle kernel paging request at fffffffefce35002
[    0.279740] IP: [&lt;ffffffff821bca49&gt;] efi_bgrt_init+0x144/0x1fd
[    0.286383] PGD 300f067 PUD 0
[    0.289879] Oops: 0000 [#1] SMP
[    0.293566] Modules linked in:
[    0.297039] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.0-rc1-eywa-eywa-built-in-47041+ #2
[    0.306619] Hardware name: Intel Corporation Skylake Client platform/Skylake Y LPDDR3 RVP3, BIOS SKLSE2R1.R00.B104.B01.1511110114 11/11/2015
[    0.320925] task: ffffffff820134c0 ti: ffffffff82000000 task.ti: ffffffff82000000
[    0.329420] RIP: 0010:[&lt;ffffffff821bca49&gt;]  [&lt;ffffffff821bca49&gt;] efi_bgrt_init+0x144/0x1fd
[    0.338821] RSP: 0000:ffffffff82003f18  EFLAGS: 00010246
[    0.344852] RAX: fffffffefce35000 RBX: fffffffefce35000 RCX: fffffffefce2b000
[    0.352952] RDX: 000000008a82b000 RSI: ffffffff8235bb80 RDI: 000000008a835000
[    0.361050] RBP: ffffffff82003f30 R08: 000000008a865000 R09: ffffffffff202850
[    0.369149] R10: ffffffff811ad62f R11: 0000000000000000 R12: 0000000000000000
[    0.377248] R13: ffff88016dbaea40 R14: ffffffff822622c0 R15: ffffffff82003fb0
[    0.385348] FS:  0000000000000000(0000) GS:ffff88016d800000(0000) knlGS:0000000000000000
[    0.394533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.401054] CR2: fffffffefce35002 CR3: 000000000300c000 CR4: 00000000003406f0
[    0.409153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.417252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.425350] Stack:
[    0.427638]  ffffffffffffffff ffffffff82256900 ffff88016dbaea40 ffffffff82003f40
[    0.436086]  ffffffff821bbce0 ffffffff82003f88 ffffffff8219c0c2 0000000000000000
[    0.444533]  ffffffff8219ba4a ffffffff822622c0 0000000000083000 00000000ffffffff
[    0.452978] Call Trace:
[    0.455763]  [&lt;ffffffff821bbce0&gt;] efi_late_init+0x9/0xb
[    0.461697]  [&lt;ffffffff8219c0c2&gt;] start_kernel+0x463/0x47f
[    0.467928]  [&lt;ffffffff8219ba4a&gt;] ? set_init_arg+0x55/0x55
[    0.474159]  [&lt;ffffffff8219b120&gt;] ? early_idt_handler_array+0x120/0x120
[    0.481669]  [&lt;ffffffff8219b5ee&gt;] x86_64_start_reservations+0x2a/0x2c
[    0.488982]  [&lt;ffffffff8219b72d&gt;] x86_64_start_kernel+0x13d/0x14c
[    0.495897] Code: 00 41 b4 01 48 8b 78 28 e8 09 36 01 00 48 85 c0 48 89 c3 75 13 48 c7 c7 f8 ac d3 81 31 c0 e8 d7 3b fb fe e9 b5 00 00 00 45 84 e4 &lt;44&gt; 8b 6b 02 74 0d be 06 00 00 00 48 89 df e8 ae 34 0$
[    0.518151] RIP  [&lt;ffffffff821bca49&gt;] efi_bgrt_init+0x144/0x1fd
[    0.524888]  RSP &lt;ffffffff82003f18&gt;
[    0.528851] CR2: fffffffefce35002
[    0.532615] ---[ end trace 7b06521e6ebf2aea ]---
[    0.537852] Kernel panic - not syncing: Attempted to kill the idle task!

As said above one way to fix this bug is to shift %cr3 to efi_pgd but we
are not doing that way because it leaks inner details of how we switch
to EFI page tables into a new call site and it also adds duplicate code.
Instead, we remove the call to efi_lookup_mapped_addr() and always
perform early_mem*() instead of early_io*() because we want to remap RAM
regions and not I/O regions. We also delete efi_lookup_mapped_addr()
because we are no longer using it.

Signed-off-by: Sai Praneeth Prakhya &lt;sai.praneeth.prakhya@intel.com&gt;
Reported-by: Wendy Wang &lt;wendy.wang@intel.com&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Ricardo Neri &lt;ricardo.neri@intel.com&gt;
Cc: Ravi Shankar &lt;ravi.v.shankar@intel.com&gt;
Signed-off-by: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Cc: "Ghannam, Yazen" &lt;Yazen.Ghannam@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
