<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/Documentation, branch v3.12.42</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option</title>
<updated>2015-04-30T09:15:12+00:00</updated>
<author>
<name>Christoffer Dall</name>
<email>christoffer.dall@linaro.org</email>
</author>
<published>2014-10-16T14:14:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=96c7d3a6b93b12868acd5432e408f42e36afe8d5'/>
<id>96c7d3a6b93b12868acd5432e408f42e36afe8d5</id>
<content type='text'>
commit 3ad8b3de526a76fbe9466b366059e4958957b88f upstream.

The implementation of KVM_ARM_VCPU_INIT is currently not doing what
userspace expects, namely making sure that a vcpu which may have been
turned off using PSCI is returned to its initial state, which would be
powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag.

Implement the expected functionality and clarify the ABI.

Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Signed-off-by: Shannon Zhao &lt;shannon.zhao@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 3ad8b3de526a76fbe9466b366059e4958957b88f upstream.

The implementation of KVM_ARM_VCPU_INIT is currently not doing what
userspace expects, namely making sure that a vcpu which may have been
turned off using PSCI is returned to its initial state, which would be
powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag.

Implement the expected functionality and clarify the ABI.

Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Signed-off-by: Shannon Zhao &lt;shannon.zhao@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>stable_kernel_rules: Add pointer to netdev-FAQ for network patches</title>
<updated>2015-04-09T12:13:30+00:00</updated>
<author>
<name>Dave Chiluk</name>
<email>chiluk@canonical.com</email>
</author>
<published>2014-06-24T15:11:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1b39e9323cfb25aa4e0f2f3aaa8edbdba72556fa'/>
<id>1b39e9323cfb25aa4e0f2f3aaa8edbdba72556fa</id>
<content type='text'>
commit b76fc285337b6b256e9ba20a40cfd043f70c27af upstream.

Stable_kernel_rules should point submitters of network stable patches to the
netdev_FAQ.txt as requests for stable network patches should go to netdev
first.

Signed-off-by: Dave Chiluk &lt;chiluk@canonical.com&gt;
Cc: stable &lt;stable@vger.kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b76fc285337b6b256e9ba20a40cfd043f70c27af upstream.

Stable_kernel_rules should point submitters of network stable patches to the
netdev_FAQ.txt as requests for stable network patches should go to netdev
first.

Signed-off-by: Dave Chiluk &lt;chiluk@canonical.com&gt;
Cc: stable &lt;stable@vger.kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Input: i8042 - reset keyboard to fix Elantech touchpad detection</title>
<updated>2015-01-29T14:44:42+00:00</updated>
<author>
<name>Srihari Vijayaraghavan</name>
<email>linux.bug.reporting@gmail.com</email>
</author>
<published>2015-01-08T00:25:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=78fa0d5560b6ffb6f67a8cdc2e88804e1c3c2ea1'/>
<id>78fa0d5560b6ffb6f67a8cdc2e88804e1c3c2ea1</id>
<content type='text'>
commit 148e9a711e034e06310a8c36b64957934ebe30f2 upstream.

On some laptops, keyboard needs to be reset in order to successfully detect
touchpad (e.g., some Gigabyte laptop models with Elantech touchpads).
Without resettin keyboard touchpad pretends to be completely dead.

Based on the original patch by Mateusz Jończyk this version has been
expanded to include DMI based detection &amp; application of the fix
automatically on the affected models of laptops. This has been confirmed to
fix problem by three users already on three different models of laptops.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81331
Signed-off-by: Srihari Vijayaraghavan &lt;linux.bug.reporting@gmail.com&gt;
Acked-by: Mateusz Jończyk &lt;mat.jonczyk@o2.pl&gt;
Tested-by: Srihari Vijayaraghavan &lt;linux.bug.reporting@gmail.com&gt;
Tested by: Zakariya Dehlawi &lt;zdehlawi@gmail.com&gt;
Tested-by: Guillaum Bouchard &lt;guillaum.bouchard@gmail.com&gt;
Signed-off-by: Dmitry Torokhov &lt;dmitry.torokhov@gmail.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 148e9a711e034e06310a8c36b64957934ebe30f2 upstream.

On some laptops, keyboard needs to be reset in order to successfully detect
touchpad (e.g., some Gigabyte laptop models with Elantech touchpads).
Without resettin keyboard touchpad pretends to be completely dead.

Based on the original patch by Mateusz Jończyk this version has been
expanded to include DMI based detection &amp; application of the fix
automatically on the affected models of laptops. This has been confirmed to
fix problem by three users already on three different models of laptops.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81331
Signed-off-by: Srihari Vijayaraghavan &lt;linux.bug.reporting@gmail.com&gt;
Acked-by: Mateusz Jończyk &lt;mat.jonczyk@o2.pl&gt;
Tested-by: Srihari Vijayaraghavan &lt;linux.bug.reporting@gmail.com&gt;
Tested by: Zakariya Dehlawi &lt;zdehlawi@gmail.com&gt;
Tested-by: Guillaum Bouchard &lt;guillaum.bouchard@gmail.com&gt;
Signed-off-by: Dmitry Torokhov &lt;dmitry.torokhov@gmail.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>pstore-ram: Allow optional mapping with pgprot_noncached</title>
<updated>2015-01-26T13:38:46+00:00</updated>
<author>
<name>Tony Lindgren</name>
<email>tony@atomide.com</email>
</author>
<published>2014-09-16T20:50:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ca6a4563d4052e1da58439a60be113964c77388c'/>
<id>ca6a4563d4052e1da58439a60be113964c77388c</id>
<content type='text'>
commit 027bc8b08242c59e19356b4b2c189f2d849ab660 upstream.

On some ARMs the memory can be mapped pgprot_noncached() and still
be working for atomic operations. As pointed out by Colin Cross
&lt;ccross@android.com&gt;, in some cases you do want to use
pgprot_noncached() if the SoC supports it to see a debug printk
just before a write hanging the system.

On ARMs, the atomic operations on strongly ordered memory are
implementation defined. So let's provide an optional kernel parameter
for configuring pgprot_noncached(), and use pgprot_writecombine() by
default.

Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Rob Herring &lt;robherring2@gmail.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Anton Vorontsov &lt;anton@enomsg.org&gt;
Cc: Colin Cross &lt;ccross@android.com&gt;
Cc: Olof Johansson &lt;olof@lixom.net&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 027bc8b08242c59e19356b4b2c189f2d849ab660 upstream.

On some ARMs the memory can be mapped pgprot_noncached() and still
be working for atomic operations. As pointed out by Colin Cross
&lt;ccross@android.com&gt;, in some cases you do want to use
pgprot_noncached() if the SoC supports it to see a debug printk
just before a write hanging the system.

On ARMs, the atomic operations on strongly ordered memory are
implementation defined. So let's provide an optional kernel parameter
for configuring pgprot_noncached(), and use pgprot_writecombine() by
default.

Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Rob Herring &lt;robherring2@gmail.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Anton Vorontsov &lt;anton@enomsg.org&gt;
Cc: Colin Cross &lt;ccross@android.com&gt;
Cc: Olof Johansson &lt;olof@lixom.net&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Documentation: lzo: document part of the encoding</title>
<updated>2014-10-31T14:11:17+00:00</updated>
<author>
<name>Willy Tarreau</name>
<email>w@1wt.eu</email>
</author>
<published>2014-09-27T10:31:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=46f35744a3befcd6ee7c4897443ba1278affc68d'/>
<id>46f35744a3befcd6ee7c4897443ba1278affc68d</id>
<content type='text'>
commit d98a0526434d27e261f622cf9d2e0028b5ff1a00 upstream.

Add a complete description of the LZO format as processed by the
decompressor. I have not found a public specification of this format
hence this analysis, which will be used to better understand the code.

Cc: Willem Pinckaers &lt;willem@lekkertech.net&gt;
Cc: "Don A. Bailey" &lt;donb@securitymouse.com&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d98a0526434d27e261f622cf9d2e0028b5ff1a00 upstream.

Add a complete description of the LZO format as processed by the
decompressor. I have not found a public specification of this format
hence this analysis, which will be used to better understand the code.

Cc: Willem Pinckaers &lt;willem@lekkertech.net&gt;
Cc: "Don A. Bailey" &lt;donb@securitymouse.com&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kvm: fix potentially corrupt mmio cache</title>
<updated>2014-10-31T14:11:08+00:00</updated>
<author>
<name>David Matlack</name>
<email>dmatlack@google.com</email>
</author>
<published>2014-08-18T22:46:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8d09d4afe2735d152447903f521623dc54ddafa4'/>
<id>8d09d4afe2735d152447903f521623dc54ddafa4</id>
<content type='text'>
commit ee3d1570b58677885b4552bce8217fda7b226a68 upstream.

vcpu exits and memslot mutations can run concurrently as long as the
vcpu does not aquire the slots mutex. Thus it is theoretically possible
for memslots to change underneath a vcpu that is handling an exit.

If we increment the memslot generation number again after
synchronize_srcu_expedited(), vcpus can safely cache memslot generation
without maintaining a single rcu_dereference through an entire vm exit.
And much of the x86/kvm code does not maintain a single rcu_dereference
of the current memslots during each exit.

We can prevent the following case:

   vcpu (CPU 0)                             | thread (CPU 1)
--------------------------------------------+--------------------------
1  vm exit                                  |
2  srcu_read_unlock(&amp;kvm-&gt;srcu)             |
3  decide to cache something based on       |
     old memslots                           |
4                                           | change memslots
                                            | (increments generation)
5                                           | synchronize_srcu(&amp;kvm-&gt;srcu);
6  retrieve generation # from new memslots  |
7  tag cache with new memslot generation    |
8  srcu_read_unlock(&amp;kvm-&gt;srcu)             |
...                                         |
   &lt;action based on cache occurs even       |
    though the caching decision was based   |
    on the old memslots&gt;                    |
...                                         |
   &lt;action *continues* to occur until next  |
    memslot generation change, which may    |
    be never&gt;                               |
                                            |

By incrementing the generation after synchronizing with kvm-&gt;srcu readers,
we ensure that the generation retrieved in (6) will become invalid soon
after (8).

Keeping the existing increment is not strictly necessary, but we
do keep it and just move it for consistency from update_memslots to
install_new_memslots.  It invalidates old cached MMIOs immediately,
instead of having to wait for the end of synchronize_srcu_expedited,
which makes the code more clearly correct in case CPU 1 is preempted
right after synchronize_srcu() returns.

To avoid halving the generation space in SPTEs, always presume that the
low bit of the generation is zero when reconstructing a generation number
out of an SPTE.  This effectively disables MMIO caching in SPTEs during
the call to synchronize_srcu_expedited.  Using the low bit this way is
somewhat like a seqcount---where the protected thing is a cache, and
instead of retrying we can simply punt if we observe the low bit to be 1.

Signed-off-by: David Matlack &lt;dmatlack@google.com&gt;
Reviewed-by: Xiao Guangrong &lt;xiaoguangrong@linux.vnet.ibm.com&gt;
Reviewed-by: David Matlack &lt;dmatlack@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ee3d1570b58677885b4552bce8217fda7b226a68 upstream.

vcpu exits and memslot mutations can run concurrently as long as the
vcpu does not aquire the slots mutex. Thus it is theoretically possible
for memslots to change underneath a vcpu that is handling an exit.

If we increment the memslot generation number again after
synchronize_srcu_expedited(), vcpus can safely cache memslot generation
without maintaining a single rcu_dereference through an entire vm exit.
And much of the x86/kvm code does not maintain a single rcu_dereference
of the current memslots during each exit.

We can prevent the following case:

   vcpu (CPU 0)                             | thread (CPU 1)
--------------------------------------------+--------------------------
1  vm exit                                  |
2  srcu_read_unlock(&amp;kvm-&gt;srcu)             |
3  decide to cache something based on       |
     old memslots                           |
4                                           | change memslots
                                            | (increments generation)
5                                           | synchronize_srcu(&amp;kvm-&gt;srcu);
6  retrieve generation # from new memslots  |
7  tag cache with new memslot generation    |
8  srcu_read_unlock(&amp;kvm-&gt;srcu)             |
...                                         |
   &lt;action based on cache occurs even       |
    though the caching decision was based   |
    on the old memslots&gt;                    |
...                                         |
   &lt;action *continues* to occur until next  |
    memslot generation change, which may    |
    be never&gt;                               |
                                            |

By incrementing the generation after synchronizing with kvm-&gt;srcu readers,
we ensure that the generation retrieved in (6) will become invalid soon
after (8).

Keeping the existing increment is not strictly necessary, but we
do keep it and just move it for consistency from update_memslots to
install_new_memslots.  It invalidates old cached MMIOs immediately,
instead of having to wait for the end of synchronize_srcu_expedited,
which makes the code more clearly correct in case CPU 1 is preempted
right after synchronize_srcu() returns.

To avoid halving the generation space in SPTEs, always presume that the
low bit of the generation is zero when reconstructing a generation number
out of an SPTE.  This effectively disables MMIO caching in SPTEs during
the call to synchronize_srcu_expedited.  Using the low bit this way is
somewhat like a seqcount---where the protected thing is a cache, and
instead of retrying we can simply punt if we observe the low bit to be 1.

Signed-off-by: David Matlack &lt;dmatlack@google.com&gt;
Reviewed-by: Xiao Guangrong &lt;xiaoguangrong@linux.vnet.ibm.com&gt;
Reviewed-by: David Matlack &lt;dmatlack@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ALSA: virtuoso: add Xonar Essence STX II support</title>
<updated>2014-09-03T19:31:16+00:00</updated>
<author>
<name>Clemens Ladisch</name>
<email>clemens@ladisch.de</email>
</author>
<published>2014-08-04T13:17:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b5f979ef90afe42cdebf64c8c69ac3fdc8da18d8'/>
<id>b5f979ef90afe42cdebf64c8c69ac3fdc8da18d8</id>
<content type='text'>
commit f42bb22243d2ae264d721b055f836059fe35321f upstream.

Just add the PCI ID for the STX II.  It appears to work the same as the
STX, except for the addition of the not-yet-supported daughterboard.

Tested-by: Mario &lt;fugazzi99@gmail.com&gt;
Tested-by: corubba &lt;corubba@gmx.de&gt;
Signed-off-by: Clemens Ladisch &lt;clemens@ladisch.de&gt;
Signed-off-by: Takashi Iwai &lt;tiwai@suse.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f42bb22243d2ae264d721b055f836059fe35321f upstream.

Just add the PCI ID for the STX II.  It appears to work the same as the
STX, except for the addition of the not-yet-supported daughterboard.

Tested-by: Mario &lt;fugazzi99@gmail.com&gt;
Tested-by: corubba &lt;corubba@gmx.de&gt;
Signed-off-by: Clemens Ladisch &lt;clemens@ladisch.de&gt;
Signed-off-by: Takashi Iwai &lt;tiwai@suse.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>DMA-API: provide a helper to set both DMA and coherent DMA masks</title>
<updated>2014-08-26T12:12:07+00:00</updated>
<author>
<name>Russell King</name>
<email>rmk+kernel@arm.linux.org.uk</email>
</author>
<published>2013-06-26T12:49:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8bac7a35e60ca70c8d12ddbfdf28a8df5a976b2b'/>
<id>8bac7a35e60ca70c8d12ddbfdf28a8df5a976b2b</id>
<content type='text'>
commit 4aa806b771d16b810771d86ce23c4c3160888db3 upstream.

Provide a helper to set both the DMA and coherent DMA masks to the
same value - this avoids duplicated code in a number of drivers,
sometimes with buggy error handling, and also allows us identify
which drivers do things differently.

Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4aa806b771d16b810771d86ce23c4c3160888db3 upstream.

Provide a helper to set both the DMA and coherent DMA masks to the
same value - this avoids duplicated code in a number of drivers,
sometimes with buggy error handling, and also allows us identify
which drivers do things differently.

Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack</title>
<updated>2014-08-19T12:23:37+00:00</updated>
<author>
<name>H. Peter Anvin</name>
<email>hpa@linux.intel.com</email>
</author>
<published>2014-04-29T23:46:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7e329475de1f97df56e1cfa412e5b3479994c202'/>
<id>7e329475de1f97df56e1cfa412e5b3479994c202</id>
<content type='text'>
commit 3891a04aafd668686239349ea58f3314ea2af86b upstream.

The IRET instruction, when returning to a 16-bit segment, only
restores the bottom 16 bits of the user space stack pointer.  This
causes some 16-bit software to break, but it also leaks kernel state
to user space.  We have a software workaround for that ("espfix") for
the 32-bit kernel, but it relies on a nonzero stack segment base which
is not available in 64-bit mode.

In checkin:

    b3b42ac2cbae x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
the logic that 16-bit support is crippled on 64-bit kernels anyway (no
V86 support), but it turns out that people are doing stuff like
running old Win16 binaries under Wine and expect it to work.

This works around this by creating percpu "ministacks", each of which
is mapped 2^16 times 64K apart.  When we detect that the return SS is
on the LDT, we copy the IRET frame to the ministack and use the
relevant alias to return to userspace.  The ministacks are mapped
readonly, so if IRET faults we promote #GP to #DF which is an IST
vector and thus has its own stack; we then do the fixup in the #DF
handler.

(Making #GP an IST exception would make the msr_safe functions unsafe
in NMI/MC context, and quite possibly have other effects.)

Special thanks to:

- Andy Lutomirski, for the suggestion of using very small stack slots
  and copy (as opposed to map) the IRET frame there, and for the
  suggestion to mark them readonly and let the fault promote to #DF.
- Konrad Wilk for paravirt fixup and testing.
- Borislav Petkov for testing help and useful comments.

Reported-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Andrew Lutomriski &lt;amluto@gmail.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Dirk Hohndel &lt;dirk@hohndel.org&gt;
Cc: Arjan van de Ven &lt;arjan.van.de.ven@intel.com&gt;
Cc: comex &lt;comexk@gmail.com&gt;
Cc: Alexander van Heukelum &lt;heukelum@fastmail.fm&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # consider after upstream merge
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 3891a04aafd668686239349ea58f3314ea2af86b upstream.

The IRET instruction, when returning to a 16-bit segment, only
restores the bottom 16 bits of the user space stack pointer.  This
causes some 16-bit software to break, but it also leaks kernel state
to user space.  We have a software workaround for that ("espfix") for
the 32-bit kernel, but it relies on a nonzero stack segment base which
is not available in 64-bit mode.

In checkin:

    b3b42ac2cbae x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
the logic that 16-bit support is crippled on 64-bit kernels anyway (no
V86 support), but it turns out that people are doing stuff like
running old Win16 binaries under Wine and expect it to work.

This works around this by creating percpu "ministacks", each of which
is mapped 2^16 times 64K apart.  When we detect that the return SS is
on the LDT, we copy the IRET frame to the ministack and use the
relevant alias to return to userspace.  The ministacks are mapped
readonly, so if IRET faults we promote #GP to #DF which is an IST
vector and thus has its own stack; we then do the fixup in the #DF
handler.

(Making #GP an IST exception would make the msr_safe functions unsafe
in NMI/MC context, and quite possibly have other effects.)

Special thanks to:

- Andy Lutomirski, for the suggestion of using very small stack slots
  and copy (as opposed to map) the IRET frame there, and for the
  suggestion to mark them readonly and let the fault promote to #DF.
- Konrad Wilk for paravirt fixup and testing.
- Borislav Petkov for testing help and useful comments.

Reported-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Andrew Lutomriski &lt;amluto@gmail.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Dirk Hohndel &lt;dirk@hohndel.org&gt;
Cc: Arjan van de Ven &lt;arjan.van.de.ven@intel.com&gt;
Cc: comex &lt;comexk@gmail.com&gt;
Cc: Alexander van Heukelum &lt;heukelum@fastmail.fm&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # consider after upstream merge
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mm, pcp: allow restoring percpu_pagelist_fraction default</title>
<updated>2014-07-18T13:51:01+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2014-06-23T20:22:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6b5002673d4de9144db56cfe17046b3873cc6d9e'/>
<id>6b5002673d4de9144db56cfe17046b3873cc6d9e</id>
<content type='text'>
commit 7cd2b0a34ab8e4db971920eef8982f985441adfb upstream.

Oleg reports a division by zero error on zero-length write() to the
percpu_pagelist_fraction sysctl:

    divide error: 0000 [#1] SMP DEBUG_PAGEALLOC
    CPU: 1 PID: 9142 Comm: badarea_io Not tainted 3.15.0-rc2-vm-nfs+ #19
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    task: ffff8800d5aeb6e0 ti: ffff8800d87a2000 task.ti: ffff8800d87a2000
    RIP: 0010: percpu_pagelist_fraction_sysctl_handler+0x84/0x120
    RSP: 0018:ffff8800d87a3e78  EFLAGS: 00010246
    RAX: 0000000000000f89 RBX: ffff88011f7fd000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000010
    RBP: ffff8800d87a3e98 R08: ffffffff81d002c8 R09: ffff8800d87a3f50
    R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000060
    R13: ffffffff81c3c3e0 R14: ffffffff81cfddf8 R15: ffff8801193b0800
    FS:  00007f614f1e9740(0000) GS:ffff88011f440000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00007f614f1fa000 CR3: 00000000d9291000 CR4: 00000000000006e0
    Call Trace:
      proc_sys_call_handler+0xb3/0xc0
      proc_sys_write+0x14/0x20
      vfs_write+0xba/0x1e0
      SyS_write+0x46/0xb0
      tracesys+0xe1/0xe6

However, if the percpu_pagelist_fraction sysctl is set by the user, it
is also impossible to restore it to the kernel default since the user
cannot write 0 to the sysctl.

This patch allows the user to write 0 to restore the default behavior.
It still requires a fraction equal to or larger than 8, however, as
stated by the documentation for sanity.  If a value in the range [1, 7]
is written, the sysctl will return EINVAL.

This successfully solves the divide by zero issue at the same time.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Reported-by: Oleg Drokin &lt;green@linuxhacker.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7cd2b0a34ab8e4db971920eef8982f985441adfb upstream.

Oleg reports a division by zero error on zero-length write() to the
percpu_pagelist_fraction sysctl:

    divide error: 0000 [#1] SMP DEBUG_PAGEALLOC
    CPU: 1 PID: 9142 Comm: badarea_io Not tainted 3.15.0-rc2-vm-nfs+ #19
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    task: ffff8800d5aeb6e0 ti: ffff8800d87a2000 task.ti: ffff8800d87a2000
    RIP: 0010: percpu_pagelist_fraction_sysctl_handler+0x84/0x120
    RSP: 0018:ffff8800d87a3e78  EFLAGS: 00010246
    RAX: 0000000000000f89 RBX: ffff88011f7fd000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000010
    RBP: ffff8800d87a3e98 R08: ffffffff81d002c8 R09: ffff8800d87a3f50
    R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000060
    R13: ffffffff81c3c3e0 R14: ffffffff81cfddf8 R15: ffff8801193b0800
    FS:  00007f614f1e9740(0000) GS:ffff88011f440000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00007f614f1fa000 CR3: 00000000d9291000 CR4: 00000000000006e0
    Call Trace:
      proc_sys_call_handler+0xb3/0xc0
      proc_sys_write+0x14/0x20
      vfs_write+0xba/0x1e0
      SyS_write+0x46/0xb0
      tracesys+0xe1/0xe6

However, if the percpu_pagelist_fraction sysctl is set by the user, it
is also impossible to restore it to the kernel default since the user
cannot write 0 to the sysctl.

This patch allows the user to write 0 to restore the default behavior.
It still requires a fraction equal to or larger than 8, however, as
stated by the documentation for sanity.  If a value in the range [1, 7]
is written, the sysctl will return EINVAL.

This successfully solves the divide by zero issue at the same time.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Reported-by: Oleg Drokin &lt;green@linuxhacker.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
</feed>
