<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch, branch v2.6.27.31</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>x86: enable GART-IOMMU only after setting up protection methods</title>
<updated>2009-08-16T21:27:04+00:00</updated>
<author>
<name>Mark Langsdorf</name>
<email>mark.langsdorf@amd.com</email>
</author>
<published>2009-07-05T20:50:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=941afb2843070a0af16fb56d09f3e2364ff94cf8'/>
<id>941afb2843070a0af16fb56d09f3e2364ff94cf8</id>
<content type='text'>
commit fe2245c905631a3a353504fc04388ce3dfaf9d9e upstream.

The current code to set up the GART as an IOMMU enables GART
translations before it removes the aperture from the kernel memory
map, sets the GART PTEs to UC, sets up the guard and scratch
pages, or does a wbinvd().  This leaves the possibility of cache
aliasing open and can cause system crashes.

Re-order the code so as to enable the GART translations only
after all safeguards are in place and the tlb has been flushed.

AMD has tested this patch on both Istanbul systems and 1st
generation Opteron systems with APG enabled and seen no adverse
effects.  Istanbul systems with HT Assist enabled sometimes
see MCE errors due to cache artifacts with the unmodified
code.

Signed-off-by: Mark Langsdorf &lt;mark.langsdorf@amd.com&gt;
Cc: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Cc: akpm@linux-foundation.org
Cc: jbarnes@virtuousgeek.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit fe2245c905631a3a353504fc04388ce3dfaf9d9e upstream.

The current code to set up the GART as an IOMMU enables GART
translations before it removes the aperture from the kernel memory
map, sets the GART PTEs to UC, sets up the guard and scratch
pages, or does a wbinvd().  This leaves the possibility of cache
aliasing open and can cause system crashes.

Re-order the code so as to enable the GART translations only
after all safeguards are in place and the tlb has been flushed.

AMD has tested this patch on both Istanbul systems and 1st
generation Opteron systems with APG enabled and seen no adverse
effects.  Istanbul systems with HT Assist enabled sometimes
see MCE errors due to cache artifacts with the unmodified
code.

Signed-off-by: Mark Langsdorf &lt;mark.langsdorf@amd.com&gt;
Cc: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Cc: akpm@linux-foundation.org
Cc: jbarnes@virtuousgeek.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>parisc: ensure broadcast tlb purge runs single threaded</title>
<updated>2009-08-16T21:26:56+00:00</updated>
<author>
<name>Helge Deller</name>
<email>deller@gmx.de</email>
</author>
<published>2009-07-29T21:17:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2614443191474e4c8619bf59665f9ef820e5276c'/>
<id>2614443191474e4c8619bf59665f9ef820e5276c</id>
<content type='text'>
commit e82a3b75127188f20c7780bec580e148beb29da7 upstream

parisc: ensure broadcast tlb purge runs single threaded
The TLB flushing functions on hppa, which causes PxTLB broadcasts on the system
bus, needs to be protected by irq-safe spinlocks to avoid irq handlers to deadlock
the kernel. The deadlocks only happened during I/O intensive loads and triggered
pretty seldom, which is why this bug went so long unnoticed.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
[edited to use spin_lock_irqsave on UP as well since we'd been locking there
  all this time anyway, --kyle]
Signed-off-by: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e82a3b75127188f20c7780bec580e148beb29da7 upstream

parisc: ensure broadcast tlb purge runs single threaded
The TLB flushing functions on hppa, which causes PxTLB broadcasts on the system
bus, needs to be protected by irq-safe spinlocks to avoid irq handlers to deadlock
the kernel. The deadlocks only happened during I/O intensive loads and triggered
pretty seldom, which is why this bug went so long unnoticed.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
[edited to use spin_lock_irqsave on UP as well since we'd been locking there
  all this time anyway, --kyle]
Signed-off-by: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86: don't use 'access_ok()' as a range check in get_user_pages_fast()</title>
<updated>2009-07-30T23:06:08+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2009-06-22T17:25:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=034b03f44cba675abf2550324c4b2ebc1a4a85b3'/>
<id>034b03f44cba675abf2550324c4b2ebc1a4a85b3</id>
<content type='text'>
[ Upstream commit 7f8189068726492950bf1a2dcfd9b51314560abf - modified
  for stable to not use the sloppy __VIRTUAL_MASK_SHIFT ]

It's really not right to use 'access_ok()', since that is meant for the
normal "get_user()" and "copy_from/to_user()" accesses, which are done
through the TLB, rather than through the page tables.

Why? access_ok() does both too few, and too many checks.  Too many,
because it is meant for regular kernel accesses that will not honor the
'user' bit in the page tables, and because it honors the USER_DS vs
KERNEL_DS distinction that we shouldn't care about in GUP.  And too few,
because it doesn't do the 'canonical' check on the address on x86-64,
since the TLB will do that for us.

So instead of using a function that isn't meant for this, and does
something else and much more complicated, just do the real rules: we
don't want the range to overflow, and on x86-64, we want it to be a
canonical low address (on 32-bit, all addresses are canonical).

Acked-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7f8189068726492950bf1a2dcfd9b51314560abf - modified
  for stable to not use the sloppy __VIRTUAL_MASK_SHIFT ]

It's really not right to use 'access_ok()', since that is meant for the
normal "get_user()" and "copy_from/to_user()" accesses, which are done
through the TLB, rather than through the page tables.

Why? access_ok() does both too few, and too many checks.  Too many,
because it is meant for regular kernel accesses that will not honor the
'user' bit in the page tables, and because it honors the USER_DS vs
KERNEL_DS distinction that we shouldn't care about in GUP.  And too few,
because it doesn't do the 'canonical' check on the address on x86-64,
since the TLB will do that for us.

So instead of using a function that isn't meant for this, and does
something else and much more complicated, just do the real rules: we
don't want the range to overflow, and on x86-64, we want it to be a
canonical low address (on 32-bit, all addresses are canonical).

Acked-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64: Fix bad_srat() to clear all state</title>
<updated>2009-07-30T23:06:08+00:00</updated>
<author>
<name>Andi Kleen</name>
<email>andi@firstfloor.org</email>
</author>
<published>2009-07-18T06:56:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=30629ea3f895f3329530598d6b048e116a9bc593'/>
<id>30629ea3f895f3329530598d6b048e116a9bc593</id>
<content type='text'>
commit 429b2b319af3987e808c18f6b81313104caf782c upstream.

Need to clear both nodes and nodes_add state for start/end.

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
LKML-Reference: &lt;20090718065657.GA2898@basil.fritz.box&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 429b2b319af3987e808c18f6b81313104caf782c upstream.

Need to clear both nodes and nodes_add state for start/end.

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
LKML-Reference: &lt;20090718065657.GA2898@basil.fritz.box&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86: handle initrd that extends into unusable memory</title>
<updated>2009-07-02T23:31:49+00:00</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2009-06-05T02:14:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=16171659aff37ec9e4a16f2881b16d07df6a3111'/>
<id>16171659aff37ec9e4a16f2881b16d07df6a3111</id>
<content type='text'>
commit 8c5dd8f43367f4f266dd616f11658005bc2d20ef upstream.

On a system where system memory (according e820) is not covered by
mtrr, mtrr_trim_memory converts a portion of memory to reserved, but
bootloader has already put the initrd in that range.

Thus, we need to have 64bit to use relocate_initrd too.

[ Impact: fix using initrd when mtrr_trim_memory happen ]

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8c5dd8f43367f4f266dd616f11658005bc2d20ef upstream.

On a system where system memory (according e820) is not covered by
mtrr, mtrr_trim_memory converts a portion of memory to reserved, but
bootloader has already put the initrd in that range.

Thus, we need to have 64bit to use relocate_initrd too.

[ Impact: fix using initrd when mtrr_trim_memory happen ]

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86: quirk for reboot stalls on a Dell Optiplex 330</title>
<updated>2009-07-02T23:31:41+00:00</updated>
<author>
<name>Steve Conklin</name>
<email>sconklin@canonical.com</email>
</author>
<published>2008-11-14T06:55:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=02be1c88182e1a3fc2170301e3398ecab99d1c96'/>
<id>02be1c88182e1a3fc2170301e3398ecab99d1c96</id>
<content type='text'>
commit 093bac154c142fa1fb31a3ac69ae1bc08930231b upstream.

Dell Optiplex 330 appears to hang on reboot. This is resolved by adding
a quirk to set bios reboot.

Signed-off-by: Leann Ogasawara &lt;leann.ogasawara@canonical.com&gt;
Signed-off-by: Steve Conklin &lt;steve.conklin@canonical.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Jean Delvare &lt;khali@linux-fr.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 093bac154c142fa1fb31a3ac69ae1bc08930231b upstream.

Dell Optiplex 330 appears to hang on reboot. This is resolved by adding
a quirk to set bios reboot.

Signed-off-by: Leann Ogasawara &lt;leann.ogasawara@canonical.com&gt;
Signed-off-by: Steve Conklin &lt;steve.conklin@canonical.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Jean Delvare &lt;khali@linux-fr.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86: Add quirk for reboot stalls on a Dell Optiplex 360</title>
<updated>2009-07-02T23:31:40+00:00</updated>
<author>
<name>Jean Delvare</name>
<email>jdelvare@suse.de</email>
</author>
<published>2009-06-05T10:02:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dce33cf16cec5f4577850b34dcfc875d52bc6a80'/>
<id>dce33cf16cec5f4577850b34dcfc875d52bc6a80</id>
<content type='text'>
commit 4a4aca641bc4598e77b866804f47c651ec4a764d upstream.

The Dell Optiplex 360 hangs on reboot, just like the Optiplex 330, so
the same quirk is needed.

Signed-off-by: Jean Delvare &lt;jdelvare@suse.de&gt;
Cc: Steve Conklin &lt;steve.conklin@canonical.com&gt;
Cc: Leann Ogasawara &lt;leann.ogasawara@canonical.com&gt;
LKML-Reference: &lt;200906051202.38311.jdelvare@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4a4aca641bc4598e77b866804f47c651ec4a764d upstream.

The Dell Optiplex 360 hangs on reboot, just like the Optiplex 330, so
the same quirk is needed.

Signed-off-by: Jean Delvare &lt;jdelvare@suse.de&gt;
Cc: Steve Conklin &lt;steve.conklin@canonical.com&gt;
Cc: Leann Ogasawara &lt;leann.ogasawara@canonical.com&gt;
LKML-Reference: &lt;200906051202.38311.jdelvare@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86: fix DMI on EFI</title>
<updated>2009-06-12T03:01:35+00:00</updated>
<author>
<name>Brian Maly</name>
<email>bmaly@redhat.com</email>
</author>
<published>2009-03-04T02:55:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=edafc7a273c17f4806fd785c2fb0574161e8dcf5'/>
<id>edafc7a273c17f4806fd785c2fb0574161e8dcf5</id>
<content type='text'>
commit ff0c0874905fb312ca1491bbdac2653b0b48c20b upstream.

Impact: reactivate DMI quirks on EFI hardware

DMI tables are loaded by EFI, so the dmi calls must happen after
efi_init() and not before.

Currently Apple hardware uses DMI to determine the framebuffer mappings
for efifb. Without DMI working you also have no video on MacBook Pro.

This patch resolves the DMI issue for EFI hardware (DMI is now properly
detected at boot), and additionally efifb now loads on Apple hardware
(i.e. video works).

Signed-off-by: Brian Maly &lt;bmaly@redhat&gt;
Acked-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: ying.huang@intel.com
LKML-Reference: &lt;49ADEDA3.1030406@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ff0c0874905fb312ca1491bbdac2653b0b48c20b upstream.

Impact: reactivate DMI quirks on EFI hardware

DMI tables are loaded by EFI, so the dmi calls must happen after
efi_init() and not before.

Currently Apple hardware uses DMI to determine the framebuffer mappings
for efifb. Without DMI working you also have no video on MacBook Pro.

This patch resolves the DMI issue for EFI hardware (DMI is now properly
detected at boot), and additionally efifb now loads on Apple hardware
(i.e. video works).

Signed-off-by: Brian Maly &lt;bmaly@redhat&gt;
Acked-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: ying.huang@intel.com
LKML-Reference: &lt;49ADEDA3.1030406@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/pci: fix mmconfig detection with 32bit near 4g</title>
<updated>2009-06-12T03:01:32+00:00</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2009-06-03T07:13:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c45dabcb311b0bba1c028d2ca5d6e94de65bee0a'/>
<id>c45dabcb311b0bba1c028d2ca5d6e94de65bee0a</id>
<content type='text'>
commit 75e613cdc7bb2ba3795b1bc3ddf19476c767ba68 upstream.

Pascal reported and bisected a commit:
|	x86/PCI: don't call e820_all_mapped with -1 in the mmconfig case

which broke one system system.

ACPI: Using IOAPIC for interrupt routing
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 255
PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
PCI: Using MMCONFIG for extended config space

it didn't have
PCI: updated MCFG configuration 0: base f0000000 segment 0 buses 0 - 63
anymore, and try to use 0xf000000 - 0xffffffff for mmconfig

For 32bit, mcfg_res-&gt;end could be 32bit only (if 64 resources aren't used)
So use end - 1 to pass the value in mcfg-&gt;end to avoid overflow.

We don't need to worry about the e820 path, they are always 64 bit.

Reported-by: Pascal Terjan &lt;pterjan@mandriva.com&gt;
Bisected-by: Pascal Terjan &lt;pterjan@mandriva.com&gt;
Tested-by: Pascal Terjan &lt;pterjan@mandriva.com&gt;
Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: Jesse Barnes &lt;jbarnes@virtuousgeek.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 75e613cdc7bb2ba3795b1bc3ddf19476c767ba68 upstream.

Pascal reported and bisected a commit:
|	x86/PCI: don't call e820_all_mapped with -1 in the mmconfig case

which broke one system system.

ACPI: Using IOAPIC for interrupt routing
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 255
PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
PCI: Using MMCONFIG for extended config space

it didn't have
PCI: updated MCFG configuration 0: base f0000000 segment 0 buses 0 - 63
anymore, and try to use 0xf000000 - 0xffffffff for mmconfig

For 32bit, mcfg_res-&gt;end could be 32bit only (if 64 resources aren't used)
So use end - 1 to pass the value in mcfg-&gt;end to avoid overflow.

We don't need to worry about the e820 path, they are always 64 bit.

Reported-by: Pascal Terjan &lt;pterjan@mandriva.com&gt;
Bisected-by: Pascal Terjan &lt;pterjan@mandriva.com&gt;
Tested-by: Pascal Terjan &lt;pterjan@mandriva.com&gt;
Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: Jesse Barnes &lt;jbarnes@virtuousgeek.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86: ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not</title>
<updated>2009-06-12T03:01:31+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mel@csn.ul.ie</email>
</author>
<published>2009-05-28T21:34:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a1f36c3727c032c6190d04098928d7c95962a82e'/>
<id>a1f36c3727c032c6190d04098928d7c95962a82e</id>
<content type='text'>
commit 32b154c0b0bae2879bf4e549d861caf1759a3546 upstream.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13302

On x86 and x86-64, it is possible that page tables are shared beween
shared mappings backed by hugetlbfs.  As part of this,
page_table_shareable() checks a pair of vma-&gt;vm_flags and they must match
if they are to be shared.  All VMA flags are taken into account, including
VM_LOCKED.

The problem is that VM_LOCKED is cleared on fork().  When a process with a
shared memory segment forks() to exec() a helper, there will be shared
VMAs with different flags.  The impact is that the shared segment is
sometimes considered shareable and other times not, depending on what
process is checking.

What happens is that the segment page tables are being shared but the
count is inaccurate depending on the ordering of events.  As the page
tables are freed with put_page(), bad pmd's are found when some of the
children exit.  The hugepage counters also get corrupted and the Total and
Free count will no longer match even when all the hugepage-backed regions
are freed.  This requires a reboot of the machine to "fix".

This patch addresses the problem by comparing all flags except VM_LOCKED
when deciding if pagetables should be shared or not for hugetlbfs-backed
mapping.

Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Acked-by: Hugh Dickins &lt;hugh.dickins@tiscali.co.uk&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: &lt;starlight@binnacle.cx&gt;
Cc: Eric B Munson &lt;ebmunson@us.ibm.com&gt;
Cc: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: Andy Whitcroft &lt;apw@canonical.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 32b154c0b0bae2879bf4e549d861caf1759a3546 upstream.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13302

On x86 and x86-64, it is possible that page tables are shared beween
shared mappings backed by hugetlbfs.  As part of this,
page_table_shareable() checks a pair of vma-&gt;vm_flags and they must match
if they are to be shared.  All VMA flags are taken into account, including
VM_LOCKED.

The problem is that VM_LOCKED is cleared on fork().  When a process with a
shared memory segment forks() to exec() a helper, there will be shared
VMAs with different flags.  The impact is that the shared segment is
sometimes considered shareable and other times not, depending on what
process is checking.

What happens is that the segment page tables are being shared but the
count is inaccurate depending on the ordering of events.  As the page
tables are freed with put_page(), bad pmd's are found when some of the
children exit.  The hugepage counters also get corrupted and the Total and
Free count will no longer match even when all the hugepage-backed regions
are freed.  This requires a reboot of the machine to "fix".

This patch addresses the problem by comparing all flags except VM_LOCKED
when deciding if pagetables should be shared or not for hugetlbfs-backed
mapping.

Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Acked-by: Hugh Dickins &lt;hugh.dickins@tiscali.co.uk&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: &lt;starlight@binnacle.cx&gt;
Cc: Eric B Munson &lt;ebmunson@us.ibm.com&gt;
Cc: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: Andy Whitcroft &lt;apw@canonical.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
</feed>
