<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/x86/include/asm/pgtable_64.h, branch v4.4.6</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>mm: remove remaining references to NUMA hinting bits and helpers</title>
<updated>2015-02-13T02:54:08+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-02-12T22:58:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=21d9ee3eda7792c45880b2f11bff8e95c9a061fb'/>
<id>21d9ee3eda7792c45880b2f11bff8e95c9a061fb</id>
<content type='text'>
This patch removes the NUMA PTE bits and associated helpers.  As a
side-effect it increases the maximum possible swap space on x86-64.

One potential source of problems is races between the marking of PTEs
PROT_NONE, NUMA hinting faults and migration.  It must be guaranteed that
a PTE being protected is not faulted in parallel, seen as a pte_none and
corrupting memory.  The base case is safe but transhuge has problems in
the past due to an different migration mechanism and a dependance on page
lock to serialise migrations and warrants a closer look.

task_work hinting update			parallel fault
------------------------			--------------
change_pmd_range
  change_huge_pmd
    __pmd_trans_huge_lock
      pmdp_get_and_clear
						__handle_mm_fault
						pmd_none
						  do_huge_pmd_anonymous_page
						  read? pmd_lock blocks until hinting complete, fail !pmd_none test
						  write? __do_huge_pmd_anonymous_page acquires pmd_lock, checks pmd_none
      pmd_modify
      set_pmd_at

task_work hinting update			parallel migration
------------------------			------------------
change_pmd_range
  change_huge_pmd
    __pmd_trans_huge_lock
      pmdp_get_and_clear
						__handle_mm_fault
						  do_huge_pmd_numa_page
						    migrate_misplaced_transhuge_page
						    pmd_lock waits for updates to complete, recheck pmd_same
      pmd_modify
      set_pmd_at

Both of those are safe and the case where a transhuge page is inserted
during a protection update is unchanged.  The case where two processes try
migrating at the same time is unchanged by this series so should still be
ok.  I could not find a case where we are accidentally depending on the
PTE not being cleared and flushed.  If one is missed, it'll manifest as
corruption problems that start triggering shortly after this series is
merged and only happen when NUMA balancing is enabled.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Tested-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Kirill Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch removes the NUMA PTE bits and associated helpers.  As a
side-effect it increases the maximum possible swap space on x86-64.

One potential source of problems is races between the marking of PTEs
PROT_NONE, NUMA hinting faults and migration.  It must be guaranteed that
a PTE being protected is not faulted in parallel, seen as a pte_none and
corrupting memory.  The base case is safe but transhuge has problems in
the past due to an different migration mechanism and a dependance on page
lock to serialise migrations and warrants a closer look.

task_work hinting update			parallel fault
------------------------			--------------
change_pmd_range
  change_huge_pmd
    __pmd_trans_huge_lock
      pmdp_get_and_clear
						__handle_mm_fault
						pmd_none
						  do_huge_pmd_anonymous_page
						  read? pmd_lock blocks until hinting complete, fail !pmd_none test
						  write? __do_huge_pmd_anonymous_page acquires pmd_lock, checks pmd_none
      pmd_modify
      set_pmd_at

task_work hinting update			parallel migration
------------------------			------------------
change_pmd_range
  change_huge_pmd
    __pmd_trans_huge_lock
      pmdp_get_and_clear
						__handle_mm_fault
						  do_huge_pmd_numa_page
						    migrate_misplaced_transhuge_page
						    pmd_lock waits for updates to complete, recheck pmd_same
      pmd_modify
      set_pmd_at

Both of those are safe and the case where a transhuge page is inserted
during a protection update is unchanged.  The case where two processes try
migrating at the same time is unchanged by this series so should still be
ok.  I could not find a case where we are accidentally depending on the
PTE not being cleared and flushed.  If one is missed, it'll manifest as
corruption problems that start triggering shortly after this series is
merged and only happen when NUMA balancing is enabled.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Tested-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Kirill Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: drop _PAGE_FILE and pte_file()-related helpers</title>
<updated>2015-02-10T22:30:33+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2015-02-10T22:11:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0a191362058391878cc2a4d4ccddcd8223eb4f79'/>
<id>0a191362058391878cc2a4d4ccddcd8223eb4f79</id>
<content type='text'>
We've replaced remap_file_pages(2) implementation with emulation.  Nobody
creates non-linear mapping anymore.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We've replaced remap_file_pages(2) implementation with emulation.  Nobody
creates non-linear mapping anymore.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2014-10-14T00:22:41+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-10-14T00:22:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=df133e8fa8e1d4afa57c84953bf80eaed2b145e0'/>
<id>df133e8fa8e1d4afa57c84953bf80eaed2b145e0</id>
<content type='text'>
Pull x86 mm updates from Ingo Molnar:
 "This tree includes the following changes:

   - fix memory hotplug
   - fix hibernation bootup memory layout assumptions
   - fix hyperv numa guest kernel messages
   - remove dead code
   - update documentation"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Update memory map description to list hypervisor-reserved area
  x86/mm, hibernate: Do not assume the first e820 area to be RAM
  x86/mm/numa: Drop dead code and rename setup_node_data() to setup_alloc_data()
  x86/mm/hotplug: Modify PGD entry when removing memory
  x86/mm/hotplug: Pass sync_global_pgds() a correct argument in remove_pagetable()
  x86: Remove set_pmd_pfn
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 mm updates from Ingo Molnar:
 "This tree includes the following changes:

   - fix memory hotplug
   - fix hibernation bootup memory layout assumptions
   - fix hyperv numa guest kernel messages
   - remove dead code
   - update documentation"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Update memory map description to list hypervisor-reserved area
  x86/mm, hibernate: Do not assume the first e820 area to be RAM
  x86/mm/numa: Drop dead code and rename setup_node_data() to setup_alloc_data()
  x86/mm/hotplug: Modify PGD entry when removing memory
  x86/mm/hotplug: Pass sync_global_pgds() a correct argument in remove_pagetable()
  x86: Remove set_pmd_pfn
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mm/hotplug: Modify PGD entry when removing memory</title>
<updated>2014-09-16T06:55:09+00:00</updated>
<author>
<name>Yasuaki Ishimatsu</name>
<email>isimatu.yasuaki@jp.fujitsu.com</email>
</author>
<published>2014-08-22T20:27:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9661d5bcd058fe15b4138a00d96bd36516134543'/>
<id>9661d5bcd058fe15b4138a00d96bd36516134543</id>
<content type='text'>
When hot-adding/removing memory, sync_global_pgds() is called
for synchronizing PGD to PGD entries of all processes MM.  But
when hot-removing memory, sync_global_pgds() does not work
correctly.

At first, sync_global_pgds() checks whether target PGD is none
or not.  And if PGD is none, the PGD is skipped.  But when
hot-removing memory, PGD may be none since PGD may be cleared by
free_pud_table().  So when sync_global_pgds() is called after
hot-removing memory, sync_global_pgds() should not skip PGD even
if the PGD is none.  And sync_global_pgds() must clear PGD
entries of all processes MM.

Currently sync_global_pgds() does not clear PGD entries of all
processes MM when hot-removing memory.  So when hot adding
memory which is same memory range as removed memory after
hot-removing memory, following call traces are shown:

 kernel BUG at arch/x86/mm/init_64.c:206!
 ...
 [&lt;ffffffff815e0c80&gt;] kernel_physical_mapping_init+0x1b2/0x1d2
 [&lt;ffffffff815ced94&gt;] init_memory_mapping+0x1d4/0x380
 [&lt;ffffffff8104aebd&gt;] arch_add_memory+0x3d/0xd0
 [&lt;ffffffff815d03d9&gt;] add_memory+0xb9/0x1b0
 [&lt;ffffffff81352415&gt;] acpi_memory_device_add+0x1af/0x28e
 [&lt;ffffffff81325dc4&gt;] acpi_bus_device_attach+0x8c/0xf0
 [&lt;ffffffff813413b9&gt;] acpi_ns_walk_namespace+0xc8/0x17f
 [&lt;ffffffff81325d38&gt;] ? acpi_bus_type_and_status+0xb7/0xb7
 [&lt;ffffffff81325d38&gt;] ? acpi_bus_type_and_status+0xb7/0xb7
 [&lt;ffffffff813418ed&gt;] acpi_walk_namespace+0x95/0xc5
 [&lt;ffffffff81326b4c&gt;] acpi_bus_scan+0x9a/0xc2
 [&lt;ffffffff81326bff&gt;] acpi_scan_bus_device_check+0x8b/0x12e
 [&lt;ffffffff81326cb5&gt;] acpi_scan_device_check+0x13/0x15
 [&lt;ffffffff81320122&gt;] acpi_os_execute_deferred+0x25/0x32
 [&lt;ffffffff8107e02b&gt;] process_one_work+0x17b/0x460
 [&lt;ffffffff8107edfb&gt;] worker_thread+0x11b/0x400
 [&lt;ffffffff8107ece0&gt;] ? rescuer_thread+0x400/0x400
 [&lt;ffffffff81085aef&gt;] kthread+0xcf/0xe0
 [&lt;ffffffff81085a20&gt;] ? kthread_create_on_node+0x140/0x140
 [&lt;ffffffff815fc76c&gt;] ret_from_fork+0x7c/0xb0
 [&lt;ffffffff81085a20&gt;] ? kthread_create_on_node+0x140/0x140

This patch clears PGD entries of all processes MM when
sync_global_pgds() is called after hot-removing memory

Signed-off-by: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Acked-by: Toshi Kani &lt;toshi.kani@hp.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Gu Zheng &lt;guz.fnst@cn.fujitsu.com&gt;
Cc: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When hot-adding/removing memory, sync_global_pgds() is called
for synchronizing PGD to PGD entries of all processes MM.  But
when hot-removing memory, sync_global_pgds() does not work
correctly.

At first, sync_global_pgds() checks whether target PGD is none
or not.  And if PGD is none, the PGD is skipped.  But when
hot-removing memory, PGD may be none since PGD may be cleared by
free_pud_table().  So when sync_global_pgds() is called after
hot-removing memory, sync_global_pgds() should not skip PGD even
if the PGD is none.  And sync_global_pgds() must clear PGD
entries of all processes MM.

Currently sync_global_pgds() does not clear PGD entries of all
processes MM when hot-removing memory.  So when hot adding
memory which is same memory range as removed memory after
hot-removing memory, following call traces are shown:

 kernel BUG at arch/x86/mm/init_64.c:206!
 ...
 [&lt;ffffffff815e0c80&gt;] kernel_physical_mapping_init+0x1b2/0x1d2
 [&lt;ffffffff815ced94&gt;] init_memory_mapping+0x1d4/0x380
 [&lt;ffffffff8104aebd&gt;] arch_add_memory+0x3d/0xd0
 [&lt;ffffffff815d03d9&gt;] add_memory+0xb9/0x1b0
 [&lt;ffffffff81352415&gt;] acpi_memory_device_add+0x1af/0x28e
 [&lt;ffffffff81325dc4&gt;] acpi_bus_device_attach+0x8c/0xf0
 [&lt;ffffffff813413b9&gt;] acpi_ns_walk_namespace+0xc8/0x17f
 [&lt;ffffffff81325d38&gt;] ? acpi_bus_type_and_status+0xb7/0xb7
 [&lt;ffffffff81325d38&gt;] ? acpi_bus_type_and_status+0xb7/0xb7
 [&lt;ffffffff813418ed&gt;] acpi_walk_namespace+0x95/0xc5
 [&lt;ffffffff81326b4c&gt;] acpi_bus_scan+0x9a/0xc2
 [&lt;ffffffff81326bff&gt;] acpi_scan_bus_device_check+0x8b/0x12e
 [&lt;ffffffff81326cb5&gt;] acpi_scan_device_check+0x13/0x15
 [&lt;ffffffff81320122&gt;] acpi_os_execute_deferred+0x25/0x32
 [&lt;ffffffff8107e02b&gt;] process_one_work+0x17b/0x460
 [&lt;ffffffff8107edfb&gt;] worker_thread+0x11b/0x400
 [&lt;ffffffff8107ece0&gt;] ? rescuer_thread+0x400/0x400
 [&lt;ffffffff81085aef&gt;] kthread+0xcf/0xe0
 [&lt;ffffffff81085a20&gt;] ? kthread_create_on_node+0x140/0x140
 [&lt;ffffffff815fc76c&gt;] ret_from_fork+0x7c/0xb0
 [&lt;ffffffff81085a20&gt;] ? kthread_create_on_node+0x140/0x140

This patch clears PGD entries of all processes MM when
sync_global_pgds() is called after hot-removing memory

Signed-off-by: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Acked-by: Toshi Kani &lt;toshi.kani@hp.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Gu Zheng &lt;guz.fnst@cn.fujitsu.com&gt;
Cc: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/xen: don't copy bogus duplicate entries into kernel page tables</title>
<updated>2014-09-10T14:23:42+00:00</updated>
<author>
<name>Stefan Bader</name>
<email>stefan.bader@canonical.com</email>
</author>
<published>2014-09-02T10:16:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0b5a50635fc916cf46e3de0b819a61fc3f17e7ee'/>
<id>0b5a50635fc916cf46e3de0b819a61fc3f17e7ee</id>
<content type='text'>
When RANDOMIZE_BASE (KASLR) is enabled; or the sum of all loaded
modules exceeds 512 MiB, then loading modules fails with a warning
(and hence a vmalloc allocation failure) because the PTEs for the
newly-allocated vmalloc address space are not zero.

  WARNING: CPU: 0 PID: 494 at linux/mm/vmalloc.c:128
           vmap_page_range_noflush+0x2a1/0x360()

This is caused by xen_setup_kernel_pagetables() copying
level2_kernel_pgt into level2_fixmap_pgt, overwriting many non-present
entries.

Without KASLR, the normal kernel image size only covers the first half
of level2_kernel_pgt and module space starts after that.

L4[511]-&gt;level3_kernel_pgt[510]-&gt;level2_kernel_pgt[  0..255]-&gt;kernel
                                                  [256..511]-&gt;module
                          [511]-&gt;level2_fixmap_pgt[  0..505]-&gt;module

This allows 512 MiB of of module vmalloc space to be used before
having to use the corrupted level2_fixmap_pgt entries.

With KASLR enabled, the kernel image uses the full PUD range of 1G and
module space starts in the level2_fixmap_pgt. So basically:

L4[511]-&gt;level3_kernel_pgt[510]-&gt;level2_kernel_pgt[0..511]-&gt;kernel
                          [511]-&gt;level2_fixmap_pgt[0..505]-&gt;module

And now no module vmalloc space can be used without using the corrupt
level2_fixmap_pgt entries.

Fix this by properly converting the level2_fixmap_pgt entries to MFNs,
and setting level1_fixmap_pgt as read-only.

A number of comments were also using the the wrong L3 offset for
level2_kernel_pgt.  These have been corrected.

Signed-off-by: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Signed-off-by: David Vrabel &lt;david.vrabel@citrix.com&gt;
Reviewed-by: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: stable@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When RANDOMIZE_BASE (KASLR) is enabled; or the sum of all loaded
modules exceeds 512 MiB, then loading modules fails with a warning
(and hence a vmalloc allocation failure) because the PTEs for the
newly-allocated vmalloc address space are not zero.

  WARNING: CPU: 0 PID: 494 at linux/mm/vmalloc.c:128
           vmap_page_range_noflush+0x2a1/0x360()

This is caused by xen_setup_kernel_pagetables() copying
level2_kernel_pgt into level2_fixmap_pgt, overwriting many non-present
entries.

Without KASLR, the normal kernel image size only covers the first half
of level2_kernel_pgt and module space starts after that.

L4[511]-&gt;level3_kernel_pgt[510]-&gt;level2_kernel_pgt[  0..255]-&gt;kernel
                                                  [256..511]-&gt;module
                          [511]-&gt;level2_fixmap_pgt[  0..505]-&gt;module

This allows 512 MiB of of module vmalloc space to be used before
having to use the corrupted level2_fixmap_pgt entries.

With KASLR enabled, the kernel image uses the full PUD range of 1G and
module space starts in the level2_fixmap_pgt. So basically:

L4[511]-&gt;level3_kernel_pgt[510]-&gt;level2_kernel_pgt[0..511]-&gt;kernel
                          [511]-&gt;level2_fixmap_pgt[0..505]-&gt;module

And now no module vmalloc space can be used without using the corrupt
level2_fixmap_pgt entries.

Fix this by properly converting the level2_fixmap_pgt entries to MFNs,
and setting level1_fixmap_pgt as read-only.

A number of comments were also using the the wrong L3 offset for
level2_kernel_pgt.  These have been corrected.

Signed-off-by: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Signed-off-by: David Vrabel &lt;david.vrabel@citrix.com&gt;
Reviewed-by: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: stable@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: x86 pgtable: drop unneeded preprocessor ifdef</title>
<updated>2014-06-04T23:54:05+00:00</updated>
<author>
<name>Cyrill Gorcunov</name>
<email>gorcunov@openvz.org</email>
</author>
<published>2014-06-04T23:08:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2373eaecff33db5972bde9418f92d6401b4a945c'/>
<id>2373eaecff33db5972bde9418f92d6401b4a945c</id>
<content type='text'>
_PAGE_BIT_FILE (bit 6) is always less than _PAGE_BIT_PROTNONE (bit 8), so
drop redundant #ifdef.

Signed-off-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Steven Noonan &lt;steven@uplinklabs.net&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
_PAGE_BIT_FILE (bit 6) is always less than _PAGE_BIT_PROTNONE (bit 8), so
drop redundant #ifdef.

Signed-off-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Steven Noonan &lt;steven@uplinklabs.net&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels</title>
<updated>2014-06-04T23:53:55+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2014-06-04T23:06:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c46a7c817e662a820373bb76b88d0ad67d6abe5d'/>
<id>c46a7c817e662a820373bb76b88d0ad67d6abe5d</id>
<content type='text'>
_PAGE_NUMA is currently an alias of _PROT_PROTNONE to trap NUMA hinting
faults on x86.  Care is taken such that _PAGE_NUMA is used only in
situations where the VMA flags distinguish between NUMA hinting faults
and prot_none faults.  This decision was x86-specific and conceptually
it is difficult requiring special casing to distinguish between PROTNONE
and NUMA ptes based on context.

Fundamentally, we only need the _PAGE_NUMA bit to tell the difference
between an entry that is really unmapped and a page that is protected
for NUMA hinting faults as if the PTE is not present then a fault will
be trapped.

Swap PTEs on x86-64 use the bits after _PAGE_GLOBAL for the offset.
This patch shrinks the maximum possible swap size and uses the bit to
uniquely distinguish between NUMA hinting ptes and swap ptes.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Steven Noonan &lt;steven@uplinklabs.net&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
_PAGE_NUMA is currently an alias of _PROT_PROTNONE to trap NUMA hinting
faults on x86.  Care is taken such that _PAGE_NUMA is used only in
situations where the VMA flags distinguish between NUMA hinting faults
and prot_none faults.  This decision was x86-specific and conceptually
it is difficult requiring special casing to distinguish between PROTNONE
and NUMA ptes based on context.

Fundamentally, we only need the _PAGE_NUMA bit to tell the difference
between an entry that is really unmapped and a page that is protected
for NUMA hinting faults as if the PTE is not present then a fault will
be trapped.

Swap PTEs on x86-64 use the bits after _PAGE_GLOBAL for the offset.
This patch shrinks the maximum possible swap size and uses the bit to
uniquely distinguish between NUMA hinting ptes and swap ptes.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Steven Noonan &lt;steven@uplinklabs.net&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2013-02-22T02:06:55+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-02-22T02:06:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2ef14f465b9e096531343f5b734cffc5f759f4a6'/>
<id>2ef14f465b9e096531343f5b734cffc5f759f4a6</id>
<content type='text'>
Pull x86 mm changes from Peter Anvin:
 "This is a huge set of several partly interrelated (and concurrently
  developed) changes, which is why the branch history is messier than
  one would like.

  The *really* big items are two humonguous patchsets mostly developed
  by Yinghai Lu at my request, which completely revamps the way we
  create initial page tables.  In particular, rather than estimating how
  much memory we will need for page tables and then build them into that
  memory -- a calculation that has shown to be incredibly fragile -- we
  now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
  a #PF handler which creates temporary page tables on demand.

  This has several advantages:

  1. It makes it much easier to support things that need access to data
     very early (a followon patchset uses this to load microcode way
     early in the kernel startup).

  2. It allows the kernel and all the kernel data objects to be invoked
     from above the 4 GB limit.  This allows kdump to work on very large
     systems.

  3. It greatly reduces the difference between Xen and native (Xen's
     equivalent of the #PF handler are the temporary page tables created
     by the domain builder), eliminating a bunch of fragile hooks.

  The patch series also gets us a bit closer to W^X.

  Additional work in this pull is the 64-bit get_user() work which you
  were also involved with, and a bunch of cleanups/speedups to
  __phys_addr()/__pa()."

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (105 commits)
  x86, mm: Move reserving low memory later in initialization
  x86, doc: Clarify the use of asm("%edx") in uaccess.h
  x86, mm: Redesign get_user with a __builtin_choose_expr hack
  x86: Be consistent with data size in getuser.S
  x86, mm: Use a bitfield to mask nuisance get_user() warnings
  x86/kvm: Fix compile warning in kvm_register_steal_time()
  x86-32: Add support for 64bit get_user()
  x86-32, mm: Remove reference to alloc_remap()
  x86-32, mm: Remove reference to resume_map_numa_kva()
  x86-32, mm: Rip out x86_32 NUMA remapping code
  x86/numa: Use __pa_nodebug() instead
  x86: Don't panic if can not alloc buffer for swiotlb
  mm: Add alloc_bootmem_low_pages_nopanic()
  x86, 64bit, mm: hibernate use generic mapping_init
  x86, 64bit, mm: Mark data/bss/brk to nx
  x86: Merge early kernel reserve for 32bit and 64bit
  x86: Add Crash kernel low reservation
  x86, kdump: Remove crashkernel range find limit for 64bit
  memblock: Add memblock_mem_size()
  x86, boot: Not need to check setup_header version for setup_data
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 mm changes from Peter Anvin:
 "This is a huge set of several partly interrelated (and concurrently
  developed) changes, which is why the branch history is messier than
  one would like.

  The *really* big items are two humonguous patchsets mostly developed
  by Yinghai Lu at my request, which completely revamps the way we
  create initial page tables.  In particular, rather than estimating how
  much memory we will need for page tables and then build them into that
  memory -- a calculation that has shown to be incredibly fragile -- we
  now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
  a #PF handler which creates temporary page tables on demand.

  This has several advantages:

  1. It makes it much easier to support things that need access to data
     very early (a followon patchset uses this to load microcode way
     early in the kernel startup).

  2. It allows the kernel and all the kernel data objects to be invoked
     from above the 4 GB limit.  This allows kdump to work on very large
     systems.

  3. It greatly reduces the difference between Xen and native (Xen's
     equivalent of the #PF handler are the temporary page tables created
     by the domain builder), eliminating a bunch of fragile hooks.

  The patch series also gets us a bit closer to W^X.

  Additional work in this pull is the 64-bit get_user() work which you
  were also involved with, and a bunch of cleanups/speedups to
  __phys_addr()/__pa()."

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (105 commits)
  x86, mm: Move reserving low memory later in initialization
  x86, doc: Clarify the use of asm("%edx") in uaccess.h
  x86, mm: Redesign get_user with a __builtin_choose_expr hack
  x86: Be consistent with data size in getuser.S
  x86, mm: Use a bitfield to mask nuisance get_user() warnings
  x86/kvm: Fix compile warning in kvm_register_steal_time()
  x86-32: Add support for 64bit get_user()
  x86-32, mm: Remove reference to alloc_remap()
  x86-32, mm: Remove reference to resume_map_numa_kva()
  x86-32, mm: Rip out x86_32 NUMA remapping code
  x86/numa: Use __pa_nodebug() instead
  x86: Don't panic if can not alloc buffer for swiotlb
  mm: Add alloc_bootmem_low_pages_nopanic()
  x86, 64bit, mm: hibernate use generic mapping_init
  x86, 64bit, mm: Mark data/bss/brk to nx
  x86: Merge early kernel reserve for 32bit and 64bit
  x86: Add Crash kernel low reservation
  x86, kdump: Remove crashkernel range find limit for 64bit
  memblock: Add memblock_mem_size()
  x86, boot: Not need to check setup_header version for setup_data
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mm: Convert update_mmu_cache() and update_mmu_cache_pmd() to functions</title>
<updated>2013-01-24T15:12:13+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2012-12-18T20:22:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=602e018607ba5c92922c0ffae40e346e1b95fa84'/>
<id>602e018607ba5c92922c0ffae40e346e1b95fa84</id>
<content type='text'>
Converting macros to functions unhide type problems before
changes will be integrated and trigger problems on other
architectures.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Converting macros to functions unhide type problems before
changes will be integrated and trigger problems on other
architectures.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h</title>
<updated>2012-11-17T00:40:34+00:00</updated>
<author>
<name>Alexander Duyck</name>
<email>alexander.h.duyck@intel.com</email>
</author>
<published>2012-11-16T21:53:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fb50b020c5331c8c4bee0eb875865f5f8be6c03a'/>
<id>fb50b020c5331c8c4bee0eb875865f5f8be6c03a</id>
<content type='text'>
This patch is meant to clean-up the fact that we have several functions in
page_64_types.h which really don't belong there.  I found this issue when I
had tried to replace __phys_addr with an inline function.  It resulted in the
realmode bits generating compile warnings about types.  In order to resolve
that I am relocating the address translation to page_64.h since this is in
keeping with where these functions are located in 32 bit.

In addtion I have relocated several functions defined in init_64.c to
pgtable_64.h as this seems to be where most of the functions related to
memory initialization were already located.

[ hpa: added missing #include &lt;asm/pgtable.h&gt; to apic_numachip.c,
  as reported by Yinghai Lu. ]

Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@intel.com&gt;
Link: http://lkml.kernel.org/r/20121116215244.8521.31505.stgit@ahduyck-cp1.jf.intel.com
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: Daniel J Blueman &lt;daniel@numascale-asia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch is meant to clean-up the fact that we have several functions in
page_64_types.h which really don't belong there.  I found this issue when I
had tried to replace __phys_addr with an inline function.  It resulted in the
realmode bits generating compile warnings about types.  In order to resolve
that I am relocating the address translation to page_64.h since this is in
keeping with where these functions are located in 32 bit.

In addtion I have relocated several functions defined in init_64.c to
pgtable_64.h as this seems to be where most of the functions related to
memory initialization were already located.

[ hpa: added missing #include &lt;asm/pgtable.h&gt; to apic_numachip.c,
  as reported by Yinghai Lu. ]

Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@intel.com&gt;
Link: http://lkml.kernel.org/r/20121116215244.8521.31505.stgit@ahduyck-cp1.jf.intel.com
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: Daniel J Blueman &lt;daniel@numascale-asia.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
