<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/Documentation/sysctl/vm.txt, branch v3.17-rc2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>mm, pcp: allow restoring percpu_pagelist_fraction default</title>
<updated>2014-06-23T23:47:43+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2014-06-23T20:22:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7cd2b0a34ab8e4db971920eef8982f985441adfb'/>
<id>7cd2b0a34ab8e4db971920eef8982f985441adfb</id>
<content type='text'>
Oleg reports a division by zero error on zero-length write() to the
percpu_pagelist_fraction sysctl:

    divide error: 0000 [#1] SMP DEBUG_PAGEALLOC
    CPU: 1 PID: 9142 Comm: badarea_io Not tainted 3.15.0-rc2-vm-nfs+ #19
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    task: ffff8800d5aeb6e0 ti: ffff8800d87a2000 task.ti: ffff8800d87a2000
    RIP: 0010: percpu_pagelist_fraction_sysctl_handler+0x84/0x120
    RSP: 0018:ffff8800d87a3e78  EFLAGS: 00010246
    RAX: 0000000000000f89 RBX: ffff88011f7fd000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000010
    RBP: ffff8800d87a3e98 R08: ffffffff81d002c8 R09: ffff8800d87a3f50
    R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000060
    R13: ffffffff81c3c3e0 R14: ffffffff81cfddf8 R15: ffff8801193b0800
    FS:  00007f614f1e9740(0000) GS:ffff88011f440000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00007f614f1fa000 CR3: 00000000d9291000 CR4: 00000000000006e0
    Call Trace:
      proc_sys_call_handler+0xb3/0xc0
      proc_sys_write+0x14/0x20
      vfs_write+0xba/0x1e0
      SyS_write+0x46/0xb0
      tracesys+0xe1/0xe6

However, if the percpu_pagelist_fraction sysctl is set by the user, it
is also impossible to restore it to the kernel default since the user
cannot write 0 to the sysctl.

This patch allows the user to write 0 to restore the default behavior.
It still requires a fraction equal to or larger than 8, however, as
stated by the documentation for sanity.  If a value in the range [1, 7]
is written, the sysctl will return EINVAL.

This successfully solves the divide by zero issue at the same time.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Reported-by: Oleg Drokin &lt;green@linuxhacker.ru&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Oleg reports a division by zero error on zero-length write() to the
percpu_pagelist_fraction sysctl:

    divide error: 0000 [#1] SMP DEBUG_PAGEALLOC
    CPU: 1 PID: 9142 Comm: badarea_io Not tainted 3.15.0-rc2-vm-nfs+ #19
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    task: ffff8800d5aeb6e0 ti: ffff8800d87a2000 task.ti: ffff8800d87a2000
    RIP: 0010: percpu_pagelist_fraction_sysctl_handler+0x84/0x120
    RSP: 0018:ffff8800d87a3e78  EFLAGS: 00010246
    RAX: 0000000000000f89 RBX: ffff88011f7fd000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000010
    RBP: ffff8800d87a3e98 R08: ffffffff81d002c8 R09: ffff8800d87a3f50
    R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000060
    R13: ffffffff81c3c3e0 R14: ffffffff81cfddf8 R15: ffff8801193b0800
    FS:  00007f614f1e9740(0000) GS:ffff88011f440000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00007f614f1fa000 CR3: 00000000d9291000 CR4: 00000000000006e0
    Call Trace:
      proc_sys_call_handler+0xb3/0xc0
      proc_sys_write+0x14/0x20
      vfs_write+0xba/0x1e0
      SyS_write+0x46/0xb0
      tracesys+0xe1/0xe6

However, if the percpu_pagelist_fraction sysctl is set by the user, it
is also impossible to restore it to the kernel default since the user
cannot write 0 to the sysctl.

This patch allows the user to write 0 to restore the default behavior.
It still requires a fraction equal to or larger than 8, however, as
stated by the documentation for sanity.  If a value in the range [1, 7]
is written, the sysctl will return EINVAL.

This successfully solves the divide by zero issue at the same time.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Reported-by: Oleg Drokin &lt;green@linuxhacker.ru&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Documentation/sysctl/vm.txt: clarify vfs_cache_pressure description</title>
<updated>2014-06-04T23:54:13+00:00</updated>
<author>
<name>Denys Vlasenko</name>
<email>dvlasenk@redhat.com</email>
</author>
<published>2014-06-04T23:11:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4a0da71b96b9d4080c0820e9e7d02470ebe62dc6'/>
<id>4a0da71b96b9d4080c0820e9e7d02470ebe62dc6</id>
<content type='text'>
Existing description is worded in a way which almost encourages setting of
vfs_cache_pressure above 100, possibly way above it.

Users are left in a dark what this numeric value is - an int?  a
percentage?  what the scale is?

As a result, we are getting reports about noticeable performance
degradation from users who have set vfs_cache_pressure to ridiculously
high values - because they thought there is no downside to it.

Via code inspection it's obvious that this value is treated as a
percentage.  This patch changes text to reflect this fact, and adds a
cautionary paragraph advising against setting vfs_cache_pressure sky high.

Signed-off-by: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Existing description is worded in a way which almost encourages setting of
vfs_cache_pressure above 100, possibly way above it.

Users are left in a dark what this numeric value is - an int?  a
percentage?  what the scale is?

As a result, we are getting reports about noticeable performance
degradation from users who have set vfs_cache_pressure to ridiculously
high values - because they thought there is no downside to it.

Via code inspection it's obvious that this value is treated as a
percentage.  This patch changes text to reflect this fact, and adds a
cautionary paragraph advising against setting vfs_cache_pressure sky high.

Signed-off-by: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: disable zone_reclaim_mode by default</title>
<updated>2014-06-04T23:53:59+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2014-06-04T23:07:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4f9b16a64753d0bb607454347036dc997fd03b82'/>
<id>4f9b16a64753d0bb607454347036dc997fd03b82</id>
<content type='text'>
When it was introduced, zone_reclaim_mode made sense as NUMA distances
punished and workloads were generally partitioned to fit into a NUMA
node.  NUMA machines are now common but few of the workloads are
NUMA-aware and it's routine to see major performance degradation due to
zone_reclaim_mode being enabled but relatively few can identify the
problem.

Those that require zone_reclaim_mode are likely to be able to detect
when it needs to be enabled and tune appropriately so lets have a
sensible default for the bulk of users.

This patch (of 2):

zone_reclaim_mode causes processes to prefer reclaiming memory from
local node instead of spilling over to other nodes.  This made sense
initially when NUMA machines were almost exclusively HPC and the
workload was partitioned into nodes.  The NUMA penalties were
sufficiently high to justify reclaiming the memory.  On current machines
and workloads it is often the case that zone_reclaim_mode destroys
performance but not all users know how to detect this.  Favour the
common case and disable it by default.  Users that are sophisticated
enough to know they need zone_reclaim_mode will detect it.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Reviewed-by: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When it was introduced, zone_reclaim_mode made sense as NUMA distances
punished and workloads were generally partitioned to fit into a NUMA
node.  NUMA machines are now common but few of the workloads are
NUMA-aware and it's routine to see major performance degradation due to
zone_reclaim_mode being enabled but relatively few can identify the
problem.

Those that require zone_reclaim_mode are likely to be able to detect
when it needs to be enabled and tune appropriately so lets have a
sensible default for the bulk of users.

This patch (of 2):

zone_reclaim_mode causes processes to prefer reclaiming memory from
local node instead of spilling over to other nodes.  This made sense
initially when NUMA machines were almost exclusively HPC and the
workload was partitioned into nodes.  The NUMA penalties were
sufficiently high to justify reclaiming the memory.  On current machines
and workloads it is often the case that zone_reclaim_mode destroys
performance but not all users know how to detect this.  Favour the
common case and disable it by default.  Users that are sophisticated
enough to know they need zone_reclaim_mode will detect it.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Reviewed-by: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drop_caches: add some documentation and info message</title>
<updated>2014-04-03T23:21:04+00:00</updated>
<author>
<name>Dave Hansen</name>
<email>dave@linux.vnet.ibm.com</email>
</author>
<published>2014-04-03T21:48:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5509a5d27b971a90b940e148ca9ca53312e4fa7a'/>
<id>5509a5d27b971a90b940e148ca9ca53312e4fa7a</id>
<content type='text'>
There is plenty of anecdotal evidence and a load of blog posts
suggesting that using "drop_caches" periodically keeps your system
running in "tip top shape".  Perhaps adding some kernel documentation
will increase the amount of accurate data on its use.

If we are not shrinking caches effectively, then we have real bugs.
Using drop_caches will simply mask the bugs and make them harder to
find, but certainly does not fix them, nor is it an appropriate
"workaround" to limit the size of the caches.  On the contrary, there
have been bug reports on issues that turned out to be misguided use of
cache dropping.

Dropping caches is a very drastic and disruptive operation that is good
for debugging and running tests, but if it creates bug reports from
production use, kernel developers should be aware of its use.

Add a bit more documentation about it, a syslog message to track down
abusers, and vmstat drop counters to help analyze problem reports.

[akpm@linux-foundation.org: checkpatch fixes]
[hannes@cmpxchg.org: add runtime suppression control]
Signed-off-by: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is plenty of anecdotal evidence and a load of blog posts
suggesting that using "drop_caches" periodically keeps your system
running in "tip top shape".  Perhaps adding some kernel documentation
will increase the amount of accurate data on its use.

If we are not shrinking caches effectively, then we have real bugs.
Using drop_caches will simply mask the bugs and make them harder to
find, but certainly does not fix them, nor is it an appropriate
"workaround" to limit the size of the caches.  On the contrary, there
have been bug reports on issues that turned out to be misguided use of
cache dropping.

Dropping caches is a very drastic and disruptive operation that is good
for debugging and running tests, but if it creates bug reports from
production use, kernel developers should be aware of its use.

Add a bit more documentation about it, a syslog message to track down
abusers, and vmstat drop counters to help analyze problem reports.

[akpm@linux-foundation.org: checkpatch fixes]
[hannes@cmpxchg.org: add runtime suppression control]
Signed-off-by: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: document improved handling of swappiness==0</title>
<updated>2014-01-30T00:22:39+00:00</updated>
<author>
<name>Aaron Tomlin</name>
<email>atomlin@redhat.com</email>
</author>
<published>2014-01-29T22:05:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8582cb96b0bfd6891766d8c30d759bf21aad3b4d'/>
<id>8582cb96b0bfd6891766d8c30d759bf21aad3b4d</id>
<content type='text'>
Prior to commit fe35004fbf9e ("mm: avoid swapping out with
swappiness==0") setting swappiness to 0, reclaim code could still evict
recently used user anonymous memory to swap even though there is a
significant amount of RAM used for page cache.

The behaviour of setting swappiness to 0 has since changed.  When set,
the reclaim code does not initiate swap until the amount of free pages
and file-backed pages, is less than the high water mark in a zone.

Let's update the documentation to reflect this.

[akpm@linux-foundation.org: remove comma, per Randy]
Signed-off-by: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Bryn M. Reeves &lt;bmr@redhat.com&gt;
Cc: Satoru Moriya &lt;satoru.moriya@hds.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Prior to commit fe35004fbf9e ("mm: avoid swapping out with
swappiness==0") setting swappiness to 0, reclaim code could still evict
recently used user anonymous memory to swap even though there is a
significant amount of RAM used for page cache.

The behaviour of setting swappiness to 0 has since changed.  When set,
the reclaim code does not initiate swap until the amount of free pages
and file-backed pages, is less than the high water mark in a zone.

Let's update the documentation to reflect this.

[akpm@linux-foundation.org: remove comma, per Randy]
Signed-off-by: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Bryn M. Reeves &lt;bmr@redhat.com&gt;
Cc: Satoru Moriya &lt;satoru.moriya@hds.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: add overcommit_kbytes sysctl variable</title>
<updated>2014-01-22T00:19:44+00:00</updated>
<author>
<name>Jerome Marchand</name>
<email>jmarchan@redhat.com</email>
</author>
<published>2014-01-21T23:49:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=49f0ce5f92321cdcf741e35f385669a421013cb7'/>
<id>49f0ce5f92321cdcf741e35f385669a421013cb7</id>
<content type='text'>
Some applications that run on HPC clusters are designed around the
availability of RAM and the overcommit ratio is fine tuned to get the
maximum usage of memory without swapping.  With growing memory, the
1%-of-all-RAM grain provided by overcommit_ratio has become too coarse
for these workload (on a 2TB machine it represents no less than 20GB).

This patch adds the new overcommit_kbytes sysctl variable that allow a
much finer grain.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix nommu build]
Signed-off-by: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some applications that run on HPC clusters are designed around the
availability of RAM and the overcommit ratio is fine tuned to get the
maximum usage of memory without swapping.  With growing memory, the
1%-of-all-RAM grain provided by overcommit_ratio has become too coarse
for these workload (on a 2TB machine it represents no less than 20GB).

This patch adds the new overcommit_kbytes sysctl variable that allow a
much finer grain.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix nommu build]
Signed-off-by: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: improve the description for dirty_background_ratio/dirty_ratio sysctl</title>
<updated>2013-11-13T03:09:11+00:00</updated>
<author>
<name>Zheng Liu</name>
<email>wenqing.lz@taobao.com</email>
</author>
<published>2013-11-12T23:08:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=715ea41e60277f28f84d6c937737350e00955d56'/>
<id>715ea41e60277f28f84d6c937737350e00955d56</id>
<content type='text'>
Now dirty_background_ratio/dirty_ratio contains a percentage of total
avaiable memory, which contains free pages and reclaimable pages.  The
number of these pages is not equal to the number of total system memory.
But they are described as a percentage of total system memory in
Documentation/sysctl/vm.txt.  So we need to fix them to avoid
misunderstanding.

Signed-off-by: Zheng Liu &lt;wenqing.lz@taobao.com&gt;
Cc: Rob Landley &lt;rob@landley.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now dirty_background_ratio/dirty_ratio contains a percentage of total
avaiable memory, which contains free pages and reclaimable pages.  The
number of these pages is not equal to the number of total system memory.
But they are described as a percentage of total system memory in
Documentation/sysctl/vm.txt.  So we need to fix them to avoid
misunderstanding.

Signed-off-by: Zheng Liu &lt;wenqing.lz@taobao.com&gt;
Cc: Rob Landley &lt;rob@landley.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: prepare to remove /proc/sys/vm/hugepages_treat_as_movable</title>
<updated>2013-09-11T22:57:49+00:00</updated>
<author>
<name>Naoya Horiguchi</name>
<email>n-horiguchi@ah.jp.nec.com</email>
</author>
<published>2013-09-11T21:22:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=86cdb465cf3a9d81058b517af05074157fa9dcdd'/>
<id>86cdb465cf3a9d81058b517af05074157fa9dcdd</id>
<content type='text'>
Now hugepage migration is enabled, although restricted on pmd-based
hugepages for now (due to lack of testing.) So we should allocate
migratable hugepages from ZONE_MOVABLE if possible.

This patch makes GFP flags in hugepage allocation dependent on migration
support, not only the value of hugepages_treat_as_movable.  It provides no
change on the behavior for architectures which do not support hugepage
migration,

Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Reviewed-by: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Hillf Danton &lt;dhillf@gmail.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: "Aneesh Kumar K.V" &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now hugepage migration is enabled, although restricted on pmd-based
hugepages for now (due to lack of testing.) So we should allocate
migratable hugepages from ZONE_MOVABLE if possible.

This patch makes GFP flags in hugepage allocation dependent on migration
support, not only the value of hugepages_treat_as_movable.  It provides no
change on the behavior for architectures which do not support hugepage
migration,

Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Reviewed-by: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Hillf Danton &lt;dhillf@gmail.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: "Aneesh Kumar K.V" &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/page_alloc: fix doc for numa_zonelist_order</title>
<updated>2013-07-09T17:33:23+00:00</updated>
<author>
<name>Wanpeng Li</name>
<email>liwanp@linux.vnet.ibm.com</email>
</author>
<published>2013-07-08T23:00:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f8f191f1addf0b31f188fd88e71e97200871c99c'/>
<id>f8f191f1addf0b31f188fd88e71e97200871c99c</id>
<content type='text'>
The default zonelist order selecter will select "node" order if any nodes
DMA zone comprises greater than 70% of its local memory instead of 60%,
according to default_zonelist_order::low_kmem_size &gt; total * 70/100.

Signed-off-by: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The default zonelist order selecter will select "node" order if any nodes
DMA zone comprises greater than 70% of its local memory instead of 60%,
according to default_zonelist_order::low_kmem_size &gt; total * 70/100.

Signed-off-by: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: replace hardcoded 3% with admin_reserve_pages knob</title>
<updated>2013-04-29T22:54:36+00:00</updated>
<author>
<name>Andrew Shewmaker</name>
<email>agshew@gmail.com</email>
</author>
<published>2013-04-29T22:08:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4eeab4f5580d11bffedc697684b91b0bca0d5009'/>
<id>4eeab4f5580d11bffedc697684b91b0bca0d5009</id>
<content type='text'>
Add an admin_reserve_kbytes knob to allow admins to change the hardcoded
memory reserve to something other than 3%, which may be multiple
gigabytes on large memory systems.  Only about 8MB is necessary to
enable recovery in the default mode, and only a few hundred MB are
required even when overcommit is disabled.

This affects OVERCOMMIT_GUESS and OVERCOMMIT_NEVER.

admin_reserve_kbytes is initialized to min(3% free pages, 8MB)

I arrived at 8MB by summing the RSS of sshd or login, bash, and top.

Please see first patch in this series for full background, motivation,
testing, and full changelog.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: make init_admin_reserve() static]
Signed-off-by: Andrew Shewmaker &lt;agshew@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add an admin_reserve_kbytes knob to allow admins to change the hardcoded
memory reserve to something other than 3%, which may be multiple
gigabytes on large memory systems.  Only about 8MB is necessary to
enable recovery in the default mode, and only a few hundred MB are
required even when overcommit is disabled.

This affects OVERCOMMIT_GUESS and OVERCOMMIT_NEVER.

admin_reserve_kbytes is initialized to min(3% free pages, 8MB)

I arrived at 8MB by summing the RSS of sshd or login, bash, and top.

Please see first patch in this series for full background, motivation,
testing, and full changelog.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: make init_admin_reserve() static]
Signed-off-by: Andrew Shewmaker &lt;agshew@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
