<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/mm/memory.c, branch v3.0-rc3</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>memcg: add the pagefault count into memcg stats</title>
<updated>2011-05-27T00:12:36+00:00</updated>
<author>
<name>Ying Han</name>
<email>yinghan@google.com</email>
</author>
<published>2011-05-26T23:25:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=456f998ec817ebfa254464be4f089542fa390645'/>
<id>456f998ec817ebfa254464be4f089542fa390645</id>
<content type='text'>
Two new stats in per-memcg memory.stat which tracks the number of page
faults and number of major page faults.

  "pgfault"
  "pgmajfault"

They are different from "pgpgin"/"pgpgout" stat which count number of
pages charged/discharged to the cgroup and have no meaning of reading/
writing page to disk.

It is valuable to track the two stats for both measuring application's
performance as well as the efficiency of the kernel page reclaim path.
Counting pagefaults per process is useful, but we also need the aggregated
value since processes are monitored and controlled in cgroup basis in
memcg.

Functional test: check the total number of pgfault/pgmajfault of all
memcgs and compare with global vmstat value:

  $ cat /proc/vmstat | grep fault
  pgfault 1070751
  pgmajfault 553

  $ cat /dev/cgroup/memory.stat | grep fault
  pgfault 1071138
  pgmajfault 553
  total_pgfault 1071142
  total_pgmajfault 553

  $ cat /dev/cgroup/A/memory.stat | grep fault
  pgfault 199
  pgmajfault 0
  total_pgfault 199
  total_pgmajfault 0

Performance test: run page fault test(pft) wit 16 thread on faulting in
15G anon pages in 16G container.  There is no regression noticed on the
"flt/cpu/s"

Sample output from pft:

  TAG pft:anon-sys-default:
    Gb  Thr CLine   User     System     Wall    flt/cpu/s fault/wsec
    15   16   1     0.67s   233.41s    14.76s   16798.546 266356.260

  +-------------------------------------------------------------------------+
      N           Min           Max        Median           Avg        Stddev
  x  10     16682.962     17344.027     16913.524     16928.812      166.5362
  +  10     16695.568     16923.896     16820.604     16824.652     84.816568
  No difference proven at 95.0% confidence

[akpm@linux-foundation.org: fix build]
[hughd@google.com: shmem fix]
Signed-off-by: Ying Han &lt;yinghan@google.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Reviewed-by: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Daisuke Nishimura &lt;nishimura@mxp.nes.nec.co.jp&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Two new stats in per-memcg memory.stat which tracks the number of page
faults and number of major page faults.

  "pgfault"
  "pgmajfault"

They are different from "pgpgin"/"pgpgout" stat which count number of
pages charged/discharged to the cgroup and have no meaning of reading/
writing page to disk.

It is valuable to track the two stats for both measuring application's
performance as well as the efficiency of the kernel page reclaim path.
Counting pagefaults per process is useful, but we also need the aggregated
value since processes are monitored and controlled in cgroup basis in
memcg.

Functional test: check the total number of pgfault/pgmajfault of all
memcgs and compare with global vmstat value:

  $ cat /proc/vmstat | grep fault
  pgfault 1070751
  pgmajfault 553

  $ cat /dev/cgroup/memory.stat | grep fault
  pgfault 1071138
  pgmajfault 553
  total_pgfault 1071142
  total_pgmajfault 553

  $ cat /dev/cgroup/A/memory.stat | grep fault
  pgfault 199
  pgmajfault 0
  total_pgfault 199
  total_pgmajfault 0

Performance test: run page fault test(pft) wit 16 thread on faulting in
15G anon pages in 16G container.  There is no regression noticed on the
"flt/cpu/s"

Sample output from pft:

  TAG pft:anon-sys-default:
    Gb  Thr CLine   User     System     Wall    flt/cpu/s fault/wsec
    15   16   1     0.67s   233.41s    14.76s   16798.546 266356.260

  +-------------------------------------------------------------------------+
      N           Min           Max        Median           Avg        Stddev
  x  10     16682.962     17344.027     16913.524     16928.812      166.5362
  +  10     16695.568     16923.896     16820.604     16824.652     84.816568
  No difference proven at 95.0% confidence

[akpm@linux-foundation.org: fix build]
[hughd@google.com: shmem fix]
Signed-off-by: Ying Han &lt;yinghan@google.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Reviewed-by: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Daisuke Nishimura &lt;nishimura@mxp.nes.nec.co.jp&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: don't access vm_flags as 'int'</title>
<updated>2011-05-26T16:20:31+00:00</updated>
<author>
<name>KOSAKI Motohiro</name>
<email>kosaki.motohiro@jp.fujitsu.com</email>
</author>
<published>2011-05-26T10:16:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ca16d140af91febe25daeb9e032bf8bd46b8c31f'/>
<id>ca16d140af91febe25daeb9e032bf8bd46b8c31f</id>
<content type='text'>
The type of vma-&gt;vm_flags is 'unsigned long'. Neither 'int' nor
'unsigned int'. This patch fixes such misuse.

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
[ Changed to use a typedef - we'll extend it to cover more cases
  later, since there has been discussion about making it a 64-bit
  type..                      - Linus ]
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The type of vma-&gt;vm_flags is 'unsigned long'. Neither 'int' nor
'unsigned int'. This patch fixes such misuse.

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
[ Changed to use a typedef - we'll extend it to cover more cases
  later, since there has been discussion about making it a 64-bit
  type..                      - Linus ]
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: uninline large generic tlb.h functions</title>
<updated>2011-05-25T15:39:20+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:12:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9547d01bfb9c351dc19067f8a4cea9d3955f4125'/>
<id>9547d01bfb9c351dc19067f8a4cea9d3955f4125</id>
<content type='text'>
Some of these functions have grown beyond inline sanity, move them
out-of-line.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Requested-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some of these functions have grown beyond inline sanity, move them
out-of-line.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Requested-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: Convert i_mmap_lock to a mutex</title>
<updated>2011-05-25T15:39:18+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:12:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3d48ae45e72390ddf8cc5256ac32ed6f7a19cbea'/>
<id>3d48ae45e72390ddf8cc5256ac32ed6f7a19cbea</id>
<content type='text'>
Straightforward conversion of i_mmap_lock to a mutex.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Straightforward conversion of i_mmap_lock to a mutex.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: Remove i_mmap_lock lockbreak</title>
<updated>2011-05-25T15:39:17+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:12:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=97a894136f29802da19a15541de3c019e1ca147e'/>
<id>97a894136f29802da19a15541de3c019e1ca147e</id>
<content type='text'>
Hugh says:
 "The only significant loser, I think, would be page reclaim (when
  concurrent with truncation): could spin for a long time waiting for
  the i_mmap_mutex it expects would soon be dropped? "

Counter points:
 - cpu contention makes the spin stop (need_resched())
 - zap pages should be freeing pages at a higher rate than reclaim
   ever can

I think the simplification of the truncate code is definitely worth it.

Effectively reverts: 2aa15890f3c ("mm: prevent concurrent
unmap_mapping_range() on the same inode") and takes out the code that
caused its problem.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hugh says:
 "The only significant loser, I think, would be page reclaim (when
  concurrent with truncation): could spin for a long time waiting for
  the i_mmap_mutex it expects would soon be dropped? "

Counter points:
 - cpu contention makes the spin stop (need_resched())
 - zap pages should be freeing pages at a higher rate than reclaim
   ever can

I think the simplification of the truncate code is definitely worth it.

Effectively reverts: 2aa15890f3c ("mm: prevent concurrent
unmap_mapping_range() on the same inode") and takes out the code that
caused its problem.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: extended batches for generic mmu_gather</title>
<updated>2011-05-25T15:39:16+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:12:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e303297e6c3a7b847c4731eb14006ca6b435ecca'/>
<id>e303297e6c3a7b847c4731eb14006ca6b435ecca</id>
<content type='text'>
Instead of using a single batch (the small on-stack, or an allocated
page), try and extend the batch every time it runs out and only flush once
either the extend fails or we're done.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: Nick Piggin &lt;npiggin@kernel.dk&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of using a single batch (the small on-stack, or an allocated
page), try and extend the batch every time it runs out and only flush once
either the extend fails or we're done.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: Nick Piggin &lt;npiggin@kernel.dk&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, powerpc: move the RCU page-table freeing into generic code</title>
<updated>2011-05-25T15:39:16+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:12:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=267239116987d64850ad2037d8e0f3071dc3b5ce'/>
<id>267239116987d64850ad2037d8e0f3071dc3b5ce</id>
<content type='text'>
In case other architectures require RCU freed page-tables to implement
gup_fast() and software filled hashes and similar things, provide the
means to do so by moving the logic into generic code.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: David Miller &lt;davem@davemloft.net&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In case other architectures require RCU freed page-tables to implement
gup_fast() and software filled hashes and similar things, provide the
means to do so by moving the logic into generic code.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: David Miller &lt;davem@davemloft.net&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: mmu_gather rework</title>
<updated>2011-05-25T15:39:12+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:11:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d16dfc550f5326a4000f3322582a7c05dec91d7a'/>
<id>d16dfc550f5326a4000f3322582a7c05dec91d7a</id>
<content type='text'>
Rework the existing mmu_gather infrastructure.

The direct purpose of these patches was to allow preemptible mmu_gather,
but even without that I think these patches provide an improvement to the
status quo.

The first 9 patches rework the mmu_gather infrastructure.  For review
purpose I've split them into generic and per-arch patches with the last of
those a generic cleanup.

The next patch provides generic RCU page-table freeing, and the followup
is a patch converting s390 to use this.  I've also got 4 patches from
DaveM lined up (not included in this series) that uses this to implement
gup_fast() for sparc64.

Then there is one patch that extends the generic mmu_gather batching.

After that follow the mm preemptibility patches, these make part of the mm
a lot more preemptible.  It converts i_mmap_lock and anon_vma-&gt;lock to
mutexes which together with the mmu_gather rework makes mmu_gather
preemptible as well.

Making i_mmap_lock a mutex also enables a clean-up of the truncate code.

This also allows for preemptible mmu_notifiers, something that XPMEM I
think wants.

Furthermore, it removes the new and universially detested unmap_mutex.

This patch:

Remove the first obstacle towards a fully preemptible mmu_gather.

The current scheme assumes mmu_gather is always done with preemption
disabled and uses per-cpu storage for the page batches.  Change this to
try and allocate a page for batching and in case of failure, use a small
on-stack array to make some progress.

Preemptible mmu_gather is desired in general and usable once i_mmap_lock
becomes a mutex.  Doing it before the mutex conversion saves us from
having to rework the code by moving the mmu_gather bits inside the
pte_lock.

Also avoid flushing the tlb batches from under the pte lock, this is
useful even without the i_mmap_lock conversion as it significantly reduces
pte lock hold times.

[akpm@linux-foundation.org: fix comment tpyo]
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rework the existing mmu_gather infrastructure.

The direct purpose of these patches was to allow preemptible mmu_gather,
but even without that I think these patches provide an improvement to the
status quo.

The first 9 patches rework the mmu_gather infrastructure.  For review
purpose I've split them into generic and per-arch patches with the last of
those a generic cleanup.

The next patch provides generic RCU page-table freeing, and the followup
is a patch converting s390 to use this.  I've also got 4 patches from
DaveM lined up (not included in this series) that uses this to implement
gup_fast() for sparc64.

Then there is one patch that extends the generic mmu_gather batching.

After that follow the mm preemptibility patches, these make part of the mm
a lot more preemptible.  It converts i_mmap_lock and anon_vma-&gt;lock to
mutexes which together with the mmu_gather rework makes mmu_gather
preemptible as well.

Making i_mmap_lock a mutex also enables a clean-up of the truncate code.

This also allows for preemptible mmu_notifiers, something that XPMEM I
think wants.

Furthermore, it removes the new and universially detested unmap_mutex.

This patch:

Remove the first obstacle towards a fully preemptible mmu_gather.

The current scheme assumes mmu_gather is always done with preemption
disabled and uses per-cpu storage for the page batches.  Change this to
try and allocate a page for batching and in case of failure, use a small
on-stack array to make some progress.

Preemptible mmu_gather is desired in general and usable once i_mmap_lock
becomes a mutex.  Doing it before the mutex conversion saves us from
having to rework the code by moving the mmu_gather bits inside the
pte_lock.

Also avoid flushing the tlb batches from under the pte lock, this is
useful even without the i_mmap_lock conversion as it significantly reduces
pte lock hold times.

[akpm@linux-foundation.org: fix comment tpyo]
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: make expand_downwards() symmetrical with expand_upwards()</title>
<updated>2011-05-25T15:39:12+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2011-05-25T00:11:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d05f3169c0fbca16132ec7c2be71685c6de638b5'/>
<id>d05f3169c0fbca16132ec7c2be71685c6de638b5</id>
<content type='text'>
Currently we have expand_upwards exported while expand_downwards is
accessible only via expand_stack or expand_stack_downwards.

check_stack_guard_page is a nice example of the asymmetry.  It uses
expand_stack for VM_GROWSDOWN while expand_upwards is called for
VM_GROWSUP case.

Let's clean this up by exporting both functions and make those names
consistent.  Let's use expand_{upwards,downwards} because expanding
doesn't always involve stack manipulation (an example is
ia64_do_page_fault which uses expand_upwards for registers backing store
expansion).  expand_downwards has to be defined for both
CONFIG_STACK_GROWS{UP,DOWN} because get_arg_page calls the downwards
version in the early process initialization phase for growsup
configuration.

Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: James Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we have expand_upwards exported while expand_downwards is
accessible only via expand_stack or expand_stack_downwards.

check_stack_guard_page is a nice example of the asymmetry.  It uses
expand_stack for VM_GROWSDOWN while expand_upwards is called for
VM_GROWSUP case.

Let's clean this up by exporting both functions and make those names
consistent.  Let's use expand_{upwards,downwards} because expanding
doesn't always involve stack manipulation (an example is
ia64_do_page_fault which uses expand_upwards for registers backing store
expansion).  expand_downwards has to be defined for both
CONFIG_STACK_GROWS{UP,DOWN} because get_arg_page calls the downwards
version in the early process initialization phase for growsup
configuration.

Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: James Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Don't lock guardpage if the stack is growing up</title>
<updated>2011-05-09T23:22:07+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mikulas@artax.karlin.mff.cuni.cz</email>
</author>
<published>2011-05-09T11:01:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a09a79f66874c905af35d5bb5e5f2fdc7b6b894d'/>
<id>a09a79f66874c905af35d5bb5e5f2fdc7b6b894d</id>
<content type='text'>
Linux kernel excludes guard page when performing mlock on a VMA with
down-growing stack. However, some architectures have up-growing stack
and locking the guard page should be excluded in this case too.

This patch fixes lvm2 on PA-RISC (and possibly other architectures with
up-growing stack). lvm2 calculates number of used pages when locking and
when unlocking and reports an internal error if the numbers mismatch.

[ Patch changed fairly extensively to also fix /proc/&lt;pid&gt;/maps for the
  grows-up case, and to move things around a bit to clean it all up and
  share the infrstructure with the /proc bits.

  Tested on ia64 that has both grow-up and grow-down segments  - Linus ]

Signed-off-by: Mikulas Patocka &lt;mikulas@artax.karlin.mff.cuni.cz&gt;
Tested-by: Tony Luck &lt;tony.luck@gmail.com&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Linux kernel excludes guard page when performing mlock on a VMA with
down-growing stack. However, some architectures have up-growing stack
and locking the guard page should be excluded in this case too.

This patch fixes lvm2 on PA-RISC (and possibly other architectures with
up-growing stack). lvm2 calculates number of used pages when locking and
when unlocking and reports an internal error if the numbers mismatch.

[ Patch changed fairly extensively to also fix /proc/&lt;pid&gt;/maps for the
  grows-up case, and to move things around a bit to clean it all up and
  share the infrstructure with the /proc bits.

  Tested on ia64 that has both grow-up and grow-down segments  - Linus ]

Signed-off-by: Mikulas Patocka &lt;mikulas@artax.karlin.mff.cuni.cz&gt;
Tested-by: Tony Luck &lt;tony.luck@gmail.com&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
