<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/swap.h, branch v3.9-rc3</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>vmscan: change type of vm_total_pages to unsigned long</title>
<updated>2013-02-24T01:50:22+00:00</updated>
<author>
<name>Zhang Yanfei</name>
<email>zhangyanfei@cn.fujitsu.com</email>
</author>
<published>2013-02-23T00:35:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b21e0b90ccb99a377bce0167fed1e881bb5065d7'/>
<id>b21e0b90ccb99a377bce0167fed1e881bb5065d7</id>
<content type='text'>
This variable is calculated from nr_free_pagecache_pages so
change its type to unsigned long.

Signed-off-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This variable is calculated from nr_free_pagecache_pages so
change its type to unsigned long.

Signed-off-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: fix return type for functions nr_free_*_pages</title>
<updated>2013-02-24T01:50:21+00:00</updated>
<author>
<name>Zhang Yanfei</name>
<email>zhangyanfei@cn.fujitsu.com</email>
</author>
<published>2013-02-23T00:35:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ebec3862fd6eefe8301aa55ed2e30c685d831842'/>
<id>ebec3862fd6eefe8301aa55ed2e30c685d831842</id>
<content type='text'>
Currently, the amount of RAM that functions nr_free_*_pages return is
held in unsigned int.  But in machines with big memory (exceeding 16TB),
the amount may be incorrect because of overflow, so fix it.

Signed-off-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Simon Horman &lt;horms@verge.net.au&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Eric Van Hensbergen &lt;ericvh@gmail.com&gt;
Cc: Ron Minnich &lt;rminnich@sandia.gov&gt;
Cc: Latchesar Ionkov &lt;lucho@ionkov.net&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, the amount of RAM that functions nr_free_*_pages return is
held in unsigned int.  But in machines with big memory (exceeding 16TB),
the amount may be incorrect because of overflow, so fix it.

Signed-off-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Simon Horman &lt;horms@verge.net.au&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Eric Van Hensbergen &lt;ericvh@gmail.com&gt;
Cc: Ron Minnich &lt;rminnich@sandia.gov&gt;
Cc: Latchesar Ionkov &lt;lucho@ionkov.net&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>swap: add per-partition lock for swapfile</title>
<updated>2013-02-24T01:50:17+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@kernel.org</email>
</author>
<published>2013-02-23T00:34:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ec8acf20afb8534ed511f6613dd2226b9e301010'/>
<id>ec8acf20afb8534ed511f6613dd2226b9e301010</id>
<content type='text'>
swap_lock is heavily contended when I test swap to 3 fast SSD (even
slightly slower than swap to 2 such SSD).  The main contention comes
from swap_info_get().  This patch tries to fix the gap with adding a new
per-partition lock.

Global data like nr_swapfiles, total_swap_pages, least_priority and
swap_list are still protected by swap_lock.

nr_swap_pages is an atomic now, it can be changed without swap_lock.  In
theory, it's possible get_swap_page() finds no swap pages but actually
there are free swap pages.  But sounds not a big problem.

Accessing partition specific data (like scan_swap_map and so on) is only
protected by swap_info_struct.lock.

Changing swap_info_struct.flags need hold swap_lock and
swap_info_struct.lock, because scan_scan_map() will check it.  read the
flags is ok with either the locks hold.

If both swap_lock and swap_info_struct.lock must be hold, we always hold
the former first to avoid deadlock.

swap_entry_free() can change swap_list.  To delete that code, we add a
new highest_priority_index.  Whenever get_swap_page() is called, we
check it.  If it's valid, we use it.

It's a pity get_swap_page() still holds swap_lock().  But in practice,
swap_lock() isn't heavily contended in my test with this patch (or I can
say there are other much more heavier bottlenecks like TLB flush).  And
BTW, looks get_swap_page() doesn't really need the lock.  We never free
swap_info[] and we check SWAP_WRITEOK flag.  The only risk without the
lock is we could swapout to some low priority swap, but we can quickly
recover after several rounds of swap, so sounds not a big deal to me.
But I'd prefer to fix this if it's a real problem.

"swap: make each swap partition have one address_space" improved the
swapout speed from 1.7G/s to 2G/s.  This patch further improves the
speed to 2.3G/s, so around 15% improvement.  It's a multi-process test,
so TLB flush isn't the biggest bottleneck before the patches.

[arnd@arndb.de: fix it for nommu]
[hughd@google.com: add missing unlock]
[minchan@kernel.org: get rid of lockdep whinge on sys_swapon]
Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Xiao Guangrong &lt;xiaoguangrong@linux.vnet.ibm.com&gt;
Cc: Dan Magenheimer &lt;dan.magenheimer@oracle.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
swap_lock is heavily contended when I test swap to 3 fast SSD (even
slightly slower than swap to 2 such SSD).  The main contention comes
from swap_info_get().  This patch tries to fix the gap with adding a new
per-partition lock.

Global data like nr_swapfiles, total_swap_pages, least_priority and
swap_list are still protected by swap_lock.

nr_swap_pages is an atomic now, it can be changed without swap_lock.  In
theory, it's possible get_swap_page() finds no swap pages but actually
there are free swap pages.  But sounds not a big problem.

Accessing partition specific data (like scan_swap_map and so on) is only
protected by swap_info_struct.lock.

Changing swap_info_struct.flags need hold swap_lock and
swap_info_struct.lock, because scan_scan_map() will check it.  read the
flags is ok with either the locks hold.

If both swap_lock and swap_info_struct.lock must be hold, we always hold
the former first to avoid deadlock.

swap_entry_free() can change swap_list.  To delete that code, we add a
new highest_priority_index.  Whenever get_swap_page() is called, we
check it.  If it's valid, we use it.

It's a pity get_swap_page() still holds swap_lock().  But in practice,
swap_lock() isn't heavily contended in my test with this patch (or I can
say there are other much more heavier bottlenecks like TLB flush).  And
BTW, looks get_swap_page() doesn't really need the lock.  We never free
swap_info[] and we check SWAP_WRITEOK flag.  The only risk without the
lock is we could swapout to some low priority swap, but we can quickly
recover after several rounds of swap, so sounds not a big deal to me.
But I'd prefer to fix this if it's a real problem.

"swap: make each swap partition have one address_space" improved the
swapout speed from 1.7G/s to 2G/s.  This patch further improves the
speed to 2.3G/s, so around 15% improvement.  It's a multi-process test,
so TLB flush isn't the biggest bottleneck before the patches.

[arnd@arndb.de: fix it for nommu]
[hughd@google.com: add missing unlock]
[minchan@kernel.org: get rid of lockdep whinge on sys_swapon]
Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Xiao Guangrong &lt;xiaoguangrong@linux.vnet.ibm.com&gt;
Cc: Dan Magenheimer &lt;dan.magenheimer@oracle.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>swap: make each swap partition have one address_space</title>
<updated>2013-02-24T01:50:17+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@kernel.org</email>
</author>
<published>2013-02-23T00:34:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=33806f06da654092182410d974b6d3c5396ea3eb'/>
<id>33806f06da654092182410d974b6d3c5396ea3eb</id>
<content type='text'>
When I use several fast SSD to do swap, swapper_space.tree_lock is
heavily contended.  This makes each swap partition have one
address_space to reduce the lock contention.  There is an array of
address_space for swap.  The swap entry type is the index to the array.

In my test with 3 SSD, this increases the swapout throughput 20%.

[akpm@linux-foundation.org: revert unneeded change to  __add_to_swap_cache]
Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When I use several fast SSD to do swap, swapper_space.tree_lock is
heavily contended.  This makes each swap partition have one
address_space to reduce the lock contention.  There is an array of
address_space for swap.  The swap entry type is the index to the array.

In my test with 3 SSD, this increases the swapout throughput 20%.

[akpm@linux-foundation.org: revert unneeded change to  __add_to_swap_cache]
Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: vmscan: save work scanning (almost) empty LRU lists</title>
<updated>2013-02-24T01:50:09+00:00</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2013-02-23T00:32:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d778df51c09264076fe0208c099ef7d428f21790'/>
<id>d778df51c09264076fe0208c099ef7d428f21790</id>
<content type='text'>
In certain cases (kswapd reclaim, memcg target reclaim), a fixed minimum
amount of pages is scanned from the LRU lists on each iteration, to make
progress.

Do not make this minimum bigger than the respective LRU list size,
however, and save some busy work trying to isolate and reclaim pages
that are not there.

Empty LRU lists are quite common with memory cgroups in NUMA
environments because there exists a set of LRU lists for each zone for
each memory cgroup, while the memory of a single cgroup is expected to
stay on just one node.  The number of expected empty LRU lists is thus

  memcgs * (nodes - 1) * lru types

Each attempt to reclaim from an empty LRU list does expensive size
comparisons between lists, acquires the zone's lru lock etc.  Avoid
that.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Satoru Moriya &lt;satoru.moriya@hds.com&gt;
Cc: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In certain cases (kswapd reclaim, memcg target reclaim), a fixed minimum
amount of pages is scanned from the LRU lists on each iteration, to make
progress.

Do not make this minimum bigger than the respective LRU list size,
however, and save some busy work trying to isolate and reclaim pages
that are not there.

Empty LRU lists are quite common with memory cgroups in NUMA
environments because there exists a set of LRU lists for each zone for
each memory cgroup, while the memory of a single cgroup is expected to
stay on just one node.  The number of expected empty LRU lists is thus

  memcgs * (nodes - 1) * lru types

Each attempt to reclaim from an empty LRU list does expensive size
comparisons between lists, acquires the zone's lru lock etc.  Avoid
that.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Satoru Moriya &lt;satoru.moriya@hds.com&gt;
Cc: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: remove vma arg from page_evictable</title>
<updated>2012-10-09T07:22:55+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-10-08T23:33:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=39b5f29ac1f988c1615fbc9c69f6651ab0d0c3c7'/>
<id>39b5f29ac1f988c1615fbc9c69f6651ab0d0c3c7</id>
<content type='text'>
page_evictable(page, vma) is an irritant: almost all its callers pass
NULL for vma.  Remove the vma arg and use mlocked_vma_newpage(vma, page)
explicitly in the couple of places it's needed.  But in those places we
don't even need page_evictable() itself!  They're dealing with a freshly
allocated anonymous page, which has no "mapping" and cannot be mlocked yet.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Ying Han &lt;yinghan@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
page_evictable(page, vma) is an irritant: almost all its callers pass
NULL for vma.  Remove the vma arg and use mlocked_vma_newpage(vma, page)
explicitly in the couple of places it's needed.  But in those places we
don't even need page_evictable() itself!  They're dealing with a freshly
allocated anonymous page, which has no "mapping" and cannot be mlocked yet.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Ying Han &lt;yinghan@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: swap: implement generic handler for swap_activate</title>
<updated>2012-08-01T01:42:47+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-07-31T23:44:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a509bc1a9e487d952d9404318f7f990166ab57a7'/>
<id>a509bc1a9e487d952d9404318f7f990166ab57a7</id>
<content type='text'>
The version of swap_activate introduced is sufficient for swap-over-NFS
but would not provide enough information to implement a generic handler.
This patch shuffles things slightly to ensure the same information is
available for aops-&gt;swap_activate() as is available to the core.

No functionality change.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Sebastian Andrzej Siewior &lt;sebastian@breakpoint.cc&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Xiaotian Feng &lt;dfeng@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The version of swap_activate introduced is sufficient for swap-over-NFS
but would not provide enough information to implement a generic handler.
This patch shuffles things slightly to ensure the same information is
available for aops-&gt;swap_activate() as is available to the core.

No functionality change.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Sebastian Andrzej Siewior &lt;sebastian@breakpoint.cc&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Xiaotian Feng &lt;dfeng@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages</title>
<updated>2012-08-01T01:42:47+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-07-31T23:44:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=62c230bc1790923a1b35da03596a68a6c9b5b100'/>
<id>62c230bc1790923a1b35da03596a68a6c9b5b100</id>
<content type='text'>
Currently swapfiles are managed entirely by the core VM by using -&gt;bmap to
allocate space and write to the blocks directly.  This effectively ensures
that the underlying blocks are allocated and avoids the need for the swap
subsystem to locate what physical blocks store offsets within a file.

If the swap subsystem is to use the filesystem information to locate the
blocks, it is critical that information such as block groups, block
bitmaps and the block descriptor table that map the swap file were
resident in memory.  This patch adds address_space_operations that the VM
can call when activating or deactivating swap backed by a file.

  int swap_activate(struct file *);
  int swap_deactivate(struct file *);

The -&gt;swap_activate() method is used to communicate to the file that the
VM relies on it, and the address_space should take adequate measures such
as reserving space in the underlying device, reserving memory for mempools
and pinning information such as the block descriptor table in memory.  The
-&gt;swap_deactivate() method is called on sys_swapoff() if -&gt;swap_activate()
returned success.

After a successful swapfile -&gt;swap_activate, the swapfile is marked
SWP_FILE and swapper_space.a_ops will proxy to
sis-&gt;swap_file-&gt;f_mappings-&gt;a_ops using -&gt;direct_io to write swapcache
pages and -&gt;readpage to read.

It is perfectly possible that direct_IO be used to read the swap pages but
it is an unnecessary complication.  Similarly, it is possible that
-&gt;writepage be used instead of direct_io to write the pages but filesystem
developers have stated that calling writepage from the VM is undesirable
for a variety of reasons and using direct_IO opens up the possibility of
writing back batches of swap pages in the future.

[a.p.zijlstra@chello.nl: Original patch]
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Sebastian Andrzej Siewior &lt;sebastian@breakpoint.cc&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Xiaotian Feng &lt;dfeng@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently swapfiles are managed entirely by the core VM by using -&gt;bmap to
allocate space and write to the blocks directly.  This effectively ensures
that the underlying blocks are allocated and avoids the need for the swap
subsystem to locate what physical blocks store offsets within a file.

If the swap subsystem is to use the filesystem information to locate the
blocks, it is critical that information such as block groups, block
bitmaps and the block descriptor table that map the swap file were
resident in memory.  This patch adds address_space_operations that the VM
can call when activating or deactivating swap backed by a file.

  int swap_activate(struct file *);
  int swap_deactivate(struct file *);

The -&gt;swap_activate() method is used to communicate to the file that the
VM relies on it, and the address_space should take adequate measures such
as reserving space in the underlying device, reserving memory for mempools
and pinning information such as the block descriptor table in memory.  The
-&gt;swap_deactivate() method is called on sys_swapoff() if -&gt;swap_activate()
returned success.

After a successful swapfile -&gt;swap_activate, the swapfile is marked
SWP_FILE and swapper_space.a_ops will proxy to
sis-&gt;swap_file-&gt;f_mappings-&gt;a_ops using -&gt;direct_io to write swapcache
pages and -&gt;readpage to read.

It is perfectly possible that direct_IO be used to read the swap pages but
it is an unnecessary complication.  Similarly, it is possible that
-&gt;writepage be used instead of direct_io to write the pages but filesystem
developers have stated that calling writepage from the VM is undesirable
for a variety of reasons and using direct_IO opens up the possibility of
writing back batches of swap pages in the future.

[a.p.zijlstra@chello.nl: Original patch]
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Sebastian Andrzej Siewior &lt;sebastian@breakpoint.cc&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Xiaotian Feng &lt;dfeng@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: methods for teaching filesystems about PG_swapcache pages</title>
<updated>2012-08-01T01:42:47+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-07-31T23:44:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f981c5950fa85916ba49bea5d9a7a5078f47e569'/>
<id>f981c5950fa85916ba49bea5d9a7a5078f47e569</id>
<content type='text'>
In order to teach filesystems to handle swap cache pages, three new page
functions are introduced:

  pgoff_t page_file_index(struct page *);
  loff_t page_file_offset(struct page *);
  struct address_space *page_file_mapping(struct page *);

page_file_index() - gives the offset of this page in the file in
PAGE_CACHE_SIZE blocks.  Like page-&gt;index is for mapped pages, this
function also gives the correct index for PG_swapcache pages.

page_file_offset() - uses page_file_index(), so that it will give the
expected result, even for PG_swapcache pages.

page_file_mapping() - gives the mapping backing the actual page; that is
for swap cache pages it will give swap_file-&gt;f_mapping.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;sebastian@breakpoint.cc&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Xiaotian Feng &lt;dfeng@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to teach filesystems to handle swap cache pages, three new page
functions are introduced:

  pgoff_t page_file_index(struct page *);
  loff_t page_file_offset(struct page *);
  struct address_space *page_file_mapping(struct page *);

page_file_index() - gives the offset of this page in the file in
PAGE_CACHE_SIZE blocks.  Like page-&gt;index is for mapped pages, this
function also gives the correct index for PG_swapcache pages.

page_file_offset() - uses page_file_index(), so that it will give the
expected result, even for PG_swapcache pages.

page_file_mapping() - gives the mapping backing the actual page; that is
for swap cache pages it will give swap_file-&gt;f_mapping.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;sebastian@breakpoint.cc&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Xiaotian Feng &lt;dfeng@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memcg: rename config variables</title>
<updated>2012-08-01T01:42:43+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2012-07-31T23:43:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c255a458055e459f65eb7b7f51dc5dbdd0caf1d8'/>
<id>c255a458055e459f65eb7b7f51dc5dbdd0caf1d8</id>
<content type='text'>
Sanity:

CONFIG_CGROUP_MEM_RES_CTLR -&gt; CONFIG_MEMCG
CONFIG_CGROUP_MEM_RES_CTLR_SWAP -&gt; CONFIG_MEMCG_SWAP
CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -&gt; CONFIG_MEMCG_SWAP_ENABLED
CONFIG_CGROUP_MEM_RES_CTLR_KMEM -&gt; CONFIG_MEMCG_KMEM

[mhocko@suse.cz: fix missed bits]
Cc: Glauber Costa &lt;glommer@parallels.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sanity:

CONFIG_CGROUP_MEM_RES_CTLR -&gt; CONFIG_MEMCG
CONFIG_CGROUP_MEM_RES_CTLR_SWAP -&gt; CONFIG_MEMCG_SWAP
CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -&gt; CONFIG_MEMCG_SWAP_ENABLED
CONFIG_CGROUP_MEM_RES_CTLR_KMEM -&gt; CONFIG_MEMCG_KMEM

[mhocko@suse.cz: fix missed bits]
Cc: Glauber Costa &lt;glommer@parallels.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
