<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/mm/filemap.c, branch v3.18.23</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>mm: get rid of radix tree gfp mask for pagecache_get_page</title>
<updated>2015-01-30T01:40:46+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2014-12-29T19:30:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=513c66820333a2cc7b54eaa6a7b5c34f6ffaf770'/>
<id>513c66820333a2cc7b54eaa6a7b5c34f6ffaf770</id>
<content type='text'>
commit 45f87de57f8fad59302fd263dd81ffa4843b5b24 upstream.

Commit 2457aec63745 ("mm: non-atomically mark page accessed during page
cache allocation where possible") has added a separate parameter for
specifying gfp mask for radix tree allocations.

Not only this is less than optimal from the API point of view because it
is error prone, it is also buggy currently because
grab_cache_page_write_begin is using GFP_KERNEL for radix tree and if
fgp_flags doesn't contain FGP_NOFS (mostly controlled by fs by
AOP_FLAG_NOFS flag) but the mapping_gfp_mask has __GFP_FS cleared then
the radix tree allocation wouldn't obey the restriction and might
recurse into filesystem and cause deadlocks.  This is the case for most
filesystems unfortunately because only ext4 and gfs2 are using
AOP_FLAG_NOFS.

Let's simply remove radix_gfp_mask parameter because the allocation
context is same for both page cache and for the radix tree.  Just make
sure that the radix tree gets only the sane subset of the mask (e.g.  do
not pass __GFP_WRITE).

Long term it is more preferable to convert remaining users of
AOP_FLAG_NOFS to use mapping_gfp_mask instead and simplify this
interface even further.

Reported-by: Dave Chinner &lt;david@fromorbit.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 45f87de57f8fad59302fd263dd81ffa4843b5b24 upstream.

Commit 2457aec63745 ("mm: non-atomically mark page accessed during page
cache allocation where possible") has added a separate parameter for
specifying gfp mask for radix tree allocations.

Not only this is less than optimal from the API point of view because it
is error prone, it is also buggy currently because
grab_cache_page_write_begin is using GFP_KERNEL for radix tree and if
fgp_flags doesn't contain FGP_NOFS (mostly controlled by fs by
AOP_FLAG_NOFS flag) but the mapping_gfp_mask has __GFP_FS cleared then
the radix tree allocation wouldn't obey the restriction and might
recurse into filesystem and cause deadlocks.  This is the case for most
filesystems unfortunately because only ext4 and gfs2 are using
AOP_FLAG_NOFS.

Let's simply remove radix_gfp_mask parameter because the allocation
context is same for both page cache and for the radix tree.  Just make
sure that the radix tree gets only the sane subset of the mask (e.g.  do
not pass __GFP_WRITE).

Long term it is more preferable to convert remaining users of
AOP_FLAG_NOFS to use mapping_gfp_mask instead and simplify this
interface even further.

Reported-by: Dave Chinner &lt;david@fromorbit.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mm/filemap.c: remove trailing whitespace</title>
<updated>2014-10-10T02:26:00+00:00</updated>
<author>
<name>Paul McQuade</name>
<email>paulmcquad@gmail.com</email>
</author>
<published>2014-10-09T22:29:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=99dadfdde04b72ce98aa2fbebdb49526f494e4cf'/>
<id>99dadfdde04b72ce98aa2fbebdb49526f494e4cf</id>
<content type='text'>
Signed-off-by: Paul McQuade &lt;paulmcquad@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Paul McQuade &lt;paulmcquad@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs</title>
<updated>2014-10-08T16:49:23+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-10-08T16:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=25641c0c8d72f3d235c022fd2c19181912c2ae8b'/>
<id>25641c0c8d72f3d235c022fd2c19181912c2ae8b</id>
<content type='text'>
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable fixes:
   - fix an NFSv4.1 state renewal regression
   - fix open/lock state recovery error handling
   - fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails
   - fix statd when reconnection fails
   - don't wake tasks during connection abort
   - don't start reboot recovery if lease check fails
   - fix duplicate proc entries

  Features:
  - pNFS block driver fixes and clean ups from Christoph
  - More code cleanups from Anna
  - Improve mmap() writeback performance
  - Replace use of PF_TRANS with a more generic mechanism for avoiding
    deadlocks in nfs_release_page"

* tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (66 commits)
  NFSv4.1: Fix an NFSv4.1 state renewal regression
  NFSv4: fix open/lock state recovery error handling
  NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails
  NFS: Fabricate fscache server index key correctly
  SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT
  NFSv3: Fix missing includes of nfs3_fs.h
  NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page()
  NFS: avoid waiting at all in nfs_release_page when congested.
  NFS: avoid deadlocks with loop-back mounted NFS filesystems.
  MM: export page_wakeup functions
  SCHED: add some "wait..on_bit...timeout()" interfaces.
  NFS: don't use STABLE writes during writeback.
  NFSv4: use exponential retry on NFS4ERR_DELAY for async requests.
  rpc: Add -EPERM processing for xs_udp_send_request()
  rpc: return sent and err from xs_sendpages()
  lockd: Try to reconnect if statd has moved
  SUNRPC: Don't wake tasks during connection abort
  Fixing lease renewal
  nfs: fix duplicate proc entries
  pnfs/blocklayout: Fix a 64-bit division/remainder issue in bl_map_stripe
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable fixes:
   - fix an NFSv4.1 state renewal regression
   - fix open/lock state recovery error handling
   - fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails
   - fix statd when reconnection fails
   - don't wake tasks during connection abort
   - don't start reboot recovery if lease check fails
   - fix duplicate proc entries

  Features:
  - pNFS block driver fixes and clean ups from Christoph
  - More code cleanups from Anna
  - Improve mmap() writeback performance
  - Replace use of PF_TRANS with a more generic mechanism for avoiding
    deadlocks in nfs_release_page"

* tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (66 commits)
  NFSv4.1: Fix an NFSv4.1 state renewal regression
  NFSv4: fix open/lock state recovery error handling
  NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails
  NFS: Fabricate fscache server index key correctly
  SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT
  NFSv3: Fix missing includes of nfs3_fs.h
  NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page()
  NFS: avoid waiting at all in nfs_release_page when congested.
  NFS: avoid deadlocks with loop-back mounted NFS filesystems.
  MM: export page_wakeup functions
  SCHED: add some "wait..on_bit...timeout()" interfaces.
  NFS: don't use STABLE writes during writeback.
  NFSv4: use exponential retry on NFS4ERR_DELAY for async requests.
  rpc: Add -EPERM processing for xs_udp_send_request()
  rpc: return sent and err from xs_sendpages()
  lockd: Try to reconnect if statd has moved
  SUNRPC: Don't wake tasks during connection abort
  Fixing lease renewal
  nfs: fix duplicate proc entries
  pnfs/blocklayout: Fix a 64-bit division/remainder issue in bl_map_stripe
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial</title>
<updated>2014-10-08T01:16:26+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-10-08T01:16:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=28596c9722289b2f98fa83a2e4351eb0a031b953'/>
<id>28596c9722289b2f98fa83a2e4351eb0a031b953</id>
<content type='text'>
Pull "trivial tree" updates from Jiri Kosina:
 "Usual pile from trivial tree everyone is so eagerly waiting for"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Remove MN10300_PROC_MN2WS0038
  mei: fix comments
  treewide: Fix typos in Kconfig
  kprobes: update jprobe_example.c for do_fork() change
  Documentation: change "&amp;" to "and" in Documentation/applying-patches.txt
  Documentation: remove obsolete pcmcia-cs from Changes
  Documentation: update links in Changes
  Documentation: Docbook: Fix generated DocBook/kernel-api.xml
  score: Remove GENERIC_HAS_IOMAP
  gpio: fix 'CONFIG_GPIO_IRQCHIP' comments
  tty: doc: Fix grammar in serial/tty
  dma-debug: modify check_for_stack output
  treewide: fix errors in printk
  genirq: fix reference in devm_request_threaded_irq comment
  treewide: fix synchronize_rcu() in comments
  checkstack.pl: port to AArch64
  doc: queue-sysfs: minor fixes
  init/do_mounts: better syntax description
  MIPS: fix comment spelling
  powerpc/simpleboot: fix comment
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull "trivial tree" updates from Jiri Kosina:
 "Usual pile from trivial tree everyone is so eagerly waiting for"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Remove MN10300_PROC_MN2WS0038
  mei: fix comments
  treewide: Fix typos in Kconfig
  kprobes: update jprobe_example.c for do_fork() change
  Documentation: change "&amp;" to "and" in Documentation/applying-patches.txt
  Documentation: remove obsolete pcmcia-cs from Changes
  Documentation: update links in Changes
  Documentation: Docbook: Fix generated DocBook/kernel-api.xml
  score: Remove GENERIC_HAS_IOMAP
  gpio: fix 'CONFIG_GPIO_IRQCHIP' comments
  tty: doc: Fix grammar in serial/tty
  dma-debug: modify check_for_stack output
  treewide: fix errors in printk
  genirq: fix reference in devm_request_threaded_irq comment
  treewide: fix synchronize_rcu() in comments
  checkstack.pl: port to AArch64
  doc: queue-sysfs: minor fixes
  init/do_mounts: better syntax description
  MIPS: fix comment spelling
  powerpc/simpleboot: fix comment
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>MM: export page_wakeup functions</title>
<updated>2014-09-25T12:25:09+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-09-24T01:28:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a4796e37c12e177572b80864cbab9c907ea250b0'/>
<id>a4796e37c12e177572b80864cbab9c907ea250b0</id>
<content type='text'>
This will allow NFS to wait for PG_private to be cleared and,
particularly, to send a wake-up when it is.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Acked-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This will allow NFS to wait for PG_private to be cleared and,
particularly, to send a wake-up when it is.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Acked-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>SCHED: add some "wait..on_bit...timeout()" interfaces.</title>
<updated>2014-09-25T12:23:57+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-09-25T03:55:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cbbce82209490df8b68da9aec0d642451fe0a668'/>
<id>cbbce82209490df8b68da9aec0d642451fe0a668</id>
<content type='text'>
In commit c1221321b7c25b53204447cff9949a6d5a7ddddc
   sched: Allow wait_on_bit_action() functions to support a timeout

I suggested that a "wait_on_bit_timeout()" interface would not meet my
need.  This isn't true - I was just over-engineering.

Including a 'private' field in wait_bit_key instead of a focused
"timeout" field was just premature generalization.  If some other
use is ever found, it can be generalized or added later.

So this patch renames "private" to "timeout" with a meaning "stop
waiting when "jiffies" reaches or passes "timeout",
and adds two of the many possible wait..bit..timeout() interfaces:

wait_on_page_bit_killable_timeout(), which is the one I want to use,
and out_of_line_wait_on_bit_timeout() which is a reasonably general
example.  Others can be added as needed.

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In commit c1221321b7c25b53204447cff9949a6d5a7ddddc
   sched: Allow wait_on_bit_action() functions to support a timeout

I suggested that a "wait_on_bit_timeout()" interface would not meet my
need.  This isn't true - I was just over-engineering.

Including a 'private' field in wait_bit_key instead of a focused
"timeout" field was just premature generalization.  If some other
use is ever found, it can be generalized or added later.

So this patch renames "private" to "timeout" with a meaning "stop
waiting when "jiffies" reaches or passes "timeout",
and adds two of the many possible wait..bit..timeout() interfaces:

wait_on_page_bit_killable_timeout(), which is the one I want to use,
and out_of_line_wait_on_bit_timeout() which is a reasonably general
example.  Others can be added as needed.

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Documentation: Docbook: Fix generated DocBook/kernel-api.xml</title>
<updated>2014-09-09T08:34:56+00:00</updated>
<author>
<name>Masanari Iida</name>
<email>standby24x7@gmail.com</email>
</author>
<published>2014-09-08T16:27:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=da3dae54e4ff09886b9a19224c8d9556bb2ba096'/>
<id>da3dae54e4ff09886b9a19224c8d9556bb2ba096</id>
<content type='text'>
This patch fix spelling typo found in DocBook/kernel-api.xml.
It is because the file is generated from the source comments,
I have to fix the comments in source codes.

Signed-off-by: Masanari Iida &lt;standby24x7@gmail.com&gt;
Acked-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch fix spelling typo found in DocBook/kernel-api.xml.
It is because the file is generated from the source comments,
I have to fix the comments in source codes.

Signed-off-by: Masanari Iida &lt;standby24x7@gmail.com&gt;
Acked-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2014-08-11T18:44:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-08-11T18:44:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f6f993328b2abcab86a3c99d7bd9f2066ab03d36'/>
<id>f6f993328b2abcab86a3c99d7bd9f2066ab03d36</id>
<content type='text'>
Pull vfs updates from Al Viro:
 "Stuff in here:

   - acct.c fixes and general rework of mnt_pin mechanism.  That allows
     to go for delayed-mntput stuff, which will permit mntput() on deep
     stack without worrying about stack overflows - fs shutdown will
     happen on shallow stack.  IOW, we can do Eric's umount-on-rmdir
     series without introducing tons of stack overflows on new mntput()
     call chains it introduces.
   - Bruce's d_splice_alias() patches
   - more Miklos' rename() stuff.
   - a couple of regression fixes (stable fodder, in the end of branch)
     and a fix for API idiocy in iov_iter.c.

  There definitely will be another pile, maybe even two.  I'd like to
  get Eric's series in this time, but even if we miss it, it'll go right
  in the beginning of for-next in the next cycle - the tricky part of
  prereqs is in this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
  fix copy_tree() regression
  __generic_file_write_iter(): fix handling of sync error after DIO
  switch iov_iter_get_pages() to passing maximal number of pages
  fs: mark __d_obtain_alias static
  dcache: d_splice_alias should detect loops
  exportfs: update Exporting documentation
  dcache: d_find_alias needn't recheck IS_ROOT &amp;&amp; DCACHE_DISCONNECTED
  dcache: remove unused d_find_alias parameter
  dcache: d_obtain_alias callers don't all want DISCONNECTED
  dcache: d_splice_alias should ignore DCACHE_DISCONNECTED
  dcache: d_splice_alias mustn't create directory aliases
  dcache: close d_move race in d_splice_alias
  dcache: move d_splice_alias
  namei: trivial fix to vfs_rename_dir comment
  VFS: allow -&gt;d_manage() to declare -EISDIR in rcu_walk mode.
  cifs: support RENAME_NOREPLACE
  hostfs: support rename flags
  shmem: support RENAME_EXCHANGE
  shmem: support RENAME_NOREPLACE
  btrfs: add RENAME_NOREPLACE
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull vfs updates from Al Viro:
 "Stuff in here:

   - acct.c fixes and general rework of mnt_pin mechanism.  That allows
     to go for delayed-mntput stuff, which will permit mntput() on deep
     stack without worrying about stack overflows - fs shutdown will
     happen on shallow stack.  IOW, we can do Eric's umount-on-rmdir
     series without introducing tons of stack overflows on new mntput()
     call chains it introduces.
   - Bruce's d_splice_alias() patches
   - more Miklos' rename() stuff.
   - a couple of regression fixes (stable fodder, in the end of branch)
     and a fix for API idiocy in iov_iter.c.

  There definitely will be another pile, maybe even two.  I'd like to
  get Eric's series in this time, but even if we miss it, it'll go right
  in the beginning of for-next in the next cycle - the tricky part of
  prereqs is in this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
  fix copy_tree() regression
  __generic_file_write_iter(): fix handling of sync error after DIO
  switch iov_iter_get_pages() to passing maximal number of pages
  fs: mark __d_obtain_alias static
  dcache: d_splice_alias should detect loops
  exportfs: update Exporting documentation
  dcache: d_find_alias needn't recheck IS_ROOT &amp;&amp; DCACHE_DISCONNECTED
  dcache: remove unused d_find_alias parameter
  dcache: d_obtain_alias callers don't all want DISCONNECTED
  dcache: d_splice_alias should ignore DCACHE_DISCONNECTED
  dcache: d_splice_alias mustn't create directory aliases
  dcache: close d_move race in d_splice_alias
  dcache: move d_splice_alias
  namei: trivial fix to vfs_rename_dir comment
  VFS: allow -&gt;d_manage() to declare -EISDIR in rcu_walk mode.
  cifs: support RENAME_NOREPLACE
  hostfs: support rename flags
  shmem: support RENAME_EXCHANGE
  shmem: support RENAME_NOREPLACE
  btrfs: add RENAME_NOREPLACE
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>__generic_file_write_iter(): fix handling of sync error after DIO</title>
<updated>2014-08-11T16:27:40+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-08-08T16:39:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=60bb45297f7551833346c5cebc6d483ea17ea5f2'/>
<id>60bb45297f7551833346c5cebc6d483ea17ea5f2</id>
<content type='text'>
If DIO results in short write and sync write fails, we want to bugger off
whether the DIO part has written anything or not; the logics on the return
will take care of the right return value.

Cc: stable@vger.kernel.org [3.16]
Reported-by: Anton Altaparmakov &lt;aia21@cam.ac.uk&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If DIO results in short write and sync write fails, we want to bugger off
whether the DIO part has written anything or not; the logics on the return
will take care of the right return value.

Cc: stable@vger.kernel.org [3.16]
Reported-by: Anton Altaparmakov &lt;aia21@cam.ac.uk&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcontrol: rewrite uncharge API</title>
<updated>2014-08-08T22:57:17+00:00</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2014-08-08T21:19:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0a31bc97c80c3fa87b32c091d9a930ac19cd0c40'/>
<id>0a31bc97c80c3fa87b32c091d9a930ac19cd0c40</id>
<content type='text'>
The memcg uncharging code that is involved towards the end of a page's
lifetime - truncation, reclaim, swapout, migration - is impressively
complicated and fragile.

Because anonymous and file pages were always charged before they had their
page-&gt;mapping established, uncharges had to happen when the page type
could still be known from the context; as in unmap for anonymous, page
cache removal for file and shmem pages, and swap cache truncation for swap
pages.  However, these operations happen well before the page is actually
freed, and so a lot of synchronization is necessary:

- Charging, uncharging, page migration, and charge migration all need
  to take a per-page bit spinlock as they could race with uncharging.

- Swap cache truncation happens during both swap-in and swap-out, and
  possibly repeatedly before the page is actually freed.  This means
  that the memcg swapout code is called from many contexts that make
  no sense and it has to figure out the direction from page state to
  make sure memory and memory+swap are always correctly charged.

- On page migration, the old page might be unmapped but then reused,
  so memcg code has to prevent untimely uncharging in that case.
  Because this code - which should be a simple charge transfer - is so
  special-cased, it is not reusable for replace_page_cache().

But now that charged pages always have a page-&gt;mapping, introduce
mem_cgroup_uncharge(), which is called after the final put_page(), when we
know for sure that nobody is looking at the page anymore.

For page migration, introduce mem_cgroup_migrate(), which is called after
the migration is successful and the new page is fully rmapped.  Because
the old page is no longer uncharged after migration, prevent double
charges by decoupling the page's memcg association (PCG_USED and
pc-&gt;mem_cgroup) from the page holding an actual charge.  The new bits
PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
to the new page during migration.

mem_cgroup_migrate() is suitable for replace_page_cache() as well,
which gets rid of mem_cgroup_replace_page_cache().  However, care
needs to be taken because both the source and the target page can
already be charged and on the LRU when fuse is splicing: grab the page
lock on the charge moving side to prevent changing pc-&gt;mem_cgroup of a
page under migration.  Also, the lruvecs of both pages change as we
uncharge the old and charge the new during migration, and putback may
race with us, so grab the lru lock and isolate the pages iff on LRU to
prevent races and ensure the pages are on the right lruvec afterward.

Swap accounting is massively simplified: because the page is no longer
uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
before the final put_page() in page reclaim.

Finally, page_cgroup changes are now protected by whatever protection the
page itself offers: anonymous pages are charged under the page table lock,
whereas page cache insertions, swapin, and migration hold the page lock.
Uncharging happens under full exclusion with no outstanding references.
Charging and uncharging also ensure that the page is off-LRU, which
serializes against charge migration.  Remove the very costly page_cgroup
lock and set pc-&gt;flags non-atomically.

[mhocko@suse.cz: mem_cgroup_charge_statistics needs preempt_disable]
[vdavydov@parallels.com: fix flags definition]
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Tested-by: Jet Chen &lt;jet.chen@intel.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Tested-by: Felipe Balbi &lt;balbi@ti.com&gt;
Signed-off-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The memcg uncharging code that is involved towards the end of a page's
lifetime - truncation, reclaim, swapout, migration - is impressively
complicated and fragile.

Because anonymous and file pages were always charged before they had their
page-&gt;mapping established, uncharges had to happen when the page type
could still be known from the context; as in unmap for anonymous, page
cache removal for file and shmem pages, and swap cache truncation for swap
pages.  However, these operations happen well before the page is actually
freed, and so a lot of synchronization is necessary:

- Charging, uncharging, page migration, and charge migration all need
  to take a per-page bit spinlock as they could race with uncharging.

- Swap cache truncation happens during both swap-in and swap-out, and
  possibly repeatedly before the page is actually freed.  This means
  that the memcg swapout code is called from many contexts that make
  no sense and it has to figure out the direction from page state to
  make sure memory and memory+swap are always correctly charged.

- On page migration, the old page might be unmapped but then reused,
  so memcg code has to prevent untimely uncharging in that case.
  Because this code - which should be a simple charge transfer - is so
  special-cased, it is not reusable for replace_page_cache().

But now that charged pages always have a page-&gt;mapping, introduce
mem_cgroup_uncharge(), which is called after the final put_page(), when we
know for sure that nobody is looking at the page anymore.

For page migration, introduce mem_cgroup_migrate(), which is called after
the migration is successful and the new page is fully rmapped.  Because
the old page is no longer uncharged after migration, prevent double
charges by decoupling the page's memcg association (PCG_USED and
pc-&gt;mem_cgroup) from the page holding an actual charge.  The new bits
PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
to the new page during migration.

mem_cgroup_migrate() is suitable for replace_page_cache() as well,
which gets rid of mem_cgroup_replace_page_cache().  However, care
needs to be taken because both the source and the target page can
already be charged and on the LRU when fuse is splicing: grab the page
lock on the charge moving side to prevent changing pc-&gt;mem_cgroup of a
page under migration.  Also, the lruvecs of both pages change as we
uncharge the old and charge the new during migration, and putback may
race with us, so grab the lru lock and isolate the pages iff on LRU to
prevent races and ensure the pages are on the right lruvec afterward.

Swap accounting is massively simplified: because the page is no longer
uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
before the final put_page() in page reclaim.

Finally, page_cgroup changes are now protected by whatever protection the
page itself offers: anonymous pages are charged under the page table lock,
whereas page cache insertions, swapin, and migration hold the page lock.
Uncharging happens under full exclusion with no outstanding references.
Charging and uncharging also ensure that the page is off-LRU, which
serializes against charge migration.  Remove the very costly page_cgroup
lock and set pc-&gt;flags non-atomically.

[mhocko@suse.cz: mem_cgroup_charge_statistics needs preempt_disable]
[vdavydov@parallels.com: fix flags definition]
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Tested-by: Jet Chen &lt;jet.chen@intel.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Tested-by: Felipe Balbi &lt;balbi@ti.com&gt;
Signed-off-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
