<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/ceph/caps.c, branch v3.2.5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>ceph: use i_ceph_lock instead of i_lock</title>
<updated>2011-12-07T18:46:44+00:00</updated>
<author>
<name>Sage Weil</name>
<email>sage@newdream.net</email>
</author>
<published>2011-11-30T17:47:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=be655596b3de5873f994ddbe205751a5ffb4de39'/>
<id>be655596b3de5873f994ddbe205751a5ffb4de39</id>
<content type='text'>
We have been using i_lock to protect all kinds of data structures in the
ceph_inode_info struct, including lists of inodes that we need to iterate
over while avoiding races with inode destruction.  That requires grabbing
a reference to the inode with the list lock protected, but igrab() now
takes i_lock to check the inode flags.

Changing the list lock ordering would be a painful process.

However, using a ceph-specific i_ceph_lock in the ceph inode instead of
i_lock is a simple mechanical change and avoids the ordering constraints
imposed by igrab().

Reported-by: Amon Ott &lt;a.ott@m-privacy.de&gt;
Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have been using i_lock to protect all kinds of data structures in the
ceph_inode_info struct, including lists of inodes that we need to iterate
over while avoiding races with inode destruction.  That requires grabbing
a reference to the inode with the list lock protected, but igrab() now
takes i_lock to check the inode flags.

Changing the list lock ordering would be a painful process.

However, using a ceph-specific i_ceph_lock in the ceph inode instead of
i_lock is a simple mechanical change and avoids the ordering constraints
imposed by igrab().

Reported-by: Amon Ott &lt;a.ott@m-privacy.de&gt;
Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: use new D_COMPLETE dentry flag</title>
<updated>2011-11-06T04:10:10+00:00</updated>
<author>
<name>Sage Weil</name>
<email>sage@newdream.net</email>
</author>
<published>2011-11-03T16:23:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c6ffe10015f4e6fba8a915318b319c43aed1836f'/>
<id>c6ffe10015f4e6fba8a915318b319c43aed1836f</id>
<content type='text'>
We used to use a flag on the directory inode to track whether the dcache
contents for a directory were a complete cached copy.  Switch to a dentry
flag CEPH_D_COMPLETE that is safely updated by -&gt;d_prune().

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We used to use a flag on the directory inode to track whether the dcache
contents for a directory were a complete cached copy.  Switch to a dentry
flag CEPH_D_COMPLETE that is safely updated by -&gt;d_prune().

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>filesystems: add set_nlink()</title>
<updated>2011-11-02T11:53:43+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@suse.cz</email>
</author>
<published>2011-10-28T12:13:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bfe8684869601dacfcb2cd69ef8cfd9045f62170'/>
<id>bfe8684869601dacfcb2cd69ef8cfd9045f62170</id>
<content type='text'>
Replace remaining direct i_nlink updates with a new set_nlink()
updater function.

Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Tested-by: Toshiyuki Okajima &lt;toshi.okajima@jp.fujitsu.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace remaining direct i_nlink updates with a new set_nlink()
updater function.

Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Tested-by: Toshiyuki Okajima &lt;toshi.okajima@jp.fujitsu.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: don't complain on msgpool alloc failures</title>
<updated>2011-10-25T23:10:15+00:00</updated>
<author>
<name>Sage Weil</name>
<email>sage@newdream.net</email>
</author>
<published>2011-08-09T22:03:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b61c27636fffbaf1980e675282777b9467254a40'/>
<id>b61c27636fffbaf1980e675282777b9467254a40</id>
<content type='text'>
The pool allocation failures are masked by the pool; there is no need to
spam the console about them.  (That's the whole point of having the pool
in the first place.)

Mark msg allocations whose failure is safely handled as such.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The pool allocation failures are masked by the pool; there is no need to
spam the console about them.  (That's the whole point of having the pool
in the first place.)

Mark msg allocations whose failure is safely handled as such.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: push i_mutex and filemap_write_and_wait down into -&gt;fsync() handlers</title>
<updated>2011-07-21T00:47:59+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@redhat.com</email>
</author>
<published>2011-07-17T00:44:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=02c24a82187d5a628c68edfe71ae60dc135cd178'/>
<id>02c24a82187d5a628c68edfe71ae60dc135cd178</id>
<content type='text'>
Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the -&gt;fsync() handlers.  Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2.  For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,

Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Josef Bacik &lt;josef@redhat.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the -&gt;fsync() handlers.  Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2.  For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,

Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Josef Bacik &lt;josef@redhat.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: use ihold when we already have an inode ref</title>
<updated>2011-06-08T04:34:11+00:00</updated>
<author>
<name>Sage Weil</name>
<email>sage@newdream.net</email>
</author>
<published>2011-05-27T16:24:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=70b666c3b4cb2b96098d80e6f515e4bc6d37db5a'/>
<id>70b666c3b4cb2b96098d80e6f515e4bc6d37db5a</id>
<content type='text'>
We should use ihold whenever we already have a stable inode ref, even
when we aren't holding i_lock.  This avoids adding new and unnecessary
locking dependencies.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We should use ihold whenever we already have a stable inode ref, even
when we aren't holding i_lock.  This avoids adding new and unnecessary
locking dependencies.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: fix cap flush race reentrancy</title>
<updated>2011-05-24T18:52:12+00:00</updated>
<author>
<name>Sage Weil</name>
<email>sage@newdream.net</email>
</author>
<published>2011-05-24T18:46:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=db3540522e955c1ebb391f4f5324dff4f20ecd09'/>
<id>db3540522e955c1ebb391f4f5324dff4f20ecd09</id>
<content type='text'>
In e9964c10 we change cap flushing to do a delicate dance because some
inodes on the cap_dirty list could be in a migrating state (got EXPORT but
not IMPORT) in which we couldn't actually flush and move from
dirty-&gt;flushing, breaking the while (!empty) { process first } loop
structure.  It worked for a single sync thread, but was not reentrant and
triggered infinite loops when multiple syncers came along.

Instead, move inodes with dirty to a separate cap_dirty_migrating list
when in the limbo export-but-no-import state, allowing us to go back to
the simple loop structure (which was reentrant).  This is cleaner and more
robust.

Audited the cap_dirty users and this looks fine:
list_empty(&amp;ci-&gt;i_dirty_item) is still a reliable indicator of whether we
have dirty caps (which list we're on is irrelevant) and list_del_init()
calls still do the right thing.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In e9964c10 we change cap flushing to do a delicate dance because some
inodes on the cap_dirty list could be in a migrating state (got EXPORT but
not IMPORT) in which we couldn't actually flush and move from
dirty-&gt;flushing, breaking the while (!empty) { process first } loop
structure.  It worked for a single sync thread, but was not reentrant and
triggered infinite loops when multiple syncers came along.

Instead, move inodes with dirty to a separate cap_dirty_migrating list
when in the limbo export-but-no-import state, allowing us to go back to
the simple loop structure (which was reentrant).  This is cleaner and more
robust.

Audited the cap_dirty users and this looks fine:
list_empty(&amp;ci-&gt;i_dirty_item) is still a reliable indicator of whether we
have dirty caps (which list we're on is irrelevant) and list_del_init()
calls still do the right thing.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: fix rare potential cap leak</title>
<updated>2011-05-19T18:25:03+00:00</updated>
<author>
<name>Sage Weil</name>
<email>sage@newdream.net</email>
</author>
<published>2011-05-12T22:13:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3540303f87115cbdae6ed2cab44ce6a7676d48d3'/>
<id>3540303f87115cbdae6ed2cab44ce6a7676d48d3</id>
<content type='text'>
If we grab new_cap, retake the lock, and find we already have a cap now
for the given mds, release new_cap.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we grab new_cap, retake the lock, and find we already have a cap now
for the given mds, release new_cap.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: do not use i_wrbuffer_ref as refcount for Fb cap</title>
<updated>2011-05-11T17:44:48+00:00</updated>
<author>
<name>Henry C Chang</name>
<email>henry.cy.chang@gmail.com</email>
</author>
<published>2011-05-11T10:29:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d3d0720d4a7a46e93e055e5b0f1a8bd612743ed6'/>
<id>d3d0720d4a7a46e93e055e5b0f1a8bd612743ed6</id>
<content type='text'>
We increments i_wrbuffer_ref when taking the Fb cap. This breaks
the dirty page accounting and causes looping in
__ceph_do_pending_vmtruncate, and ceph client hangs.

This bug can be reproduced occasionally by running blogbench.

Add a new field i_wb_ref to inode and dedicate it to Fb reference
counting.

Signed-off-by: Henry C Chang &lt;henry.cy.chang@gmail.com&gt;
Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We increments i_wrbuffer_ref when taking the Fb cap. This breaks
the dirty page accounting and causes looping in
__ceph_do_pending_vmtruncate, and ceph client hangs.

This bug can be reproduced occasionally by running blogbench.

Add a new field i_wb_ref to inode and dedicate it to Fb reference
counting.

Signed-off-by: Henry C Chang &lt;henry.cy.chang@gmail.com&gt;
Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: do not call __mark_dirty_inode under i_lock</title>
<updated>2011-05-04T19:56:45+00:00</updated>
<author>
<name>Sage Weil</name>
<email>sage@newdream.net</email>
</author>
<published>2011-05-04T18:33:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fca65b4ad72d28cbb43a029114d04b89f06faadb'/>
<id>fca65b4ad72d28cbb43a029114d04b89f06faadb</id>
<content type='text'>
The __mark_dirty_inode helper now takes i_lock as of 250df6ed.  Fix the
one ceph callers that held i_lock (__ceph_mark_dirty_caps) to return the
flags value so that the callers can do it outside of i_lock.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The __mark_dirty_inode helper now takes i_lock as of 250df6ed.  Fix the
one ceph callers that held i_lock (__ceph_mark_dirty_caps) to return the
flags value so that the callers can do it outside of i_lock.

Signed-off-by: Sage Weil &lt;sage@newdream.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
