<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/dcache.c, branch v3.2.73</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>dcache: Handle escaped paths in prepend_path</title>
<updated>2015-10-13T02:46:04+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-08-15T18:36:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=722632af3c2b4828e79f143e356489c6761035ec'/>
<id>722632af3c2b4828e79f143e356489c6761035ec</id>
<content type='text'>
commit cde93be45a8a90d8c264c776fab63487b5038a65 upstream.

A rename can result in a dentry that by walking up d_parent
will never reach it's mnt_root.  For lack of a better term
I call this an escaped path.

prepend_path is called by four different functions __d_path,
d_absolute_path, d_path, and getcwd.

__d_path only wants to see paths are connected to the root it passes
in.  So __d_path needs prepend_path to return an error.

d_absolute_path similarly wants to see paths that are connected to
some root.  Escaped paths are not connected to any mnt_root so
d_absolute_path needs prepend_path to return an error greater
than 1.  So escaped paths will be treated like paths on lazily
unmounted mounts.

getcwd needs to prepend "(unreachable)" so getcwd also needs
prepend_path to return an error.

d_path is the interesting hold out.  d_path just wants to print
something, and does not care about the weird cases.  Which raises
the question what should be printed?

Given that &lt;escaped_path&gt;/&lt;anything&gt; should result in -ENOENT I
believe it is desirable for escaped paths to be printed as empty
paths.  As there are not really any meaninful path components when
considered from the perspective of a mount tree.

So tweak prepend_path to return an empty path with an new error
code of 3 when it encounters an escaped path.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit cde93be45a8a90d8c264c776fab63487b5038a65 upstream.

A rename can result in a dentry that by walking up d_parent
will never reach it's mnt_root.  For lack of a better term
I call this an escaped path.

prepend_path is called by four different functions __d_path,
d_absolute_path, d_path, and getcwd.

__d_path only wants to see paths are connected to the root it passes
in.  So __d_path needs prepend_path to return an error.

d_absolute_path similarly wants to see paths that are connected to
some root.  Escaped paths are not connected to any mnt_root so
d_absolute_path needs prepend_path to return an error greater
than 1.  So escaped paths will be treated like paths on lazily
unmounted mounts.

getcwd needs to prepend "(unreachable)" so getcwd also needs
prepend_path to return an error.

d_path is the interesting hold out.  d_path just wants to print
something, and does not care about the weird cases.  Which raises
the question what should be printed?

Given that &lt;escaped_path&gt;/&lt;anything&gt; should result in -ENOENT I
believe it is desirable for escaped paths to be printed as empty
paths.  As there are not really any meaninful path components when
considered from the perspective of a mount tree.

So tweak prepend_path to return an empty path with an new error
code of 3 when it encounters an escaped path.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>d_walk() might skip too much</title>
<updated>2015-08-06T23:32:13+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2015-05-29T03:09:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=60593faea74ea54e84cb638f9cc59e32d8268f37'/>
<id>60593faea74ea54e84cb638f9cc59e32d8268f37</id>
<content type='text'>
commit 2159184ea01e4ae7d15f2017e296d4bc82d5aeb0 upstream.

when we find that a child has died while we'd been trying to ascend,
we should go into the first live sibling itself, rather than its sibling.

Off-by-one in question had been introduced in "deal with deadlock in
d_walk()" and the fix needs to be backported to all branches this one
has been backported to.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2: apply to the 3 copies of this logic we
 ended up with]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2159184ea01e4ae7d15f2017e296d4bc82d5aeb0 upstream.

when we find that a child has died while we'd been trying to ascend,
we should go into the first live sibling itself, rather than its sibling.

Off-by-one in question had been introduced in "deal with deadlock in
d_walk()" and the fix needs to be backported to all branches this one
has been backported to.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2: apply to the 3 copies of this logic we
 ended up with]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dcache: Fix locking bugs in backported "deal with deadlock in d_walk()"</title>
<updated>2015-02-20T00:49:41+00:00</updated>
<author>
<name>Ben Hutchings</name>
<email>ben@decadent.org.uk</email>
</author>
<published>2015-02-11T03:16:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=20defcec264ceab2630356fb9d397f3d237b5e6d'/>
<id>20defcec264ceab2630356fb9d397f3d237b5e6d</id>
<content type='text'>
Steven Rostedt reported:
&gt; Porting -rt to the latest 3.2 stable tree I triggered this bug:
&gt; 
&gt; =====================================
&gt; [ BUG: bad unlock balance detected! ]
&gt; -------------------------------------
&gt; rm/1638 is trying to release lock (rcu_read_lock) at:
&gt; [&lt;c04fde6c&gt;] rcu_read_unlock+0x0/0x23
&gt; but there are no more locks to release!
&gt; 
&gt; other info that might help us debug this:
&gt; 2 locks held by rm/1638:
&gt;  #0:  (&amp;sb-&gt;s_type-&gt;i_mutex_key#9/1){+.+.+.}, at: [&lt;c04f93eb&gt;] do_rmdir+0x5f/0xd2
&gt;  #1:  (&amp;sb-&gt;s_type-&gt;i_mutex_key#9){+.+.+.}, at: [&lt;c04f9329&gt;] vfs_rmdir+0x49/0xac
&gt; 
&gt; stack backtrace:
&gt; Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2
&gt; Call Trace:
&gt;  [&lt;c083f390&gt;] ? printk+0x1d/0x1f
&gt;  [&lt;c0463cdf&gt;] print_unlock_inbalance_bug+0xc3/0xcd
&gt;  [&lt;c04653a8&gt;] lock_release_non_nested+0x98/0x1ec
&gt;  [&lt;c046228d&gt;] ? trace_hardirqs_off_caller+0x18/0x90
&gt;  [&lt;c0456f1c&gt;] ? local_clock+0x2d/0x50
&gt;  [&lt;c04fde6c&gt;] ? d_hash+0x2f/0x2f
&gt;  [&lt;c04fde6c&gt;] ? d_hash+0x2f/0x2f
&gt;  [&lt;c046568e&gt;] lock_release+0x192/0x1ad
&gt;  [&lt;c04fde83&gt;] rcu_read_unlock+0x17/0x23
&gt;  [&lt;c04ff344&gt;] shrink_dcache_parent+0x227/0x270
&gt;  [&lt;c04f9348&gt;] vfs_rmdir+0x68/0xac
&gt;  [&lt;c04f9424&gt;] do_rmdir+0x98/0xd2
&gt;  [&lt;c04f03ad&gt;] ? fput+0x1a3/0x1ab
&gt;  [&lt;c084dd42&gt;] ? sysenter_exit+0xf/0x1a
&gt;  [&lt;c0465b58&gt;] ? trace_hardirqs_on_caller+0x118/0x149
&gt;  [&lt;c04fa3e0&gt;] sys_unlinkat+0x2b/0x35
&gt;  [&lt;c084dd13&gt;] sysenter_do_call+0x12/0x12
&gt; 
&gt; 
&gt; 
&gt; 
&gt; There's a path to calling rcu_read_unlock() without calling
&gt; rcu_read_lock() in have_submounts().
&gt; 
&gt; 	goto positive;
&gt; 
&gt; positive:
&gt; 	if (!locked &amp;&amp; read_seqretry(&amp;rename_lock, seq))
&gt; 		goto rename_retry;
&gt; 
&gt; rename_retry:
&gt; 	rcu_read_unlock();
&gt; 
&gt; in the above path, rcu_read_lock() is never done before calling
&gt; rcu_read_unlock();

I reviewed locking contexts in all three functions that I changed when
backporting "deal with deadlock in d_walk()".  It's actually worse
than this:

- We don't hold this_parent-&gt;d_lock at the 'positive' label in
  have_submounts(), but it is unlocked after 'rename_retry'.
- There is an rcu_read_unlock() after the 'out' label in
  select_parent(), but it's not held at the 'goto out'.

Fix all three lock imbalances.

Reported-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Tested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Steven Rostedt reported:
&gt; Porting -rt to the latest 3.2 stable tree I triggered this bug:
&gt; 
&gt; =====================================
&gt; [ BUG: bad unlock balance detected! ]
&gt; -------------------------------------
&gt; rm/1638 is trying to release lock (rcu_read_lock) at:
&gt; [&lt;c04fde6c&gt;] rcu_read_unlock+0x0/0x23
&gt; but there are no more locks to release!
&gt; 
&gt; other info that might help us debug this:
&gt; 2 locks held by rm/1638:
&gt;  #0:  (&amp;sb-&gt;s_type-&gt;i_mutex_key#9/1){+.+.+.}, at: [&lt;c04f93eb&gt;] do_rmdir+0x5f/0xd2
&gt;  #1:  (&amp;sb-&gt;s_type-&gt;i_mutex_key#9){+.+.+.}, at: [&lt;c04f9329&gt;] vfs_rmdir+0x49/0xac
&gt; 
&gt; stack backtrace:
&gt; Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2
&gt; Call Trace:
&gt;  [&lt;c083f390&gt;] ? printk+0x1d/0x1f
&gt;  [&lt;c0463cdf&gt;] print_unlock_inbalance_bug+0xc3/0xcd
&gt;  [&lt;c04653a8&gt;] lock_release_non_nested+0x98/0x1ec
&gt;  [&lt;c046228d&gt;] ? trace_hardirqs_off_caller+0x18/0x90
&gt;  [&lt;c0456f1c&gt;] ? local_clock+0x2d/0x50
&gt;  [&lt;c04fde6c&gt;] ? d_hash+0x2f/0x2f
&gt;  [&lt;c04fde6c&gt;] ? d_hash+0x2f/0x2f
&gt;  [&lt;c046568e&gt;] lock_release+0x192/0x1ad
&gt;  [&lt;c04fde83&gt;] rcu_read_unlock+0x17/0x23
&gt;  [&lt;c04ff344&gt;] shrink_dcache_parent+0x227/0x270
&gt;  [&lt;c04f9348&gt;] vfs_rmdir+0x68/0xac
&gt;  [&lt;c04f9424&gt;] do_rmdir+0x98/0xd2
&gt;  [&lt;c04f03ad&gt;] ? fput+0x1a3/0x1ab
&gt;  [&lt;c084dd42&gt;] ? sysenter_exit+0xf/0x1a
&gt;  [&lt;c0465b58&gt;] ? trace_hardirqs_on_caller+0x118/0x149
&gt;  [&lt;c04fa3e0&gt;] sys_unlinkat+0x2b/0x35
&gt;  [&lt;c084dd13&gt;] sysenter_do_call+0x12/0x12
&gt; 
&gt; 
&gt; 
&gt; 
&gt; There's a path to calling rcu_read_unlock() without calling
&gt; rcu_read_lock() in have_submounts().
&gt; 
&gt; 	goto positive;
&gt; 
&gt; positive:
&gt; 	if (!locked &amp;&amp; read_seqretry(&amp;rename_lock, seq))
&gt; 		goto rename_retry;
&gt; 
&gt; rename_retry:
&gt; 	rcu_read_unlock();
&gt; 
&gt; in the above path, rcu_read_lock() is never done before calling
&gt; rcu_read_unlock();

I reviewed locking contexts in all three functions that I changed when
backporting "deal with deadlock in d_walk()".  It's actually worse
than this:

- We don't hold this_parent-&gt;d_lock at the 'positive' label in
  have_submounts(), but it is unlocked after 'rename_retry'.
- There is an rcu_read_unlock() after the 'out' label in
  select_parent(), but it's not held at the 'goto out'.

Fix all three lock imbalances.

Reported-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Tested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>deal with deadlock in d_walk()</title>
<updated>2015-01-01T01:27:51+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-10-26T23:31:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2d5a2e6775fadc4ac5b7a1a5cbcdec1bffc0b142'/>
<id>2d5a2e6775fadc4ac5b7a1a5cbcdec1bffc0b142</id>
<content type='text'>
commit ca5358ef75fc69fee5322a38a340f5739d997c10 upstream.

... by not hitting rename_retry for reasons other than rename having
happened.  In other words, do _not_ restart when finding that
between unlocking the child and locking the parent the former got
into __dentry_kill().  Skip the killed siblings instead...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2:
 - As we only have try_to_ascend() and not d_walk(), apply this
   change to all callers of try_to_ascend()
 - Adjust context to make __dentry_kill() apply to d_kill()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ca5358ef75fc69fee5322a38a340f5739d997c10 upstream.

... by not hitting rename_retry for reasons other than rename having
happened.  In other words, do _not_ restart when finding that
between unlocking the child and locking the parent the former got
into __dentry_kill().  Skip the killed siblings instead...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2:
 - As we only have try_to_ascend() and not d_walk(), apply this
   change to all callers of try_to_ascend()
 - Adjust context to make __dentry_kill() apply to d_kill()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>move d_rcu from overlapping d_child to overlapping d_alias</title>
<updated>2015-01-01T01:27:50+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-10-26T23:19:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=026181647a6262f4ba6d60c0847d306ad685468c'/>
<id>026181647a6262f4ba6d60c0847d306ad685468c</id>
<content type='text'>
commit 946e51f2bf37f1656916eb75bd0742ba33983c28 upstream.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2:
 - Apply name changes in all the different places we use d_alias and d_child
 - Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 946e51f2bf37f1656916eb75bd0742ba33983c28 upstream.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2:
 - Apply name changes in all the different places we use d_alias and d_child
 - Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/dcache.c: add cond_resched() to shrink_dcache_parent()</title>
<updated>2013-05-13T14:02:30+00:00</updated>
<author>
<name>Greg Thelen</name>
<email>gthelen@google.com</email>
</author>
<published>2013-04-30T22:26:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9a91027ba2368234da322cb32d5c13a41731dd2d'/>
<id>9a91027ba2368234da322cb32d5c13a41731dd2d</id>
<content type='text'>
commit 421348f1ca0bf17769dee0aed4d991845ae0536d upstream.

Call cond_resched() in shrink_dcache_parent() to maintain interactivity.

Before this patch:

	void shrink_dcache_parent(struct dentry * parent)
	{
		while ((found = select_parent(parent, &amp;dispose)) != 0)
			shrink_dentry_list(&amp;dispose);
	}

select_parent() populates the dispose list with dentries which
shrink_dentry_list() then deletes.  select_parent() carefully uses
need_resched() to avoid doing too much work at once.  But neither
shrink_dcache_parent() nor its called functions call cond_resched().  So
once need_resched() is set select_parent() will return single dentry
dispose list which is then deleted by shrink_dentry_list().  This is
inefficient when there are a lot of dentry to process.  This can cause
softlockup and hurts interactivity on non preemptable kernels.

This change adds cond_resched() in shrink_dcache_parent().  The benefit
of this is that need_resched() is quickly cleared so that future calls
to select_parent() are able to efficiently return a big batch of dentry.

These additional cond_resched() do not seem to impact performance, at
least for the workload below.

Here is a program which can cause soft lockup if other system activity
sets need_resched().

	int main()
	{
	        struct rlimit rlim;
	        int i;
	        int f[100000];
	        char buf[20];
	        struct timeval t1, t2;
	        double diff;

	        /* cleanup past run */
	        system("rm -rf x");

	        /* boost nfile rlimit */
	        rlim.rlim_cur = 200000;
	        rlim.rlim_max = 200000;
	        if (setrlimit(RLIMIT_NOFILE, &amp;rlim))
	                err(1, "setrlimit");

	        /* make directory for files */
	        if (mkdir("x", 0700))
	                err(1, "mkdir");

	        if (gettimeofday(&amp;t1, NULL))
	                err(1, "gettimeofday");

	        /* populate directory with open files */
	        for (i = 0; i &lt; 100000; i++) {
	                snprintf(buf, sizeof(buf), "x/%d", i);
	                f[i] = open(buf, O_CREAT);
	                if (f[i] == -1)
	                        err(1, "open");
	        }

	        /* close some of the files */
	        for (i = 0; i &lt; 85000; i++)
	                close(f[i]);

	        /* unlink all files, even open ones */
	        system("rm -rf x");

	        if (gettimeofday(&amp;t2, NULL))
	                err(1, "gettimeofday");

	        diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
	                ((double)t1.tv_sec * 1000000 + t1.tv_usec));

	        printf("done: %g elapsed\n", diff/1e6);
	        return 0;
	}

Signed-off-by: Greg Thelen &lt;gthelen@google.com&gt;
Signed-off-by: Dave Chinner &lt;david@fromorbit.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 421348f1ca0bf17769dee0aed4d991845ae0536d upstream.

Call cond_resched() in shrink_dcache_parent() to maintain interactivity.

Before this patch:

	void shrink_dcache_parent(struct dentry * parent)
	{
		while ((found = select_parent(parent, &amp;dispose)) != 0)
			shrink_dentry_list(&amp;dispose);
	}

select_parent() populates the dispose list with dentries which
shrink_dentry_list() then deletes.  select_parent() carefully uses
need_resched() to avoid doing too much work at once.  But neither
shrink_dcache_parent() nor its called functions call cond_resched().  So
once need_resched() is set select_parent() will return single dentry
dispose list which is then deleted by shrink_dentry_list().  This is
inefficient when there are a lot of dentry to process.  This can cause
softlockup and hurts interactivity on non preemptable kernels.

This change adds cond_resched() in shrink_dcache_parent().  The benefit
of this is that need_resched() is quickly cleared so that future calls
to select_parent() are able to efficiently return a big batch of dentry.

These additional cond_resched() do not seem to impact performance, at
least for the workload below.

Here is a program which can cause soft lockup if other system activity
sets need_resched().

	int main()
	{
	        struct rlimit rlim;
	        int i;
	        int f[100000];
	        char buf[20];
	        struct timeval t1, t2;
	        double diff;

	        /* cleanup past run */
	        system("rm -rf x");

	        /* boost nfile rlimit */
	        rlim.rlim_cur = 200000;
	        rlim.rlim_max = 200000;
	        if (setrlimit(RLIMIT_NOFILE, &amp;rlim))
	                err(1, "setrlimit");

	        /* make directory for files */
	        if (mkdir("x", 0700))
	                err(1, "mkdir");

	        if (gettimeofday(&amp;t1, NULL))
	                err(1, "gettimeofday");

	        /* populate directory with open files */
	        for (i = 0; i &lt; 100000; i++) {
	                snprintf(buf, sizeof(buf), "x/%d", i);
	                f[i] = open(buf, O_CREAT);
	                if (f[i] == -1)
	                        err(1, "open");
	        }

	        /* close some of the files */
	        for (i = 0; i &lt; 85000; i++)
	                close(f[i]);

	        /* unlink all files, even open ones */
	        system("rm -rf x");

	        if (gettimeofday(&amp;t2, NULL))
	                err(1, "gettimeofday");

	        diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
	                ((double)t1.tv_sec * 1000000 + t1.tv_usec));

	        printf("done: %g elapsed\n", diff/1e6);
	        return 0;
	}

Signed-off-by: Greg Thelen &lt;gthelen@google.com&gt;
Signed-off-by: Dave Chinner &lt;david@fromorbit.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Nest rename_lock inside vfsmount_lock</title>
<updated>2013-04-10T02:20:03+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2013-03-26T22:25:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a0a551fcc3199a6e9fb6c0236daff99bbe278905'/>
<id>a0a551fcc3199a6e9fb6c0236daff99bbe278905</id>
<content type='text'>
commit 7ea600b5314529f9d1b9d6d3c41cb26fce6a7a4a upstream.

... lest we get livelocks between path_is_under() and d_path() and friends.

The thing is, wrt fairness lglocks are more similar to rwsems than to rwlocks;
it is possible to have thread B spin on attempt to take lock shared while thread
A is already holding it shared, if B is on lower-numbered CPU than A and there's
a thread C spinning on attempt to take the same lock exclusive.

As the result, we need consistent ordering between vfsmount_lock (lglock) and
rename_lock (seq_lock), even though everything that takes both is going to take
vfsmount_lock only shared.

Spotted-by: Brad Spengler &lt;spender@grsecurity.net&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2:
 - Adjust context
 - s/&amp;vfsmount_lock/vfsmount_lock/]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7ea600b5314529f9d1b9d6d3c41cb26fce6a7a4a upstream.

... lest we get livelocks between path_is_under() and d_path() and friends.

The thing is, wrt fairness lglocks are more similar to rwsems than to rwlocks;
it is possible to have thread B spin on attempt to take lock shared while thread
A is already holding it shared, if B is on lower-numbered CPU than A and there's
a thread C spinning on attempt to take the same lock exclusive.

As the result, we need consistent ordering between vfsmount_lock (lglock) and
rename_lock (seq_lock), even though everything that takes both is going to take
vfsmount_lock only shared.

Spotted-by: Brad Spengler &lt;spender@grsecurity.net&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2:
 - Adjust context
 - s/&amp;vfsmount_lock/vfsmount_lock/]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: d_obtain_alias() needs to use "/" as default name.</title>
<updated>2013-01-03T03:33:45+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2012-11-09T00:09:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=150086ca6bd7493e7753f923a3f55e73701253b3'/>
<id>150086ca6bd7493e7753f923a3f55e73701253b3</id>
<content type='text'>
commit b911a6bdeef5848c468597d040e3407e0aee04ce upstream.

NFS appears to use d_obtain_alias() to create the root dentry rather than
d_make_root.  This can cause 'prepend_path()' to complain that the root
has a weird name if an NFS filesystem is lazily unmounted.  e.g.  if
"/mnt" is an NFS mount then

 { cd /mnt; umount -l /mnt ; ls -l /proc/self/cwd; }

will cause a WARN message like
   WARNING: at /home/git/linux/fs/dcache.c:2624 prepend_path+0x1d7/0x1e0()
   ...
   Root dentry has weird name &lt;&gt;

to appear in kernel logs.

So change d_obtain_alias() to use "/" rather than "" as the anonymous
name.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2: use named initialisers instead of QSTR_INIT()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b911a6bdeef5848c468597d040e3407e0aee04ce upstream.

NFS appears to use d_obtain_alias() to create the root dentry rather than
d_make_root.  This can cause 'prepend_path()' to complain that the root
has a weird name if an NFS filesystem is lazily unmounted.  e.g.  if
"/mnt" is an NFS mount then

 { cd /mnt; umount -l /mnt ; ls -l /proc/self/cwd; }

will cause a WARN message like
   WARNING: at /home/git/linux/fs/dcache.c:2624 prepend_path+0x1d7/0x1e0()
   ...
   Root dentry has weird name &lt;&gt;

to appear in kernel logs.

So change d_obtain_alias() to use "/" rather than "" as the anonymous
name.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2: use named initialisers instead of QSTR_INIT()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: dcache: fix deadlock in tree traversal</title>
<updated>2012-10-10T02:31:13+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>miklos@szeredi.hu</email>
</author>
<published>2012-09-17T20:23:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5e47beb778ad1c5fa901a61700f3ac1b4ee8a579'/>
<id>5e47beb778ad1c5fa901a61700f3ac1b4ee8a579</id>
<content type='text'>
commit 8110e16d42d587997bcaee0c864179e6d93603fe upstream.

IBM reported a deadlock in select_parent().  This was found to be caused
by taking rename_lock when already locked when restarting the tree
traversal.

There are two cases when the traversal needs to be restarted:

 1) concurrent d_move(); this can only happen when not already locked,
    since taking rename_lock protects against concurrent d_move().

 2) racing with final d_put() on child just at the moment of ascending
    to parent; rename_lock doesn't protect against this rare race, so it
    can happen when already locked.

Because of case 2, we need to be able to handle restarting the traversal
when rename_lock is already held.  This patch fixes all three callers of
try_to_ascend().

IBM reported that the deadlock is gone with this patch.

[ I rewrote the patch to be smaller and just do the "goto again" if the
  lock was already held, but credit goes to Miklos for the real work.
   - Linus ]

Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8110e16d42d587997bcaee0c864179e6d93603fe upstream.

IBM reported a deadlock in select_parent().  This was found to be caused
by taking rename_lock when already locked when restarting the tree
traversal.

There are two cases when the traversal needs to be restarted:

 1) concurrent d_move(); this can only happen when not already locked,
    since taking rename_lock protects against concurrent d_move().

 2) racing with final d_put() on child just at the moment of ascending
    to parent; rename_lock doesn't protect against this rare race, so it
    can happen when already locked.

Because of case 2, we need to be able to handle restarting the traversal
when rename_lock is already held.  This patch fixes all three callers of
try_to_ascend().

IBM reported that the deadlock is gone with this patch.

[ I rewrote the patch to be smaller and just do the "goto again" if the
  lock was already held, but credit goes to Miklos for the real work.
   - Linus ]

Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill()</title>
<updated>2012-10-10T02:30:48+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@suse.cz</email>
</author>
<published>2012-09-17T20:31:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2ebf56f3e471e0c5a831d74b5cdfe735523188f7'/>
<id>2ebf56f3e471e0c5a831d74b5cdfe735523188f7</id>
<content type='text'>
commit b161dfa6937ae46d50adce8a7c6b12233e96e7bd upstream.

IBM reported a soft lockup after applying the fix for the rename_lock
deadlock.  Commit c83ce989cb5f ("VFS: Fix the nfs sillyrename regression
in kernel 2.6.38") was found to be the culprit.

The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the
dentry was killed.  This flag can be set on non-killed dentries too,
which results in infinite retries when trying to traverse the dentry
tree.

This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is
only set in d_kill() and makes try_to_ascend() test only this flag.

IBM reported successful test results with this patch.

Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b161dfa6937ae46d50adce8a7c6b12233e96e7bd upstream.

IBM reported a soft lockup after applying the fix for the rename_lock
deadlock.  Commit c83ce989cb5f ("VFS: Fix the nfs sillyrename regression
in kernel 2.6.38") was found to be the culprit.

The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the
dentry was killed.  This flag can be set on non-killed dentries too,
which results in infinite retries when trying to traverse the dentry
tree.

This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is
only set in d_kill() and makes try_to_ascend() test only this flag.

IBM reported successful test results with this patch.

Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
