<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/btrfs/async-thread.c, branch v4.0.6</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>btrfs: remove unlikely from NULL checks</title>
<updated>2014-10-02T14:06:19+00:00</updated>
<author>
<name>David Sterba</name>
<email>dsterba@suse.cz</email>
</author>
<published>2014-09-29T17:20:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5d99a998f375b7bff7ddff0162a6eed4d4ca1318'/>
<id>5d99a998f375b7bff7ddff0162a6eed4d4ca1318</id>
<content type='text'>
Unlikely is implicit for NULL checks of pointers.

Signed-off-by: David Sterba &lt;dsterba@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Unlikely is implicit for NULL checks of pointers.

Signed-off-by: David Sterba &lt;dsterba@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: implement repair function when direct read fails</title>
<updated>2014-09-17T20:39:01+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2014-09-12T10:44:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8b110e393c5a6e72d50fcdf9fa7ed8b647cfdfc9'/>
<id>8b110e393c5a6e72d50fcdf9fa7ed8b647cfdfc9</id>
<content type='text'>
This patch implement data repair function when direct read fails.

The detail of the implementation is:
- When we find the data is not right, we try to read the data from the other
  mirror.
- When the io on the mirror ends, we will insert the endio work into the
  dedicated btrfs workqueue, not common read endio workqueue, because the
  original endio work is still blocked in the btrfs endio workqueue, if we
  insert the endio work of the io on the mirror into that workqueue, deadlock
  would happen.
- After we get right data, we write it back to the corrupted mirror.
- And if the data on the new mirror is still corrupted, we will try next
  mirror until we read right data or all the mirrors are traversed.
- After the above work, we set the uptodate flag according to the result.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch implement data repair function when direct read fails.

The detail of the implementation is:
- When we find the data is not right, we try to read the data from the other
  mirror.
- When the io on the mirror ends, we will insert the endio work into the
  dedicated btrfs workqueue, not common read endio workqueue, because the
  original endio work is still blocked in the btrfs endio workqueue, if we
  insert the endio work of the io on the mirror into that workqueue, deadlock
  would happen.
- After we get right data, we write it back to the corrupted mirror.
- And if the data on the new mirror is still corrupted, we will try next
  mirror until we read right data or all the mirrors are traversed.
- After the above work, we set the uptodate flag according to the result.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: fix task hang under heavy compressed write</title>
<updated>2014-08-24T14:17:02+00:00</updated>
<author>
<name>Liu Bo</name>
<email>bo.li.liu@oracle.com</email>
</author>
<published>2014-08-15T15:36:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9e0af23764344f7f1b68e4eefbe7dc865018b63d'/>
<id>9e0af23764344f7f1b68e4eefbe7dc865018b63d</id>
<content type='text'>
This has been reported and discussed for a long time, and this hang occurs in
both 3.15 and 3.16.

Btrfs now migrates to use kernel workqueue, but it introduces this hang problem.

Btrfs has a kind of work queued as an ordered way, which means that its
ordered_func() must be processed in the way of FIFO, so it usually looks like --

normal_work_helper(arg)
    work = container_of(arg, struct btrfs_work, normal_work);

    work-&gt;func() &lt;---- (we name it work X)
    for ordered_work in wq-&gt;ordered_list
            ordered_work-&gt;ordered_func()
            ordered_work-&gt;ordered_free()

The hang is a rare case, first when we find free space, we get an uncached block
group, then we go to read its free space cache inode for free space information,
so it will

file a readahead request
    btrfs_readpages()
         for page that is not in page cache
                __do_readpage()
                     submit_extent_page()
                           btrfs_submit_bio_hook()
                                 btrfs_bio_wq_end_io()
                                 submit_bio()
                                 end_workqueue_bio() &lt;--(ret by the 1st endio)
                                      queue a work(named work Y) for the 2nd
                                      also the real endio()

So the hang occurs when work Y's work_struct and work X's work_struct happens
to share the same address.

A bit more explanation,

A,B,C -- struct btrfs_work
arg   -- struct work_struct

kthread:
worker_thread()
    pick up a work_struct from @worklist
    process_one_work(arg)
	worker-&gt;current_work = arg;  &lt;-- arg is A-&gt;normal_work
	worker-&gt;current_func(arg)
		normal_work_helper(arg)
		     A = container_of(arg, struct btrfs_work, normal_work);

		     A-&gt;func()
		     A-&gt;ordered_func()
		     A-&gt;ordered_free()  &lt;-- A gets freed

		     B-&gt;ordered_func()
			  submit_compressed_extents()
			      find_free_extent()
				  load_free_space_inode()
				      ...   &lt;-- (the above readhead stack)
				      end_workqueue_bio()
					   btrfs_queue_work(work C)
		     B-&gt;ordered_free()

As if work A has a high priority in wq-&gt;ordered_list and there are more ordered
works queued after it, such as B-&gt;ordered_func(), its memory could have been
freed before normal_work_helper() returns, which means that kernel workqueue
code worker_thread() still has worker-&gt;current_work pointer to be work
A-&gt;normal_work's, ie. arg's address.

Meanwhile, work C is allocated after work A is freed, work C-&gt;normal_work
and work A-&gt;normal_work are likely to share the same address(I confirmed this
with ftrace output, so I'm not just guessing, it's rare though).

When another kthread picks up work C-&gt;normal_work to process, and finds our
kthread is processing it(see find_worker_executing_work()), it'll think
work C as a collision and skip then, which ends up nobody processing work C.

So the situation is that our kthread is waiting forever on work C.

Besides, there're other cases that can lead to deadlock, but the real problem
is that all btrfs workqueue shares one work-&gt;func, -- normal_work_helper,
so this makes each workqueue to have its own helper function, but only a
wraper pf normal_work_helper.

With this patch, I no long hit the above hang.

Signed-off-by: Liu Bo &lt;bo.li.liu@oracle.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This has been reported and discussed for a long time, and this hang occurs in
both 3.15 and 3.16.

Btrfs now migrates to use kernel workqueue, but it introduces this hang problem.

Btrfs has a kind of work queued as an ordered way, which means that its
ordered_func() must be processed in the way of FIFO, so it usually looks like --

normal_work_helper(arg)
    work = container_of(arg, struct btrfs_work, normal_work);

    work-&gt;func() &lt;---- (we name it work X)
    for ordered_work in wq-&gt;ordered_list
            ordered_work-&gt;ordered_func()
            ordered_work-&gt;ordered_free()

The hang is a rare case, first when we find free space, we get an uncached block
group, then we go to read its free space cache inode for free space information,
so it will

file a readahead request
    btrfs_readpages()
         for page that is not in page cache
                __do_readpage()
                     submit_extent_page()
                           btrfs_submit_bio_hook()
                                 btrfs_bio_wq_end_io()
                                 submit_bio()
                                 end_workqueue_bio() &lt;--(ret by the 1st endio)
                                      queue a work(named work Y) for the 2nd
                                      also the real endio()

So the hang occurs when work Y's work_struct and work X's work_struct happens
to share the same address.

A bit more explanation,

A,B,C -- struct btrfs_work
arg   -- struct work_struct

kthread:
worker_thread()
    pick up a work_struct from @worklist
    process_one_work(arg)
	worker-&gt;current_work = arg;  &lt;-- arg is A-&gt;normal_work
	worker-&gt;current_func(arg)
		normal_work_helper(arg)
		     A = container_of(arg, struct btrfs_work, normal_work);

		     A-&gt;func()
		     A-&gt;ordered_func()
		     A-&gt;ordered_free()  &lt;-- A gets freed

		     B-&gt;ordered_func()
			  submit_compressed_extents()
			      find_free_extent()
				  load_free_space_inode()
				      ...   &lt;-- (the above readhead stack)
				      end_workqueue_bio()
					   btrfs_queue_work(work C)
		     B-&gt;ordered_free()

As if work A has a high priority in wq-&gt;ordered_list and there are more ordered
works queued after it, such as B-&gt;ordered_func(), its memory could have been
freed before normal_work_helper() returns, which means that kernel workqueue
code worker_thread() still has worker-&gt;current_work pointer to be work
A-&gt;normal_work's, ie. arg's address.

Meanwhile, work C is allocated after work A is freed, work C-&gt;normal_work
and work A-&gt;normal_work are likely to share the same address(I confirmed this
with ftrace output, so I'm not just guessing, it's rare though).

When another kthread picks up work C-&gt;normal_work to process, and finds our
kthread is processing it(see find_worker_executing_work()), it'll think
work C as a collision and skip then, which ends up nobody processing work C.

So the situation is that our kthread is waiting forever on work C.

Besides, there're other cases that can lead to deadlock, but the real problem
is that all btrfs workqueue shares one work-&gt;func, -- normal_work_helper,
so this makes each workqueue to have its own helper function, but only a
wraper pf normal_work_helper.

With this patch, I no long hit the above hang.

Signed-off-by: Liu Bo &lt;bo.li.liu@oracle.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: fix crash in remount(thread_pool=) case</title>
<updated>2014-04-07T17:41:52+00:00</updated>
<author>
<name>Sergei Trofimovich</name>
<email>slyfox@gentoo.org</email>
</author>
<published>2014-04-07T07:55:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=800ee2247f483b6d05ed47ef3bbc90b56451746c'/>
<id>800ee2247f483b6d05ed47ef3bbc90b56451746c</id>
<content type='text'>
Reproducer:
    mount /dev/ubda /mnt
    mount -oremount,thread_pool=42 /mnt

Gives a crash:
    ? btrfs_workqueue_set_max+0x0/0x70
    btrfs_resize_thread_pool+0xe3/0xf0
    ? sync_filesystem+0x0/0xc0
    ? btrfs_resize_thread_pool+0x0/0xf0
    btrfs_remount+0x1d2/0x570
    ? kern_path+0x0/0x80
    do_remount_sb+0xd9/0x1c0
    do_mount+0x26a/0xbf0
    ? kfree+0x0/0x1b0
    SyS_mount+0xc4/0x110

It's a call
    btrfs_workqueue_set_max(fs_info-&gt;scrub_wr_completion_workers, new_pool_size);
with
    fs_info-&gt;scrub_wr_completion_workers = NULL;

as scrub wqs get created only on user's demand.

Patch skips not-created-yet workqueues.

Signed-off-by: Sergei Trofimovich &lt;slyfox@gentoo.org&gt;
CC: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
CC: Chris Mason &lt;clm@fb.com&gt;
CC: Josef Bacik &lt;jbacik@fb.com&gt;
CC: linux-btrfs@vger.kernel.org
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reproducer:
    mount /dev/ubda /mnt
    mount -oremount,thread_pool=42 /mnt

Gives a crash:
    ? btrfs_workqueue_set_max+0x0/0x70
    btrfs_resize_thread_pool+0xe3/0xf0
    ? sync_filesystem+0x0/0xc0
    ? btrfs_resize_thread_pool+0x0/0xf0
    btrfs_remount+0x1d2/0x570
    ? kern_path+0x0/0x80
    do_remount_sb+0xd9/0x1c0
    do_mount+0x26a/0xbf0
    ? kfree+0x0/0x1b0
    SyS_mount+0xc4/0x110

It's a call
    btrfs_workqueue_set_max(fs_info-&gt;scrub_wr_completion_workers, new_pool_size);
with
    fs_info-&gt;scrub_wr_completion_workers = NULL;

as scrub wqs get created only on user's demand.

Patch skips not-created-yet workqueues.

Signed-off-by: Sergei Trofimovich &lt;slyfox@gentoo.org&gt;
CC: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
CC: Chris Mason &lt;clm@fb.com&gt;
CC: Josef Bacik &lt;jbacik@fb.com&gt;
CC: linux-btrfs@vger.kernel.org
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: Add trace for btrfs_workqueue alloc/destroy</title>
<updated>2014-03-21T00:15:28+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>quwenruo@cn.fujitsu.com</email>
</author>
<published>2014-03-12T08:05:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c3a468915a384c0015263edd9b7263775599a323'/>
<id>c3a468915a384c0015263edd9b7263775599a323</id>
<content type='text'>
Since most of the btrfs_workqueue is printed as pointer address,
for easier analysis, add trace for btrfs_workqueue alloc/destroy.
So it is possible to determine the workqueue that a given work belongs
to(by comparing the wq pointer address with alloc trace event).

Signed-off-by: Qu Wenruo &lt;quenruo@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since most of the btrfs_workqueue is printed as pointer address,
for easier analysis, add trace for btrfs_workqueue alloc/destroy.
So it is possible to determine the workqueue that a given work belongs
to(by comparing the wq pointer address with alloc trace event).

Signed-off-by: Qu Wenruo &lt;quenruo@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: add missing kfree in btrfs_destroy_workqueue</title>
<updated>2014-03-21T00:15:27+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@gmail.com</email>
</author>
<published>2014-03-11T14:31:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ef66af101a261f1c86ef9ec3859ebd9c28ee2e54'/>
<id>ef66af101a261f1c86ef9ec3859ebd9c28ee2e54</id>
<content type='text'>
Signed-off-by: Filipe David Borba Manana &lt;fdmanana@gmail.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Filipe David Borba Manana &lt;fdmanana@gmail.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: Add ftrace for btrfs_workqueue</title>
<updated>2014-03-10T19:17:21+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>quwenruo@cn.fujitsu.com</email>
</author>
<published>2014-03-06T04:19:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=52483bc26f0e95c91e8fd07f9def588bf89664f8'/>
<id>52483bc26f0e95c91e8fd07f9def588bf89664f8</id>
<content type='text'>
Add ftrace for btrfs_workqueue for further workqueue tunning.
This patch needs to applied after the workqueue replace patchset.

Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add ftrace for btrfs_workqueue for further workqueue tunning.
This patch needs to applied after the workqueue replace patchset.

Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: Cleanup the btrfs_workqueue related function type</title>
<updated>2014-03-10T19:17:20+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>quwenruo@cn.fujitsu.com</email>
</author>
<published>2014-03-06T04:19:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6db8914f9763d3f0a7610b497d44f93a4c17e62e'/>
<id>6db8914f9763d3f0a7610b497d44f93a4c17e62e</id>
<content type='text'>
The new btrfs_workqueue still use open-coded function defition,
this patch will change them into btrfs_func_t type which is much the
same as kernel workqueue.

Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The new btrfs_workqueue still use open-coded function defition,
this patch will change them into btrfs_func_t type which is much the
same as kernel workqueue.

Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: Cleanup the "_struct" suffix in btrfs_workequeue</title>
<updated>2014-03-10T19:17:16+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>quwenruo@cn.fujitsu.com</email>
</author>
<published>2014-02-28T02:46:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d458b0540ebd728b4d6ef47cc5ef0dbfd4dd361a'/>
<id>d458b0540ebd728b4d6ef47cc5ef0dbfd4dd361a</id>
<content type='text'>
Since the "_struct" suffix is mainly used for distinguish the differnt
btrfs_work between the original and the newly created one,
there is no need using the suffix since all btrfs_workers are changed
into btrfs_workqueue.

Also this patch fixed some codes whose code style is changed due to the
too long "_struct" suffix.

Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Tested-by: David Sterba &lt;dsterba@suse.cz&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since the "_struct" suffix is mainly used for distinguish the differnt
btrfs_work between the original and the newly created one,
there is no need using the suffix since all btrfs_workers are changed
into btrfs_workqueue.

Also this patch fixed some codes whose code style is changed due to the
too long "_struct" suffix.

Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Tested-by: David Sterba &lt;dsterba@suse.cz&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: Cleanup the old btrfs_worker.</title>
<updated>2014-03-10T19:17:15+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>quwenruo@cn.fujitsu.com</email>
</author>
<published>2014-02-28T02:46:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a046e9c88b0f46677923864295eac7c92cd962cb'/>
<id>a046e9c88b0f46677923864295eac7c92cd962cb</id>
<content type='text'>
Since all the btrfs_worker is replaced with the newly created
btrfs_workqueue, the old codes can be easily remove.

Signed-off-by: Quwenruo &lt;quwenruo@cn.fujitsu.com&gt;
Tested-by: David Sterba &lt;dsterba@suse.cz&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since all the btrfs_worker is replaced with the newly created
btrfs_workqueue, the old codes can be easily remove.

Signed-off-by: Quwenruo &lt;quwenruo@cn.fujitsu.com&gt;
Tested-by: David Sterba &lt;dsterba@suse.cz&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
