<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/btrfs/dev-replace.c, branch v4.0.6</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event()</title>
<updated>2015-01-22T02:06:48+00:00</updated>
<author>
<name>Zhao Lei</name>
<email>zhaolei@cn.fujitsu.com</email>
</author>
<published>2015-01-20T07:11:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7653947fe6d59db72fbc26c4fc9ef842febc79e3'/>
<id>7653947fe6d59db72fbc26c4fc9ef842febc79e3</id>
<content type='text'>
Signed-off-by: Zhao Lei &lt;zhaolei@cn.fujitsu.com&gt;
Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Zhao Lei &lt;zhaolei@cn.fujitsu.com&gt;
Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: Cleanup btrfs_bio_counter_inc_blocked()</title>
<updated>2015-01-22T02:06:48+00:00</updated>
<author>
<name>Zhao Lei</name>
<email>zhaolei@cn.fujitsu.com</email>
</author>
<published>2015-01-20T07:11:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=09dd7a01c3436344fb4c3974b275e016264e0a27'/>
<id>09dd7a01c3436344fb4c3974b275e016264e0a27</id>
<content type='text'>
1: Remove no-need DEFINE_WAIT(wait)
2: Add likely() for BTRFS_FS_STATE_DEV_REPLACING condition
3: Use while loop instead of goto

Signed-off-by: Zhao Lei &lt;zhaolei@cn.fujitsu.com&gt;
Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
1: Remove no-need DEFINE_WAIT(wait)
2: Add likely() for BTRFS_FS_STATE_DEV_REPLACING condition
3: Use while loop instead of goto

Signed-off-by: Zhao Lei &lt;zhaolei@cn.fujitsu.com&gt;
Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'raid56-scrub-replace' of git://github.com/miaoxie/linux-btrfs into for-linus</title>
<updated>2014-12-03T02:42:03+00:00</updated>
<author>
<name>Chris Mason</name>
<email>clm@fb.com</email>
</author>
<published>2014-12-03T02:42:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9627aeee3e203e30679549e4962633698a6bf87f'/>
<id>9627aeee3e203e30679549e4962633698a6bf87f</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs, replace: enable dev-replace for raid56</title>
<updated>2014-12-03T02:20:48+00:00</updated>
<author>
<name>Zhao Lei</name>
<email>zhaolei@cn.fujitsu.com</email>
</author>
<published>2014-11-13T03:45:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5d3edd8f44aac94de7b16f4c54290e24f5e8c532'/>
<id>5d3edd8f44aac94de7b16f4c54290e24f5e8c532</id>
<content type='text'>
Signed-off-by: Zhao Lei &lt;zhaolei@cn.fujitsu.com&gt;
Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Zhao Lei &lt;zhaolei@cn.fujitsu.com&gt;
Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs, raid56: fix use-after-free problem in the final device replace procedure on raid56</title>
<updated>2014-12-03T02:18:47+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2014-11-25T08:39:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4245215d6a8dba1a51c50533b6667919687c0b89'/>
<id>4245215d6a8dba1a51c50533b6667919687c0b89</id>
<content type='text'>
The commit c404e0dc (Btrfs: fix use-after-free in the finishing
procedure of the device replace) fixed a use-after-free problem
which happened when removing the source device at the end of device
replace, but at that time, btrfs didn't support device replace
on raid56, so we didn't fix the problem on the raid56 profile.
Currently, we implemented device replace for raid56, so we need
kick that problem out before we enable that function for raid56.

The fix method is very simple, we just increase the bio per-cpu
counter before we submit a raid56 io, and decrease the counter
when the raid56 io ends.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The commit c404e0dc (Btrfs: fix use-after-free in the finishing
procedure of the device replace) fixed a use-after-free problem
which happened when removing the source device at the end of device
replace, but at that time, btrfs didn't support device replace
on raid56, so we didn't fix the problem on the raid56 profile.
Currently, we implemented device replace for raid56, so we need
kick that problem out before we enable that function for raid56.

The fix method is very simple, we just increase the bio per-cpu
counter before we submit a raid56 io, and decrease the counter
when the raid56 io ends.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: Fix a lockdep warning when running xfstest.</title>
<updated>2014-11-25T13:55:38+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>quwenruo@cn.fujitsu.com</email>
</author>
<published>2014-10-30T08:52:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=084b6e7c7607bbeb28544da659c3f5981a4689b0'/>
<id>084b6e7c7607bbeb28544da659c3f5981a4689b0</id>
<content type='text'>
The following lockdep warning is triggered during xfstests:

[ 1702.980872] =========================================================
[ 1702.981181] [ INFO: possible irq lock inversion dependency detected ]
[ 1702.981482] 3.18.0-rc1 #27 Not tainted
[ 1702.981781] ---------------------------------------------------------
[ 1702.982095] kswapd0/77 just changed the state of lock:
[ 1702.982415]  (&amp;delayed_node-&gt;mutex){+.+.-.}, at: [&lt;ffffffffa03b0b51&gt;] __btrfs_release_delayed_node+0x41/0x1f0 [btrfs]
[ 1702.982794] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[ 1702.983160]  (&amp;fs_info-&gt;dev_replace.lock){+.+.+.}

and interrupts could create inverse lock ordering between them.

[ 1702.984675]
other info that might help us debug this:
[ 1702.985524] Chain exists of:
  &amp;delayed_node-&gt;mutex --&gt; &amp;found-&gt;groups_sem --&gt; &amp;fs_info-&gt;dev_replace.lock

[ 1702.986799]  Possible interrupt unsafe locking scenario:

[ 1702.987681]        CPU0                    CPU1
[ 1702.988137]        ----                    ----
[ 1702.988598]   lock(&amp;fs_info-&gt;dev_replace.lock);
[ 1702.989069]                                local_irq_disable();
[ 1702.989534]                                lock(&amp;delayed_node-&gt;mutex);
[ 1702.990038]                                lock(&amp;found-&gt;groups_sem);
[ 1702.990494]   &lt;Interrupt&gt;
[ 1702.990938]     lock(&amp;delayed_node-&gt;mutex);
[ 1702.991407]
 *** DEADLOCK ***

It is because the btrfs_kobj_{add/rm}_device() will call memory
allocation with GFP_KERNEL,
which may flush fs page cache to free space, waiting for it self to do
the commit, causing the deadlock.

To solve the problem, move btrfs_kobj_{add/rm}_device() out of the
dev_replace lock range, also involing split the
btrfs_rm_dev_replace_srcdev() function into remove and free parts.

Now only btrfs_rm_dev_replace_remove_srcdev() is called in dev_replace
lock range, and kobj_{add/rm} and btrfs_rm_dev_replace_free_srcdev() are
called out of the lock range.

Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The following lockdep warning is triggered during xfstests:

[ 1702.980872] =========================================================
[ 1702.981181] [ INFO: possible irq lock inversion dependency detected ]
[ 1702.981482] 3.18.0-rc1 #27 Not tainted
[ 1702.981781] ---------------------------------------------------------
[ 1702.982095] kswapd0/77 just changed the state of lock:
[ 1702.982415]  (&amp;delayed_node-&gt;mutex){+.+.-.}, at: [&lt;ffffffffa03b0b51&gt;] __btrfs_release_delayed_node+0x41/0x1f0 [btrfs]
[ 1702.982794] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[ 1702.983160]  (&amp;fs_info-&gt;dev_replace.lock){+.+.+.}

and interrupts could create inverse lock ordering between them.

[ 1702.984675]
other info that might help us debug this:
[ 1702.985524] Chain exists of:
  &amp;delayed_node-&gt;mutex --&gt; &amp;found-&gt;groups_sem --&gt; &amp;fs_info-&gt;dev_replace.lock

[ 1702.986799]  Possible interrupt unsafe locking scenario:

[ 1702.987681]        CPU0                    CPU1
[ 1702.988137]        ----                    ----
[ 1702.988598]   lock(&amp;fs_info-&gt;dev_replace.lock);
[ 1702.989069]                                local_irq_disable();
[ 1702.989534]                                lock(&amp;delayed_node-&gt;mutex);
[ 1702.990038]                                lock(&amp;found-&gt;groups_sem);
[ 1702.990494]   &lt;Interrupt&gt;
[ 1702.990938]     lock(&amp;delayed_node-&gt;mutex);
[ 1702.991407]
 *** DEADLOCK ***

It is because the btrfs_kobj_{add/rm}_device() will call memory
allocation with GFP_KERNEL,
which may flush fs page cache to free space, waiting for it self to do
the commit, causing the deadlock.

To solve the problem, move btrfs_kobj_{add/rm}_device() out of the
dev_replace lock range, also involing split the
btrfs_rm_dev_replace_srcdev() function into remove and free parts.

Now only btrfs_rm_dev_replace_remove_srcdev() is called in dev_replace
lock range, and kobj_{add/rm} and btrfs_rm_dev_replace_free_srcdev() are
called out of the lock range.

Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: return failure if btrfs_dev_replace_finishing() failed</title>
<updated>2014-11-21T01:14:28+00:00</updated>
<author>
<name>Eryu Guan</name>
<email>guaneryu@gmail.com</email>
</author>
<published>2014-10-13T04:42:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2fc9f6baa24ac230166df41ed636224969523341'/>
<id>2fc9f6baa24ac230166df41ed636224969523341</id>
<content type='text'>
device replace could fail due to another running scrub process or any
other errors btrfs_scrub_dev() may hit, but this failure doesn't get
returned to userspace.

The following steps could reproduce this issue

	mkfs -t btrfs -f /dev/sdb1 /dev/sdb2
	mount /dev/sdb1 /mnt/btrfs
	while true; do btrfs scrub start -B /mnt/btrfs &gt;/dev/null 2&gt;&amp;1; done &amp;
	btrfs replace start -Bf /dev/sdb2 /dev/sdb3 /mnt/btrfs
	# if this replace succeeded, do the following and repeat until
	# you see this log in dmesg
	# BTRFS: btrfs_scrub_dev(/dev/sdb2, 2, /dev/sdb3) failed -115
	#btrfs replace start -Bf /dev/sdb3 /dev/sdb2 /mnt/btrfs

	# once you see the error log in dmesg, check return value of
	# replace
	echo $?

Introduce a new dev replace result

BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS

to catch -EINPROGRESS explicitly and return other errors directly to
userspace.

Signed-off-by: Eryu Guan &lt;guaneryu@gmail.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
device replace could fail due to another running scrub process or any
other errors btrfs_scrub_dev() may hit, but this failure doesn't get
returned to userspace.

The following steps could reproduce this issue

	mkfs -t btrfs -f /dev/sdb1 /dev/sdb2
	mount /dev/sdb1 /mnt/btrfs
	while true; do btrfs scrub start -B /mnt/btrfs &gt;/dev/null 2&gt;&amp;1; done &amp;
	btrfs replace start -Bf /dev/sdb2 /dev/sdb3 /mnt/btrfs
	# if this replace succeeded, do the following and repeat until
	# you see this log in dmesg
	# BTRFS: btrfs_scrub_dev(/dev/sdb2, 2, /dev/sdb3) failed -115
	#btrfs replace start -Bf /dev/sdb3 /dev/sdb2 /mnt/btrfs

	# once you see the error log in dmesg, check return value of
	# replace
	echo $?

Introduce a new dev replace result

BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS

to catch -EINPROGRESS explicitly and return other errors directly to
userspace.

Signed-off-by: Eryu Guan &lt;guaneryu@gmail.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: make the logic of source device removing more clear</title>
<updated>2014-09-17T20:38:46+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2014-09-03T13:35:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=82372bc816d75722c24d1abadb11cd8c0a33883a'/>
<id>82372bc816d75722c24d1abadb11cd8c0a33883a</id>
<content type='text'>
Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: fix use-after-free problem of the device during device replace</title>
<updated>2014-09-17T20:38:44+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2014-09-03T13:35:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=67a2c45ee7f4f250458279a2e1244679c5d9735c'/>
<id>67a2c45ee7f4f250458279a2e1244679c5d9735c</id>
<content type='text'>
The problem is:
	Task0(device scan task)		Task1(device replace task)
	scan_one_device()
	mutex_lock(&amp;uuid_mutex)
	device = find_device()
					mutex_lock(&amp;device_list_mutex)
					lock_chunk()
					rm_and_free_source_device
					unlock_chunk()
					mutex_unlock(&amp;device_list_mutex)
	check device

Destroying the target device if device replace fails also has the same problem.

We fix this problem by locking uuid_mutex during destroying source device or
target device, just like the device remove operation.

It is a temporary solution, we can fix this problem and make the code more
clear by atomic counter in the future.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The problem is:
	Task0(device scan task)		Task1(device replace task)
	scan_one_device()
	mutex_lock(&amp;uuid_mutex)
	device = find_device()
					mutex_lock(&amp;device_list_mutex)
					lock_chunk()
					rm_and_free_source_device
					unlock_chunk()
					mutex_unlock(&amp;device_list_mutex)
	check device

Destroying the target device if device replace fails also has the same problem.

We fix this problem by locking uuid_mutex during destroying source device or
target device, just like the device remove operation.

It is a temporary solution, we can fix this problem and make the code more
clear by atomic counter in the future.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: Fix misuse of chunk mutex</title>
<updated>2014-09-17T20:38:42+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2014-09-03T13:35:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2196d6e8a71fc901e31c1d81581fc6cc6c64913e'/>
<id>2196d6e8a71fc901e31c1d81581fc6cc6c64913e</id>
<content type='text'>
There were several problems about chunk mutex usage:
- Lock chunk mutex when updating metadata. It would cause the nested
  deadlock because updating metadata might need allocate new chunks
  that need acquire chunk mutex. We remove chunk mutex at this case,
  because b-tree lock and other lock mechanism can help us.
- ABBA deadlock occured between device_list_mutex and chunk_mutex.
  When we update device status, we must acquire device_list_mutex at the
  beginning, and then we might get chunk_mutex during the device status
  update because we need allocate new chunks for metadata COW. But at
  most place, we acquire chunk_mutex at first and then acquire device list
  mutex. We need change the lock order.
- Some place we needn't acquire chunk_mutex. For example we needn't get
  chunk_mutex when we free a empty seed fs_devices structure.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There were several problems about chunk mutex usage:
- Lock chunk mutex when updating metadata. It would cause the nested
  deadlock because updating metadata might need allocate new chunks
  that need acquire chunk mutex. We remove chunk mutex at this case,
  because b-tree lock and other lock mechanism can help us.
- ABBA deadlock occured between device_list_mutex and chunk_mutex.
  When we update device status, we must acquire device_list_mutex at the
  beginning, and then we might get chunk_mutex during the device status
  update because we need allocate new chunks for metadata COW. But at
  most place, we acquire chunk_mutex at first and then acquire device list
  mutex. We need change the lock order.
- Some place we needn't acquire chunk_mutex. For example we needn't get
  chunk_mutex when we free a empty seed fs_devices structure.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
