<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/md/multipath.c, branch v4.4.93</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>md: multipath: don't hardcopy bio in .make_request path</title>
<updated>2016-04-12T16:08:57+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2016-03-12T01:29:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2775a60447ae12350711b443be1083117b6f13b8'/>
<id>2775a60447ae12350711b443be1083117b6f13b8</id>
<content type='text'>
commit fafcde3ac1a418688a734365203a12483b83907a upstream.

Inside multipath_make_request(), multipath maps the incoming
bio into low level device's bio, but it is totally wrong to
copy the bio into mapped bio via '*mapped_bio = *bio'. For
example, .__bi_remaining is kept in the copy, especially if
the incoming bio is chained to via bio splitting, so .bi_end_io
can't be called for the mapped bio at all in the completing path
in this kind of situation.

This patch fixes the issue by using clone style.

Reported-and-tested-by: Andrea Righi &lt;righi.andrea@gmail.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit fafcde3ac1a418688a734365203a12483b83907a upstream.

Inside multipath_make_request(), multipath maps the incoming
bio into low level device's bio, but it is totally wrong to
copy the bio into mapped bio via '*mapped_bio = *bio'. For
example, .__bi_remaining is kept in the copy, especially if
the incoming bio is chained to via bio splitting, so .bi_end_io
can't be called for the mapped bio at all in the completing path
in this kind of situation.

This patch fixes the issue by using clone style.

Reported-and-tested-by: Andrea Righi &lt;righi.andrea@gmail.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid: only permit hot-add of compatible integrity profiles</title>
<updated>2016-02-17T20:30:57+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2016-01-14T00:00:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2c4b95da2335088ba85dcdfa39d374491d76fbba'/>
<id>2c4b95da2335088ba85dcdfa39d374491d76fbba</id>
<content type='text'>
commit 1501efadc524a0c99494b576923091589a52d2a4 upstream.

It is not safe for an integrity profile to be changed while i/o is
in-flight in the queue.  Prevent adding new disks or otherwise online
spares to an array if the device has an incompatible integrity profile.

The original change to the blk_integrity_unregister implementation in
md, commmit c7bfced9a671 "md: suspend i/o during runtime
blk_integrity_unregister" introduced an immediate hang regression.

This policy of disallowing changes the integrity profile once one has
been established is shared with DM.

Here is an abbreviated log from a test run that:
1/ Creates a degraded raid1 with an integrity-enabled device (pmem0s) [   59.076127]
2/ Tries to add an integrity-disabled device (pmem1m) [   90.489209]
3/ Retries with an integrity-enabled device (pmem1s) [  205.671277]

[   59.076127] md/raid1:md0: active with 1 out of 2 mirrors
[   59.078302] md: data integrity enabled on md0
[..]
[   90.489209] md0: incompatible integrity profile for pmem1m
[..]
[  205.671277] md: super_written gets error=-5
[  205.677386] md/raid1:md0: Disk failure on pmem1m, disabling device.
[  205.677386] md/raid1:md0: Operation continuing on 1 devices.
[  205.683037] RAID1 conf printout:
[  205.684699]  --- wd:1 rd:2
[  205.685972]  disk 0, wo:0, o:1, dev:pmem0s
[  205.687562]  disk 1, wo:1, o:1, dev:pmem1s
[  205.691717] md: recovery of RAID array md0

Fixes: c7bfced9a671 ("md: suspend i/o during runtime blk_integrity_unregister")
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reported-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 1501efadc524a0c99494b576923091589a52d2a4 upstream.

It is not safe for an integrity profile to be changed while i/o is
in-flight in the queue.  Prevent adding new disks or otherwise online
spares to an array if the device has an incompatible integrity profile.

The original change to the blk_integrity_unregister implementation in
md, commmit c7bfced9a671 "md: suspend i/o during runtime
blk_integrity_unregister" introduced an immediate hang regression.

This policy of disallowing changes the integrity profile once one has
been established is shared with DM.

Here is an abbreviated log from a test run that:
1/ Creates a degraded raid1 with an integrity-enabled device (pmem0s) [   59.076127]
2/ Tries to add an integrity-disabled device (pmem1m) [   90.489209]
3/ Retries with an integrity-enabled device (pmem1s) [  205.671277]

[   59.076127] md/raid1:md0: active with 1 out of 2 mirrors
[   59.078302] md: data integrity enabled on md0
[..]
[   90.489209] md0: incompatible integrity profile for pmem1m
[..]
[  205.671277] md: super_written gets error=-5
[  205.677386] md/raid1:md0: Disk failure on pmem1m, disabling device.
[  205.677386] md/raid1:md0: Operation continuing on 1 devices.
[  205.683037] RAID1 conf printout:
[  205.684699]  --- wd:1 rd:2
[  205.685972]  disk 0, wo:0, o:1, dev:pmem0s
[  205.687562]  disk 1, wo:1, o:1, dev:pmem1s
[  205.691717] md: recovery of RAID array md0

Fixes: c7bfced9a671 ("md: suspend i/o during runtime blk_integrity_unregister")
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reported-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>md: suspend i/o during runtime blk_integrity_unregister</title>
<updated>2015-10-21T20:43:38+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2015-10-21T17:20:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c7bfced9a6716ff66c9d61f934bb60af08d4688c'/>
<id>c7bfced9a6716ff66c9d61f934bb60af08d4688c</id>
<content type='text'>
Synchronize pending i/o against a change in the integrity profile to
avoid the possibility of spurious integrity errors.  Given linear_add()
is suspending the mddev before manipulating the mddev, do the same for
the other personalities.

Acked-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Synchronize pending i/o against a change in the integrity profile to
avoid the possibility of spurious integrity errors.  Given linear_add()
is suspending the mddev before manipulating the mddev, do the same for
the other personalities.

Acked-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: drop null test before destroy functions</title>
<updated>2015-10-02T07:23:44+00:00</updated>
<author>
<name>Julia Lawall</name>
<email>Julia.Lawall@lip6.fr</email>
</author>
<published>2015-09-13T12:15:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=644df1a85fc4b0c7a16800f55717261546f4e651'/>
<id>644df1a85fc4b0c7a16800f55717261546f4e651</id>
<content type='text'>
Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// &lt;smpl&gt;
@@ expression x; @@
-if (x != NULL)
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// &lt;/smpl&gt;

Signed-off-by: Julia Lawall &lt;Julia.Lawall@lip6.fr&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// &lt;smpl&gt;
@@ expression x; @@
-if (x != NULL)
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// &lt;/smpl&gt;

Signed-off-by: Julia Lawall &lt;Julia.Lawall@lip6.fr&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: kill merge_bvec_fn() completely</title>
<updated>2015-08-13T18:31:57+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@gmail.com</email>
</author>
<published>2015-04-28T06:48:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8ae126660fddbeebb9251a174e6fa45b6ad8f932'/>
<id>8ae126660fddbeebb9251a174e6fa45b6ad8f932</id>
<content type='text'>
As generic_make_request() is now able to handle arbitrarily sized bios,
it's no longer necessary for each individual block driver to define its
own -&gt;merge_bvec_fn() callback. Remove every invocation completely.

Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Lars Ellenberg &lt;drbd-dev@lists.linbit.com&gt;
Cc: drbd-user@lists.linbit.com
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Yehuda Sadeh &lt;yehuda@inktank.com&gt;
Cc: Sage Weil &lt;sage@inktank.com&gt;
Cc: Alex Elder &lt;elder@kernel.org&gt;
Cc: ceph-devel@vger.kernel.org
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: dm-devel@redhat.com
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: linux-raid@vger.kernel.org
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: "Martin K. Petersen" &lt;martin.petersen@oracle.com&gt;
Acked-by: NeilBrown &lt;neilb@suse.de&gt; (for the 'md' bits)
Acked-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
[dpark: also remove -&gt;merge_bvec_fn() in dm-thin as well as
 dm-era-target, and resolve merge conflicts]
Signed-off-by: Dongsu Park &lt;dpark@posteo.net&gt;
Signed-off-by: Ming Lin &lt;ming.l@ssi.samsung.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As generic_make_request() is now able to handle arbitrarily sized bios,
it's no longer necessary for each individual block driver to define its
own -&gt;merge_bvec_fn() callback. Remove every invocation completely.

Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Lars Ellenberg &lt;drbd-dev@lists.linbit.com&gt;
Cc: drbd-user@lists.linbit.com
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Yehuda Sadeh &lt;yehuda@inktank.com&gt;
Cc: Sage Weil &lt;sage@inktank.com&gt;
Cc: Alex Elder &lt;elder@kernel.org&gt;
Cc: ceph-devel@vger.kernel.org
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: dm-devel@redhat.com
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: linux-raid@vger.kernel.org
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: "Martin K. Petersen" &lt;martin.petersen@oracle.com&gt;
Acked-by: NeilBrown &lt;neilb@suse.de&gt; (for the 'md' bits)
Acked-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
[dpark: also remove -&gt;merge_bvec_fn() in dm-thin as well as
 dm-era-target, and resolve merge conflicts]
Signed-off-by: Dongsu Park &lt;dpark@posteo.net&gt;
Signed-off-by: Ming Lin &lt;ming.l@ssi.samsung.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: add a bi_error field to struct bio</title>
<updated>2015-07-29T14:55:15+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2015-07-20T13:29:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4246a0b63bd8f56a1469b12eafeb875b1041a451'/>
<id>4246a0b63bd8f56a1469b12eafeb875b1041a451</id>
<content type='text'>
Currently we have two different ways to signal an I/O error on a BIO:

 (1) by clearing the BIO_UPTODATE flag
 (2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario.  Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we have two different ways to signal an I/O error on a BIO:

 (1) by clearing the BIO_UPTODATE flag
 (2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario.  Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: rename -&gt;stop to -&gt;free</title>
<updated>2015-02-03T21:35:52+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-12-15T01:56:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=afa0f557cb15176570a18fb2a093e348a793afd4'/>
<id>afa0f557cb15176570a18fb2a093e348a793afd4</id>
<content type='text'>
Now that the -&gt;stop function only frees the private data,
rename is accordingly.

Also pass in the private pointer as an arg rather than using
mddev-&gt;private.  This flexibility will be useful in level_store().

Finally, don't clear -&gt;private.  It doesn't make sense to clear
it seeing that isn't what we free, and it is no longer necessary
to clear -&gt;private (it was some time ago before  -&gt;to_remove was
introduced).

Setting -&gt;to_remove in -&gt;free() is a bit of a wart, but not a
big problem at the moment.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the -&gt;stop function only frees the private data,
rename is accordingly.

Also pass in the private pointer as an arg rather than using
mddev-&gt;private.  This flexibility will be useful in level_store().

Finally, don't clear -&gt;private.  It doesn't make sense to clear
it seeing that isn't what we free, and it is no longer necessary
to clear -&gt;private (it was some time ago before  -&gt;to_remove was
introduced).

Setting -&gt;to_remove in -&gt;free() is a bit of a wart, but not a
big problem at the moment.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: split detach operation out from -&gt;stop.</title>
<updated>2015-02-03T21:35:52+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-12-15T01:56:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5aa61f427e4979be733e4847b9199ff9cc48a47e'/>
<id>5aa61f427e4979be733e4847b9199ff9cc48a47e</id>
<content type='text'>
Each md personality has a 'stop' operation which does two
things:
 1/ it finalizes some aspects of the array to ensure nothing
    is accessing the -&gt;private data
 2/ it frees the -&gt;private data.

All the steps in '1' can apply to all arrays and so can be
performed in common code.

This is useful as in the case where we change the personality which
manages an array (in level_store()), it would be helpful to do
step 1 early, and step 2 later.

So split the 'step 1' functionality out into a new mddev_detach().

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Each md personality has a 'stop' operation which does two
things:
 1/ it finalizes some aspects of the array to ensure nothing
    is accessing the -&gt;private data
 2/ it frees the -&gt;private data.

All the steps in '1' can apply to all arrays and so can be
performed in common code.

This is useful as in the case where we change the personality which
manages an array (in level_store()), it would be helpful to do
step 1 early, and step 2 later.

So split the 'step 1' functionality out into a new mddev_detach().

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: make -&gt;congested robust against personality changes.</title>
<updated>2015-02-03T21:35:52+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-12-15T01:56:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5c675f83c68fbdf9c0e103c1090b06be747fa62c'/>
<id>5c675f83c68fbdf9c0e103c1090b06be747fa62c</id>
<content type='text'>
There is currently no locking around calls to the 'congested'
bdi function.  If called at an awkward time while an array is
being converted from one level (or personality) to another, there
is a tiny chance of running code in an unreferenced module etc.

So add a 'congested' function to the md_personality operations
structure, and call it with appropriate locking from a central
'mddev_congested'.

When the array personality is changing the array will be 'suspended'
so no IO is processed.
If mddev_congested detects this, it simply reports that the
array is congested, which is a safe guess.
As mddev_suspend calls synchronize_rcu(), mddev_congested can
avoid races by included the whole call inside an rcu_read_lock()
region.
This require that the congested functions for all subordinate devices
can be run under rcu_lock.  Fortunately this is the case.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is currently no locking around calls to the 'congested'
bdi function.  If called at an awkward time while an array is
being converted from one level (or personality) to another, there
is a tiny chance of running code in an unreferenced module etc.

So add a 'congested' function to the md_personality operations
structure, and call it with appropriate locking from a central
'mddev_congested'.

When the array personality is changing the array will be 'suspended'
so no IO is processed.
If mddev_congested detects this, it simply reports that the
array is congested, which is a safe guess.
As mddev_suspend calls synchronize_rcu(), mddev_congested can
avoid races by included the whole call inside an rcu_read_lock()
region.
This require that the congested functions for all subordinate devices
can be run under rcu_lock.  Fortunately this is the case.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: remove unwanted white space from md.c</title>
<updated>2014-10-14T02:08:29+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-09-30T04:23:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f72ffdd68616e3697bc782b21c82197aeb480fd5'/>
<id>f72ffdd68616e3697bc782b21c82197aeb480fd5</id>
<content type='text'>
My editor shows much of this is RED.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
My editor shows much of this is RED.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
