<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/md/raid5.c, branch v4.14</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>md/raid5: cap worker count</title>
<updated>2017-09-28T03:08:44+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-21T18:03:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7d5d7b5058fbd638914e42504677141a69f43011'/>
<id>7d5d7b5058fbd638914e42504677141a69f43011</id>
<content type='text'>
static checker reports a potential integer overflow. Cap the worker count to
avoid the overflow.

Reported:-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
static checker reports a potential integer overflow. Cap the worker count to
avoid the overflow.

Reported:-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md</title>
<updated>2017-09-19T15:39:32+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-09-19T15:39:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=12fcf66e74b16b96e57fc1ce32bdf27b3a426fd0'/>
<id>12fcf66e74b16b96e57fc1ce32bdf27b3a426fd0</id>
<content type='text'>
Pull MD fixes from Shaohua Li:
 "Two small patches to fix long-lived raid5 stripe batch bugs, one from
  Dennis and the other from me"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list
  md/raid5: fix a race condition in stripe batch
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull MD fixes from Shaohua Li:
 "Two small patches to fix long-lived raid5 stripe batch bugs, one from
  Dennis and the other from me"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list
  md/raid5: fix a race condition in stripe batch
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md</title>
<updated>2017-09-07T19:41:48+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-09-07T19:41:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3645e6d0dc80be4376f87acc9ee527768387c909'/>
<id>3645e6d0dc80be4376f87acc9ee527768387c909</id>
<content type='text'>
Pull MD updates from Shaohua Li:
 "This update mainly fixes bugs:

   - Make raid5 ppl support several ppl from Pawel

   - Several raid5-cache bug fixes from Song

   - Bitmap fixes from Neil and Me

   - One raid1/10 regression fix since 4.12 from Me

   - Other small fixes and cleanup"

* tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/bitmap: disable bitmap_resize for file-backed bitmaps.
  raid5-ppl: Recovery support for multiple partial parity logs
  md: Runtime support for multiple ppls
  md/raid0: attach correct cgroup info in bio
  lib/raid6: align AVX512 constants to 512 bits, not bytes
  raid5: remove raid5_build_block
  md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_show
  md: replace seq_release_private with seq_release
  md: notify about new spare disk in the container
  md/raid1/10: reset bio allocated from mempool
  md/raid5: release/flush io in raid5_do_work()
  md/bitmap: copy correct data for bitmap super
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull MD updates from Shaohua Li:
 "This update mainly fixes bugs:

   - Make raid5 ppl support several ppl from Pawel

   - Several raid5-cache bug fixes from Song

   - Bitmap fixes from Neil and Me

   - One raid1/10 regression fix since 4.12 from Me

   - Other small fixes and cleanup"

* tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/bitmap: disable bitmap_resize for file-backed bitmaps.
  raid5-ppl: Recovery support for multiple partial parity logs
  md: Runtime support for multiple ppls
  md/raid0: attach correct cgroup info in bio
  lib/raid6: align AVX512 constants to 512 bits, not bytes
  raid5: remove raid5_build_block
  md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_show
  md: replace seq_release_private with seq_release
  md: notify about new spare disk in the container
  md/raid1/10: reset bio allocated from mempool
  md/raid5: release/flush io in raid5_do_work()
  md/bitmap: copy correct data for bitmap super
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list</title>
<updated>2017-09-06T05:51:13+00:00</updated>
<author>
<name>Dennis Yang</name>
<email>dennisyang@qnap.com</email>
</author>
<published>2017-09-06T03:02:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=184a09eb9a2fe425e49c9538f1604b05ed33cfef'/>
<id>184a09eb9a2fe425e49c9538f1604b05ed33cfef</id>
<content type='text'>
In release_stripe_plug(), if a stripe_head has its STRIPE_ON_UNPLUG_LIST
set, it indicates that this stripe_head is already in the raid5_plug_cb
list and release_stripe() would be called instead to drop a reference
count. Otherwise, the STRIPE_ON_UNPLUG_LIST bit would be set for this
stripe_head and it will get queued into the raid5_plug_cb list.

Since break_stripe_batch_list() did not preserve STRIPE_ON_UNPLUG_LIST,
A stripe could be re-added to plug list while it is still on that list
in the following situation. If stripe_head A is added to another
stripe_head B's batch list, in this case A will have its
batch_head != NULL and be added into the plug list. After that,
stripe_head B gets handled and called break_stripe_batch_list() to
reset all the batched stripe_head(including A which is still on
the plug list)'s state and reset their batch_head to NULL.
Before the plug list gets processed, if there is another write request
comes in and get stripe_head A, A will have its batch_head == NULL
(cleared by calling break_stripe_batch_list() on B) and be added to
plug list once again.

Signed-off-by: Dennis Yang &lt;dennisyang@qnap.com&gt;
Cc: stable@vger.kernel.org (v4.1+)
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In release_stripe_plug(), if a stripe_head has its STRIPE_ON_UNPLUG_LIST
set, it indicates that this stripe_head is already in the raid5_plug_cb
list and release_stripe() would be called instead to drop a reference
count. Otherwise, the STRIPE_ON_UNPLUG_LIST bit would be set for this
stripe_head and it will get queued into the raid5_plug_cb list.

Since break_stripe_batch_list() did not preserve STRIPE_ON_UNPLUG_LIST,
A stripe could be re-added to plug list while it is still on that list
in the following situation. If stripe_head A is added to another
stripe_head B's batch list, in this case A will have its
batch_head != NULL and be added into the plug list. After that,
stripe_head B gets handled and called break_stripe_batch_list() to
reset all the batched stripe_head(including A which is still on
the plug list)'s state and reset their batch_head to NULL.
Before the plug list gets processed, if there is another write request
comes in and get stripe_head A, A will have its batch_head == NULL
(cleared by calling break_stripe_batch_list() on B) and be added to
plug list once again.

Signed-off-by: Dennis Yang &lt;dennisyang@qnap.com&gt;
Cc: stable@vger.kernel.org (v4.1+)
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5: fix a race condition in stripe batch</title>
<updated>2017-09-05T17:57:49+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-08-25T17:40:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3664847d95e60a9a943858b7800f8484669740fc'/>
<id>3664847d95e60a9a943858b7800f8484669740fc</id>
<content type='text'>
We have a race condition in below scenario, say have 3 continuous stripes, sh1,
sh2 and sh3, sh1 is the stripe_head of sh2 and sh3:

CPU1				CPU2				CPU3
handle_stripe(sh3)
				stripe_add_to_batch_list(sh3)
				-&gt; lock(sh2, sh3)
				-&gt; lock batch_lock(sh1)
				-&gt; add sh3 to batch_list of sh1
				-&gt; unlock batch_lock(sh1)
								clear_batch_ready(sh1)
								-&gt; lock(sh1) and batch_lock(sh1)
								-&gt; clear STRIPE_BATCH_READY for all stripes in batch_list
								-&gt; unlock(sh1) and batch_lock(sh1)
-&gt;clear_batch_ready(sh3)
--&gt;test_and_clear_bit(STRIPE_BATCH_READY, sh3)
---&gt;return 0 as sh-&gt;batch == NULL
				-&gt; sh3-&gt;batch_head = sh1
				-&gt; unlock (sh2, sh3)

In CPU1, handle_stripe will continue handle sh3 even it's in batch stripe list
of sh1. By moving sh3-&gt;batch_head assignment in to batch_lock, we make it
impossible to clear STRIPE_BATCH_READY before batch_head is set.

Thanks Stephane for helping debug this tricky issue.

Reported-and-tested-by: Stephane Thiell &lt;sthiell@stanford.edu&gt;
Cc: stable@vger.kernel.org (v4.1+)
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have a race condition in below scenario, say have 3 continuous stripes, sh1,
sh2 and sh3, sh1 is the stripe_head of sh2 and sh3:

CPU1				CPU2				CPU3
handle_stripe(sh3)
				stripe_add_to_batch_list(sh3)
				-&gt; lock(sh2, sh3)
				-&gt; lock batch_lock(sh1)
				-&gt; add sh3 to batch_list of sh1
				-&gt; unlock batch_lock(sh1)
								clear_batch_ready(sh1)
								-&gt; lock(sh1) and batch_lock(sh1)
								-&gt; clear STRIPE_BATCH_READY for all stripes in batch_list
								-&gt; unlock(sh1) and batch_lock(sh1)
-&gt;clear_batch_ready(sh3)
--&gt;test_and_clear_bit(STRIPE_BATCH_READY, sh3)
---&gt;return 0 as sh-&gt;batch == NULL
				-&gt; sh3-&gt;batch_head = sh1
				-&gt; unlock (sh2, sh3)

In CPU1, handle_stripe will continue handle sh3 even it's in batch stripe list
of sh1. By moving sh3-&gt;batch_head assignment in to batch_lock, we make it
impossible to clear STRIPE_BATCH_READY before batch_head is set.

Thanks Stephane for helping debug this tricky issue.

Reported-and-tested-by: Stephane Thiell &lt;sthiell@stanford.edu&gt;
Cc: stable@vger.kernel.org (v4.1+)
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Runtime support for multiple ppls</title>
<updated>2017-08-28T14:45:48+00:00</updated>
<author>
<name>Pawel Baldysiak</name>
<email>pawel.baldysiak@intel.com</email>
</author>
<published>2017-08-16T15:13:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ddc088238cd6988bb4ac3776f403d7ff9d3c7a63'/>
<id>ddc088238cd6988bb4ac3776f403d7ff9d3c7a63</id>
<content type='text'>
Increase PPL area to 1MB and use it as circular buffer to store PPL. The
entry with highest generation number is the latest one. If PPL to be
written is larger then space left in a buffer, rewind the buffer to the
start (don't wrap it).

Signed-off-by: Pawel Baldysiak &lt;pawel.baldysiak@intel.com&gt;
Signed-off-by: Artur Paszkiewicz &lt;artur.paszkiewicz@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Increase PPL area to 1MB and use it as circular buffer to store PPL. The
entry with highest generation number is the latest one. If PPL to be
written is larger then space left in a buffer, rewind the buffer to the
start (don't wrap it).

Signed-off-by: Pawel Baldysiak &lt;pawel.baldysiak@intel.com&gt;
Signed-off-by: Artur Paszkiewicz &lt;artur.paszkiewicz@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>raid5: remove raid5_build_block</title>
<updated>2017-08-25T17:21:47+00:00</updated>
<author>
<name>Guoqing Jiang</name>
<email>gqjiang@suse.com</email>
</author>
<published>2017-08-10T08:12:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=27a4ff8f49a9b912be76f36f7e198824cf0aecd9'/>
<id>27a4ff8f49a9b912be76f36f7e198824cf0aecd9</id>
<content type='text'>
Now raid5_build_block is just called to set the
sector of r5dev, raid5_compute_blocknr can be
used directly for the purpose.

Signed-off-by: Guoqing Jiang &lt;gqjiang@suse.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now raid5_build_block is just called to set the
sector of r5dev, raid5_compute_blocknr can be
used directly for the purpose.

Signed-off-by: Guoqing Jiang &lt;gqjiang@suse.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5: release/flush io in raid5_do_work()</title>
<updated>2017-08-24T17:06:19+00:00</updated>
<author>
<name>Song Liu</name>
<email>songliubraving@fb.com</email>
</author>
<published>2017-08-24T16:53:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9c72a18e46ebe0f09484cce8ebf847abdab58498'/>
<id>9c72a18e46ebe0f09484cce8ebf847abdab58498</id>
<content type='text'>
In raid5, there are scenarios where some ios are deferred to a later
time, and some IO need a flush to complete. To make sure we make
progress with these IOs, we need to call the following functions:

    flush_deferred_bios(conf);
    r5l_flush_stripe_to_raid(conf-&gt;log);

Both of these functions are called in raid5d(), but missing in
raid5_do_work(). As a result, these functions are not called
when multi-threading (group_thread_cnt &gt; 0) is enabled. This patch
adds calls to these function to raid5_do_work().

Note for stable branches:

  r5l_flush_stripe_to_raid(conf-&gt;log) is need for 4.4+
  flush_deferred_bios(conf) is only needed for 4.11+

Cc: stable@vger.kernel.org (4.4+)
Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In raid5, there are scenarios where some ios are deferred to a later
time, and some IO need a flush to complete. To make sure we make
progress with these IOs, we need to call the following functions:

    flush_deferred_bios(conf);
    r5l_flush_stripe_to_raid(conf-&gt;log);

Both of these functions are called in raid5d(), but missing in
raid5_do_work(). As a result, these functions are not called
when multi-threading (group_thread_cnt &gt; 0) is enabled. This patch
adds calls to these function to raid5_do_work().

Note for stable branches:

  r5l_flush_stripe_to_raid(conf-&gt;log) is need for 4.4+
  flush_deferred_bios(conf) is only needed for 4.11+

Cc: stable@vger.kernel.org (4.4+)
Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: replace bi_bdev with a gendisk pointer and partitions index</title>
<updated>2017-08-23T18:49:55+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-08-23T17:10:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=74d46992e0d9dee7f1f376de0d56d31614c8a17a'/>
<id>74d46992e0d9dee7f1f376de0d56d31614c8a17a</id>
<content type='text'>
This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>raid5: remove a call to get_start_sect</title>
<updated>2017-08-23T18:49:49+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-08-23T17:10:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=10433d04b8e647a50feffec72fd3cf40ce42b084'/>
<id>10433d04b8e647a50feffec72fd3cf40ce42b084</id>
<content type='text'>
The block layer always remaps partitions before calling into the
-&gt;make_request methods of drivers.  Thus the call to get_start_sect in
in_chunk_boundary will always return 0 and can be removed.

Reviewed-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The block layer always remaps partitions before calling into the
-&gt;make_request methods of drivers.  Thus the call to get_start_sect in
in_chunk_boundary will always return 0 and can be removed.

Reviewed-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
