summaryrefslogtreecommitdiff
path: root/drivers/md
diff options
context:
space:
mode:
authorNeilBrown <neilb@suse.com>2017-04-03 12:11:32 +1000
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2018-03-22 09:23:25 +0100
commit2b84883f74b2be66de7f424e1f7d9bdf3b1530b2 (patch)
tree170ad743991967898689e133517162dc24f354db /drivers/md
parent23e4e7b49c65500cf4a4e494ee62076b009d29e6 (diff)
md/raid6: Fix anomily when recovering a single device in RAID6.
[ Upstream commit 7471fb77ce4dc4cb81291189947fcdf621a97987 ] When recoverying a single missing/failed device in a RAID6, those stripes where the Q block is on the missing device are handled a bit differently. In these cases it is easy to check that the P block is correct, so we do. This results in the P block be destroy. Consequently the P block needs to be read a second time in order to compute Q. This causes lots of seeks and hurts performance. It shouldn't be necessary to re-read P as it can be computed from the DATA. But we only compute blocks on missing devices, since c337869d9501 ("md: do not compute parity unless it is on a failed drive"). So relax the change made in that commit to allow computing of the P block in a RAID6 which it is the only missing that block. This makes RAID6 recovery run much faster as the disk just "before" the recovering device is no longer seeking back-and-forth. Reported-by-tested-by: Brad Campbell <lists2009@fnarfbargle.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Diffstat (limited to 'drivers/md')
-rw-r--r--drivers/md/raid5.c13
1 files changed, 12 insertions, 1 deletions
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 86ab6d14d782..ca968c3f25c7 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3372,9 +3372,20 @@ static int fetch_block(struct stripe_head *sh, struct stripe_head_state *s,
BUG_ON(test_bit(R5_Wantcompute, &dev->flags));
BUG_ON(test_bit(R5_Wantread, &dev->flags));
BUG_ON(sh->batch_head);
+
+ /*
+ * In the raid6 case if the only non-uptodate disk is P
+ * then we already trusted P to compute the other failed
+ * drives. It is safe to compute rather than re-read P.
+ * In other cases we only compute blocks from failed
+ * devices, otherwise check/repair might fail to detect
+ * a real inconsistency.
+ */
+
if ((s->uptodate == disks - 1) &&
+ ((sh->qd_idx >= 0 && sh->pd_idx == disk_idx) ||
(s->failed && (disk_idx == s->failed_num[0] ||
- disk_idx == s->failed_num[1]))) {
+ disk_idx == s->failed_num[1])))) {
/* have disk failed, and we're requested to fetch it;
* do compute it
*/