linux-toradex.git/drivers/md/raid1.c, branch v2.6.32.35

md/raid1: really fix recovery looping when single good device fails.

2010-12-09T21:26:46+00:00

commit 8f9e0ee38f75d4740daa9e42c8af628d33d19a02 upstream.

Commit 4044ba58dd15cb01797c4fd034f39ef4a75f7cc3 supposedly fixed a
problem where if a raid1 with just one good device gets a read-error
during recovery, the recovery would abort and immediately restart in
an infinite loop.

However it depended on raid1_remove_disk removing the spare device
from the array.  But that does not happen in this case.  So add a test
so that in the 'recovery_disabled' case, the device will be removed.

This suitable for any kernel since 2.6.29 which is when
recovery_disabled was introduced.

Reported-by: Sebastian Färber 
Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

md/raid1: delay reads that could overtake behind-writes.

2010-08-13T20:20:25+00:00

commit e555190d82c0f58e825e3cbd9e6ebe2e7ac713bd upstream.

When a raid1 array is configured to support write-behind
on some devices, it normally only reads from other devices.
If all devices are write-behind (because the rest have failed)
it is possible for a read request to be serviced before a
behind-write request, which would appear as data corruption.

So when forced to read from a WriteMostly device, wait for any
write-behind to complete, and don't start any more behind-writes.

Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

md: Fix read balancing in RAID1 and RAID10 on drives > 2TB

2010-07-05T18:10:46+00:00

commit af3a2cd6b8a479345786e7fe5e199ad2f6240e56 upstream.

read_balance uses a "unsigned long" for a sector number which
will get truncated beyond 2TB.
This will cause read-balancing to be non-optimal, and can cause
data to be read from the 'wrong' branch during a resync.  This has a
very small chance of returning wrong data.

Reported-by: Jordan Russell 
Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

md/raid1: fix counting of write targets.

2010-07-05T18:10:45+00:00

commit 964147d5c86d63be79b442c30f3783d49860c078 upstream.

There is a very small race window when writing to a
RAID1 such that if a device is marked faulty at exactly the wrong
time, the write-in-progress will not be sent to the device,
but the bitmap (if present) will be updated to say that
the write was sent.

Then if the device turned out to still be usable as was re-added
to the array, the bitmap-based-resync would skip resyncing that
block, possibly leading to corruption.  This would only be a problem
if no further writes were issued to that area of the device (i.e.
that bitmap chunk).

Suitable for any pending -stable kernel.

Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

md: revert incorrect fix for read error handling in raid1.

2009-12-01T06:30:59+00:00

commit 4706b349f was a forward port of a fix that was needed
for SLES10.  But in fact it is not needed in mainline because
the earlier commit dd00a99e7a fixes the same problem in a
better way.
Further, this commit introduces a bug in the way it interacts with
the automatic read-error-correction.  If, after a read error is
successfully corrected, the same disk is chosen to re-read - the
re-read won't be attempted but an error will be returned instead.

After reverting that commit, there is the possibility that a
read error on a read-only array (where read errors cannot
be corrected as that requires a write) will repeatedly read the same
device and continue to get an error.
So in the "Array is readonly" case, fail the drive immediately on
a read error.

Signed-off-by: NeilBrown 
Cc: stable@kernel.org

md: raid1/raid10: handle allocation errors during array setup.

2009-10-16T04:55:44+00:00

Both raid1 and raid10 create a mempool during startup.
If the 'alloc' function for this mempool fails, unplug_slaves
is called.
If that happens when the pool is being initialised, unplug_slaves
will try to use the 'conf' structure that isn't filled in yet, and
badness will happen.

So ensure that unplug_slaves doesn't get called unless we know
that the conf structure if fully initialised.

Signed-off-by: NeilBrown

md/raid1/raid10: add a cond_resched

2009-10-16T04:55:32+00:00

During 'check' of a raid1 or raid10 it is possible for the management
thread to spend a lot of time running 'memcmp' on blocks from
different devices, so make sure the thread has a chance to schedule.
raid5d already has a cond_resched (in process_stripe).

Reported-By: Lee Howard 
Signed-off-by: NeilBrown

md: raid-1/10: fix RW bits manipulation

2009-09-23T08:20:15+00:00

Recently Jens has changed bio_rw_flagged() logic by following
commit 1f98a13f623e0ef666690a18c1250335fc6d7ef1. Now it returns
bool instead of int. This broke raid1/raid10 RW bits manipulation logic.
One of visible result is BUG_ON triggering due to empty barrier
here scsi_lib.c:1108 scsi_setup_fs_cmnd()

Signed-off-by: Dmitry Monakhov 
Signed-off-by: NeilBrown

md: report device as congested when suspended

2009-09-23T08:10:29+00:00

This should writeback from coming when the device is temporarily
suspended.

Signed-off-by: NeilBrown

md: Improve name of threads created by md_register_thread

2009-09-23T08:09:45+00:00

The management thread for raid4,5,6 arrays are all called
mdX_raid5, independent of the actual raid level, which is wrong and
can be confusion.

So change md_register_thread to use the name from the personality
unless no alternate name (like 'resync' or 'reshape') is given.

This is simpler and more correct.

Cc: Jinzc 
Signed-off-by: NeilBrown