linux-toradex.git/drivers/md, branch v4.1.10

md: flush ->event_work before stopping array.

2015-09-29T17:26:14+00:00

commit ee5d004fd0591536a061451eba2b187092e9127c upstream.

The 'event_work' worker used by dm-raid may still be running
when the array is stopped.  This can result in an oops.

So flush the workqueue on which it is run after detaching
and before destroying the device.

Reported-by: Heinz Mauelshagen 
Signed-off-by: NeilBrown 
Fixes: 9d09e663d550 ("dm: raid456 basic support")
Signed-off-by: Greg Kroah-Hartman

md/raid10: always set reshape_safe when initializing reshape_position.

2015-09-29T17:26:13+00:00

commit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 upstream.

'reshape_position' tracks where in the reshape we have reached.
'reshape_safe' tracks where in the reshape we have safely recorded
in the metadata.

These are compared to determine when to update the metadata.
So it is important that reshape_safe is initialised properly.
Currently it isn't.  When starting a reshape from the beginning
it usually has the correct value by luck.  But when reducing the
number of devices in a RAID10, it has the wrong value and this leads
to the metadata not being updated correctly.
This can lead to corruption if the reshape is not allowed to complete.

This patch is suitable for any -stable kernel which supports RAID10
reshape, which is 3.5 and later.

Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support")
Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

md/raid5: don't let shrink_slab shrink too far.

2015-09-29T17:26:13+00:00

commit 49895bcc7e566ba455eb2996607d6fbd3447ce16 upstream.

I have a report of drop_one_stripe() called from
raid5_cache_scan() apparently finding ->max_nr_stripes == 0.

This should not be allowed.

So add a test to keep max_nr_stripes above min_nr_stripes.

Also use a 'mask' rather than a 'mod' in drop_one_stripe
to ensure 'hash' is valid even if max_nr_stripes does reach zero.


Fixes: edbe83ab4c27 ("md/raid5: allow the stripe_cache to grow and shrink.")
Reported-by: Tomas Papan 
Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

md/raid5: avoid races when changing cache size.

2015-09-29T17:26:13+00:00

commit 2d5b569b665ea6d0b15c52529ff06300de81a7ce upstream.

Cache size can grow or shrink due to various pressures at
any time.  So when we resize the cache as part of a 'grow'
operation (i.e. change the size to allow more devices) we need
to blocks that automatic growing/shrinking.

So introduce a mutex.  auto grow/shrink uses mutex_trylock()
and just doesn't bother if there is a blockage.
Resizing the whole cache holds the mutex to ensure that
the correct number of new stripes is allocated.

This bug can result in some stripes not being freed when an
array is stopped.  This leads to the kmem_cache not being
freed and a subsequent array can try to use the same kmem_cache
and get confused.

Fixes: edbe83ab4c27 ("md/raid5: allow the stripe_cache to grow and shrink.")
Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

dm thin metadata: delete btrees when releasing metadata snapshot

2015-09-13T16:07:40+00:00

commit 7f518ad0a212e2a6fd68630e176af1de395070a7 upstream.

The device details and mapping trees were just being decremented
before.  Now btree_del() is called to do a deep delete.

Signed-off-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Signed-off-by: Greg Kroah-Hartman

dm: fix dm_merge_bvec regression on 32 bit systems

2015-08-17T03:52:23+00:00

commit bd4aaf8f9b85d6b2df3231fd62b219ebb75d3568 upstream.

A DM regression on 32 bit systems was reported against v4.2-rc3 here:
https://lkml.org/lkml/2015/7/29/401

Fix this by reverting both commit 1c220c69 ("dm: fix casting bug in
dm_merge_bvec()") and 148e51ba ("dm: improve documentation and code
clarity in dm_merge_bvec").  This combined revert is done to eliminate
the possibility of a partial revert in stable@ kernels.

In hindsight the correct fix, at the time 1c220c69 was applied to fix
the regression that 148e51ba introduced, should've been to simply revert
148e51ba.

Reported-by: Josh Boyer 
Tested-by: Adam Williamson 
Acked-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Signed-off-by: Greg Kroah-Hartman

md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies

2015-08-17T03:52:23+00:00

commit 423f04d63cf421ea436bcc5be02543d549ce4b28 upstream.

raid1_end_read_request() assumes that the In_sync bits are consistent
with the ->degaded count.
raid1_spare_active updates the In_sync bit before the ->degraded count
and so exposes an inconsistency, as does error()
So extend the spinlock in raid1_spare_active() and error() to hide those
inconsistencies.

This should probably be part of
  Commit: 34cab6f42003 ("md/raid1: fix test for 'was read error from
  last working device'.")
as it addresses the same issue.  It fixes the same bug and should go
to -stable for same reasons.

Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

md: use kzalloc() when bitmap is disabled

2015-08-17T03:52:13+00:00

commit b6878d9e03043695dbf3fa1caa6dfc09db225b16 upstream.

In drivers/md/md.c get_bitmap_file() uses kmalloc() for creating a
mdu_bitmap_file_t called "file".

5769         file = kmalloc(sizeof(*file), GFP_NOIO);
5770         if (!file)
5771                 return -ENOMEM;

This structure is copied to user space at the end of the function.

5786         if (err == 0 &&
5787             copy_to_user(arg, file, sizeof(*file)))
5788                 err = -EFAULT

But if bitmap is disabled only the first byte of "file" is initialized
with zero, so it's possible to read some bytes (up to 4095) of kernel
space memory from user space. This is an information leak.

5775         /* bitmap disabled, zero the first byte and copy out */
5776         if (!mddev->bitmap_info.file)
5777                 file->pathname[0] = '\0';

Signed-off-by: Benjamin Randazzo 
Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

md/raid1: fix test for 'was read error from last working device'.

2015-08-10T19:21:55+00:00

commit 34cab6f42003cb06f48f86a86652984dec338ae9 upstream.

When we get a read error from the last working device, we don't
try to repair it, and don't fail the device.  We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
wrong.
When there is only one fully working device, it assumes that a
non-faulty device is that device.  However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync".  If ->degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas 
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown 
Signed-off-by: Greg Kroah-Hartman

Revert "dm: only run the queue on completion if congested or no requests pending"

2015-08-10T19:21:54+00:00

commit 621739b00e16ca2d80411dc9b111cb15b91f3ba9 upstream.

This reverts commit 9a0e609e3fd8a95c96629b9fbde6b8c5b9a1456a.
(Resolved a conflict during revert due to commit bfebd1cdb4 that came
after)

This revert is motivated by a couple failure reports on request-based DM
multipath testbeds:
1) Netapp reported that their multipath fault injection test under heavy
   IO load can stall longer than 300 seconds.
2) IBM reported elevated lock contention in their testbed (likely due to
   increased back pressure due to IO not being dispatched as quickly):
   https://www.redhat.com/archives/dm-devel/2015-July/msg00057.html

Signed-off-by: Mike Snitzer 
Signed-off-by: Greg Kroah-Hartman