linux-toradex.git/drivers/md/md-cluster.c, branch v4.10

md-cluster: make resync lock also could be interruptted

2016-09-21T16:09:44+00:00

When one node is perform resync or recovery, other nodes
can't get resync lock and could block for a while before
it holds the lock, so we can't stop array immediately for
this scenario.

To make array could be stop quickly, we check MD_CLOSING
in dlm_lock_sync_interruptible to make us can interrupt
the lock request.

Reviewed-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li

md-cluster: introduce dlm_lock_sync_interruptible to fix tasks hang

2016-09-21T16:09:44+00:00

When some node leaves cluster, then it's bitmap need to be
synced by another node, so "md*_recover" thread is triggered
for the purpose. However, with below steps. we can find tasks
hang happened either in B or C.

1. Node A create a resyncing cluster raid1, assemble it in
   other two nodes (B and C).
2. stop array in B and C.
3. stop array in A.

linux44:~ # ps aux|grep md|grep D
root	5938	0.0  0.1  19852  1964 pts/0    D+   14:52   0:00 mdadm -S md0
root	5939	0.0  0.0      0     0 ?        D    14:52   0:00 [md0_recover]

linux44:~ # cat /proc/5939/stack
[] dlm_lock_sync+0x71/0x90 [md_cluster]
[] recover_bitmaps+0x125/0x220 [md_cluster]
[] md_thread+0x16d/0x180 [md_mod]
[] kthread+0xb4/0xc0
[] ret_from_fork+0x58/0x90

linux44:~ # cat /proc/5938/stack
[] kthread_stop+0x6e/0x120
[] md_unregister_thread+0x40/0x80 [md_mod]
[] leave+0x70/0x120 [md_cluster]
[] md_cluster_stop+0x14/0x30 [md_mod]
[] bitmap_free+0x14b/0x150 [md_mod]
[] do_md_stop+0x35b/0x5a0 [md_mod]
[] md_ioctl+0x873/0x1590 [md_mod]
[] blkdev_ioctl+0x214/0x7d0
[] block_ioctl+0x3d/0x40
[] do_vfs_ioctl+0x2d4/0x4b0
[] SyS_ioctl+0x88/0xa0
[] system_call_fastpath+0x16/0x1b

The problem is caused by recover_bitmaps can't reliably abort
when the thread is unregistered. So dlm_lock_sync_interruptible
is introduced to detect the thread's situation to fix the problem.

Reviewed-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li

md-cluster: convert the completion to wait queue

2016-09-21T16:09:44+00:00

Previously, we used completion to sync between require dlm lock
and sync_ast, however we will have to expose completion.wait
and completion.done in dlm_lock_sync_interruptible (introduced
later), it is not a common usage for completion, so convert
related things to wait queue.

Reviewed-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li

md-cluster: protect md_find_rdev_nr_rcu with rcu lock

2016-09-21T16:09:44+00:00

We need to use rcu_read_lock/unlock to avoid potential
race.

Reported-by: Shaohua Li 
Reviewed-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li

md-cluster: remove some unnecessary dlm_unlock_sync

2016-09-21T16:09:44+00:00

Since DLM_LKF_FORCEUNLOCK is used in lockres_free,
we don't need to call dlm_unlock_sync before free
lock resource.

Reviewed-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li

md-cluster: use FORCEUNLOCK in lockres_free

2016-09-21T16:09:44+00:00

For dlm_unlock, we need to pass flag to dlm_unlock as the
third parameter instead of set res->flags.

Also, DLM_LKF_FORCEUNLOCK is more suitable for dlm_unlock
since it works even the lock is on waiting or convert queue.

Acked-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li

md-cluster: fix error return code in join()

2016-08-24T17:21:51+00:00

Fix to return error code -ENOMEM from the lockres_init() error
handling case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun 
Signed-off-by: Shaohua Li

md-cluster: check the return value of process_recvd_msg

2016-05-09T16:24:04+00:00

We don't need to run the full path of recv_daemon
if process_recvd_msg doesn't return 0.

Reviewed-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li

md-cluster: gather resync infos and enable recv_thread after bitmap is ready

2016-05-09T16:24:03+00:00

The in-memory bitmap is not ready when node joins cluster,
so it doesn't make sense to make gather_all_resync_info()
called so earlier, we need to call it after the node's
bitmap is setup. Also, recv_thread could be wake up after
node joins cluster, but it could cause problem if node
receives RESYNCING message without persionality since
mddev->pers->quiesce is called in process_suspend_info.

This commit introduces a new cluster interface load_bitmaps
to fix above problems, load_bitmaps is called in bitmap_load
where bitmap and persionality are ready, and load_bitmaps
does the following tasks:

1. call gather_all_resync_info to load all the node's
   bitmap info.
2. set MD_CLUSTER_ALREADY_IN_CLUSTER bit to recv_thread
   could be wake up, and wake up recv_thread if there is
   pending recv event.

Then ack_bast only wakes up recv_thread after IN_CLUSTER
bit is ready otherwise MD_CLUSTER_PENDING_RESYNC_EVENT is
set.

Reviewed-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li

md-cluster: sync bitmap when node received RESYNCING msg

2016-05-04T19:39:35+00:00

If the node received RESYNCING message which means
another node will perform resync with the area, then
we don't want to do it again in another node.

Let's set RESYNC_MASK and clear NEEDED_MASK for the
region from old-low to new-low which has finished
syncing, and the region from old-hi to new-hi is about
to syncing, bitmap_sync_with_cluste is introduced for
the purpose.

Reviewed-by: NeilBrown 
Signed-off-by: Guoqing Jiang 
Signed-off-by: Shaohua Li