From 66099bb0ee6c20f91ace3fa5f82202fbceb67d8e Mon Sep 17 00:00:00 2001 From: Guoqing Jiang Date: Fri, 10 Jul 2015 17:01:15 +0800 Subject: md-cluster: fix deadlock issue on message lock There is problem with previous communication mechanism, and we got below deadlock scenario with cluster which has 3 nodes. Sender Receiver Receiver token(EX) message(EX) writes message downconverts message(CR) requests ack(EX) get message(CR) gets message(CR) reads message reads message requests EX on message requests EX on message To fix this problem, we do the following changes: 1. the sender downconverts MESSAGE to CW rather than CR. 2. and the receiver request PR lock not EX lock on message. And in case we failed to down-convert EX to CW on message, it is better to unlock message otherthan still hold the lock. Reviewed-by: Goldwyn Rodrigues Signed-off-by: Lidong Zhong Signed-off-by: Guoqing Jiang Signed-off-by: NeilBrown --- Documentation/md-cluster.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation/md-cluster.txt') diff --git a/Documentation/md-cluster.txt b/Documentation/md-cluster.txt index de1af7db3355..1b794369e03a 100644 --- a/Documentation/md-cluster.txt +++ b/Documentation/md-cluster.txt @@ -91,7 +91,7 @@ The algorithm is: this message inappropriate or redundant. 3. sender write LVB. - sender down-convert MESSAGE from EX to CR + sender down-convert MESSAGE from EX to CW sender try to get EX of ACK [ wait until all receiver has *processed* the MESSAGE ] @@ -112,7 +112,7 @@ The algorithm is: sender down-convert ACK from EX to CR sender release MESSAGE sender release TOKEN - receiver upconvert to EX of MESSAGE + receiver upconvert to PR of MESSAGE receiver get CR of ACK receiver release MESSAGE -- cgit v1.2.3