summaryrefslogtreecommitdiff
path: root/fs/ocfs2/dlm/dlmmaster.c
AgeCommit message (Collapse)Author
2016-03-15ocfs2: fix a tiny race that leads file system read-onlyJiufei Xue
when o2hb detect a node down, it first set the dead node to recovery map and create ocfs2rec which will replay journal for dead node. o2hb thread then call dlm_do_local_recovery_cleanup() to delete the lock for dead node. After the lock of dead node is gone, locks for other nodes can be granted and may modify the meta data without replaying journal of the dead node. The detail is described as follows. N1 N2 N3(master) modify the extent tree of inode, and commit dirty metadata to journal, then goes down. o2hb thread detects N1 goes down, set recovery map and delete the lock of N1. dlm_thread flush ast for the lock of N2. do not detect the death of N1, so recovery map is empty. read inode from disk without replaying the journal of N1 and modify the extent tree of the inode that N1 had modified. ocfs2rec recover the journal of N1. The modification of N2 is lost. The modification of N1 and N2 are not serial, and it will lead to read-only file system. We can set recovery_waiting flag to the lock resource after delete the lock for dead node to prevent other node from getting the lock before dlm recovery. After dlm recovery, the recovery map on N2 is not empty, ocfs2_inode_lock_full_nested() will wait for ocfs2 recovery. Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2/dlm: return in progress if master can not clear the refmap bit right nowxuejiufei
Master returns in-progress to non-master node when it can not clear the refmap bit right now. And non-master node will not purge the lock resource until receiving deref done message. Signed-off-by: xuejiufei <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2/dlm: add DEREF_DONE messagexuejiufei
This series of patches is to fix the dis-order issue of setting/clearing refmap bit described below. Node 1 Node 2(master) dlmlock dlm_do_master_request dlm_master_request_handler -> dlm_lockres_set_refmap_bit dlmlock succeed dlmunlock succeed dlm_purge_lockres dlm_deref_handler -> find lock resource is in DLM_LOCK_RES_SETREF_INPROG state, so dispatch a deref work dlm_purge_lockres succeed. call dlmlock again dlm_do_master_request dlm_master_request_handler -> dlm_lockres_set_refmap_bit deref work trigger, call dlm_lockres_clear_refmap_bit to clear Node 1 from refmap dlm_purge_lockres succeed dlm_send_remote_lock_request return DLM_IVLOCKID because the lockres is not exist BUG if the lockres is $RECOVERY This series of patches add a new message to keep the order of set and clear. Other nodes can purge the lock resource only after the refmap bit on master is cleared. This patch is to add DEREF_DONE message and corresponding handler. Node can purge the lock resource after receiving this message. As a new message is added, so increase the minor number of dlm protocol version. Signed-off-by: xuejiufei <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14ocfs2/dlm: do not insert a new mle when another process is already migratingxuejiufei
When two processes are migrating the same lockres, dlm_add_migration_mle() return -EEXIST, but insert a new mle in hash list. dlm_migrate_lockres() will detach the old mle and free the new one which is already in hash list, that will destroy the list. Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14ocfs2/dlm: ignore cleaning the migration mle that is inusexuejiufei
We have found that migration source will trigger a BUG that the refcount of mle is already zero before put when the target is down during migration. The situation is as follows: dlm_migrate_lockres dlm_add_migration_mle dlm_mark_lockres_migrating dlm_get_mle_inuse <<<<<< Now the refcount of the mle is 2. dlm_send_one_lockres and wait for the target to become the new master. <<<<<< o2hb detect the target down and clean the migration mle. Now the refcount is 1. dlm_migrate_lockres woken, and put the mle twice when found the target goes down which trigger the BUG with the following message: "ERROR: bad mle: ". Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14ocfs2/dlm: return appropriate value when dlm_grab() returns NULLXue jiufei
dlm_grab() may return NULL when the node is doing unmount. When doing code review, we found that some dlm handlers may return error to caller when dlm_grab() returns NULL and make caller BUG or other problems. Here is an example: Node 1 Node 2 receives migration message from node 3, and send migrate request to others start unmounting receives migrate request from node 1 and call dlm_migrate_request_handler() unmount thread unregisters domain handlers and removes dlm_context from dlm_domains dlm_migrate_request_handlers() returns -EINVAL to node 1 Exit migration neither clearing the migration state nor sending assert master message to node 3 which cause node 3 hung. Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14ocfs2/dlm: wait until DLM_LOCK_RES_SETREF_INPROG is cleared in ↵jiangyiwen
dlm_deref_lockres_worker Commit f3f854648de6 ("ocfs2_dlm: Ensure correct ordering of set/clear refmap bit on lockres") still exists a race which can't ensure the ordering is exactly correct. Node1 Node2 Node3 umount, migrate lockres to Node2 migrate finished, send migrate request to Node3 received migrate request, create a migration_mle, respond to Node2. set DLM_LOCK_RES_SETREF_INPROG and send assert master to Node3 delete migration_mle in assert_master_handler, Node3 umount without response dlm_thread purge this lockres, send drop deref message to Node2 found the flag of DLM_LOCK_RES_SETREF_INPROG is set, dispatch dlm_deref_lockres_worker to clear refmap, but in function of dlm_deref_lockres_worker, only if node in refmap it wait DLM_LOCK_RES_SETREF_INPROG to be cleared. So worker is done successfully purge lockres, send assert master response to Node1, and finish umount set Node3 in refmap, and it won't be cleared forever, thus lead to umount hung so wait until DLM_LOCK_RES_SETREF_INPROG is cleared in dlm_deref_lockres_worker. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29ocfs2/dlm: clear migration_pending when migration target goes downxuejiufei
We have found a BUG on res->migration_pending when migrating lock resources. The situation is as follows. dlm_mark_lockres_migration res->migration_pending = 1; __dlm_lockres_reserve_ast dlm_lockres_release_ast returns with res->migration_pending remains because other threads reserve asts wait dlm_migration_can_proceed returns 1 >>>>>>> o2hb found that target goes down and remove target from domain_map dlm_migration_can_proceed returns 1 dlm_mark_lockres_migrating returns -ESHOTDOWN with res->migration_pending still remains. When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON with res->migration_pending. So clear migration_pending when target is down. Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-23ocfs2/dlm: unlock lockres spinlock before dlm_lockres_putJoseph Qi
dlm_lockres_put will call dlm_lockres_release if it is the last reference, and then it may call dlm_print_one_lock_resource and take lockres spinlock. So unlock lockres spinlock before dlm_lockres_put to avoid deadlock. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-22ocfs2/dlm: fix deadlock when dispatch assert masterJoseph Qi
The order of the following three spinlocks should be: dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock But dlm_dispatch_assert_master() is called while holding dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls dlm_grab() which will take dlm_domain_lock. Once another thread (for example, dlm_query_join_handler) has already taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock happens. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: "Junxiao Bi" <junxiao.bi@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04ocfs2: avoid access invalid address when read o2dlm debug messagesYiwen Jiang
The following case will lead to a lockres is freed but is still in use. cat /sys/kernel/debug/o2dlm/locking_state dlm_thread lockres_seq_start -> lock dlm->track_lock -> get resA resA->refs decrease to 0, call dlm_lockres_release, and wait for "cat" unlock. Although resA->refs is already set to 0, increase resA->refs, and then unlock lock dlm->track_lock -> list_del_init() -> unlock -> free resA In such a race case, invalid address access may occurs. So we should delete list res->tracking before resA->refs decrease to 0. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-05ocfs2: dlm: fix race between purge and get lock resourceJunxiao Bi
There is a race window in dlm_get_lock_resource(), which may return a lock resource which has been purged. This will cause the process to hang forever in dlmlock() as the ast msg can't be handled due to its lock resource not existing. dlm_get_lock_resource { ... spin_lock(&dlm->spinlock); tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash); if (tmpres) { spin_unlock(&dlm->spinlock); >>>>>>>> race window, dlm_run_purge_list() may run and purge the lock resource spin_lock(&tmpres->spinlock); ... spin_unlock(&tmpres->spinlock); } } Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-18ocfs2/dlm: fix race between dispatched_work and dlm_lockres_grab_inflight_workerJoseph Qi
Commit ac4fef4d23ed ("ocfs2/dlm: do not purge lockres that is queued for assert master") may have the following possible race case: dlm_dispatch_assert_master dlm_wq ======================================================================== queue_work(dlm->quedlm_worker, &dlm->dispatched_work); dispatch work, dlm_lockres_drop_inflight_worker *BUG_ON(res->inflight_assert_workers == 0)* dlm_lockres_grab_inflight_worker inflight_assert_workers++ So ensure inflight_assert_workers to be increased first. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Xue jiufei <xuejiufei@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10ocfs2: o2dlm: fix a race between purge and master querySrinivas Eeda
Node A sends master query request to node B which is the master. At this time lockres happens to be on purgelist. dlm_master_request_handler gets the dlm spinlock, finds the resource and releases the dlm spin lock. Right at this dlm_thread on this node could purge the lockres. dlm_master_request_handler can then acquire lockres spinlock and reply to Node A that node B is the master even though lockres on node B is purged. The above scenario will now make node A falsely think node B is the master which is inconsistent. Further if another node C tries to master the same resource, every node will respond they are not the master. Node C then masters the resource and sends assert master to all nodes. This will now make node A crash with the following message. dlm_assert_master_handler:1831 ERROR: DIE! Mastery assert from 9, but current owner is 10! Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com> Tested-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09ocfs2: remove unused code in dlm_new_lockres()Xue jiufei
Remove the branch that free res->lockname.name because the condition is never satisfied when jump to label error. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-02ocfs2/dlm: should put mle when goto kill in dlm_assert_master_handleralex chen
In dlm_assert_master_handler, the mle is get in dlm_find_mle, should be put when goto kill, otherwise, this mle will never be released. Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-26ocfs2/dlm: do not get resource spinlock if lockres is newJoseph Qi
There is a deadlock case which reported by Guozhonghua: https://oss.oracle.com/pipermail/ocfs2-devel/2014-September/010079.html This case is caused by &res->spinlock and &dlm->master_lock misordering in different threads. It was introduced by commit 8d400b81cc83 ("ocfs2/dlm: Clean up refmap helpers"). Since lockres is new, it doesn't not require the &res->spinlock. So remove it. Fixes: 8d400b81cc83 ("ocfs2/dlm: Clean up refmap helpers") Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: joyce.xue <xuejiufei@huawei.com> Reported-by: Guozhonghua <guozhonghua@h3c.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06ocfs2: race between umount and unfinished remastering during recoveryTariq Saeed
Orabug: 19074140 When umount is issued during recovery on the new master that has not finished remastering locks, it triggers BUG() in dlm_send_mig_lockres_msg(). Here is the situation: 1) node A has a lock on resource X mastered by node B. 2) node B dies -> node A sets recovering flag for res X 3) Node C becomes the new master for resources owned by the dead node and is remastering locks of the dead node but has not finished the remastering process yet. 4) umount is issued on node C. 5) During processing of umount, ignoring unfished recovery, node C attempts to migrate resource X to node A. 6) node A finds res X in DLM_LOCK_RES_RECOVERING state, considers it a logic error and sends back -EFAULT. 7) node C asserts BUG() upon seeing EFAULT resp from node B. Fix is to delay migrating res X till remastering is finished at which point recovering flag will be cleared on both A and C. Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-23ocfs2/dlm: do not purge lockres that is queued for assert masterXue jiufei
When workqueue is delayed, it may occur that a lockres is purged while it is still queued for master assert. it may trigger BUG() as follows. N1 N2 dlm_get_lockres() ->dlm_do_master_requery is the master of lockres, so queue assert_master work dlm_thread() start running and purge the lockres dlm_assert_master_worker() send assert master message to other nodes receiving the assert_master message, set master to N2 dlmlock_remote() send create_lock message to N2, but receive DLM_IVLOCKID, if it is RECOVERY lockres, it triggers the BUG(). Another BUG() is triggered when N3 become the new master and send assert_master to N1, N1 will trigger the BUG() because owner doesn't match. So we should not purge lockres when it is queued for assert master. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-23ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop ↵jiangyiwen
during umount The following case may lead to endless loop during umount. node A node B node C node D umount volume, migrate lockres1 to B want to lock lockres1, send MASTER_REQUEST_MSG to C init block mle send MIGRATE_REQUEST_MSG to C find a block mle, and then return DLM_MIGRATE_RESPONSE_MASTERY_REF to B set C in refmap umount successfully try to umount, endless loop occurs when migrate lockres1 since C is in refmap So we can fix this endless loop case by only returning DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving MIGRATE_REQUEST_MSG. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: jiangyiwen <jiangyiwen@huawei.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Xue jiufei <xuejiufei@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04ocfs2: remove NULL assignments on staticFabian Frederick
Static values are automatically initialized to NULL. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-23ocfs2: fix double kmem_cache_destroy in dlm_initJoseph Qi
In dlm_init, if create dlm_lockname_cache failed in dlm_init_master_caches, it will destroy dlm_lockres_cache which created before twice. And this will cause system die when loading modules. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13ocfs2: break useless while loopJunxiao Bi
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13ocfs2: delay migration when the lockres is in migration stateXue jiufei
We trigger a bug in __dlm_lockres_reserve_ast() when we parallel umount 4 nodes. The situation is as follows: 1) Node A migrate all lockres it owned(eg. lockres A) to other nodes say node B when it umounts. 2) Receiving MIG_LOCKRES message from A, Node B masters the lockres A with DLM_LOCK_RES_MIGRATING state set. 3) Then we umount ocfs2 on node B. It also should migrate lockres A to another node, say node C. But now, DLM_LOCK_RES_MIGRATING state of lockers A is not cleared. Node B triggered the BUG on lockres with state DLM_LOCK_RES_MIGRATING. Signed-off-by: Xuejiufei <xuejiufei@huawei.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Tariq Saeed <tariq.x.saeed@oracle.com> Cc: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11ocfs2: use list_for_each_entry() instead of list_for_each()Dong Fang
[dan.carpenter@oracle.com: fix up some NULL dereference bugs] Signed-off-by: Dong Fang <yp.fangdong@gmail.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jeff Liu <jeff.liu@oracle.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-26ocfs2/dlm: use GFP_ATOMIC inside a spin_lockDan Carpenter
My static checker complains that this is called with a spin_lock held in dlm_master_requery_handler() from dlmrecovery.c. Probably the reason we have not received any bug reports about this is that recovery is not a common operation. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-24ocfs2/dlm: Take inflight reference count for remotely mastered resources tooSunil Mushran
The inflight reference count, in the lock resource, is taken to pin the resource in memory. We take it when a new resource is created and release it after a lock is attached to it. We do this to prevent the resource from getting purged prematurely. Earlier this reference count was being taken for locally mastered resources only. This patch extends the same functionality for remotely mastered ones. We are doing this because the same premature purging could occur for remotely mastered resources if the remote node were to die before completion of the create lock. Fix for Oracle bug#12405575. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
2011-07-24ocfs2/dlm: Clean up refmap helpersSunil Mushran
Patch cleans up helpers that set/clear refmap bits and grab/drop inflight lock ref counts. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
2011-07-24ocfs2/dlm: Clean up messages in o2dlmSunil Mushran
o2dlm messages needed a facelift. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
2011-05-25ocfs2/dlm: Do not migrate resource to a node that is leaving the domainSunil Mushran
During dlm domain shutdown, o2dlm has to free all the lock resources. Ones that have no locks and references are freed. Ones that have locks and/or references are migrated to another node. The first task in migration is finding a target. Currently we scan the lock resource and find one node that either has a lock or a reference. This is not very efficient in a parallel umount case as we might end up migrating the lock resource to a node which itself may have to migrate it to a third node. The patch scans the dlm->exit_domain_map to ensure the target node is not leaving the domain. If no valid target node is found, o2dlm does not migrate the resource but instead waits for the unlock and deref messages that will allow it to free the resource. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-05-23ocfs2/dlm: dlm_is_lockres_migrateable() returns booleanSunil Mushran
Patch cleans up the gunk added by commit 388c4bcb4e63e88fb1f312a2f5f9eb2623afcf5b. dlm_is_lockres_migrateable() now returns 1 if lockresource is deemed migrateable and 0 if not. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-05-13ocfs2/dlm: Target node death during resource migration leads to thread spinSunil Mushran
During resource migration, if the target node were to die, the thread doing the migration spins until the target node is not removed from the domain map. This patch slows the spin by making the thread wait for the recovery to kick in. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-02-21ocfs2: Remove ENTRY from masklog.Tao Ma
ENTRY is used to record the entry of a function. But because it is added in so many functions, if we enable it, the system logs get filled up quickly and cause too much I/O. So actually no one can open it for a production system or even for a test. So for mlog_entry_void, we just remove it. for mlog_entry(...), we replace it with mlog(0,...), and they will be replace by trace event later. Signed-off-by: Tao Ma <boyu.mt@taobao.com>
2010-12-09ocfs2/dlm: Migrate lockres with no locks if it has a referenceSunil Mushran
o2dlm was not migrating resources with zero locks because it assumed that that resource would get purged by dlm_thread. However, some usage patterns involve creating and dropping locks at a high rate leading to the migrate thread seeing zero locks but the purge thread seeing an active reference. When this happens, the dlm_thread cannot purge the resource and the migrate thread sees no reason to migrate that resource. The spell is broken when the migrate thread catches the resource with a lock. The fix is to make the migrate thread also consider the reference map. This usage pattern can be triggered by userspace on userdlm locks and flocks. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-09-23o2dlm: force free mles during dlm exitSrinivas Eeda
While umounting, a block mle doesn't get freed if dlm is shutdown after master request is received but before assert master. This results in unclean shutdown of dlm domain. This patch frees all mles that lie around after other nodes were notified about exiting the dlm and marking dlm state as leaving. Only block mles are expected to be around, so we log ERROR for other mles but still free them. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-08-07ocfs2/dlm: remove potential deadlock -V3Wengang Wang
When we need to take both dlm_domain_lock and dlm->spinlock, we should take them in order of: dlm_domain_lock then dlm->spinlock. There is pathes disobey this order. That is calling dlm_lockres_put() with dlm->spinlock held in dlm_run_purge_list. dlm_lockres_put() calls dlm_put() at the ref and dlm_put() locks on dlm_domain_lock. Fix: Don't grab/put the dlm when the initialising/releasing lockres. That grab is not required because we don't call dlm_unregister_domain() based on refcount. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Cc: stable@kernel.org Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-08-07ocfs2/dlm: fix a dead lockWengang Wang
When we have to take both dlm->master_lock and lockres->spinlock, take them in order lockres->spinlock and then dlm->master_lock. The patch fixes a violation of the rule. We can simply move taking dlm->master_lock to where we have dropped res->spinlock since when we access res->state and free mle memory we don't need master_lock's protection. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Cc: stable@kernel.org Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-07-15ocfs2/dlm: Remove BUG_ON from migration in the rare case of a down nodeWengang Wang
For migration, we are waiting for DLM_LOCK_RES_MIGRATING flag to be set before sending DLM_MIG_LOCKRES_MSG message to the target. We are using dlm_migration_can_proceed() for that purpose. However, if the node is down, dlm_migration_can_proceed() will also return "go ahead". In this rare case, the DLM_LOCK_RES_MIGRATING flag might not be set yet. Remove the BUG_ON() that trips over this condition. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-21Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (47 commits) ocfs2: Silence a gcc warning. ocfs2: Don't retry xattr set in case value extension fails. ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break ocfs2: Reset xattr value size after xa_cleanup_value_truncate(). fs/ocfs2/dlm: Use kstrdup fs/ocfs2/dlm: Drop memory allocation cast Ocfs2: Optimize punching-hole code. Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public. Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing. Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead. ocfs2: Block signals for mkdir/link/symlink/O_CREAT. ocfs2: Wrap signal blocking in void functions. ocfs2/dlm: Increase o2dlm lockres hash size ocfs2: Make ocfs2_extend_trans() really extend. ocfs2/trivial: Code cleanup for allocation reservation. ocfs2: make ocfs2_adjust_resv_from_alloc simple. ocfs2: Make nointr a default mount option ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE o2net: log socket state changes ocfs2: print node # when tcp fails ...
2010-05-18fs/ocfs2/dlm: Drop memory allocation castJulia Lawall
Drop cast on the result of kmalloc and similar functions. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ type T; @@ - (T *) (\(kmalloc\|kzalloc\|kcalloc\|kmem_cache_alloc\|kmem_cache_zalloc\| kmem_cache_alloc_node\|kmalloc_node\|kzalloc_node\)(...)) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-05ocfs2: print node # when tcp failsWengang Wang
Print the node number of a peer node if sending it a message failed. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-23ocfs2: Fix a race in o2dlm lockres masterySrinivas Eeda
In o2dlm, the master of a lock resource keeps a map of all interested nodes. This prevents the master from purging the resource before an interested node can create a lock. A race between the mastery thread and the mastery handler allowed an interested node to discover who the master is without informing the master directly. This is easily fixed by holding the dlm spinlock a little longer in the mastery handler. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-01-25ocfs2/trivial: Remove trailing whitespacesSunil Mushran
Patch removes trailing whitespaces. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-12-04tree-wide: fix assorted typos all over the placeAndré Goddard Rosa
That is "success", "unknown", "through", "performance", "[re|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-23headers: utsname.h reduxAlexey Dobriyan
* remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03ocfs2/dlm: Remove struct dlm_lock_name in struct dlm_master_list_entrySunil Mushran
This patch removes struct dlm_lock_name and adds the entries directly to struct dlm_master_list_entry. Under the new scheme, both mles that are backed by a lockres or not, will have the name populated in mle->mname. This allows us to get rid of code that was figuring out the location of the mle name. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03ocfs2/dlm: dlm_set_lockres_owner() and dlm_change_lockres_owner() inlinedSunil Mushran
This patch inlines dlm_set_lockres_owner() and dlm_change_lockres_owner(). Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03ocfs2/dlm: Improve lockres countsSunil Mushran
This patch replaces the lockres counts that tracked the number number of locally and remotely mastered lockres' with a current and total count. The total count is the number of lockres' that have been created since the dlm domain was created. The number of locally and remotely mastered counts can be computed using the locking_state output. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03ocfs2/dlm: Track number of mlesSunil Mushran
The lifetime of a mle is limited to the duration of the lockres mastery process. While typically this lifetime is fairly short, we have noticed the number of mles explode under certain circumstances. This patch tracks the number of each different types of mles and should help us determine how best to speed up the mastery process. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>