<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/dlm/lock.c, branch v3.7.2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>dlm: fix missing dir remove</title>
<updated>2012-07-16T19:24:43+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-06-25T18:48:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=96006ea6d4eea73466e90ef353bf34e507724e77'/>
<id>96006ea6d4eea73466e90ef353bf34e507724e77</id>
<content type='text'>
I don't know exactly how, but in some cases, a dir
record is not removed, or a new one is created when
it shouldn't be.  The result is that the dir node
lookup returns a master node where the rsb does not
exist.  In this case, The master node will repeatedly
return -EBADR for requests, and the lock requests will
be stuck.

Until all possible ways for this to happen can be
eliminated, a simple and effective way to recover from
this situation is for the supposed master node to send
a standard remove message to the dir node when it
receives a request for a resource it has no rsb for.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I don't know exactly how, but in some cases, a dir
record is not removed, or a new one is created when
it shouldn't be.  The result is that the dir node
lookup returns a master node where the rsb does not
exist.  In this case, The master node will repeatedly
return -EBADR for requests, and the lock requests will
be stuck.

Until all possible ways for this to happen can be
eliminated, a simple and effective way to recover from
this situation is for the supposed master node to send
a standard remove message to the dir node when it
receives a request for a resource it has no rsb for.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fix conversion deadlock from recovery</title>
<updated>2012-07-16T19:18:22+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-06-05T20:55:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c503a62103c46d56447f56306b52be6f844689ba'/>
<id>c503a62103c46d56447f56306b52be6f844689ba</id>
<content type='text'>
The process of rebuilding locks on a new master during
recovery could re-order the locks on the convert queue,
creating an "in place" conversion deadlock that would
not be resolved.  Fix this by not considering queue
order when granting conversions after recovery.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The process of rebuilding locks on a new master during
recovery could re-order the locks on the convert queue,
creating an "in place" conversion deadlock that would
not be resolved.  Fix this by not considering queue
order when granting conversions after recovery.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fix race between remove and lookup</title>
<updated>2012-07-16T19:18:01+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-06-14T17:17:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=05c32f47bfae74dabff05208957768078b53cc49'/>
<id>05c32f47bfae74dabff05208957768078b53cc49</id>
<content type='text'>
It was possible for a remove message on an old
rsb to be sent after a lookup message on a new
rsb, where the rsbs were for the same resource
name.  This could lead to a missing directory
entry for the new rsb.

It is fixed by keeping a copy of the resource
name being removed until after the remove has
been sent.  A lookup checks if this in-progress
remove matches the name it is looking up.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It was possible for a remove message on an old
rsb to be sent after a lookup message on a new
rsb, where the rsbs were for the same resource
name.  This could lead to a missing directory
entry for the new rsb.

It is fixed by keeping a copy of the resource
name being removed until after the remove has
been sent.  A lookup checks if this in-progress
remove matches the name it is looking up.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: use rsbtbl as resource directory</title>
<updated>2012-07-16T19:16:19+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-05-10T15:18:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c04fecb4d9f7753e0cbff7edd03ec68f8721cdce'/>
<id>c04fecb4d9f7753e0cbff7edd03ec68f8721cdce</id>
<content type='text'>
Remove the dir hash table (dirtbl), and use
the rsb hash table (rsbtbl) as the resource
directory.  It has always been an unnecessary
duplication of information.

This improves efficiency by using a single rsbtbl
lookup in many cases where both rsbtbl and dirtbl
lookups were needed previously.

This eliminates the need to handle cases of rsbtbl
and dirtbl being out of sync.

In many cases there will be memory savings because
the dir hash table no longer exists.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the dir hash table (dirtbl), and use
the rsb hash table (rsbtbl) as the resource
directory.  It has always been an unnecessary
duplication of information.

This improves efficiency by using a single rsbtbl
lookup in many cases where both rsbtbl and dirtbl
lookups were needed previously.

This eliminates the need to handle cases of rsbtbl
and dirtbl being out of sync.

In many cases there will be memory savings because
the dir hash table no longer exists.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fixes for nodir mode</title>
<updated>2012-05-02T19:15:27+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-04-26T20:54:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4875647a08e35f77274838d97ca8fa44158d50e2'/>
<id>4875647a08e35f77274838d97ca8fa44158d50e2</id>
<content type='text'>
The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: improve error and debug messages</title>
<updated>2012-04-26T20:41:46+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-04-23T21:36:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6d40c4a708e0e996fd9c60d4093aebba5fe1f749'/>
<id>6d40c4a708e0e996fd9c60d4093aebba5fe1f749</id>
<content type='text'>
Change some existing error/debug messages to
collect more useful information, and add
some new error/debug messages to address
recently found problems.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change some existing error/debug messages to
collect more useful information, and add
some new error/debug messages to address
recently found problems.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: avoid unnecessary search in search_rsb</title>
<updated>2012-04-26T20:37:56+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-04-23T19:08:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=57638bf3aa64facd9eba0e018b5773f5d2da6c2b'/>
<id>57638bf3aa64facd9eba0e018b5773f5d2da6c2b</id>
<content type='text'>
If the rsb is found in the "keep" tree, but is
not the right type (i.e. not MASTER), we can
return immediately with the result.  There's
no point in going on to search the "toss" list
as if we hadn't found it.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the rsb is found in the "keep" tree, but is
not the right type (i.e. not MASTER), we can
return immediately with the result.  There's
no point in going on to search the "toss" list
as if we hadn't found it.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fix waiter recovery</title>
<updated>2012-04-26T20:36:04+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-04-23T17:18:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=13ef11110fa2173b9d03e6616574914e12e2a90f'/>
<id>13ef11110fa2173b9d03e6616574914e12e2a90f</id>
<content type='text'>
An outstanding remote operation (an lkb on the "waiter"
list) could sometimes miss being resent during recovery.
The decision was based on the lkb_nodeid field, which
could have changed during an earlier aborted recovery,
so it no longer represents the actual remote destination.
The lkb_wait_nodeid is always the actual remote node,
so it is the best value to use.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
An outstanding remote operation (an lkb on the "waiter"
list) could sometimes miss being resent during recovery.
The decision was based on the lkb_nodeid field, which
could have changed during an earlier aborted recovery,
so it no longer represents the actual remote destination.
The lkb_wait_nodeid is always the actual remote node,
so it is the best value to use.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fix QUECVT when convert queue is empty</title>
<updated>2012-04-23T16:30:59+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-04-04T14:49:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=53ad1c980d4fb450722a575ca17c188808939340'/>
<id>53ad1c980d4fb450722a575ca17c188808939340</id>
<content type='text'>
The QUECVT flag should not prevent conversions from
being granted immediately when the convert queue is
empty.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The QUECVT flag should not prevent conversions from
being granted immediately when the convert queue is
empty.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fix slow rsb search in dir recovery</title>
<updated>2012-03-08T20:46:30+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-03-08T18:37:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7210cb7a72a22303cdb225bd1aea28697a17bbae'/>
<id>7210cb7a72a22303cdb225bd1aea28697a17bbae</id>
<content type='text'>
The function used to find an rsb during directory
recovery was searching the single linear list of
rsb's.  This wasted a lot of time compared to
using the standard hash table to find the rsb.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The function used to find an rsb during directory
recovery was searching the single linear list of
rsb's.  This wasted a lot of time compared to
using the standard hash table to find the rsb.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
