<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/block/drbd/drbd_state.c, branch v4.1.10</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>drbd: fix resync throttling initialization</title>
<updated>2014-11-10T16:27:37+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-11-10T16:21:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ff8bd88b73fad369a12465dfa52ad5c0bf088ced'/>
<id>ff8bd88b73fad369a12465dfa52ad5c0bf088ced</id>
<content type='text'>
If for some reason DRBD resync was the only activity on a backend
device, drbd_rs_c_min_rate_throttle() would mistakenly decide that it is
still initialization time, and keep throttling the resync.

This patch explicitly initializes -&gt;rs_last_events to the current
backend event counters, and drops the rs_last_events == 0 from the
throttle condition.

Reported-by: Mikhail Sugakov &lt;msugakov@amazon.de&gt;

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If for some reason DRBD resync was the only activity on a backend
device, drbd_rs_c_min_rate_throttle() would mistakenly decide that it is
still initialization time, and keep throttling the resync.

This patch explicitly initializes -&gt;rs_last_events to the current
backend event counters, and drops the rs_last_events == 0 from the
throttle condition.

Reported-by: Mikhail Sugakov &lt;msugakov@amazon.de&gt;

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: fix race between role change and handshake</title>
<updated>2014-11-10T16:27:35+00:00</updated>
<author>
<name>Philipp Reisner</name>
<email>philipp.reisner@linbit.com</email>
</author>
<published>2014-11-10T16:21:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a88215312c5ed74697973f6c9f0fce718bcf18ad'/>
<id>a88215312c5ed74697973f6c9f0fce718bcf18ad</id>
<content type='text'>
Symptoms:
If DRBD was "cleanly shut down" (all in sync, both Secondary before
disconnect, identical data generation uuids), and then one side was
promoted *during* the next connection handshake, the role change
could confuse the handshake.

The Primary would get stuck in WFBitmapS, the Secondary would log
unexpected cstate (Connected) in receive_bitmap
and get stuck in WFBitmapT.

Fix:
The test in is_valid_soft_transition wrong. It works because
the not allowed actions (promote/attach) do not touch the
cstate. The previous condition failed to demand a cstate change
in one clause.

In order to avoid deadlocks give up the state_mutex while waiting
for the transient state to go away.

Conflicts:
	drbd/drbd_state.c
	drbd/drbd_state.h
	drbd/drbd_wrappers.h

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Symptoms:
If DRBD was "cleanly shut down" (all in sync, both Secondary before
disconnect, identical data generation uuids), and then one side was
promoted *during* the next connection handshake, the role change
could confuse the handshake.

The Primary would get stuck in WFBitmapS, the Secondary would log
unexpected cstate (Connected) in receive_bitmap
and get stuck in WFBitmapT.

Fix:
The test in is_valid_soft_transition wrong. It works because
the not allowed actions (promote/attach) do not touch the
cstate. The previous condition failed to demand a cstate change
in one clause.

In order to avoid deadlocks give up the state_mutex while waiting
for the transient state to go away.

Conflicts:
	drbd/drbd_state.c
	drbd/drbd_state.h
	drbd/drbd_wrappers.h

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: Use better variable names</title>
<updated>2014-09-11T14:41:29+00:00</updated>
<author>
<name>Andreas Gruenbacher</name>
<email>agruen@linbit.com</email>
</author>
<published>2014-09-11T12:29:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=11f8b2b69d32d43a6d9b45c60c1fee48ab91f440'/>
<id>11f8b2b69d32d43a6d9b45c60c1fee48ab91f440</id>
<content type='text'>
Rename local variable 'ds' to 'disk_state' or 'data_size'.
'dgs' to 'digest_size'

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rename local variable 'ds' to 'disk_state' or 'data_size'.
'dgs' to 'digest_size'

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: clear CRASHED_PRIMARY only after successful resync</title>
<updated>2014-07-10T16:35:05+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-03-27T09:34:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2f2abeae3cd139fb3581249388bf2018612a87e5'/>
<id>2f2abeae3cd139fb3581249388bf2018612a87e5</id>
<content type='text'>
If we lost a disk during the first resync after primary crash,
we could have prematurely cleared the CRASHED_PRIMARY flag.
Testing on C_CONNECTED is not what we meant there,
but testing for both peers to become D_UP_TO_DATE.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we lost a disk during the first resync after primary crash,
we could have prematurely cleared the CRASHED_PRIMARY flag.
Testing on C_CONNECTED is not what we meant there,
but testing for both peers to become D_UP_TO_DATE.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drivers/block: Use RCU_INIT_POINTER(x, NULL) in drbd/drbd_state.c</title>
<updated>2014-07-10T16:35:03+00:00</updated>
<author>
<name>Monam Agarwal</name>
<email>monamagarwal123@gmail.com</email>
</author>
<published>2014-03-22T20:32:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ccdd6a93ee0c4a68cfaeddc80a87fc092ff35b96'/>
<id>ccdd6a93ee0c4a68cfaeddc80a87fc092ff35b96</id>
<content type='text'>
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)

The rcu_assign_pointer() ensures that the initialization of a structure
is carried out before storing a pointer to that structure.
And in the case of the NULL pointer, there is no structure to initialize.
So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL)

Signed-off-by: Monam Agarwal &lt;monamagarwal123@gmail.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)

The rcu_assign_pointer() ensures that the initialization of a structure
is carried out before storing a pointer to that structure.
And in the case of the NULL pointer, there is no structure to initialize.
So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL)

Signed-off-by: Monam Agarwal &lt;monamagarwal123@gmail.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: silence -Wmissing-prototypes warnings</title>
<updated>2014-07-10T16:34:57+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-02-27T08:46:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8ce953aa39e2bfd66036a27abdf761c2cb93f02c'/>
<id>8ce953aa39e2bfd66036a27abdf761c2cb93f02c</id>
<content type='text'>
Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: close race when detaching from disk</title>
<updated>2014-07-10T16:34:54+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-02-11T07:57:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ba3c6fb87d2df008eed8faaf01bb198e512fa72f'/>
<id>ba3c6fb87d2df008eed8faaf01bb198e512fa72f</id>
<content type='text'>
BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: bd_release+0x21/0x70
Process drbd_w_t7146
Call Trace:
 close_bdev_exclusive
 drbd_free_ldev		[drbd]
 drbd_ldev_destroy	[drbd]
 w_after_state_ch	[drbd]

Race probably went like this:
  state.disk = D_FAILED

... first one to hit zero during D_FAILED:
   put_ldev() /* ----------------&gt; 0 */
     i = atomic_dec_return()
     if (i == 0)
       if (state.disk == D_FAILED)
         schedule_work(go_diskless)
                                /* 1 &lt;------ */ get_ldev_if_state()
   go_diskless()
      do_some_pre_cleanup()                     corresponding put_ldev():
      force_state(D_DISKLESS)   /* 0 &lt;------ */ i = atomic_dec_return()
                                                if (i == 0)
        atomic_inc() /* ---------&gt; 1 */
        state.disk = D_DISKLESS
        schedule_work(after_state_ch)           /* execution pre-empted by IRQ ? */

   after_state_ch()
     put_ldev()
       i = atomic_dec_return()  /* 0 */
       if (i == 0)
         if (state.disk == D_DISKLESS)            if (state.disk == D_DISKLESS)
           drbd_ldev_destroy()                      drbd_ldev_destroy();

Trying to fix this by checking the disk state *before* the
atomic_dec_return(), which implies memory barriers, and by inserting
extra memory barriers around the state assignment in __drbd_set_state().

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: bd_release+0x21/0x70
Process drbd_w_t7146
Call Trace:
 close_bdev_exclusive
 drbd_free_ldev		[drbd]
 drbd_ldev_destroy	[drbd]
 w_after_state_ch	[drbd]

Race probably went like this:
  state.disk = D_FAILED

... first one to hit zero during D_FAILED:
   put_ldev() /* ----------------&gt; 0 */
     i = atomic_dec_return()
     if (i == 0)
       if (state.disk == D_FAILED)
         schedule_work(go_diskless)
                                /* 1 &lt;------ */ get_ldev_if_state()
   go_diskless()
      do_some_pre_cleanup()                     corresponding put_ldev():
      force_state(D_DISKLESS)   /* 0 &lt;------ */ i = atomic_dec_return()
                                                if (i == 0)
        atomic_inc() /* ---------&gt; 1 */
        state.disk = D_DISKLESS
        schedule_work(after_state_ch)           /* execution pre-empted by IRQ ? */

   after_state_ch()
     put_ldev()
       i = atomic_dec_return()  /* 0 */
       if (i == 0)
         if (state.disk == D_DISKLESS)            if (state.disk == D_DISKLESS)
           drbd_ldev_destroy()                      drbd_ldev_destroy();

Trying to fix this by checking the disk state *before* the
atomic_dec_return(), which implies memory barriers, and by inserting
extra memory barriers around the state assignment in __drbd_set_state().

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: fix resync finished detection</title>
<updated>2014-07-10T16:34:50+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-01-27T14:58:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5ab7d2c005135849cf0bb1485d954c98f2cca57c'/>
<id>5ab7d2c005135849cf0bb1485d954c98f2cca57c</id>
<content type='text'>
This fixes one recent regresion,
and one long existing bug.

The bug:
drbd_try_clear_on_disk_bm() assumed that all "count" bits have to be
accounted in the resync extent corresponding to the start sector.

Since we allow application requests to cross our "extent" boundaries,
this assumption is no longer true, resulting in possible misaccounting,
scary messages
("BAD! sector=12345s enr=6 rs_left=-7 rs_failed=0 count=58 cstate=..."),
and potentially, if the last bit to be cleared during resync would
reside in previously misaccounted resync extent, the resync would never
be recognized as finished, but would be "stalled" forever, even though
all blocks are in sync again and all bits have been cleared...

The regression was introduced by
    drbd: get rid of atomic update on disk bitmap works

For an "empty" resync (rs_total == 0), we must not "finish" the
resync on the SyncSource before the SyncTarget knows all relevant
information (sync uuid).  We need to wait for the full round-trip,
the SyncTarget will then explicitly notify us.

Also for normal, non-empty resyncs (rs_total &gt; 0), the resync-finished
condition needs to be tested before the schedule() in wait_for_work, or
it is likely to be missed.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This fixes one recent regresion,
and one long existing bug.

The bug:
drbd_try_clear_on_disk_bm() assumed that all "count" bits have to be
accounted in the resync extent corresponding to the start sector.

Since we allow application requests to cross our "extent" boundaries,
this assumption is no longer true, resulting in possible misaccounting,
scary messages
("BAD! sector=12345s enr=6 rs_left=-7 rs_failed=0 count=58 cstate=..."),
and potentially, if the last bit to be cleared during resync would
reside in previously misaccounted resync extent, the resync would never
be recognized as finished, but would be "stalled" forever, even though
all blocks are in sync again and all bits have been cleared...

The regression was introduced by
    drbd: get rid of atomic update on disk bitmap works

For an "empty" resync (rs_total == 0), we must not "finish" the
resync on the SyncSource before the SyncTarget knows all relevant
information (sync uuid).  We need to wait for the full round-trip,
the SyncTarget will then explicitly notify us.

Also for normal, non-empty resyncs (rs_total &gt; 0), the resync-finished
condition needs to be tested before the schedule() in wait_for_work, or
it is likely to be missed.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: refactor use of first_peer_device()</title>
<updated>2014-07-10T13:22:22+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-11-22T11:40:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=44a4d551846b8c61aa430b9432c1fcdf88444708'/>
<id>44a4d551846b8c61aa430b9432c1fcdf88444708</id>
<content type='text'>
Reduce the number of calls to first_peer_device(). Instead, call
first_peer_device() just once to assign a local variable peer_device.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reduce the number of calls to first_peer_device(). Instead, call
first_peer_device() just once to assign a local variable peer_device.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: Do not BUG() when connection breaks in a special way</title>
<updated>2014-04-30T19:46:54+00:00</updated>
<author>
<name>Philipp Reisner</name>
<email>philipp.reisner@linbit.com</email>
</author>
<published>2014-04-28T16:43:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=88ea685d33c75932fe90ed0b5d2cf68845462876'/>
<id>88ea685d33c75932fe90ed0b5d2cf68845462876</id>
<content type='text'>
When a 'cluster wide' disconnect executes, the result comes back
from the peer, and immediately after that the connection breaks
then _conn_rq_cond() reported back SS_CW_SUCCESS.
Therefore _conn_request_state() calls conn_set_state(), which
has a BUG() in it.
The BUG() is hit because conn_is_valid_transition() does not like
the transaction. Which goes back to is_valid_soft_transition()
returning SS_OUTDATE_WO_CONN.

This fix is to consider an error reported by is_valid_soft_transition()
even when the peer agreed to the transaction.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a 'cluster wide' disconnect executes, the result comes back
from the peer, and immediately after that the connection breaks
then _conn_rq_cond() reported back SS_CW_SUCCESS.
Therefore _conn_request_state() calls conn_set_state(), which
has a BUG() in it.
The BUG() is hit because conn_is_valid_transition() does not like
the transaction. Which goes back to is_valid_soft_transition()
returning SS_OUTDATE_WO_CONN.

This fix is to consider an error reported by is_valid_soft_transition()
even when the peer agreed to the transaction.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
