<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/block/drbd/drbd_req.c, branch v3.12.29</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>drbd: fix drbd epoch write count for ahead/behind mode</title>
<updated>2013-03-28T16:10:25+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-27T13:08:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=607f25e56ee0a31e451f6bd8a7109fa1f5dcbe29'/>
<id>607f25e56ee0a31e451f6bd8a7109fa1f5dcbe29</id>
<content type='text'>
The sanity check when receiving P_BARRIER_ACK does expect all write
requests with a given req-&gt;epoch to have been either all replicated,
or all not replicated.

Because req-&gt;epoch was assigned before calling maybe_pull_ahead(),
this expectation was not met, leading to an off-by-one in the sanity
check, and further to a "Protocol Error".

Fix: move the call to maybe_pull_ahead() a few lines up,
and assign req-&gt;epoch only after that.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The sanity check when receiving P_BARRIER_ACK does expect all write
requests with a given req-&gt;epoch to have been either all replicated,
or all not replicated.

Because req-&gt;epoch was assigned before calling maybe_pull_ahead(),
this expectation was not met, leading to an off-by-one in the sanity
check, and further to a "Protocol Error".

Fix: move the call to maybe_pull_ahead() a few lines up,
and assign req-&gt;epoch only after that.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: only fail empty flushes if no good data is reachable</title>
<updated>2013-03-28T16:10:25+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-27T13:08:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7074e4a745799d521b17775f6d076d84dc7f8c50'/>
<id>7074e4a745799d521b17775f6d076d84dc7f8c50</id>
<content type='text'>
We completed empty flushes (blkdev_issue_flush()) with IO error
if we lost the local disk, even if we still have an established
replication link to a healthy remote disk.

Fix this to only report errors to upper layers,
if neither local nor remote data is reachable.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We completed empty flushes (blkdev_issue_flush()) with IO error
if we lost the local disk, even if we still have an established
replication link to a healthy remote disk.

Fix this to only report errors to upper layers,
if neither local nor remote data is reachable.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: try hard to max out the updates per AL transaction</title>
<updated>2013-03-23T04:18:09+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-19T17:16:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=45ad07b3ac1e3062188fb760fe71cafb4a100215'/>
<id>45ad07b3ac1e3062188fb760fe71cafb4a100215</id>
<content type='text'>
There may have been more incoming requests while we where preparing
the current transaction. Try to consolidate more updates into this
transaction until we make no more progres.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There may have been more incoming requests while we where preparing
the current transaction. Try to consolidate more updates into this
transaction until we make no more progres.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: move start io accounting before activity log transaction</title>
<updated>2013-03-23T04:18:09+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-19T17:16:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7e8c288f6cde950a6ca001ec06a32c8c2cf4180e'/>
<id>7e8c288f6cde950a6ca001ec06a32c8c2cf4180e</id>
<content type='text'>
The IO accounting of the drbd "queue depth" was misleading.
We only started IO accounting once we already wrote the activity log.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The IO accounting of the drbd "queue depth" was misleading.
We only started IO accounting once we already wrote the activity log.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: consolidate as many updates as possible into one AL transaction</title>
<updated>2013-03-23T04:18:09+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-19T17:16:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=08a1ddab6df7d3c7b6341774cb1cf4b21b96a214'/>
<id>08a1ddab6df7d3c7b6341774cb1cf4b21b96a214</id>
<content type='text'>
Depending on current IO depth, try to consolidate as many updates
as possible into one activity log transaction.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Depending on current IO depth, try to consolidate as many updates
as possible into one activity log transaction.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: queue writes on submitter thread, unless they pass the activity log fastpath</title>
<updated>2013-03-23T00:15:17+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-19T17:16:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=779b3fe4c0e9dea19ae3ddef0b5fd1a663b63ee6'/>
<id>779b3fe4c0e9dea19ae3ddef0b5fd1a663b63ee6</id>
<content type='text'>
Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: prepare to queue write requests on a submit worker</title>
<updated>2013-03-23T00:14:40+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-23T00:14:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=113fef9e20e0d614b3f5940b67c96e719c559eea'/>
<id>113fef9e20e0d614b3f5940b67c96e719c559eea</id>
<content type='text'>
Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: split __drbd_make_request in before and after drbd_al_begin_io</title>
<updated>2013-03-23T00:14:00+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-19T17:16:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6d9febe237146156947f0da8407c620b5c33c1df'/>
<id>6d9febe237146156947f0da8407c620b5c33c1df</id>
<content type='text'>
This is in preparation to be able to defer requests that need to wait
for an activity log transaction to a submitter workqueue.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is in preparation to be able to defer requests that need to wait
for an activity log transaction to a submitter workqueue.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: Clarify when activity log I/O is delegated to the worker thread</title>
<updated>2013-03-23T00:14:00+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-03-19T17:16:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=56392d2f40aac4b520fc50bc356f40e07f7e1c7d'/>
<id>56392d2f40aac4b520fc50bc356f40e07f7e1c7d</id>
<content type='text'>
Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drbd: fix potential protocol error and resulting disconnect/reconnect</title>
<updated>2013-01-21T21:58:36+00:00</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2013-01-21T14:43:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2681f7f6ce6c7416eb619d0fb19422bcc68bd9e1'/>
<id>2681f7f6ce6c7416eb619d0fb19422bcc68bd9e1</id>
<content type='text'>
When we notice a disk failure on the receiving side,
we stop sending it new incoming writes.

Depending on exact timing of various events, the same transfer log epoch
could end up containing both replicated (before we noticed the failure)
and local-only requests (after we noticed the failure).

The sanity checks in tl_release(), called when receiving a
P_BARRIER_ACK, check that the ack'ed transfer log epoch matches
the expected epoch, and the number of contained writes matches
the number of ack'ed writes.

In this case, they counted both replicated and local-only writes,
but the peer only acknowledges those it has seen.  We get a mismatch,
resulting in a protocol error and disconnect/reconnect cycle.

Messages logged are
  "BAD! BarrierAck #%u received with n_writes=%u, expected n_writes=%u!\n"

A similar issue can also be triggered when starting a resync while
having a healthy replication link, by invalidating one side, forcing a
full sync, or attaching to a diskless node.

Fix this by closing the current epoch if the state changes in a way
that would cause the replication intent of the next write.

Epochs now contain either only non-replicated,
or only replicated writes.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When we notice a disk failure on the receiving side,
we stop sending it new incoming writes.

Depending on exact timing of various events, the same transfer log epoch
could end up containing both replicated (before we noticed the failure)
and local-only requests (after we noticed the failure).

The sanity checks in tl_release(), called when receiving a
P_BARRIER_ACK, check that the ack'ed transfer log epoch matches
the expected epoch, and the number of contained writes matches
the number of ack'ed writes.

In this case, they counted both replicated and local-only writes,
but the peer only acknowledges those it has seen.  We get a mismatch,
resulting in a protocol error and disconnect/reconnect cycle.

Messages logged are
  "BAD! BarrierAck #%u received with n_writes=%u, expected n_writes=%u!\n"

A similar issue can also be triggered when starting a resync while
having a healthy replication link, by invalidating one side, forcing a
full sync, or attaching to a diskless node.

Fix this by closing the current epoch if the state changes in a way
that would cause the replication intent of the next write.

Epochs now contain either only non-replicated,
or only replicated writes.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
