<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/block/bio.c, branch v4.17</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Fix slab name "biovec-(1&lt;&lt;(21-12))"</title>
<updated>2018-03-22T01:25:22+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2018-03-21T16:49:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bd5c4facf59648581d2f1692dad7b107bf429954'/>
<id>bd5c4facf59648581d2f1692dad7b107bf429954</id>
<content type='text'>
I'm getting a slab named "biovec-(1&lt;&lt;(21-12))". It is caused by unintended
expansion of the macro BIO_MAX_PAGES. This patch renames it to biovec-max.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Cc: stable@vger.kernel.org	# v4.14+
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I'm getting a slab named "biovec-(1&lt;&lt;(21-12))". It is caused by unintended
expansion of the macro BIO_MAX_PAGES. This patch renames it to biovec-max.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Cc: stable@vger.kernel.org	# v4.14+
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block</title>
<updated>2018-01-29T19:51:49+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-01-29T19:51:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0a4b6e2f80aad46fb55a5cf7b1664c0aef030ee0'/>
<id>0a4b6e2f80aad46fb55a5cf7b1664c0aef030ee0</id>
<content type='text'>
Pull block updates from Jens Axboe:
 "This is the main pull request for block IO related changes for the
  4.16 kernel. Nothing major in this pull request, but a good amount of
  improvements and fixes all over the map. This contains:

   - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
     Paolo.

   - Support for SMR zones for deadline and mq-deadline from Damien and
     Christoph.

   - Set of fixes for bcache by way of Michael Lyle, including fixes
     from himself, Kent, Rui, Tang, and Coly.

   - Series from Matias for lightnvm with fixes from Hans Holmberg,
     Javier, and Matias. Mostly centered around pblk, and the removing
     rrpc 1.2 in preparation for supporting 2.0.

   - A couple of NVMe pull requests from Christoph. Nothing major in
     here, just fixes and cleanups, and support for command tracing from
     Johannes.

   - Support for blk-throttle for tracking reads and writes separately.
     From Joseph Qi. A few cleanups/fixes also for blk-throttle from
     Weiping.

   - Series from Mike Snitzer that enables dm to register its queue more
     logically, something that's alwways been problematic on dm since
     it's a stacked device.

   - Series from Ming cleaning up some of the bio accessor use, in
     preparation for supporting multipage bvecs.

   - Various fixes from Ming closing up holes around queue mapping and
     quiescing.

   - BSD partition fix from Richard Narron, fixing a problem where we
     can't mount newer (10/11) FreeBSD partitions.

   - Series from Tejun reworking blk-mq timeout handling. The previous
     scheme relied on atomic bits, but it had races where we would think
     a request had timed out if it to reused at the wrong time.

   - null_blk now supports faking timeouts, to enable us to better
     exercise and test that functionality separately. From me.

   - Kill the separate atomic poll bit in the request struct. After
     this, we don't use the atomic bits on blk-mq anymore at all. From
     me.

   - sgl_alloc/free helpers from Bart.

   - Heavily contended tag case scalability improvement from me.

   - Various little fixes and cleanups from Arnd, Bart, Corentin,
     Douglas, Eryu, Goldwyn, and myself"

* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
  block: remove smart1,2.h
  nvme: add tracepoint for nvme_complete_rq
  nvme: add tracepoint for nvme_setup_cmd
  nvme-pci: introduce RECONNECTING state to mark initializing procedure
  nvme-rdma: remove redundant boolean for inline_data
  nvme: don't free uuid pointer before printing it
  nvme-pci: Suspend queues after deleting them
  bsg: use pr_debug instead of hand crafted macros
  blk-mq-debugfs: don't allow write on attributes with seq_operations set
  nvme-pci: Fix queue double allocations
  block: Set BIO_TRACE_COMPLETION on new bio during split
  blk-throttle: use queue_is_rq_based
  block: Remove kblockd_schedule_delayed_work{,_on}()
  blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
  blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
  lib/scatterlist: Fix chaining support in sgl_alloc_order()
  blk-throttle: track read and write request individually
  block: add bdev_read_only() checks to common helpers
  block: fail op_is_write() requests to read-only partitions
  blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull block updates from Jens Axboe:
 "This is the main pull request for block IO related changes for the
  4.16 kernel. Nothing major in this pull request, but a good amount of
  improvements and fixes all over the map. This contains:

   - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
     Paolo.

   - Support for SMR zones for deadline and mq-deadline from Damien and
     Christoph.

   - Set of fixes for bcache by way of Michael Lyle, including fixes
     from himself, Kent, Rui, Tang, and Coly.

   - Series from Matias for lightnvm with fixes from Hans Holmberg,
     Javier, and Matias. Mostly centered around pblk, and the removing
     rrpc 1.2 in preparation for supporting 2.0.

   - A couple of NVMe pull requests from Christoph. Nothing major in
     here, just fixes and cleanups, and support for command tracing from
     Johannes.

   - Support for blk-throttle for tracking reads and writes separately.
     From Joseph Qi. A few cleanups/fixes also for blk-throttle from
     Weiping.

   - Series from Mike Snitzer that enables dm to register its queue more
     logically, something that's alwways been problematic on dm since
     it's a stacked device.

   - Series from Ming cleaning up some of the bio accessor use, in
     preparation for supporting multipage bvecs.

   - Various fixes from Ming closing up holes around queue mapping and
     quiescing.

   - BSD partition fix from Richard Narron, fixing a problem where we
     can't mount newer (10/11) FreeBSD partitions.

   - Series from Tejun reworking blk-mq timeout handling. The previous
     scheme relied on atomic bits, but it had races where we would think
     a request had timed out if it to reused at the wrong time.

   - null_blk now supports faking timeouts, to enable us to better
     exercise and test that functionality separately. From me.

   - Kill the separate atomic poll bit in the request struct. After
     this, we don't use the atomic bits on blk-mq anymore at all. From
     me.

   - sgl_alloc/free helpers from Bart.

   - Heavily contended tag case scalability improvement from me.

   - Various little fixes and cleanups from Arnd, Bart, Corentin,
     Douglas, Eryu, Goldwyn, and myself"

* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
  block: remove smart1,2.h
  nvme: add tracepoint for nvme_complete_rq
  nvme: add tracepoint for nvme_setup_cmd
  nvme-pci: introduce RECONNECTING state to mark initializing procedure
  nvme-rdma: remove redundant boolean for inline_data
  nvme: don't free uuid pointer before printing it
  nvme-pci: Suspend queues after deleting them
  bsg: use pr_debug instead of hand crafted macros
  blk-mq-debugfs: don't allow write on attributes with seq_operations set
  nvme-pci: Fix queue double allocations
  block: Set BIO_TRACE_COMPLETION on new bio during split
  blk-throttle: use queue_is_rq_based
  block: Remove kblockd_schedule_delayed_work{,_on}()
  blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
  blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
  lib/scatterlist: Fix chaining support in sgl_alloc_order()
  blk-throttle: track read and write request individually
  block: add bdev_read_only() checks to common helpers
  block: fail op_is_write() requests to read-only partitions
  blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>block: Set BIO_TRACE_COMPLETION on new bio during split</title>
<updated>2018-01-23T16:10:19+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2018-01-23T16:10:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=20d59023c5ec4426284af492808bcea1f39787ef'/>
<id>20d59023c5ec4426284af492808bcea1f39787ef</id>
<content type='text'>
We inadvertently set it again on the source bio, but we need
to set it on the new split bio instead.

Fixes: fbbaf700e7b1 ("block: trace completion of all bios.")
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We inadvertently set it again on the source bio, but we need
to set it on the new split bio instead.

Fixes: fbbaf700e7b1 ("block: trace completion of all bios.")
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: move bio_alloc_pages() to bcache</title>
<updated>2018-01-06T16:18:00+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2017-12-18T12:22:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=25d8be77e19224d8f21b363d77b5283c5dc21a57'/>
<id>25d8be77e19224d8f21b363d77b5283c5dc21a57</id>
<content type='text'>
bcache is the only user of bio_alloc_pages(), so move this function into
bcache, and avoid it being misused in the future.

Also rename it to bch_bio_allo_pages() since it is bcache only.

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bcache is the only user of bio_alloc_pages(), so move this function into
bcache, and avoid it being misused in the future.

Also rename it to bch_bio_allo_pages() since it is bcache only.

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block-throttle: avoid double charge</title>
<updated>2017-12-20T18:10:17+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-12-20T18:10:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=111be883981748acc9a56e855c8336404a8e787c'/>
<id>111be883981748acc9a56e855c8336404a8e787c</id>
<content type='text'>
If a bio is throttled and split after throttling, the bio could be
resubmited and enters the throttling again. This will cause part of the
bio to be charged multiple times. If the cgroup has an IO limit, the
double charge will significantly harm the performance. The bio split
becomes quite common after arbitrary bio size change.

To fix this, we always set the BIO_THROTTLED flag if a bio is throttled.
If the bio is cloned/split, we copy the flag to new bio too to avoid a
double charge. However, cloned bio could be directed to a new disk,
keeping the flag be a problem. The observation is we always set new disk
for the bio in this case, so we can clear the flag in bio_set_dev().

This issue exists for a long time, arbitrary bio size change just makes
it worse, so this should go into stable at least since v4.2.

V1-&gt; V2: Not add extra field in bio based on discussion with Tejun

Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: stable@vger.kernel.org
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a bio is throttled and split after throttling, the bio could be
resubmited and enters the throttling again. This will cause part of the
bio to be charged multiple times. If the cgroup has an IO limit, the
double charge will significantly harm the performance. The bio split
becomes quite common after arbitrary bio size change.

To fix this, we always set the BIO_THROTTLED flag if a bio is throttled.
If the bio is cloned/split, we copy the flag to new bio too to avoid a
double charge. However, cloned bio could be directed to a new disk,
keeping the flag be a problem. The observation is we always set new disk
for the bio in this case, so we can clear the flag in bio_set_dev().

This issue exists for a long time, arbitrary bio size change just makes
it worse, so this should go into stable at least since v4.2.

V1-&gt; V2: Not add extra field in bio based on discussion with Tejun

Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: stable@vger.kernel.org
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: remove useless assignment in bio_split</title>
<updated>2017-11-22T18:26:05+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2017-11-22T18:18:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f341a4d384f7fced2ca0d9472ed88fe94de32726'/>
<id>f341a4d384f7fced2ca0d9472ed88fe94de32726</id>
<content type='text'>
Remove useless assignment to the variable "split" because the variable is
unconditionally assigned later.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove useless assignment to the variable "split" because the variable is
unconditionally assigned later.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2017-11-17T20:08:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-11-17T20:08:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=16382e17c0ff583df2d5eed56ca7c771d637e9d1'/>
<id>16382e17c0ff583df2d5eed56ca7c771d637e9d1</id>
<content type='text'>
Pull iov_iter updates from Al Viro:

 - bio_{map,copy}_user_iov() series; those are cleanups - fixes from the
   same pile went into mainline (and stable) in late September.

 - fs/iomap.c iov_iter-related fixes

 - new primitive - iov_iter_for_each_range(), which applies a function
   to kernel-mapped segments of an iov_iter.

   Usable for kvec and bvec ones, the latter does kmap()/kunmap() around
   the callback. _Not_ usable for iovec- or pipe-backed iov_iter; the
   latter is not hard to fix if the need ever appears, the former is by
   design.

   Another related primitive will have to wait for the next cycle - it
   passes page + offset + size instead of pointer + size, and that one
   will be usable for everything _except_ kvec. Unfortunately, that one
   didn't get exposure in -next yet, so...

 - a bit more lustre iov_iter work, including a use case for
   iov_iter_for_each_range() (checksum calculation)

 - vhost/scsi leak fix in failure exit

 - misc cleanups and detritectomy...

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits)
  iomap_dio_actor(): fix iov_iter bugs
  switch ksocknal_lib_recv_...() to use of iov_iter_for_each_range()
  lustre: switch struct ksock_conn to iov_iter
  vhost/scsi: switch to iov_iter_get_pages()
  fix a page leak in vhost_scsi_iov_to_sgl() error recovery
  new primitive: iov_iter_for_each_range()
  lnet_return_rx_credits_locked: don't abuse list_entry
  xen: don't open-code iov_iter_kvec()
  orangefs: remove detritus from struct orangefs_kiocb_s
  kill iov_shorten()
  bio_alloc_map_data(): do bmd-&gt;iter setup right there
  bio_copy_user_iov(): saner bio size calculation
  bio_map_user_iov(): get rid of copying iov_iter
  bio_copy_from_iter(): get rid of copying iov_iter
  move more stuff down into bio_copy_user_iov()
  blk_rq_map_user_iov(): move iov_iter_advance() down
  bio_map_user_iov(): get rid of the iov_for_each()
  bio_map_user_iov(): move alignment check into the main loop
  don't rely upon subsequent bio_add_pc_page() calls failing
  ... and with iov_iter_get_pages_alloc() it becomes even simpler
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull iov_iter updates from Al Viro:

 - bio_{map,copy}_user_iov() series; those are cleanups - fixes from the
   same pile went into mainline (and stable) in late September.

 - fs/iomap.c iov_iter-related fixes

 - new primitive - iov_iter_for_each_range(), which applies a function
   to kernel-mapped segments of an iov_iter.

   Usable for kvec and bvec ones, the latter does kmap()/kunmap() around
   the callback. _Not_ usable for iovec- or pipe-backed iov_iter; the
   latter is not hard to fix if the need ever appears, the former is by
   design.

   Another related primitive will have to wait for the next cycle - it
   passes page + offset + size instead of pointer + size, and that one
   will be usable for everything _except_ kvec. Unfortunately, that one
   didn't get exposure in -next yet, so...

 - a bit more lustre iov_iter work, including a use case for
   iov_iter_for_each_range() (checksum calculation)

 - vhost/scsi leak fix in failure exit

 - misc cleanups and detritectomy...

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits)
  iomap_dio_actor(): fix iov_iter bugs
  switch ksocknal_lib_recv_...() to use of iov_iter_for_each_range()
  lustre: switch struct ksock_conn to iov_iter
  vhost/scsi: switch to iov_iter_get_pages()
  fix a page leak in vhost_scsi_iov_to_sgl() error recovery
  new primitive: iov_iter_for_each_range()
  lnet_return_rx_credits_locked: don't abuse list_entry
  xen: don't open-code iov_iter_kvec()
  orangefs: remove detritus from struct orangefs_kiocb_s
  kill iov_shorten()
  bio_alloc_map_data(): do bmd-&gt;iter setup right there
  bio_copy_user_iov(): saner bio size calculation
  bio_map_user_iov(): get rid of copying iov_iter
  bio_copy_from_iter(): get rid of copying iov_iter
  move more stuff down into bio_copy_user_iov()
  blk_rq_map_user_iov(): move iov_iter_advance() down
  bio_map_user_iov(): get rid of the iov_for_each()
  bio_map_user_iov(): move alignment check into the main loop
  don't rely upon subsequent bio_add_pc_page() calls failing
  ... and with iov_iter_get_pages_alloc() it becomes even simpler
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>bio: ensure __bio_clone_fast copies bi_partno</title>
<updated>2017-11-17T15:29:34+00:00</updated>
<author>
<name>Michael Lyle</name>
<email>mlyle@lyle.org</email>
</author>
<published>2017-11-17T07:47:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=62530ed8b1d07a45dec94d46e521c0c6c2d476e6'/>
<id>62530ed8b1d07a45dec94d46e521c0c6c2d476e6</id>
<content type='text'>
A new field was introduced in 74d46992e0d9, bi_partno, instead of using
bdev-&gt;bd_contains and encoding the partition information in the bi_bdev
field.  __bio_clone_fast was changed to copy the disk information, but
not the partition information.  At minimum, this regressed bcache and
caused data corruption.

Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reported-by: Pavel Goran &lt;via-bcache@pvgoran.name&gt;
Reported-by: Campbell Steven &lt;casteven@gmail.com&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 4.14
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A new field was introduced in 74d46992e0d9, bi_partno, instead of using
bdev-&gt;bd_contains and encoding the partition information in the bi_bdev
field.  __bio_clone_fast was changed to copy the disk information, but
not the partition information.  At minimum, this regressed bcache and
caused data corruption.

Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reported-by: Pavel Goran &lt;via-bcache@pvgoran.name&gt;
Reported-by: Campbell Steven &lt;casteven@gmail.com&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 4.14
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block</title>
<updated>2017-11-14T23:32:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-11-14T23:32:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e2c5923c349c1738fe8fda980874d93f6fb2e5b6'/>
<id>e2c5923c349c1738fe8fda980874d93f6fb2e5b6</id>
<content type='text'>
Pull core block layer updates from Jens Axboe:
 "This is the main pull request for block storage for 4.15-rc1.

  Nothing out of the ordinary in here, and no API changes or anything
  like that. Just various new features for drivers, core changes, etc.
  In particular, this pull request contains:

   - A patch series from Bart, closing the whole on blk/scsi-mq queue
     quescing.

   - A series from Christoph, building towards hidden gendisks (for
     multipath) and ability to move bio chains around.

   - NVMe
        - Support for native multipath for NVMe (Christoph).
        - Userspace notifications for AENs (Keith).
        - Command side-effects support (Keith).
        - SGL support (Chaitanya Kulkarni)
        - FC fixes and improvements (James Smart)
        - Lots of fixes and tweaks (Various)

   - bcache
        - New maintainer (Michael Lyle)
        - Writeback control improvements (Michael)
        - Various fixes (Coly, Elena, Eric, Liang, et al)

   - lightnvm updates, mostly centered around the pblk interface
     (Javier, Hans, and Rakesh).

   - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

   - Writeback series that fix the much discussed hundreds of millions
     of sync-all units. This goes all the way, as discussed previously
     (me).

   - Fix for missing wakeup on writeback timer adjustments (Yafang
     Shao).

   - Fix laptop mode on blk-mq (me).

   - {mq,name} tupple lookup for IO schedulers, allowing us to have
     alias names. This means you can use 'deadline' on both !mq and on
     mq (where it's called mq-deadline). (me).

   - blktrace race fix, oopsing on sg load (me).

   - blk-mq optimizations (me).

   - Obscure waitqueue race fix for kyber (Omar).

   - NBD fixes (Josef).

   - Disable writeback throttling by default on bfq, like we do on cfq
     (Luca Miccio).

   - Series from Ming that enable us to treat flush requests on blk-mq
     like any other request. This is a really nice cleanup.

   - Series from Ming that improves merging on blk-mq with schedulers,
     getting us closer to flipping the switch on scsi-mq again.

   - BFQ updates (Paolo).

   - blk-mq atomic flags memory ordering fixes (Peter Z).

   - Loop cgroup support (Shaohua).

   - Lots of minor fixes from lots of different folks, both for core and
     driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
  nvme: fix visibility of "uuid" ns attribute
  blk-mq: fixup some comment typos and lengths
  ide: ide-atapi: fix compile error with defining macro DEBUG
  blk-mq: improve tag waiting setup for non-shared tags
  brd: remove unused brd_mutex
  blk-mq: only run the hardware queue if IO is pending
  block: avoid null pointer dereference on null disk
  fs: guard_bio_eod() needs to consider partitions
  xtensa/simdisk: fix compile error
  nvme: expose subsys attribute to sysfs
  nvme: create 'slaves' and 'holders' entries for hidden controllers
  block: create 'slaves' and 'holders' entries for hidden gendisks
  nvme: also expose the namespace identification sysfs files for mpath nodes
  nvme: implement multipath access to nvme subsystems
  nvme: track shared namespaces
  nvme: introduce a nvme_ns_ids structure
  nvme: track subsystems
  block, nvme: Introduce blk_mq_req_flags_t
  block, scsi: Make SCSI quiesce and resume work reliably
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull core block layer updates from Jens Axboe:
 "This is the main pull request for block storage for 4.15-rc1.

  Nothing out of the ordinary in here, and no API changes or anything
  like that. Just various new features for drivers, core changes, etc.
  In particular, this pull request contains:

   - A patch series from Bart, closing the whole on blk/scsi-mq queue
     quescing.

   - A series from Christoph, building towards hidden gendisks (for
     multipath) and ability to move bio chains around.

   - NVMe
        - Support for native multipath for NVMe (Christoph).
        - Userspace notifications for AENs (Keith).
        - Command side-effects support (Keith).
        - SGL support (Chaitanya Kulkarni)
        - FC fixes and improvements (James Smart)
        - Lots of fixes and tweaks (Various)

   - bcache
        - New maintainer (Michael Lyle)
        - Writeback control improvements (Michael)
        - Various fixes (Coly, Elena, Eric, Liang, et al)

   - lightnvm updates, mostly centered around the pblk interface
     (Javier, Hans, and Rakesh).

   - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

   - Writeback series that fix the much discussed hundreds of millions
     of sync-all units. This goes all the way, as discussed previously
     (me).

   - Fix for missing wakeup on writeback timer adjustments (Yafang
     Shao).

   - Fix laptop mode on blk-mq (me).

   - {mq,name} tupple lookup for IO schedulers, allowing us to have
     alias names. This means you can use 'deadline' on both !mq and on
     mq (where it's called mq-deadline). (me).

   - blktrace race fix, oopsing on sg load (me).

   - blk-mq optimizations (me).

   - Obscure waitqueue race fix for kyber (Omar).

   - NBD fixes (Josef).

   - Disable writeback throttling by default on bfq, like we do on cfq
     (Luca Miccio).

   - Series from Ming that enable us to treat flush requests on blk-mq
     like any other request. This is a really nice cleanup.

   - Series from Ming that improves merging on blk-mq with schedulers,
     getting us closer to flipping the switch on scsi-mq again.

   - BFQ updates (Paolo).

   - blk-mq atomic flags memory ordering fixes (Peter Z).

   - Loop cgroup support (Shaohua).

   - Lots of minor fixes from lots of different folks, both for core and
     driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
  nvme: fix visibility of "uuid" ns attribute
  blk-mq: fixup some comment typos and lengths
  ide: ide-atapi: fix compile error with defining macro DEBUG
  blk-mq: improve tag waiting setup for non-shared tags
  brd: remove unused brd_mutex
  blk-mq: only run the hardware queue if IO is pending
  block: avoid null pointer dereference on null disk
  fs: guard_bio_eod() needs to consider partitions
  xtensa/simdisk: fix compile error
  nvme: expose subsys attribute to sysfs
  nvme: create 'slaves' and 'holders' entries for hidden controllers
  block: create 'slaves' and 'holders' entries for hidden gendisks
  nvme: also expose the namespace identification sysfs files for mpath nodes
  nvme: implement multipath access to nvme subsystems
  nvme: track shared namespaces
  nvme: introduce a nvme_ns_ids structure
  nvme: track subsystems
  block, nvme: Introduce blk_mq_req_flags_t
  block, scsi: Make SCSI quiesce and resume work reliably
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion()</title>
<updated>2017-10-26T05:54:17+00:00</updated>
<author>
<name>Byungchul Park</name>
<email>byungchul.park@lge.com</email>
</author>
<published>2017-10-25T08:56:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e319e1fbd9d42420ab6eec0bfd75eb9ad7ca63b1'/>
<id>e319e1fbd9d42420ab6eec0bfd75eb9ad7ca63b1</id>
<content type='text'>
Darrick posted the following warning and Dave Chinner analyzed it:

&gt; ======================================================
&gt; WARNING: possible circular locking dependency detected
&gt; 4.14.0-rc1-fixes #1 Tainted: G        W
&gt; ------------------------------------------------------
&gt; loop0/31693 is trying to acquire lock:
&gt;  (&amp;(&amp;ip-&gt;i_mmaplock)-&gt;mr_lock){++++}, at: [&lt;ffffffffa00f1b0c&gt;] xfs_ilock+0x23c/0x330 [xfs]
&gt;
&gt; but now in release context of a crosslock acquired at the following:
&gt;  ((complete)&amp;ret.event){+.+.}, at: [&lt;ffffffff81326c1f&gt;] submit_bio_wait+0x7f/0xb0
&gt;
&gt; which lock already depends on the new lock.
&gt;
&gt; the existing dependency chain (in reverse order) is:
&gt;
&gt; -&gt; #2 ((complete)&amp;ret.event){+.+.}:
&gt;        lock_acquire+0xab/0x200
&gt;        wait_for_completion_io+0x4e/0x1a0
&gt;        submit_bio_wait+0x7f/0xb0
&gt;        blkdev_issue_zeroout+0x71/0xa0
&gt;        xfs_bmapi_convert_unwritten+0x11f/0x1d0 [xfs]
&gt;        xfs_bmapi_write+0x374/0x11f0 [xfs]
&gt;        xfs_iomap_write_direct+0x2ac/0x430 [xfs]
&gt;        xfs_file_iomap_begin+0x20d/0xd50 [xfs]
&gt;        iomap_apply+0x43/0xe0
&gt;        dax_iomap_rw+0x89/0xf0
&gt;        xfs_file_dax_write+0xcc/0x220 [xfs]
&gt;        xfs_file_write_iter+0xf0/0x130 [xfs]
&gt;        __vfs_write+0xd9/0x150
&gt;        vfs_write+0xc8/0x1c0
&gt;        SyS_write+0x45/0xa0
&gt;        entry_SYSCALL_64_fastpath+0x1f/0xbe
&gt;
&gt; -&gt; #1 (&amp;xfs_nondir_ilock_class){++++}:
&gt;        lock_acquire+0xab/0x200
&gt;        down_write_nested+0x4a/0xb0
&gt;        xfs_ilock+0x263/0x330 [xfs]
&gt;        xfs_setattr_size+0x152/0x370 [xfs]
&gt;        xfs_vn_setattr+0x6b/0x90 [xfs]
&gt;        notify_change+0x27d/0x3f0
&gt;        do_truncate+0x5b/0x90
&gt;        path_openat+0x237/0xa90
&gt;        do_filp_open+0x8a/0xf0
&gt;        do_sys_open+0x11c/0x1f0
&gt;        entry_SYSCALL_64_fastpath+0x1f/0xbe
&gt;
&gt; -&gt; #0 (&amp;(&amp;ip-&gt;i_mmaplock)-&gt;mr_lock){++++}:
&gt;        up_write+0x1c/0x40
&gt;        xfs_iunlock+0x1d0/0x310 [xfs]
&gt;        xfs_file_fallocate+0x8a/0x310 [xfs]
&gt;        loop_queue_work+0xb7/0x8d0
&gt;        kthread_worker_fn+0xb9/0x1f0
&gt;
&gt; Chain exists of:
&gt;   &amp;(&amp;ip-&gt;i_mmaplock)-&gt;mr_lock --&gt; &amp;xfs_nondir_ilock_class --&gt; (complete)&amp;ret.event
&gt;
&gt;  Possible unsafe locking scenario by crosslock:
&gt;
&gt;        CPU0                    CPU1
&gt;        ----                    ----
&gt;   lock(&amp;xfs_nondir_ilock_class);
&gt;   lock((complete)&amp;ret.event);
&gt;                                lock(&amp;(&amp;ip-&gt;i_mmaplock)-&gt;mr_lock);
&gt;                                unlock((complete)&amp;ret.event);
&gt;
&gt;                *** DEADLOCK ***

The warning is a false positive, caused by the fact that all
wait_for_completion()s in submit_bio_wait() are waiting with the same
lock class.

However, some bios have nothing to do with others, for example in the case
of loop devices, there's no direct connection between the bios of an upper
device and the bios of a lower device(=loop device).

The safest way to assign different lock classes to different devices is
to do it for each gendisk. In other words, this patch assigns a
lockdep_map per gendisk and uses it when initializing completion in
submit_bio_wait().

Analyzed-by: Dave Chinner &lt;david@fromorbit.com&gt;
Reported-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Byungchul Park &lt;byungchul.park@lge.com&gt;
Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: amir73il@gmail.com
Cc: axboe@kernel.dk
Cc: david@fromorbit.com
Cc: hch@infradead.org
Cc: idryomov@gmail.com
Cc: johan@kernel.org
Cc: johannes.berg@intel.com
Cc: kernel-team@lge.com
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-xfs@vger.kernel.org
Cc: oleg@redhat.com
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/1508921765-15396-10-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Darrick posted the following warning and Dave Chinner analyzed it:

&gt; ======================================================
&gt; WARNING: possible circular locking dependency detected
&gt; 4.14.0-rc1-fixes #1 Tainted: G        W
&gt; ------------------------------------------------------
&gt; loop0/31693 is trying to acquire lock:
&gt;  (&amp;(&amp;ip-&gt;i_mmaplock)-&gt;mr_lock){++++}, at: [&lt;ffffffffa00f1b0c&gt;] xfs_ilock+0x23c/0x330 [xfs]
&gt;
&gt; but now in release context of a crosslock acquired at the following:
&gt;  ((complete)&amp;ret.event){+.+.}, at: [&lt;ffffffff81326c1f&gt;] submit_bio_wait+0x7f/0xb0
&gt;
&gt; which lock already depends on the new lock.
&gt;
&gt; the existing dependency chain (in reverse order) is:
&gt;
&gt; -&gt; #2 ((complete)&amp;ret.event){+.+.}:
&gt;        lock_acquire+0xab/0x200
&gt;        wait_for_completion_io+0x4e/0x1a0
&gt;        submit_bio_wait+0x7f/0xb0
&gt;        blkdev_issue_zeroout+0x71/0xa0
&gt;        xfs_bmapi_convert_unwritten+0x11f/0x1d0 [xfs]
&gt;        xfs_bmapi_write+0x374/0x11f0 [xfs]
&gt;        xfs_iomap_write_direct+0x2ac/0x430 [xfs]
&gt;        xfs_file_iomap_begin+0x20d/0xd50 [xfs]
&gt;        iomap_apply+0x43/0xe0
&gt;        dax_iomap_rw+0x89/0xf0
&gt;        xfs_file_dax_write+0xcc/0x220 [xfs]
&gt;        xfs_file_write_iter+0xf0/0x130 [xfs]
&gt;        __vfs_write+0xd9/0x150
&gt;        vfs_write+0xc8/0x1c0
&gt;        SyS_write+0x45/0xa0
&gt;        entry_SYSCALL_64_fastpath+0x1f/0xbe
&gt;
&gt; -&gt; #1 (&amp;xfs_nondir_ilock_class){++++}:
&gt;        lock_acquire+0xab/0x200
&gt;        down_write_nested+0x4a/0xb0
&gt;        xfs_ilock+0x263/0x330 [xfs]
&gt;        xfs_setattr_size+0x152/0x370 [xfs]
&gt;        xfs_vn_setattr+0x6b/0x90 [xfs]
&gt;        notify_change+0x27d/0x3f0
&gt;        do_truncate+0x5b/0x90
&gt;        path_openat+0x237/0xa90
&gt;        do_filp_open+0x8a/0xf0
&gt;        do_sys_open+0x11c/0x1f0
&gt;        entry_SYSCALL_64_fastpath+0x1f/0xbe
&gt;
&gt; -&gt; #0 (&amp;(&amp;ip-&gt;i_mmaplock)-&gt;mr_lock){++++}:
&gt;        up_write+0x1c/0x40
&gt;        xfs_iunlock+0x1d0/0x310 [xfs]
&gt;        xfs_file_fallocate+0x8a/0x310 [xfs]
&gt;        loop_queue_work+0xb7/0x8d0
&gt;        kthread_worker_fn+0xb9/0x1f0
&gt;
&gt; Chain exists of:
&gt;   &amp;(&amp;ip-&gt;i_mmaplock)-&gt;mr_lock --&gt; &amp;xfs_nondir_ilock_class --&gt; (complete)&amp;ret.event
&gt;
&gt;  Possible unsafe locking scenario by crosslock:
&gt;
&gt;        CPU0                    CPU1
&gt;        ----                    ----
&gt;   lock(&amp;xfs_nondir_ilock_class);
&gt;   lock((complete)&amp;ret.event);
&gt;                                lock(&amp;(&amp;ip-&gt;i_mmaplock)-&gt;mr_lock);
&gt;                                unlock((complete)&amp;ret.event);
&gt;
&gt;                *** DEADLOCK ***

The warning is a false positive, caused by the fact that all
wait_for_completion()s in submit_bio_wait() are waiting with the same
lock class.

However, some bios have nothing to do with others, for example in the case
of loop devices, there's no direct connection between the bios of an upper
device and the bios of a lower device(=loop device).

The safest way to assign different lock classes to different devices is
to do it for each gendisk. In other words, this patch assigns a
lockdep_map per gendisk and uses it when initializing completion in
submit_bio_wait().

Analyzed-by: Dave Chinner &lt;david@fromorbit.com&gt;
Reported-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Byungchul Park &lt;byungchul.park@lge.com&gt;
Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: amir73il@gmail.com
Cc: axboe@kernel.dk
Cc: david@fromorbit.com
Cc: hch@infradead.org
Cc: idryomov@gmail.com
Cc: johan@kernel.org
Cc: johannes.berg@intel.com
Cc: kernel-team@lge.com
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-xfs@vger.kernel.org
Cc: oleg@redhat.com
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/1508921765-15396-10-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
