<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/backing-dev-defs.h, branch v4.17-rc1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block</title>
<updated>2017-11-14T23:32:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-11-14T23:32:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e2c5923c349c1738fe8fda980874d93f6fb2e5b6'/>
<id>e2c5923c349c1738fe8fda980874d93f6fb2e5b6</id>
<content type='text'>
Pull core block layer updates from Jens Axboe:
 "This is the main pull request for block storage for 4.15-rc1.

  Nothing out of the ordinary in here, and no API changes or anything
  like that. Just various new features for drivers, core changes, etc.
  In particular, this pull request contains:

   - A patch series from Bart, closing the whole on blk/scsi-mq queue
     quescing.

   - A series from Christoph, building towards hidden gendisks (for
     multipath) and ability to move bio chains around.

   - NVMe
        - Support for native multipath for NVMe (Christoph).
        - Userspace notifications for AENs (Keith).
        - Command side-effects support (Keith).
        - SGL support (Chaitanya Kulkarni)
        - FC fixes and improvements (James Smart)
        - Lots of fixes and tweaks (Various)

   - bcache
        - New maintainer (Michael Lyle)
        - Writeback control improvements (Michael)
        - Various fixes (Coly, Elena, Eric, Liang, et al)

   - lightnvm updates, mostly centered around the pblk interface
     (Javier, Hans, and Rakesh).

   - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

   - Writeback series that fix the much discussed hundreds of millions
     of sync-all units. This goes all the way, as discussed previously
     (me).

   - Fix for missing wakeup on writeback timer adjustments (Yafang
     Shao).

   - Fix laptop mode on blk-mq (me).

   - {mq,name} tupple lookup for IO schedulers, allowing us to have
     alias names. This means you can use 'deadline' on both !mq and on
     mq (where it's called mq-deadline). (me).

   - blktrace race fix, oopsing on sg load (me).

   - blk-mq optimizations (me).

   - Obscure waitqueue race fix for kyber (Omar).

   - NBD fixes (Josef).

   - Disable writeback throttling by default on bfq, like we do on cfq
     (Luca Miccio).

   - Series from Ming that enable us to treat flush requests on blk-mq
     like any other request. This is a really nice cleanup.

   - Series from Ming that improves merging on blk-mq with schedulers,
     getting us closer to flipping the switch on scsi-mq again.

   - BFQ updates (Paolo).

   - blk-mq atomic flags memory ordering fixes (Peter Z).

   - Loop cgroup support (Shaohua).

   - Lots of minor fixes from lots of different folks, both for core and
     driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
  nvme: fix visibility of "uuid" ns attribute
  blk-mq: fixup some comment typos and lengths
  ide: ide-atapi: fix compile error with defining macro DEBUG
  blk-mq: improve tag waiting setup for non-shared tags
  brd: remove unused brd_mutex
  blk-mq: only run the hardware queue if IO is pending
  block: avoid null pointer dereference on null disk
  fs: guard_bio_eod() needs to consider partitions
  xtensa/simdisk: fix compile error
  nvme: expose subsys attribute to sysfs
  nvme: create 'slaves' and 'holders' entries for hidden controllers
  block: create 'slaves' and 'holders' entries for hidden gendisks
  nvme: also expose the namespace identification sysfs files for mpath nodes
  nvme: implement multipath access to nvme subsystems
  nvme: track shared namespaces
  nvme: introduce a nvme_ns_ids structure
  nvme: track subsystems
  block, nvme: Introduce blk_mq_req_flags_t
  block, scsi: Make SCSI quiesce and resume work reliably
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull core block layer updates from Jens Axboe:
 "This is the main pull request for block storage for 4.15-rc1.

  Nothing out of the ordinary in here, and no API changes or anything
  like that. Just various new features for drivers, core changes, etc.
  In particular, this pull request contains:

   - A patch series from Bart, closing the whole on blk/scsi-mq queue
     quescing.

   - A series from Christoph, building towards hidden gendisks (for
     multipath) and ability to move bio chains around.

   - NVMe
        - Support for native multipath for NVMe (Christoph).
        - Userspace notifications for AENs (Keith).
        - Command side-effects support (Keith).
        - SGL support (Chaitanya Kulkarni)
        - FC fixes and improvements (James Smart)
        - Lots of fixes and tweaks (Various)

   - bcache
        - New maintainer (Michael Lyle)
        - Writeback control improvements (Michael)
        - Various fixes (Coly, Elena, Eric, Liang, et al)

   - lightnvm updates, mostly centered around the pblk interface
     (Javier, Hans, and Rakesh).

   - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

   - Writeback series that fix the much discussed hundreds of millions
     of sync-all units. This goes all the way, as discussed previously
     (me).

   - Fix for missing wakeup on writeback timer adjustments (Yafang
     Shao).

   - Fix laptop mode on blk-mq (me).

   - {mq,name} tupple lookup for IO schedulers, allowing us to have
     alias names. This means you can use 'deadline' on both !mq and on
     mq (where it's called mq-deadline). (me).

   - blktrace race fix, oopsing on sg load (me).

   - blk-mq optimizations (me).

   - Obscure waitqueue race fix for kyber (Omar).

   - NBD fixes (Josef).

   - Disable writeback throttling by default on bfq, like we do on cfq
     (Luca Miccio).

   - Series from Ming that enable us to treat flush requests on blk-mq
     like any other request. This is a really nice cleanup.

   - Series from Ming that improves merging on blk-mq with schedulers,
     getting us closer to flipping the switch on scsi-mq again.

   - BFQ updates (Paolo).

   - blk-mq atomic flags memory ordering fixes (Peter Z).

   - Loop cgroup support (Shaohua).

   - Lots of minor fixes from lots of different folks, both for core and
     driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
  nvme: fix visibility of "uuid" ns attribute
  blk-mq: fixup some comment typos and lengths
  ide: ide-atapi: fix compile error with defining macro DEBUG
  blk-mq: improve tag waiting setup for non-shared tags
  brd: remove unused brd_mutex
  blk-mq: only run the hardware queue if IO is pending
  block: avoid null pointer dereference on null disk
  fs: guard_bio_eod() needs to consider partitions
  xtensa/simdisk: fix compile error
  nvme: expose subsys attribute to sysfs
  nvme: create 'slaves' and 'holders' entries for hidden controllers
  block: create 'slaves' and 'holders' entries for hidden gendisks
  nvme: also expose the namespace identification sysfs files for mpath nodes
  nvme: implement multipath access to nvme subsystems
  nvme: track shared namespaces
  nvme: introduce a nvme_ns_ids structure
  nvme: track subsystems
  block, nvme: Introduce blk_mq_req_flags_t
  block, scsi: Make SCSI quiesce and resume work reliably
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>License cleanup: add SPDX GPL-2.0 license identifier to files with no license</title>
<updated>2017-11-02T10:10:55+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2017-11-01T14:07:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b24413180f5600bcb3bb70fbed5cf186b60864bd'/>
<id>b24413180f5600bcb3bb70fbed5cf186b60864bd</id>
<content type='text'>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode &amp; Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained &gt;5
   lines of source
 - File already had some variant of a license header in it (even if &lt;5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart &lt;kstewart@linuxfoundation.org&gt;
Reviewed-by: Philippe Ombredanne &lt;pombredanne@nexb.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode &amp; Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained &gt;5
   lines of source
 - File already had some variant of a license header in it (even if &lt;5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart &lt;kstewart@linuxfoundation.org&gt;
Reviewed-by: Philippe Ombredanne &lt;pombredanne@nexb.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>writeback: eliminate work item allocation in bd_start_writeback()</title>
<updated>2017-10-04T17:24:12+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2017-09-30T08:09:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=85009b4f5f0399669a44f07cb9a5622c0e71d419'/>
<id>85009b4f5f0399669a44f07cb9a5622c0e71d419</id>
<content type='text'>
Handle start-all writeback like we do periodic or kupdate
style writeback - by marking the bdi_writeback as needing a full
flush, and simply waking the thread. This eliminates the need to
allocate and queue a specific work item just for this purpose.

After this change, we truly only ever have one of them running at
any point in time. We mark the need to start all flushes, and the
writeback thread will clear it once it has processed the request.

Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Handle start-all writeback like we do periodic or kupdate
style writeback - by marking the bdi_writeback as needing a full
flush, and simply waking the thread. This eliminates the need to
allocate and queue a specific work item just for this purpose.

After this change, we truly only ever have one of them running at
any point in time. We mark the need to start all flushes, and the
writeback thread will clear it once it has processed the request.

Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>writeback: only allow one inflight and pending full flush</title>
<updated>2017-10-03T14:38:17+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2017-09-28T17:31:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=aac8d41cd438f25bf3110fc6b98f1d16d7dbc169'/>
<id>aac8d41cd438f25bf3110fc6b98f1d16d7dbc169</id>
<content type='text'>
When someone calls wakeup_flusher_threads() or
wakeup_flusher_threads_bdi(), they schedule writeback of all dirty
pages in the system (or on that bdi). If we are tight on memory, we
can get tons of these queued from kswapd/vmscan. This causes (at
least) two problems:

1) We consume a ton of memory just allocating writeback work items.
   We've seen as much as 600 million of these writeback work items
   pending. That's a lot of memory to pointlessly hold hostage,
   while the box is under memory pressure.

2) We spend so much time processing these work items, that we
   introduce a softlockup in writeback processing. This is because
   each of the writeback work items don't end up doing any work (it's
   hard when you have millions of identical ones coming in to the
   flush machinery), so we just sit in a tight loop pulling work
   items and deleting/freeing them.

Fix this by adding a 'start_all' bit to the writeback structure, and
set that when someone attempts to flush all dirty pages. The bit is
cleared when we start writeback on that work item. If the bit is
already set when we attempt to queue !nr_pages writeback, then we
simply ignore it.

This provides us one full flush in flight, with one pending as well,
and makes for more efficient handling of this type of writeback.

Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Tested-by: Chris Mason &lt;clm@fb.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When someone calls wakeup_flusher_threads() or
wakeup_flusher_threads_bdi(), they schedule writeback of all dirty
pages in the system (or on that bdi). If we are tight on memory, we
can get tons of these queued from kswapd/vmscan. This causes (at
least) two problems:

1) We consume a ton of memory just allocating writeback work items.
   We've seen as much as 600 million of these writeback work items
   pending. That's a lot of memory to pointlessly hold hostage,
   while the box is under memory pressure.

2) We spend so much time processing these work items, that we
   introduce a softlockup in writeback processing. This is because
   each of the writeback work items don't end up doing any work (it's
   hard when you have millions of identical ones coming in to the
   flush machinery), so we just sit in a tight loop pulling work
   items and deleting/freeing them.

Fix this by adding a 'start_all' bit to the writeback structure, and
set that when someone attempts to flush all dirty pages. The bit is
cleared when we start writeback on that work item. If the bit is
already set when we attempt to queue !nr_pages writeback, then we
simply ignore it.

This provides us one full flush in flight, with one pending as well,
and makes for more efficient handling of this type of writeback.

Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Tested-by: Chris Mason &lt;clm@fb.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: Provide infrastructure for dynamic BDIs in filesystems</title>
<updated>2017-04-20T18:09:55+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-04-12T10:24:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fca39346a55bb7196888ffc77d9e3557340d1d0b'/>
<id>fca39346a55bb7196888ffc77d9e3557340d1d0b</id>
<content type='text'>
Provide helper functions for setting up dynamically allocated
backing_dev_info structures for filesystems and cleaning them up on
superblock destruction.

CC: linux-mtd@lists.infradead.org
CC: linux-nfs@vger.kernel.org
CC: Petr Vandrovec &lt;petr@vandrovec.name&gt;
CC: linux-nilfs@vger.kernel.org
CC: cluster-devel@redhat.com
CC: osd-dev@open-osd.org
CC: codalist@coda.cs.cmu.edu
CC: linux-afs@lists.infradead.org
CC: ecryptfs@vger.kernel.org
CC: linux-cifs@vger.kernel.org
CC: ceph-devel@vger.kernel.org
CC: linux-btrfs@vger.kernel.org
CC: v9fs-developer@lists.sourceforge.net
CC: lustre-devel@lists.lustre.org
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Provide helper functions for setting up dynamically allocated
backing_dev_info structures for filesystems and cleaning them up on
superblock destruction.

CC: linux-mtd@lists.infradead.org
CC: linux-nfs@vger.kernel.org
CC: Petr Vandrovec &lt;petr@vandrovec.name&gt;
CC: linux-nilfs@vger.kernel.org
CC: cluster-devel@redhat.com
CC: osd-dev@open-osd.org
CC: codalist@coda.cs.cmu.edu
CC: linux-afs@lists.infradead.org
CC: ecryptfs@vger.kernel.org
CC: linux-cifs@vger.kernel.org
CC: ceph-devel@vger.kernel.org
CC: linux-btrfs@vger.kernel.org
CC: v9fs-developer@lists.sourceforge.net
CC: lustre-devel@lists.lustre.org
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bdi: Do not wait for cgwbs release in bdi_unregister()</title>
<updated>2017-03-23T02:11:30+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-03-23T00:36:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4514451e79ae5baabb85d22ba3523602e59d5218'/>
<id>4514451e79ae5baabb85d22ba3523602e59d5218</id>
<content type='text'>
Currently we wait for all cgwbs to get released in cgwb_bdi_destroy()
(called from bdi_unregister()). That is however unnecessary now when
cgwb-&gt;bdi is a proper refcounted reference (thus bdi cannot get
released before all cgwbs are released) and when cgwb_bdi_destroy()
shuts down writeback directly.

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we wait for all cgwbs to get released in cgwb_bdi_destroy()
(called from bdi_unregister()). That is however unnecessary now when
cgwb-&gt;bdi is a proper refcounted reference (thus bdi cannot get
released before all cgwbs are released) and when cgwb_bdi_destroy()
shuts down writeback directly.

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()</title>
<updated>2017-03-23T02:11:28+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-03-23T00:36:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5318ce7d46866e1dbc20ab9349b93753edba0b3e'/>
<id>5318ce7d46866e1dbc20ab9349b93753edba0b3e</id>
<content type='text'>
Currently we waited for all cgwbs to get freed in cgwb_bdi_destroy()
which also means that writeback has been shutdown on them. Since this
wait is going away, directly shutdown writeback on cgwbs from
cgwb_bdi_destroy() to avoid live writeback structures after
bdi_unregister() has finished. To make that safe with concurrent
shutdown from cgwb_release_workfn(), we also have to make sure
wb_shutdown() returns only after the bdi_writeback structure is really
shutdown.

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we waited for all cgwbs to get freed in cgwb_bdi_destroy()
which also means that writeback has been shutdown on them. Since this
wait is going away, directly shutdown writeback on cgwbs from
cgwb_bdi_destroy() to avoid live writeback structures after
bdi_unregister() has finished. To make that safe with concurrent
shutdown from cgwb_release_workfn(), we also have to make sure
wb_shutdown() returns only after the bdi_writeback structure is really
shutdown.

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bdi: Mark congested-&gt;bdi as internal</title>
<updated>2017-03-23T02:11:24+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-03-23T00:36:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b7d680d7bf584bce6023343304b819009a7c3336'/>
<id>b7d680d7bf584bce6023343304b819009a7c3336</id>
<content type='text'>
congested-&gt;bdi pointer is used only to be able to remove congested
structure from bdi-&gt;cgwb_congested_tree on structure release. Moreover
the pointer can become NULL when we unregister the bdi. Rename the field
to __bdi and add a comment to make it more explicit this is internal
stuff of memcg writeback code and people should not use the field as
such use will be likely race prone.

We do not bother with converting congested-&gt;bdi to a proper refcounted
reference. It will be slightly ugly to special-case bdi-&gt;wb.congested to
avoid effectively a cyclic reference of bdi to itself and the reference
gets cleared from bdi_unregister() making it impossible to reference
a freed bdi.

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
congested-&gt;bdi pointer is used only to be able to remove congested
structure from bdi-&gt;cgwb_congested_tree on structure release. Moreover
the pointer can become NULL when we unregister the bdi. Rename the field
to __bdi and add a comment to make it more explicit this is internal
stuff of memcg writeback code and people should not use the field as
such use will be likely race prone.

We do not bother with converting congested-&gt;bdi to a proper refcounted
reference. It will be slightly ugly to special-case bdi-&gt;wb.congested to
avoid effectively a cyclic reference of bdi to itself and the reference
gets cleared from bdi_unregister() making it impossible to reference
a freed bdi.

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: Dynamically allocate and refcount backing_dev_info</title>
<updated>2017-02-02T15:20:50+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-02-02T14:56:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d03f6cdc1fc422accb734c7c07a661a0018d8631'/>
<id>d03f6cdc1fc422accb734c7c07a661a0018d8631</id>
<content type='text'>
Instead of storing backing_dev_info inside struct request_queue,
allocate it dynamically, reference count it, and free it when the last
reference is dropped. Currently only request_queue holds the reference
but in the following patch we add other users referencing
backing_dev_info.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of storing backing_dev_info inside struct request_queue,
allocate it dynamically, reference count it, and free it when the last
reference is dropped. Currently only request_queue holds the reference
but in the following patch we add other users referencing
backing_dev_info.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-block</title>
<updated>2016-12-13T18:19:16+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-12-13T18:19:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=36869cb93d36269f34800b3384ba7991060a69cf'/>
<id>36869cb93d36269f34800b3384ba7991060a69cf</id>
<content type='text'>
Pull block layer updates from Jens Axboe:
 "This is the main block pull request this series. Contrary to previous
  release, I've kept the core and driver changes in the same branch. We
  always ended up having dependencies between the two for obvious
  reasons, so makes more sense to keep them together. That said, I'll
  probably try and keep more topical branches going forward, especially
  for cycles that end up being as busy as this one.

  The major parts of this pull request is:

   - Improved support for O_DIRECT on block devices, with a small
     private implementation instead of using the pig that is
     fs/direct-io.c. From Christoph.

   - Request completion tracking in a scalable fashion. This is utilized
     by two components in this pull, the new hybrid polling and the
     writeback queue throttling code.

   - Improved support for polling with O_DIRECT, adding a hybrid mode
     that combines pure polling with an initial sleep. From me.

   - Support for automatic throttling of writeback queues on the block
     side. This uses feedback from the device completion latencies to
     scale the queue on the block side up or down. From me.

   - Support from SMR drives in the block layer and for SD. From Hannes
     and Shaun.

   - Multi-connection support for nbd. From Josef.

   - Cleanup of request and bio flags, so we have a clear split between
     which are bio (or rq) private, and which ones are shared. From
     Christoph.

   - A set of patches from Bart, that improve how we handle queue
     stopping and starting in blk-mq.

   - Support for WRITE_ZEROES from Chaitanya.

   - Lightnvm updates from Javier/Matias.

   - Supoort for FC for the nvme-over-fabrics code. From James Smart.

   - A bunch of fixes from a whole slew of people, too many to name
     here"

* 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits)
  blk-stat: fix a few cases of missing batch flushing
  blk-flush: run the queue when inserting blk-mq flush
  elevator: make the rqhash helpers exported
  blk-mq: abstract out blk_mq_dispatch_rq_list() helper
  blk-mq: add blk_mq_start_stopped_hw_queue()
  block: improve handling of the magic discard payload
  blk-wbt: don't throttle discard or write zeroes
  nbd: use dev_err_ratelimited in io path
  nbd: reset the setup task for NBD_CLEAR_SOCK
  nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME
  nvme-fabrics: Add target support for FC transport
  nvme-fabrics: Add host support for FC transport
  nvme-fabrics: Add FC transport LLDD api definitions
  nvme-fabrics: Add FC transport FC-NVME definitions
  nvme-fabrics: Add FC transport error codes to nvme.h
  Add type 0x28 NVME type code to scsi fc headers
  nvme-fabrics: patch target code in prep for FC transport support
  nvme-fabrics: set sqe.command_id in core not transports
  parser: add u64 number parser
  nvme-rdma: align to generic ib_event logging helper
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull block layer updates from Jens Axboe:
 "This is the main block pull request this series. Contrary to previous
  release, I've kept the core and driver changes in the same branch. We
  always ended up having dependencies between the two for obvious
  reasons, so makes more sense to keep them together. That said, I'll
  probably try and keep more topical branches going forward, especially
  for cycles that end up being as busy as this one.

  The major parts of this pull request is:

   - Improved support for O_DIRECT on block devices, with a small
     private implementation instead of using the pig that is
     fs/direct-io.c. From Christoph.

   - Request completion tracking in a scalable fashion. This is utilized
     by two components in this pull, the new hybrid polling and the
     writeback queue throttling code.

   - Improved support for polling with O_DIRECT, adding a hybrid mode
     that combines pure polling with an initial sleep. From me.

   - Support for automatic throttling of writeback queues on the block
     side. This uses feedback from the device completion latencies to
     scale the queue on the block side up or down. From me.

   - Support from SMR drives in the block layer and for SD. From Hannes
     and Shaun.

   - Multi-connection support for nbd. From Josef.

   - Cleanup of request and bio flags, so we have a clear split between
     which are bio (or rq) private, and which ones are shared. From
     Christoph.

   - A set of patches from Bart, that improve how we handle queue
     stopping and starting in blk-mq.

   - Support for WRITE_ZEROES from Chaitanya.

   - Lightnvm updates from Javier/Matias.

   - Supoort for FC for the nvme-over-fabrics code. From James Smart.

   - A bunch of fixes from a whole slew of people, too many to name
     here"

* 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits)
  blk-stat: fix a few cases of missing batch flushing
  blk-flush: run the queue when inserting blk-mq flush
  elevator: make the rqhash helpers exported
  blk-mq: abstract out blk_mq_dispatch_rq_list() helper
  blk-mq: add blk_mq_start_stopped_hw_queue()
  block: improve handling of the magic discard payload
  blk-wbt: don't throttle discard or write zeroes
  nbd: use dev_err_ratelimited in io path
  nbd: reset the setup task for NBD_CLEAR_SOCK
  nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME
  nvme-fabrics: Add target support for FC transport
  nvme-fabrics: Add host support for FC transport
  nvme-fabrics: Add FC transport LLDD api definitions
  nvme-fabrics: Add FC transport FC-NVME definitions
  nvme-fabrics: Add FC transport error codes to nvme.h
  Add type 0x28 NVME type code to scsi fc headers
  nvme-fabrics: patch target code in prep for FC transport support
  nvme-fabrics: set sqe.command_id in core not transports
  parser: add u64 number parser
  nvme-rdma: align to generic ib_event logging helper
  ...
</pre>
</div>
</content>
</entry>
</feed>
