<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/mtd/mtdcore.c, branch v3.11</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>drivers: avoid format string in dev_set_name</title>
<updated>2013-07-03T23:07:41+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2013-07-03T22:04:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=02aa2a37636c8fa4fb9322d91be46ff8225b7de0'/>
<id>02aa2a37636c8fa4fb9322d91be46ff8225b7de0</id>
<content type='text'>
Calling dev_set_name with a single paramter causes it to be handled as a
format string.  Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents,
including wrappers like device_create*() and bdi_register().

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Calling dev_set_name with a single paramter causes it to be handled as a
format string.  Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents,
including wrappers like device_create*() and bdi_register().

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-linus-20130509' of git://git.infradead.org/linux-mtd</title>
<updated>2013-05-09T17:15:46+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-05-09T17:15:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a637b0d45947df686979b85361ad5bfa9d19fdd3'/>
<id>a637b0d45947df686979b85361ad5bfa9d19fdd3</id>
<content type='text'>
Pull MTD update from David Woodhouse:

 - Lots of cleanups from Artem, including deletion of some obsolete
   drivers

 - Support partitions larger than 4GiB in device tree

 - Support for new SPI chips

* tag 'for-linus-20130509' of git://git.infradead.org/linux-mtd: (83 commits)
  mtd: omap2: Use module_platform_driver()
  mtd: bf5xx_nand: Use module_platform_driver()
  mtd: denali_dt: Remove redundant use of of_match_ptr
  mtd: denali_dt: Change return value to fix smatch warning
  mtd: denali_dt: Use module_platform_driver()
  mtd: denali_dt: Fix incorrect error check
  mtd: nand: subpage write support for hardware based ECC schemes
  mtd: omap2: use msecs_to_jiffies()
  mtd: nand_ids: use size macros
  mtd: nand_ids: improve LEGACY_ID_NAND macro a bit
  mtd: add 4 Toshiba nand chips for the full-id case
  mtd: add the support to parse out the full-id nand type
  mtd: add new fields to nand_flash_dev{}
  mtd: sh_flctl: Use of_match_ptr() macro
  mtd: gpio: Use of_match_ptr() macro
  mtd: gpio: Use devm_kzalloc()
  mtd: davinci_nand: Use of_match_ptr()
  mtd: dataflash: Use of_match_ptr() macro
  mtd: remove h720x flash support
  mtd: onenand: remove OneNAND simulator
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull MTD update from David Woodhouse:

 - Lots of cleanups from Artem, including deletion of some obsolete
   drivers

 - Support partitions larger than 4GiB in device tree

 - Support for new SPI chips

* tag 'for-linus-20130509' of git://git.infradead.org/linux-mtd: (83 commits)
  mtd: omap2: Use module_platform_driver()
  mtd: bf5xx_nand: Use module_platform_driver()
  mtd: denali_dt: Remove redundant use of of_match_ptr
  mtd: denali_dt: Change return value to fix smatch warning
  mtd: denali_dt: Use module_platform_driver()
  mtd: denali_dt: Fix incorrect error check
  mtd: nand: subpage write support for hardware based ECC schemes
  mtd: omap2: use msecs_to_jiffies()
  mtd: nand_ids: use size macros
  mtd: nand_ids: improve LEGACY_ID_NAND macro a bit
  mtd: add 4 Toshiba nand chips for the full-id case
  mtd: add the support to parse out the full-id nand type
  mtd: add new fields to nand_flash_dev{}
  mtd: sh_flctl: Use of_match_ptr() macro
  mtd: gpio: Use of_match_ptr() macro
  mtd: gpio: Use devm_kzalloc()
  mtd: davinci_nand: Use of_match_ptr()
  mtd: dataflash: Use of_match_ptr() macro
  mtd: remove h720x flash support
  mtd: onenand: remove OneNAND simulator
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Include missing linux/slab.h inclusions</title>
<updated>2013-04-29T19:42:01+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2013-04-11T22:51:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0d01ff2583086fd532181d2ee16112f5342eb78d'/>
<id>0d01ff2583086fd532181d2ee16112f5342eb78d</id>
<content type='text'>
Include missing linux/slab.h inclusions where the source file is currently
expecting to get kmalloc() and co. through linux/proc_fs.h.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
cc: linux-s390@vger.kernel.org
cc: sparclinux@vger.kernel.org
cc: linux-efi@vger.kernel.org
cc: linux-mtd@lists.infradead.org
cc: devel@driverdev.osuosl.org
cc: x86@kernel.org
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Include missing linux/slab.h inclusions where the source file is currently
expecting to get kmalloc() and co. through linux/proc_fs.h.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
cc: linux-s390@vger.kernel.org
cc: sparclinux@vger.kernel.org
cc: linux-efi@vger.kernel.org
cc: linux-mtd@lists.infradead.org
cc: devel@driverdev.osuosl.org
cc: x86@kernel.org
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mtd: merge mtdchar module with mtdcore</title>
<updated>2013-04-05T12:16:54+00:00</updated>
<author>
<name>Artem Bityutskiy</name>
<email>artem.bityutskiy@linux.intel.com</email>
</author>
<published>2013-03-14T11:27:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=660685d9d1b4730f0b5ca97fa95f272f99c63bce'/>
<id>660685d9d1b4730f0b5ca97fa95f272f99c63bce</id>
<content type='text'>
The MTD subsystem has historically tried to be as configurable as possible. The
side-effect of this is that its configuration menu is rather large, and we are
gradually shrinking it. For example, we recently merged partitions support with
the mtdcore.

This patch does the next step - it merges the mtdchar module to mtdcore. And in
this case this is not only about eliminating too fine-grained separation and
simplifying the configuration menu. This is also about eliminating seemingly
useless kernel module.

Indeed, mtdchar is a module that allows user-space making use of MTD devices
via /dev/mtd* character devices. If users do not enable it, they simply cannot
use MTD devices at all. They cannot read or write the flash contents. Is it a
sane and useful setup? I believe not. And everyone just enables mtdchar.

Having mtdchar separate is also a little bit harmful. People sometimes miss the
fact that they need to enable an additional configuration option to have
user-space MTD interfaces, and then they wonder why on earth the kernel does
not allow using the flash? They spend time asking around.

Thus, let's just get rid of this module and make it part of mtd core.

Note, mtdchar had additional configuration option to enable OTP interfaces,
which are present on some flashes. I removed that option as well - it saves a
really tiny amount space.

[dwmw2: Strictly speaking, you can mount file systems on MTD devices just
        fine without the mtdchar (or mtdblock) devices; you just can't do
        other manipulations directly on the underlying device. But still I
        agree that it makes sense to make this unconditional. And Yay! we
        get to kill off an instance of checking CONFIG_foo_MODULE, which is
        an abomination that should never happen.]

Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The MTD subsystem has historically tried to be as configurable as possible. The
side-effect of this is that its configuration menu is rather large, and we are
gradually shrinking it. For example, we recently merged partitions support with
the mtdcore.

This patch does the next step - it merges the mtdchar module to mtdcore. And in
this case this is not only about eliminating too fine-grained separation and
simplifying the configuration menu. This is also about eliminating seemingly
useless kernel module.

Indeed, mtdchar is a module that allows user-space making use of MTD devices
via /dev/mtd* character devices. If users do not enable it, they simply cannot
use MTD devices at all. They cannot read or write the flash contents. Is it a
sane and useful setup? I believe not. And everyone just enables mtdchar.

Having mtdchar separate is also a little bit harmful. People sometimes miss the
fact that they need to enable an additional configuration option to have
user-space MTD interfaces, and then they wonder why on earth the kernel does
not allow using the flash? They spend time asking around.

Thus, let's just get rid of this module and make it part of mtd core.

Note, mtdchar had additional configuration option to enable OTP interfaces,
which are present on some flashes. I removed that option as well - it saves a
really tiny amount space.

[dwmw2: Strictly speaking, you can mount file systems on MTD devices just
        fine without the mtdchar (or mtdblock) devices; you just can't do
        other manipulations directly on the underlying device. But still I
        agree that it makes sense to make this unconditional. And Yay! we
        get to kill off an instance of checking CONFIG_foo_MODULE, which is
        an abomination that should never happen.]

Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mtd: mtdcore: remove few useless #ifdef's</title>
<updated>2013-04-05T12:15:31+00:00</updated>
<author>
<name>Artem Bityutskiy</name>
<email>artem.bityutskiy@linux.intel.com</email>
</author>
<published>2013-03-15T10:56:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=93e562141ac58452bcea48718af92cc62a20669d'/>
<id>93e562141ac58452bcea48718af92cc62a20669d</id>
<content type='text'>
Remove a couple of useles '#ifdef CONFIG_PROC_FS's around procfs functions
which anyway turn into empty function in 'proc_fs.h' file when CONFIG_PROC_FS
is not defined.

Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove a couple of useles '#ifdef CONFIG_PROC_FS's around procfs functions
which anyway turn into empty function in 'proc_fs.h' file when CONFIG_PROC_FS
is not defined.

Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mtd: add 'const' qualifier to a couple of register functions</title>
<updated>2013-04-05T12:02:16+00:00</updated>
<author>
<name>Artem Bityutskiy</name>
<email>artem.bityutskiy@linux.intel.com</email>
</author>
<published>2013-03-11T13:38:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=26a4734623e4f06752014336b05cf3ae77158892'/>
<id>26a4734623e4f06752014336b05cf3ae77158892</id>
<content type='text'>
'mtd_device_parse_register()' and 'parse_mtd_partitions()' functions accept a
an array of character pointers. These functions modify neither the pointers nor
the characters they point to. The characters are actually names of the MTD
parsers.

At the moment, the argument type is 'const char **', which means that only the
names of the parsers are constant. Let's turn the argument type into 'const
char * const *', which means that both names and the pointers which point to
them are constant.

Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
'mtd_device_parse_register()' and 'parse_mtd_partitions()' functions accept a
an array of character pointers. These functions modify neither the pointers nor
the characters they point to. The characters are actually names of the MTD
parsers.

At the moment, the argument type is 'const char **', which means that only the
names of the parsers are constant. Let's turn the argument type into 'const
char * const *', which means that both names and the pointers which point to
them are constant.

Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mtd: convert to idr_alloc()</title>
<updated>2013-02-28T03:10:18+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-02-28T01:04:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=589e9c4dace6995440c119486919ce95b180dd38'/>
<id>589e9c4dace6995440c119486919ce95b180dd38</id>
<content type='text'>
Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Tested-by: Ezequiel Garcia &lt;ezequiel.garcia@free-electrons.com&gt;
Cc: David Woodhouse &lt;dwmw2@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Tested-by: Ezequiel Garcia &lt;ezequiel.garcia@free-electrons.com&gt;
Cc: David Woodhouse &lt;dwmw2@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage</title>
<updated>2012-12-10T19:03:05+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-10T18:51:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=caf491916b1c1e939a2c7575efb7a77f11fc9bdf'/>
<id>caf491916b1c1e939a2c7575efb7a77f11fc9bdf</id>
<content type='text'>
This reverts commits a50915394f1fc02c2861d3b7ce7014788aa5066e and
d7c3b937bdf45f0b844400b7bf6fd3ed50bac604.

This is a revert of a revert of a revert.  In addition, it reverts the
even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the
original commits in linux-next.

It turns out that the original patch really was bogus, and that the
original revert was the correct thing to do after all.  We thought we
had fixed the problem, and then reverted the revert, but the problem
really is fundamental: waking up kswapd simply isn't the right thing to
do, and direct reclaim sometimes simply _is_ the right thing to do.

When certain allocations fail, we simply should try some direct reclaim,
and if that fails, fail the allocation.  That's the right thing to do
for THP allocations, which can easily fail, and the GPU allocations want
to do that too.

So starting kswapd is sometimes simply wrong, and removing the flag that
said "don't start kswapd" was a mistake.  Let's hope we never revisit
this mistake again - and certainly not this many times ;)

Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commits a50915394f1fc02c2861d3b7ce7014788aa5066e and
d7c3b937bdf45f0b844400b7bf6fd3ed50bac604.

This is a revert of a revert of a revert.  In addition, it reverts the
even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the
original commits in linux-next.

It turns out that the original patch really was bogus, and that the
original revert was the correct thing to do after all.  We thought we
had fixed the problem, and then reverted the revert, but the problem
really is fundamental: waking up kswapd simply isn't the right thing to
do, and direct reclaim sometimes simply _is_ the right thing to do.

When certain allocations fail, we simply should try some direct reclaim,
and if that fails, fail the allocation.  That's the right thing to do
for THP allocations, which can easily fail, and the GPU allocations want
to do that too.

So starting kswapd is sometimes simply wrong, and removing the flag that
said "don't start kswapd" was a mistake.  Let's hope we never revisit
this mistake again - and certainly not this many times ;)

Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>revert "Revert "mm: remove __GFP_NO_KSWAPD""</title>
<updated>2012-11-30T16:51:17+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2012-11-29T21:54:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a50915394f1fc02c2861d3b7ce7014788aa5066e'/>
<id>a50915394f1fc02c2861d3b7ce7014788aa5066e</id>
<content type='text'>
It apepars that this patch was innocent, and we hope that "mm: avoid
waking kswapd for THP allocations when compaction is deferred or
contended" will fix the final kswapd-spinning cause.

Cc: Zdenek Kabelac &lt;zkabelac@redhat.com&gt;
Cc: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Cc: Valdis Kletnieks &lt;Valdis.Kletnieks@vt.edu&gt;
Cc: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Robert Jennings &lt;rcj@linux.vnet.ibm.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It apepars that this patch was innocent, and we hope that "mm: avoid
waking kswapd for THP allocations when compaction is deferred or
contended" will fix the final kswapd-spinning cause.

Cc: Zdenek Kabelac &lt;zkabelac@redhat.com&gt;
Cc: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Cc: Valdis Kletnieks &lt;Valdis.Kletnieks@vt.edu&gt;
Cc: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Robert Jennings &lt;rcj@linux.vnet.ibm.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "mm: remove __GFP_NO_KSWAPD"</title>
<updated>2012-11-27T01:41:24+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-11-27T00:29:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=82b212f40059bffd6808c07266a942d444d5558a'/>
<id>82b212f40059bffd6808c07266a942d444d5558a</id>
<content type='text'>
With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
based on failures" reverted, Zdenek Kabelac reported the following

  Hmm,  so it's just took longer to hit the problem and observe
  kswapd0 spinning on my CPU again - it's not as endless like before -
  but still it easily eats minutes - it helps to	turn off  Firefox
  or TB  (memory hungry apps) so kswapd0 stops soon - and restart
  those apps again.  (And I still have like &gt;1GB of cached memory)

  kswapd0         R  running task        0    30      2 0x00000000
  Call Trace:
    preempt_schedule+0x42/0x60
    _raw_spin_unlock+0x55/0x60
    put_super+0x31/0x40
    drop_super+0x22/0x30
    prune_super+0x149/0x1b0
    shrink_slab+0xba/0x510

The sysrq+m indicates the system has no swap so it'll never reclaim
anonymous pages as part of reclaim/compaction.  That is one part of the
problem but not the root cause as file-backed pages could also be
reclaimed.

The likely underlying problem is that kswapd is woken up or kept awake
for each THP allocation request in the page allocator slow path.

If compaction fails for the requesting process then compaction will be
deferred for a time and direct reclaim is avoided.  However, if there
are a storm of THP requests that are simply rejected, it will still be
the the case that kswapd is awake for a prolonged period of time as
pgdat-&gt;kswapd_max_order is updated each time.  This is noticed by the
main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
it will loopp, shrinking a small number of pages and calling
shrink_slab() on each iteration.

The temptation is to supply a patch that checks if kswapd was woken for
THP and if so ignore pgdat-&gt;kswapd_max_order but it'll be a hack and not
backed up by proper testing.  As 3.7 is very close to release and this
is not a bug we should release with, a safer path is to revert "mm:
remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
out the balance_pgdat() logic in general.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Zdenek Kabelac &lt;zkabelac@redhat.com&gt;
Cc: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Cc: Valdis Kletnieks &lt;Valdis.Kletnieks@vt.edu&gt;
Cc: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Robert Jennings &lt;rcj@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
based on failures" reverted, Zdenek Kabelac reported the following

  Hmm,  so it's just took longer to hit the problem and observe
  kswapd0 spinning on my CPU again - it's not as endless like before -
  but still it easily eats minutes - it helps to	turn off  Firefox
  or TB  (memory hungry apps) so kswapd0 stops soon - and restart
  those apps again.  (And I still have like &gt;1GB of cached memory)

  kswapd0         R  running task        0    30      2 0x00000000
  Call Trace:
    preempt_schedule+0x42/0x60
    _raw_spin_unlock+0x55/0x60
    put_super+0x31/0x40
    drop_super+0x22/0x30
    prune_super+0x149/0x1b0
    shrink_slab+0xba/0x510

The sysrq+m indicates the system has no swap so it'll never reclaim
anonymous pages as part of reclaim/compaction.  That is one part of the
problem but not the root cause as file-backed pages could also be
reclaimed.

The likely underlying problem is that kswapd is woken up or kept awake
for each THP allocation request in the page allocator slow path.

If compaction fails for the requesting process then compaction will be
deferred for a time and direct reclaim is avoided.  However, if there
are a storm of THP requests that are simply rejected, it will still be
the the case that kswapd is awake for a prolonged period of time as
pgdat-&gt;kswapd_max_order is updated each time.  This is noticed by the
main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
it will loopp, shrinking a small number of pages and calling
shrink_slab() on each iteration.

The temptation is to supply a patch that checks if kswapd was woken for
THP and if so ignore pgdat-&gt;kswapd_max_order but it'll be a hack and not
backed up by proper testing.  As 3.7 is very close to release and this
is not a bug we should release with, a safer path is to revert "mm:
remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
out the balance_pgdat() logic in general.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Zdenek Kabelac &lt;zkabelac@redhat.com&gt;
Cc: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Cc: Valdis Kletnieks &lt;Valdis.Kletnieks@vt.edu&gt;
Cc: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Robert Jennings &lt;rcj@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
