linux-toradex.git/mm/compaction.c, branch v4.7

mm, compaction: prevent VM_BUG_ON when terminating freeing scanner

2016-07-15T05:54:27+00:00

It's possible to isolate some freepages in a pageblock and then fail
split_free_page() due to the low watermark check.  In this case, we hit
VM_BUG_ON() because the freeing scanner terminated early without a
contended lock or enough freepages.

This should never have been a VM_BUG_ON() since it's not a fatal
condition.  It should have been a VM_WARN_ON() at best, or even handled
gracefully.

Regardless, we need to terminate anytime the full pageblock scan was not
done.  The logic belongs in isolate_freepages_block(), so handle its
state gracefully by terminating the pageblock loop and making a note to
restart at the same pageblock next time since it was not possible to
complete the scan this time.

[rientjes@google.com: don't rescan pages in a pageblock]
  Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1607111244150.83138@chino.kir.corp.google.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1606291436300.145590@chino.kir.corp.google.com
Signed-off-by: David Rientjes 
Reported-by: Minchan Kim 
Tested-by: Minchan Kim 
Cc: Joonsoo Kim 
Cc: Hugh Dickins 
Cc: Mel Gorman 
Cc: Vlastimil Babka 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm, compaction: abort free scanner if split fails

2016-06-25T00:23:52+00:00

If the memory compaction free scanner cannot successfully split a free
page (only possible due to per-zone low watermark), terminate the free
scanner rather than continuing to scan memory needlessly.  If the
watermark is insufficient for a free page of order <= cc->order, then
terminate the scanner since all future splits will also likely fail.

This prevents the compaction freeing scanner from scanning all memory on
very large zones (very noticeable for zones > 128GB, for instance) when
all splits will likely fail while holding zone->lock.

compaction_alloc() iterating a 128GB zone has been benchmarked to take
over 400ms on some systems whereas any free page isolated and ready to
be split ends up failing in split_free_page() because of the low
watermark check and thus the iteration continues.

The next time compaction occurs, the freeing scanner will likely start
at the end of the zone again since no success was made previously and we
get the same lengthy iteration until the zone is brought above the low
watermark.  All thp page faults can take >400ms in such a state without
this fix.

Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1606211820350.97086@chino.kir.corp.google.com
Signed-off-by: David Rientjes 
Acked-by: Vlastimil Babka 
Cc: Minchan Kim 
Cc: Joonsoo Kim 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm/compaction.c: fix zoneindex in kcompactd()

2016-05-21T00:58:30+00:00

While testing the kcompactd in my platform 3G MEM only DMA ZONE.  I
found the kcompactd never wakeup.  It seems the zoneindex has already
minus 1 before.  So the traverse here should be <=.

It fixes a regression where kswapd could previously compact, but
kcompactd not.  Not a crash fix though.

[akpm@linux-foundation.org: fix kcompactd_do_work() as well, per Hugh]
Link: http://lkml.kernel.org/r/1463659121-84124-1-git-send-email-puck.chen@hisilicon.com
Fixes: accf62422b3a ("mm, kswapd: replace kswapd compaction with waking up kcompactd")
Signed-off-by: Chen Feng 
Acked-by: Vlastimil Babka 
Cc: Hugh Dickins 
Cc: Michal Hocko 
Cc: Kirill A. Shutemov 
Cc: Johannes Weiner 
Cc: Tejun Heo 
Cc: Zhuangluan Su 
Cc: Yiping Xu 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm, oom, compaction: prevent from should_compact_retry looping for ever for costly orders

2016-05-21T00:58:30+00:00

"mm: consider compaction feedback also for costly allocation" has
removed the upper bound for the reclaim/compaction retries based on the
number of reclaimed pages for costly orders.  While this is desirable
the patch did miss a mis interaction between reclaim, compaction and the
retry logic.  The direct reclaim tries to get zones over min watermark
while compaction backs off and returns COMPACT_SKIPPED when all zones
are below low watermark + 1<
Acked-by: Hillf Danton 
Acked-by: Vlastimil Babka 
Cc: David Rientjes 
Cc: Johannes Weiner 
Cc: Joonsoo Kim 
Cc: Mel Gorman 
Cc: Tetsuo Handa 
Cc: Vladimir Davydov 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm, compaction: distinguish between full and partial COMPACT_COMPLETE

2016-05-21T00:58:30+00:00

COMPACT_COMPLETE now means that compaction and free scanner met.  This
is not very useful information if somebody just wants to use this
feedback and make any decisions based on that.  The current caller might
be a poor guy who just happened to scan tiny portion of the zone and
that could be the reason no suitable pages were compacted.  Make sure we
distinguish the full and partial zone walks.

Consumers should treat COMPACT_PARTIAL_SKIPPED as a potential success
and be optimistic in retrying.

The existing users of COMPACT_COMPLETE are conservatively changed to use
COMPACT_PARTIAL_SKIPPED as well but some of them should be probably
reconsidered and only defer the compaction only for COMPACT_COMPLETE
with the new semantic.

This patch shouldn't introduce any functional changes.

Signed-off-by: Michal Hocko 
Acked-by: Vlastimil Babka 
Acked-by: Hillf Danton 
Cc: David Rientjes 
Cc: Johannes Weiner 
Cc: Joonsoo Kim 
Cc: Mel Gorman 
Cc: Tetsuo Handa 
Cc: Vladimir Davydov 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm, compaction: distinguish COMPACT_DEFERRED from COMPACT_SKIPPED

2016-05-21T00:58:30+00:00

try_to_compact_pages() can currently return COMPACT_SKIPPED even when
the compaction is defered for some zone just because zone DMA is skipped
in 99% of cases due to watermark checks.  This makes COMPACT_DEFERRED
basically unusable for the page allocator as a feedback mechanism.

Make sure we distinguish those two states properly and switch their
ordering in the enum.  This would mean that the COMPACT_SKIPPED will be
returned only when all eligible zones are skipped.

As a result COMPACT_DEFERRED handling for THP in __alloc_pages_slowpath
will be more precise and we would bail out rather than reclaim.

Signed-off-by: Michal Hocko 
Acked-by: Vlastimil Babka 
Acked-by: Hillf Danton 
Cc: David Rientjes 
Cc: Johannes Weiner 
Cc: Joonsoo Kim 
Cc: Mel Gorman 
Cc: Tetsuo Handa 
Cc: Vladimir Davydov 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm, compaction: cover all compaction mode in compact_zone

2016-05-21T00:58:30+00:00

The compiler is complaining after "mm, compaction: change COMPACT_
constants into enum"

  mm/compaction.c: In function `compact_zone':
  mm/compaction.c:1350:2: warning: enumeration value `COMPACT_DEFERRED' not handled in switch [-Wswitch]
    switch (ret) {
    ^
  mm/compaction.c:1350:2: warning: enumeration value `COMPACT_COMPLETE' not handled in switch [-Wswitch]
  mm/compaction.c:1350:2: warning: enumeration value `COMPACT_NO_SUITABLE_PAGE' not handled in switch [-Wswitch]
  mm/compaction.c:1350:2: warning: enumeration value `COMPACT_NOT_SUITABLE_ZONE' not handled in switch [-Wswitch]
  mm/compaction.c:1350:2: warning: enumeration value `COMPACT_CONTENDED' not handled in switch [-Wswitch]

compaction_suitable is allowed to return only COMPACT_PARTIAL,
COMPACT_SKIPPED and COMPACT_CONTINUE so other cases are simply
impossible.  Put a VM_BUG_ON to catch an impossible return value.

Signed-off-by: Michal Hocko 
Acked-by: Vlastimil Babka 
Acked-by: Hillf Danton 
Cc: David Rientjes 
Cc: Johannes Weiner 
Cc: Joonsoo Kim 
Cc: Mel Gorman 
Cc: Tetsuo Handa 
Cc: Vladimir Davydov 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm, compaction: change COMPACT_ constants into enum

2016-05-21T00:58:30+00:00

Compaction code is doing weird dances between COMPACT_FOO -> int ->
unsigned long

But there doesn't seem to be any reason for that.  All functions which
return/use one of those constants are not expecting any other value so it
really makes sense to define an enum for them and make it clear that no
other values are expected.

This is a pure cleanup and shouldn't introduce any functional changes.

Signed-off-by: Michal Hocko 
Acked-by: Vlastimil Babka 
Acked-by: Hillf Danton 
Cc: David Rientjes 
Cc: Johannes Weiner 
Cc: Joonsoo Kim 
Cc: Mel Gorman 
Cc: Tetsuo Handa 
Cc: Vladimir Davydov 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm, page_alloc: remove field from alloc_context

2016-05-20T02:12:14+00:00

The classzone_idx can be inferred from preferred_zoneref so remove the
unnecessary field and save stack space.

Signed-off-by: Mel Gorman 
Cc: Vlastimil Babka 
Cc: Jesper Dangaard Brouer 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm, page_alloc: convert alloc_flags to unsigned

2016-05-20T02:12:14+00:00

alloc_flags is a bitmask of flags but it is signed which does not
necessarily generate the best code depending on the compiler.  Even
without an impact, it makes more sense that this be unsigned.

Signed-off-by: Mel Gorman 
Acked-by: Vlastimil Babka 
Cc: Jesper Dangaard Brouer 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds