linux-toradex.git/mm/memcontrol.c, branch v5.12

mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument

2021-03-13T19:27:31+00:00

Rename mem_cgroup_split_huge_fixup to split_page_memcg and explicitly pass
in page number argument.

In this way, the interface name is more common and can be used by
potential users.  In addition, the complete info(memcg and flag) of the
memcg needs to be set to the tail pages.

Link: https://lkml.kernel.org/r/20210304074053.65527-2-zhouguanghui1@huawei.com
Signed-off-by: Zhou Guanghui 
Acked-by: Johannes Weiner 
Reviewed-by: Zi Yan 
Reviewed-by: Shakeel Butt 
Acked-by: Michal Hocko 
Cc: Hugh Dickins 
Cc: "Kirill A. Shutemov" 
Cc: Nicholas Piggin 
Cc: Kefeng Wang 
Cc: Hanjun Guo 
Cc: Tianhong Ding 
Cc: Weilong Chen 
Cc: Rui Xiang 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: memcontrol: fix get_active_memcg return value

2021-02-24T21:38:30+00:00

We use a global percpu int_active_memcg variable to store the remote memcg
when we are in the interrupt context.  But get_active_memcg always return
the current->active_memcg or root_mem_cgroup.  The remote memcg (set in
the interrupt context) is ignored.  This is not what we want.  So fix it.

Link: https://lkml.kernel.org/r/20210223091101.42150-1-songmuchun@bytedance.com
Fixes: 37d5985c003d ("mm: kmem: prepare remote memcg charging infra for interrupt contexts")
Signed-off-by: Muchun Song 
Reviewed-by: Shakeel Butt 
Reviewed-by: Roman Gushchin 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: memcontrol: fix swap undercounting in cgroup2

2021-02-24T21:38:30+00:00

When pages are swapped in, the VM may retain the swap copy to avoid
repeated writes in the future.  It's also retained if shared pages are
faulted back in some processes, but not in others.  During that time we
have an in-memory copy of the page, as well as an on-swap copy.  Cgroup1
and cgroup2 handle these overlapping lifetimes slightly differently due to
the nature of how they account memory and swap:

Cgroup1 has a unified memory+swap counter that tracks a data page
regardless whether it's in-core or swapped out.  On swapin, we transfer
the charge from the swap entry to the newly allocated swapcache page, even
though the swap entry might stick around for a while.  That's why we have
a mem_cgroup_uncharge_swap() call inside mem_cgroup_charge().

Cgroup2 tracks memory and swap as separate, independent resources and thus
has split memory and swap counters.  On swapin, we charge the newly
allocated swapcache page as memory, while the swap slot in turn must
remain charged to the swap counter as long as its allocated too.

The cgroup2 logic was broken by commit 2d1c498072de ("mm: memcontrol: make
swap tracking an integral part of memory control"), because it
accidentally removed the do_memsw_account() check in the branch inside
mem_cgroup_uncharge() that was supposed to tell the difference between the
charge transfer in cgroup1 and the separate counters in cgroup2.

As a result, cgroup2 currently undercounts retained swap to varying
degrees: swap slots are cached up to 50% of the configured limit or total
available swap space; partially faulted back shared pages are only limited
by physical capacity.  This in turn allows cgroups to significantly
overconsume their alloted swap space.

Add the do_memsw_account() check back to fix this problem.

Link: https://lkml.kernel.org/r/20210217153237.92484-1-songmuchun@bytedance.com
Fixes: 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control")
Signed-off-by: Muchun Song 
Acked-by: Johannes Weiner 
Reviewed-by: Shakeel Butt 
Acked-by: Michal Hocko 
Cc: Vladimir Davydov 
Cc: 	[5.8+]
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

fs: buffer: use raw page_memcg() on locked page

2021-02-24T21:38:30+00:00

alloc_page_buffers() currently uses get_mem_cgroup_from_page() for
charging the buffers to the page owner, which does an rcu-protected
page->memcg lookup and acquires a reference.  But buffer allocation has
the page lock held throughout, which pins the page to the memcg and
thereby the memcg - neither rcu nor holding an extra reference during the
allocation are necessary.  Use a raw page_memcg() instead.

This was the last user of get_mem_cgroup_from_page(), delete it.

Link: https://lkml.kernel.org/r/20210209190126.97842-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner 
Reported-by: Muchun Song 
Reviewed-by: Shakeel Butt 
Acked-by: Roman Gushchin 
Acked-by: Michal Hocko 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: memcontrol: replace the loop with a list_for_each_entry()

2021-02-24T21:38:30+00:00

The rule of list walk has gone since commit a9d5adeeb4b2
("mm/memcontrol: allow to uncharge page without using page->lru field")

So remove the strange comment and replace the loop with a
list_for_each_entry().

There is only one caller of the uncharge_list().  So just fold it into
mem_cgroup_uncharge_list() and remove it.

Link: https://lkml.kernel.org/r/20210204163055.56080-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song 
Acked-by: Johannes Weiner 
Acked-by: Roman Gushchin 
Reviewed-by: Miaohe Lin 
Acked-by: Michal Hocko 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm/memcontrol: remove redundant NULL check

2021-02-24T21:38:29+00:00

Fix below warnings reported by coccicheck:

  mm/memcontrol.c:451:3-9: WARNING: NULL check before some freeing functions is not needed.

Link: https://lkml.kernel.org/r/1611216029-34397-1-git-send-email-abaci-bugfix@linux.alibaba.com
Signed-off-by: Yang Li 
Reported-by: Abaci Robot 
Reviewed-by: Andrew Morton 
Reviewed-by: David Hildenbrand 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: kmem: make __memcg_kmem_(un)charge static

2021-02-24T21:38:29+00:00

I've noticed that __memcg_kmem_charge() and __memcg_kmem_uncharge() are
not used anywhere except memcontrol.c.  Yet they are not declared as
non-static and are declared in memcontrol.h.

This patch makes them static.

Link: https://lkml.kernel.org/r/20210108020332.4096911-1-guro@fb.com
Signed-off-by: Roman Gushchin 
Reviewed-by: Shakeel Butt 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: memcg: add swapcache stat for memcg v2

2021-02-24T21:38:29+00:00

This patch adds swapcache stat for the cgroup v2.  The swapcache
represents the memory that is accounted against both the memory and the
swap limit of the cgroup.  The main motivation behind exposing the
swapcache stat is for enabling users to gracefully migrate from cgroup
v1's memsw counter to cgroup v2's memory and swap counters.

Cgroup v1's memsw limit allows users to limit the memory+swap usage of a
workload but without control on the exact proportion of memory and swap.
Cgroup v2 provides separate limits for memory and swap which enables more
control on the exact usage of memory and swap individually for the
workload.

With some little subtleties, the v1's memsw limit can be switched with the
sum of the v2's memory and swap limits.  However the alternative for memsw
usage is not yet available in cgroup v2.  Exposing per-cgroup swapcache
stat enables that alternative.  Adding the memory usage and swap usage and
subtracting the swapcache will approximate the memsw usage.  This will
help in the transparent migration of the workloads depending on memsw
usage and limit to v2' memory and swap counters.

The reasons these applications are still interested in this approximate
memsw usage are: (1) these applications are not really interested in two
separate memory and swap usage metrics.  A single usage metric is more
simple to use and reason about for them.

(2) The memsw usage metric hides the underlying system's swap setup from
the applications.  Applications with multiple instances running in a
datacenter with heterogeneous systems (some have swap and some don't) will
keep seeing a consistent view of their usage.

[akpm@linux-foundation.org: fix CONFIG_SWAP=n build]

Link: https://lkml.kernel.org/r/20210108155813.2914586-3-shakeelb@google.com
Signed-off-by: Shakeel Butt 
Acked-by: Michal Hocko 
Reviewed-by: Roman Gushchin 
Cc: Johannes Weiner 
Cc: Muchun Song 
Cc: Yang Shi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm/memcg: remove rcu locking for lock_page_lruvec function series

2021-02-24T21:38:29+00:00

lock_page_lruvec() and its variants used rcu_read_lock() with the
intention of safeguarding against the mem_cgroup being destroyed
concurrently; but so long as they are called under the specified
conditions (as they are), there is no way for the page's mem_cgroup to be
destroyed.  Delete the unnecessary rcu_read_lock() and _unlock().

Hugh Dickins polished the commit log.  Thanks a lot!

Link: https://lkml.kernel.org/r/1608614453-10739-2-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm/memcg: revise the using condition of lock_page_lruvec function series

2021-02-24T21:38:29+00:00

lock_page_lruvec() and its variants are safe to use under the same
conditions as commit_charge(): add lock_page_memcg() to the comment.

Polished with Hugh Dickins' suggestions, thanks!

Link: https://lkml.kernel.org/r/1608614453-10739-1-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds