<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/mm/percpu.c, branch v5.12-rc5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>percpu: fix clang modpost section mismatch</title>
<updated>2021-02-14T18:15:15+00:00</updated>
<author>
<name>Dennis Zhou</name>
<email>dennis@kernel.org</email>
</author>
<published>2021-02-14T17:16:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=258e0815e2b1706e87c0d874211097aa8a7aa52f'/>
<id>258e0815e2b1706e87c0d874211097aa8a7aa52f</id>
<content type='text'>
pcpu_build_alloc_info() is an __init function that makes a call to
cpumask_clear_cpu(). With CONFIG_GCOV_PROFILE_ALL enabled, the inline
heuristics are modified and such cpumask_clear_cpu() which is marked
inline doesn't get inlined. Because it works on mask in __initdata,
modpost throws a section mismatch error.

Arnd sent a patch with the flatten attribute as an alternative [2]. I've
added it to compiler_attributes.h.

modpost complaint:
  WARNING: modpost: vmlinux.o(.text+0x735425): Section mismatch in reference from the function cpumask_clear_cpu() to the variable .init.data:pcpu_build_alloc_info.mask
  The function cpumask_clear_cpu() references
  the variable __initdata pcpu_build_alloc_info.mask.
  This is often because cpumask_clear_cpu lacks a __initdata
  annotation or the annotation of pcpu_build_alloc_info.mask is wrong.

clang output:
  mm/percpu.c:2724:5: remark: cpumask_clear_cpu not inlined into pcpu_build_alloc_info because too costly to inline (cost=725, threshold=325) [-Rpass-missed=inline]

[1] https://lore.kernel.org/linux-mm/202012220454.9F6Bkz9q-lkp@intel.com/
[2] https://lore.kernel.org/lkml/CAK8P3a2ZWfNeXKSm8K_SUhhwkor17jFo3xApLXjzfPqX0eUDUA@mail.gmail.com/

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
pcpu_build_alloc_info() is an __init function that makes a call to
cpumask_clear_cpu(). With CONFIG_GCOV_PROFILE_ALL enabled, the inline
heuristics are modified and such cpumask_clear_cpu() which is marked
inline doesn't get inlined. Because it works on mask in __initdata,
modpost throws a section mismatch error.

Arnd sent a patch with the flatten attribute as an alternative [2]. I've
added it to compiler_attributes.h.

modpost complaint:
  WARNING: modpost: vmlinux.o(.text+0x735425): Section mismatch in reference from the function cpumask_clear_cpu() to the variable .init.data:pcpu_build_alloc_info.mask
  The function cpumask_clear_cpu() references
  the variable __initdata pcpu_build_alloc_info.mask.
  This is often because cpumask_clear_cpu lacks a __initdata
  annotation or the annotation of pcpu_build_alloc_info.mask is wrong.

clang output:
  mm/percpu.c:2724:5: remark: cpumask_clear_cpu not inlined into pcpu_build_alloc_info because too costly to inline (cost=725, threshold=325) [-Rpass-missed=inline]

[1] https://lore.kernel.org/linux-mm/202012220454.9F6Bkz9q-lkp@intel.com/
[2] https://lore.kernel.org/lkml/CAK8P3a2ZWfNeXKSm8K_SUhhwkor17jFo3xApLXjzfPqX0eUDUA@mail.gmail.com/

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>percpu: reduce the number of cpu distance comparisons</title>
<updated>2021-02-14T17:34:05+00:00</updated>
<author>
<name>Wonhyuk Yang</name>
<email>vvghjk1234@gmail.com</email>
</author>
<published>2020-10-30T01:38:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d7d29ac76f7efb506bcecc092641e704f791d92d'/>
<id>d7d29ac76f7efb506bcecc092641e704f791d92d</id>
<content type='text'>
To build group_map[] and group_cnt[], we find out which group
CPUs belong to by comparing the distance of the cpu. However,
this includes cases where comparisons are not required.

This patch uses a bitmap to record CPUs that is not classified in
the group. CPUs that we know which group they belong to should be
cleared from the bitmap. In result, we can reduce the number of
unnecessary comparisons.

Signed-off-by: Wonhyuk Yang &lt;vvghjk1234@gmail.com&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
[Dennis: added cpumask_clear() call and #include cpumask.h.]
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To build group_map[] and group_cnt[], we find out which group
CPUs belong to by comparing the distance of the cpu. However,
this includes cases where comparisons are not required.

This patch uses a bitmap to record CPUs that is not classified in
the group. CPUs that we know which group they belong to should be
cleared from the bitmap. In result, we can reduce the number of
unnecessary comparisons.

Signed-off-by: Wonhyuk Yang &lt;vvghjk1234@gmail.com&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
[Dennis: added cpumask_clear() call and #include cpumask.h.]
</pre>
</div>
</content>
</entry>
<entry>
<title>percpu: convert flexible array initializers to use struct_size()</title>
<updated>2020-10-30T23:02:28+00:00</updated>
<author>
<name>Dennis Zhou</name>
<email>dennis@kernel.org</email>
</author>
<published>2020-10-30T20:40:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=61cf93d3e14a29288e4d5522aecb6e58268eec62'/>
<id>61cf93d3e14a29288e4d5522aecb6e58268eec62</id>
<content type='text'>
Use the safer macro as sparked by the long discussion in [1].

[1] https://lore.kernel.org/lkml/20200917204514.GA2880159@google.com/

Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the safer macro as sparked by the long discussion in [1].

[1] https://lore.kernel.org/lkml/20200917204514.GA2880159@google.com/

Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current()</title>
<updated>2020-10-18T16:27:09+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2020-10-17T23:13:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=279c3393e2c113365c999f16cd096bcf3d34319e'/>
<id>279c3393e2c113365c999f16cd096bcf3d34319e</id>
<content type='text'>
Patch series "mm: kmem: kernel memory accounting in an interrupt context".

This patchset implements memcg-based memory accounting of allocations made
from an interrupt context.

Historically, such allocations were passed unaccounted mostly because
charging the memory cgroup of the current process wasn't an option.  Also
performance reasons were likely a reason too.

The remote charging API allows to temporarily overwrite the currently
active memory cgroup, so that all memory allocations are accounted towards
some specified memory cgroup instead of the memory cgroup of the current
process.

This patchset extends the remote charging API so that it can be used from
an interrupt context.  Then it removes the fence that prevented the
accounting of allocations made from an interrupt context.  It also
contains a couple of optimizations/code refactorings.

This patchset doesn't directly enable accounting for any specific
allocations, but prepares the code base for it.  The bpf memory accounting
will likely be the first user of it: a typical example is a bpf program
parsing an incoming network packet, which allocates an entry in hashmap
map to store some information.

This patch (of 4):

Currently memcg_kmem_bypass() is called before obtaining the current
memory/obj cgroup using get_mem/obj_cgroup_from_current().  Moving
memcg_kmem_bypass() into get_mem/obj_cgroup_from_current() reduces the
number of call sites and allows further code simplifications.

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Link: http://lkml.kernel.org/r/20200827225843.1270629-1-guro@fb.com
Link: http://lkml.kernel.org/r/20200827225843.1270629-2-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "mm: kmem: kernel memory accounting in an interrupt context".

This patchset implements memcg-based memory accounting of allocations made
from an interrupt context.

Historically, such allocations were passed unaccounted mostly because
charging the memory cgroup of the current process wasn't an option.  Also
performance reasons were likely a reason too.

The remote charging API allows to temporarily overwrite the currently
active memory cgroup, so that all memory allocations are accounted towards
some specified memory cgroup instead of the memory cgroup of the current
process.

This patchset extends the remote charging API so that it can be used from
an interrupt context.  Then it removes the fence that prevented the
accounting of allocations made from an interrupt context.  It also
contains a couple of optimizations/code refactorings.

This patchset doesn't directly enable accounting for any specific
allocations, but prepares the code base for it.  The bpf memory accounting
will likely be the first user of it: a typical example is a bpf program
parsing an incoming network packet, which allocates an entry in hashmap
map to store some information.

This patch (of 4):

Currently memcg_kmem_bypass() is called before obtaining the current
memory/obj cgroup using get_mem/obj_cgroup_from_current().  Moving
memcg_kmem_bypass() into get_mem/obj_cgroup_from_current() reduces the
number of call sites and allows further code simplifications.

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Link: http://lkml.kernel.org/r/20200827225843.1270629-1-guro@fb.com
Link: http://lkml.kernel.org/r/20200827225843.1270629-2-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>percpu: fix first chunk size calculation for populated bitmap</title>
<updated>2020-09-17T17:34:39+00:00</updated>
<author>
<name>Sunghyun Jin</name>
<email>mcsmonk@gmail.com</email>
</author>
<published>2020-09-03T12:41:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b3b33d3c43bbe0177d70653f4e889c78cc37f097'/>
<id>b3b33d3c43bbe0177d70653f4e889c78cc37f097</id>
<content type='text'>
Variable populated, which is a member of struct pcpu_chunk, is used as a
unit of size of unsigned long.
However, size of populated is miscounted. So, I fix this minor part.

Fixes: 8ab16c43ea79 ("percpu: change the number of pages marked in the first_chunk pop bitmap")
Cc: &lt;stable@vger.kernel.org&gt; # 4.14+
Signed-off-by: Sunghyun Jin &lt;mcsmonk@gmail.com&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Variable populated, which is a member of struct pcpu_chunk, is used as a
unit of size of unsigned long.
However, size of populated is miscounted. So, I fix this minor part.

Fixes: 8ab16c43ea79 ("percpu: change the number of pages marked in the first_chunk pop bitmap")
Cc: &lt;stable@vger.kernel.org&gt; # 4.14+
Signed-off-by: Sunghyun Jin &lt;mcsmonk@gmail.com&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcg/percpu: per-memcg percpu memory statistics</title>
<updated>2020-08-12T17:57:55+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2020-08-12T01:30:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=772616b031f06e05846488b01dab46a7c832da13'/>
<id>772616b031f06e05846488b01dab46a7c832da13</id>
<content type='text'>
Percpu memory can represent a noticeable chunk of the total memory
consumption, especially on big machines with many CPUs.  Let's track
percpu memory usage for each memcg and display it in memory.stat.

A percpu allocation is usually scattered over multiple pages (and nodes),
and can be significantly smaller than a page.  So let's add a byte-sized
counter on the memcg level: MEMCG_PERCPU_B.  Byte-sized vmstat infra
created for slabs can be perfectly reused for percpu case.

[guro@fb.com: v3]
  Link: http://lkml.kernel.org/r/20200623184515.4132564-4-guro@fb.com

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Tobin C. Harding &lt;tobin@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: Bixuan Cui &lt;cuibixuan@huawei.com&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Link: http://lkml.kernel.org/r/20200608230819.832349-4-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Percpu memory can represent a noticeable chunk of the total memory
consumption, especially on big machines with many CPUs.  Let's track
percpu memory usage for each memcg and display it in memory.stat.

A percpu allocation is usually scattered over multiple pages (and nodes),
and can be significantly smaller than a page.  So let's add a byte-sized
counter on the memcg level: MEMCG_PERCPU_B.  Byte-sized vmstat infra
created for slabs can be perfectly reused for percpu case.

[guro@fb.com: v3]
  Link: http://lkml.kernel.org/r/20200623184515.4132564-4-guro@fb.com

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Tobin C. Harding &lt;tobin@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: Bixuan Cui &lt;cuibixuan@huawei.com&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Link: http://lkml.kernel.org/r/20200608230819.832349-4-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcg/percpu: account percpu memory to memory cgroups</title>
<updated>2020-08-12T17:57:55+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2020-08-12T01:30:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3c7be18ac9a06bc67196bfdabb7c21e1bbacdc13'/>
<id>3c7be18ac9a06bc67196bfdabb7c21e1bbacdc13</id>
<content type='text'>
Percpu memory is becoming more and more widely used by various subsystems,
and the total amount of memory controlled by the percpu allocator can make
a good part of the total memory.

As an example, bpf maps can consume a lot of percpu memory, and they are
created by a user.  Also, some cgroup internals (e.g.  memory controller
statistics) can be quite large.  On a machine with many CPUs and big
number of cgroups they can consume hundreds of megabytes.

So the lack of memcg accounting is creating a breach in the memory
isolation.  Similar to the slab memory, percpu memory should be accounted
by default.

To implement the perpcu accounting it's possible to take the slab memory
accounting as a model to follow.  Let's introduce two types of percpu
chunks: root and memcg.  What makes memcg chunks different is an
additional space allocated to store memcg membership information.  If
__GFP_ACCOUNT is passed on allocation, a memcg chunk should be be used.
If it's possible to charge the corresponding size to the target memory
cgroup, allocation is performed, and the memcg ownership data is recorded.
System-wide allocations are performed using root chunks, so there is no
additional memory overhead.

To implement a fast reparenting of percpu memory on memcg removal, we
don't store mem_cgroup pointers directly: instead we use obj_cgroup API,
introduced for slab accounting.

[akpm@linux-foundation.org: fix CONFIG_MEMCG_KMEM=n build errors and warning]
[akpm@linux-foundation.org: move unreachable code, per Roman]
[cuibixuan@huawei.com: mm/percpu: fix 'defined but not used' warning]
  Link: http://lkml.kernel.org/r/6d41b939-a741-b521-a7a2-e7296ec16219@huawei.com

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Bixuan Cui &lt;cuibixuan@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Tobin C. Harding &lt;tobin@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: Bixuan Cui &lt;cuibixuan@huawei.com&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Link: http://lkml.kernel.org/r/20200623184515.4132564-3-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Percpu memory is becoming more and more widely used by various subsystems,
and the total amount of memory controlled by the percpu allocator can make
a good part of the total memory.

As an example, bpf maps can consume a lot of percpu memory, and they are
created by a user.  Also, some cgroup internals (e.g.  memory controller
statistics) can be quite large.  On a machine with many CPUs and big
number of cgroups they can consume hundreds of megabytes.

So the lack of memcg accounting is creating a breach in the memory
isolation.  Similar to the slab memory, percpu memory should be accounted
by default.

To implement the perpcu accounting it's possible to take the slab memory
accounting as a model to follow.  Let's introduce two types of percpu
chunks: root and memcg.  What makes memcg chunks different is an
additional space allocated to store memcg membership information.  If
__GFP_ACCOUNT is passed on allocation, a memcg chunk should be be used.
If it's possible to charge the corresponding size to the target memory
cgroup, allocation is performed, and the memcg ownership data is recorded.
System-wide allocations are performed using root chunks, so there is no
additional memory overhead.

To implement a fast reparenting of percpu memory on memcg removal, we
don't store mem_cgroup pointers directly: instead we use obj_cgroup API,
introduced for slab accounting.

[akpm@linux-foundation.org: fix CONFIG_MEMCG_KMEM=n build errors and warning]
[akpm@linux-foundation.org: move unreachable code, per Roman]
[cuibixuan@huawei.com: mm/percpu: fix 'defined but not used' warning]
  Link: http://lkml.kernel.org/r/6d41b939-a741-b521-a7a2-e7296ec16219@huawei.com

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Bixuan Cui &lt;cuibixuan@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Tobin C. Harding &lt;tobin@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: Bixuan Cui &lt;cuibixuan@huawei.com&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Link: http://lkml.kernel.org/r/20200623184515.4132564-3-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>percpu: return number of released bytes from pcpu_free_area()</title>
<updated>2020-08-12T17:57:55+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2020-08-12T01:30:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5b32af91b5de6f95ad99e4eaaf57777376af124f'/>
<id>5b32af91b5de6f95ad99e4eaaf57777376af124f</id>
<content type='text'>
Patch series "mm: memcg accounting of percpu memory", v3.

This patchset adds percpu memory accounting to memory cgroups.  It's based
on the rework of the slab controller and reuses concepts and features
introduced for the per-object slab accounting.

Percpu memory is becoming more and more widely used by various subsystems,
and the total amount of memory controlled by the percpu allocator can make
a good part of the total memory.

As an example, bpf maps can consume a lot of percpu memory, and they are
created by a user.  Also, some cgroup internals (e.g.  memory controller
statistics) can be quite large.  On a machine with many CPUs and big
number of cgroups they can consume hundreds of megabytes.

So the lack of memcg accounting is creating a breach in the memory
isolation.  Similar to the slab memory, percpu memory should be accounted
by default.

Percpu allocations by their nature are scattered over multiple pages, so
they can't be tracked on the per-page basis.  So the per-object tracking
introduced by the new slab controller is reused.

The patchset implements charging of percpu allocations, adds memcg-level
statistics, enables accounting for percpu allocations made by memory
cgroup internals and provides some basic tests.

To implement the accounting of percpu memory without a significant memory
and performance overhead the following approach is used: all accounted
allocations are placed into a separate percpu chunk (or chunks).  These
chunks are similar to default chunks, except that they do have an attached
vector of pointers to obj_cgroup objects, which is big enough to save a
pointer for each allocated object.  On the allocation, if the allocation
has to be accounted (__GFP_ACCOUNT is passed, the allocating process
belongs to a non-root memory cgroup, etc), the memory cgroup is getting
charged and if the maximum limit is not exceeded the allocation is
performed using a memcg-aware chunk.  Otherwise -ENOMEM is returned or the
allocation is forced over the limit, depending on gfp (as any other kernel
memory allocation).  The memory cgroup information is saved in the
obj_cgroup vector at the corresponding offset.  On the release time the
memcg information is restored from the vector and the cgroup is getting
uncharged.  Unaccounted allocations (at this point the absolute majority
of all percpu allocations) are performed in the old way, so no additional
overhead is expected.

To avoid pinning dying memory cgroups by outstanding allocations,
obj_cgroup API is used instead of directly saving memory cgroup pointers.
obj_cgroup is basically a pointer to a memory cgroup with a standalone
reference counter.  The trick is that it can be atomically swapped to
point at the parent cgroup, so that the original memory cgroup can be
released prior to all objects, which has been charged to it.  Because all
charges and statistics are fully recursive, it's perfectly correct to
uncharge the parent cgroup instead.  This scheme is used in the slab
memory accounting, and percpu memory can just follow the scheme.

This patch (of 5):

To implement accounting of percpu memory we need the information about the
size of freed object.  Return it from pcpu_free_area().

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Tobin C. Harding &lt;tobin@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
cC: Michal Koutnýutny@suse.com&gt;
Cc: Bixuan Cui &lt;cuibixuan@huawei.com&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Link: http://lkml.kernel.org/r/20200623184515.4132564-1-guro@fb.com
Link: http://lkml.kernel.org/r/20200608230819.832349-1-guro@fb.com
Link: http://lkml.kernel.org/r/20200608230819.832349-2-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "mm: memcg accounting of percpu memory", v3.

This patchset adds percpu memory accounting to memory cgroups.  It's based
on the rework of the slab controller and reuses concepts and features
introduced for the per-object slab accounting.

Percpu memory is becoming more and more widely used by various subsystems,
and the total amount of memory controlled by the percpu allocator can make
a good part of the total memory.

As an example, bpf maps can consume a lot of percpu memory, and they are
created by a user.  Also, some cgroup internals (e.g.  memory controller
statistics) can be quite large.  On a machine with many CPUs and big
number of cgroups they can consume hundreds of megabytes.

So the lack of memcg accounting is creating a breach in the memory
isolation.  Similar to the slab memory, percpu memory should be accounted
by default.

Percpu allocations by their nature are scattered over multiple pages, so
they can't be tracked on the per-page basis.  So the per-object tracking
introduced by the new slab controller is reused.

The patchset implements charging of percpu allocations, adds memcg-level
statistics, enables accounting for percpu allocations made by memory
cgroup internals and provides some basic tests.

To implement the accounting of percpu memory without a significant memory
and performance overhead the following approach is used: all accounted
allocations are placed into a separate percpu chunk (or chunks).  These
chunks are similar to default chunks, except that they do have an attached
vector of pointers to obj_cgroup objects, which is big enough to save a
pointer for each allocated object.  On the allocation, if the allocation
has to be accounted (__GFP_ACCOUNT is passed, the allocating process
belongs to a non-root memory cgroup, etc), the memory cgroup is getting
charged and if the maximum limit is not exceeded the allocation is
performed using a memcg-aware chunk.  Otherwise -ENOMEM is returned or the
allocation is forced over the limit, depending on gfp (as any other kernel
memory allocation).  The memory cgroup information is saved in the
obj_cgroup vector at the corresponding offset.  On the release time the
memcg information is restored from the vector and the cgroup is getting
uncharged.  Unaccounted allocations (at this point the absolute majority
of all percpu allocations) are performed in the old way, so no additional
overhead is expected.

To avoid pinning dying memory cgroups by outstanding allocations,
obj_cgroup API is used instead of directly saving memory cgroup pointers.
obj_cgroup is basically a pointer to a memory cgroup with a standalone
reference counter.  The trick is that it can be atomically swapped to
point at the parent cgroup, so that the original memory cgroup can be
released prior to all objects, which has been charged to it.  Because all
charges and statistics are fully recursive, it's perfectly correct to
uncharge the parent cgroup instead.  This scheme is used in the slab
memory accounting, and percpu memory can just follow the scheme.

This patch (of 5):

To implement accounting of percpu memory we need the information about the
size of freed object.  Return it from pcpu_free_area().

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Tobin C. Harding &lt;tobin@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
cC: Michal Koutnýutny@suse.com&gt;
Cc: Bixuan Cui &lt;cuibixuan@huawei.com&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Link: http://lkml.kernel.org/r/20200623184515.4132564-1-guro@fb.com
Link: http://lkml.kernel.org/r/20200608230819.832349-1-guro@fb.com
Link: http://lkml.kernel.org/r/20200608230819.832349-2-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Remove uninitialized_var() usage</title>
<updated>2020-07-16T19:35:15+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2020-06-03T20:09:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3f649ab728cda8038259d8f14492fe400fbab911'/>
<id>3f649ab728cda8038259d8f14492fe400fbab911</id>
<content type='text'>
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky &lt;leonro@mellanox.com&gt; # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt; # IB
Acked-by: Kalle Valo &lt;kvalo@codeaurora.org&gt; # wireless drivers
Reviewed-by: Chao Yu &lt;yuchao0@huawei.com&gt; # erofs
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky &lt;leonro@mellanox.com&gt; # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt; # IB
Acked-by: Kalle Valo &lt;kvalo@codeaurora.org&gt; # wireless drivers
Reviewed-by: Chao Yu &lt;yuchao0@huawei.com&gt; # erofs
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: remove the pgprot argument to __vmalloc</title>
<updated>2020-06-02T17:59:11+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-06-02T04:51:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=88dca4ca5a93d2c09e5bbc6a62fbfc3af83c4fca'/>
<id>88dca4ca5a93d2c09e5bbc6a62fbfc3af83c4fca</id>
<content type='text'>
The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt; [hyperv]
Acked-by: Gao Xiang &lt;xiang@kernel.org&gt; [erofs]
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: David Airlie &lt;airlied@linux.ie&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nitin Gupta &lt;ngupta@vflare.org&gt;
Cc: Robin Murphy &lt;robin.murphy@arm.com&gt;
Cc: Sakari Ailus &lt;sakari.ailus@linux.intel.com&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Link: http://lkml.kernel.org/r/20200414131348.444715-22-hch@lst.de
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt; [hyperv]
Acked-by: Gao Xiang &lt;xiang@kernel.org&gt; [erofs]
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: David Airlie &lt;airlied@linux.ie&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nitin Gupta &lt;ngupta@vflare.org&gt;
Cc: Robin Murphy &lt;robin.murphy@arm.com&gt;
Cc: Sakari Ailus &lt;sakari.ailus@linux.intel.com&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Link: http://lkml.kernel.org/r/20200414131348.444715-22-hch@lst.de
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
