<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/base/node.c, branch v5.12-rc6</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>mm: memcg: add swapcache stat for memcg v2</title>
<updated>2021-02-24T21:38:29+00:00</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeelb@google.com</email>
</author>
<published>2021-02-24T20:03:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b6038942480e574c697ea1a80019bbe586c1d654'/>
<id>b6038942480e574c697ea1a80019bbe586c1d654</id>
<content type='text'>
This patch adds swapcache stat for the cgroup v2.  The swapcache
represents the memory that is accounted against both the memory and the
swap limit of the cgroup.  The main motivation behind exposing the
swapcache stat is for enabling users to gracefully migrate from cgroup
v1's memsw counter to cgroup v2's memory and swap counters.

Cgroup v1's memsw limit allows users to limit the memory+swap usage of a
workload but without control on the exact proportion of memory and swap.
Cgroup v2 provides separate limits for memory and swap which enables more
control on the exact usage of memory and swap individually for the
workload.

With some little subtleties, the v1's memsw limit can be switched with the
sum of the v2's memory and swap limits.  However the alternative for memsw
usage is not yet available in cgroup v2.  Exposing per-cgroup swapcache
stat enables that alternative.  Adding the memory usage and swap usage and
subtracting the swapcache will approximate the memsw usage.  This will
help in the transparent migration of the workloads depending on memsw
usage and limit to v2' memory and swap counters.

The reasons these applications are still interested in this approximate
memsw usage are: (1) these applications are not really interested in two
separate memory and swap usage metrics.  A single usage metric is more
simple to use and reason about for them.

(2) The memsw usage metric hides the underlying system's swap setup from
the applications.  Applications with multiple instances running in a
datacenter with heterogeneous systems (some have swap and some don't) will
keep seeing a consistent view of their usage.

[akpm@linux-foundation.org: fix CONFIG_SWAP=n build]

Link: https://lkml.kernel.org/r/20210108155813.2914586-3-shakeelb@google.com
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds swapcache stat for the cgroup v2.  The swapcache
represents the memory that is accounted against both the memory and the
swap limit of the cgroup.  The main motivation behind exposing the
swapcache stat is for enabling users to gracefully migrate from cgroup
v1's memsw counter to cgroup v2's memory and swap counters.

Cgroup v1's memsw limit allows users to limit the memory+swap usage of a
workload but without control on the exact proportion of memory and swap.
Cgroup v2 provides separate limits for memory and swap which enables more
control on the exact usage of memory and swap individually for the
workload.

With some little subtleties, the v1's memsw limit can be switched with the
sum of the v2's memory and swap limits.  However the alternative for memsw
usage is not yet available in cgroup v2.  Exposing per-cgroup swapcache
stat enables that alternative.  Adding the memory usage and swap usage and
subtracting the swapcache will approximate the memsw usage.  This will
help in the transparent migration of the workloads depending on memsw
usage and limit to v2' memory and swap counters.

The reasons these applications are still interested in this approximate
memsw usage are: (1) these applications are not really interested in two
separate memory and swap usage metrics.  A single usage metric is more
simple to use and reason about for them.

(2) The memsw usage metric hides the underlying system's swap setup from
the applications.  Applications with multiple instances running in a
datacenter with heterogeneous systems (some have swap and some don't) will
keep seeing a consistent view of their usage.

[akpm@linux-foundation.org: fix CONFIG_SWAP=n build]

Link: https://lkml.kernel.org/r/20210108155813.2914586-3-shakeelb@google.com
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcontrol: convert NR_FILE_PMDMAPPED account to pages</title>
<updated>2021-02-24T21:38:29+00:00</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2021-02-24T20:03:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=380780e71895ae301505ffcec8f954ab3666a4c7'/>
<id>380780e71895ae301505ffcec8f954ab3666a4c7</id>
<content type='text'>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_FILE_PMDMAPPED account to pages.  This patch is
consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-7-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_FILE_PMDMAPPED account to pages.  This patch is
consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-7-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcontrol: convert NR_SHMEM_PMDMAPPED account to pages</title>
<updated>2021-02-24T21:38:29+00:00</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2021-02-24T20:03:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a1528e21f8915e16252cda1137fe29672c918361'/>
<id>a1528e21f8915e16252cda1137fe29672c918361</id>
<content type='text'>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_SHMEM_PMDMAPPED account to pages.  This patch is
consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-6-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_SHMEM_PMDMAPPED account to pages.  This patch is
consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-6-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcontrol: convert NR_SHMEM_THPS account to pages</title>
<updated>2021-02-24T21:38:29+00:00</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2021-02-24T20:03:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=57b2847d3c1dc154923578efb47a12302a57d700'/>
<id>57b2847d3c1dc154923578efb47a12302a57d700</id>
<content type='text'>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_SHMEM_THPS account to pages.  This patch is
consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_SHMEM_THPS account to pages.  This patch is
consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcontrol: convert NR_FILE_THPS account to pages</title>
<updated>2021-02-24T21:38:29+00:00</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2021-02-24T20:03:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bf9ecead53c89d3d2cf60acbc460174ebbcf0027'/>
<id>bf9ecead53c89d3d2cf60acbc460174ebbcf0027</id>
<content type='text'>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with if hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_FILE_THPS account to pages.  This patch is consistent
with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival").
Doing this also can make the unit of vmstat counters more unified.
Finally, the unit of the vmstat counters are pages, kB and bytes.  The
B/KB suffix can tell us that the unit is bytes or kB.  The rest which is
without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-4-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with if hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_FILE_THPS account to pages.  This patch is consistent
with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival").
Doing this also can make the unit of vmstat counters more unified.
Finally, the unit of the vmstat counters are pages, kB and bytes.  The
B/KB suffix can tell us that the unit is bytes or kB.  The rest which is
without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-4-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcontrol: convert NR_ANON_THPS account to pages</title>
<updated>2021-02-24T21:38:29+00:00</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2021-02-24T20:03:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=69473e5de87389be6c0fa4a5d574a50c8f904fb3'/>
<id>69473e5de87389be6c0fa4a5d574a50c8f904fb3</id>
<content type='text'>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_ANON_THPS account to pages.  This patch is consistent
with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival").
Doing this also can make the unit of vmstat counters more unified.
Finally, the unit of the vmstat counters are pages, kB and bytes.  The
B/KB suffix can tell us that the unit is bytes or kB.  The rest which is
without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-3-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_ANON_THPS account to pages.  This patch is consistent
with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival").
Doing this also can make the unit of vmstat counters more unified.
Finally, the unit of the vmstat counters are pages, kB and bytes.  The
B/KB suffix can tell us that the unit is bytes or kB.  The rest which is
without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-3-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Rafael. J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta@cloud.ionos.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: memcontrol: account pagetables per node</title>
<updated>2020-12-15T20:13:40+00:00</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeelb@google.com</email>
</author>
<published>2020-12-15T03:07:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f0c0c115fb81940f4dba0644ac2a8a43b39c83f3'/>
<id>f0c0c115fb81940f4dba0644ac2a8a43b39c83f3</id>
<content type='text'>
For many workloads, pagetable consumption is significant and it makes
sense to expose it in the memory.stat for the memory cgroups.  However at
the moment, the pagetables are accounted per-zone.  Converting them to
per-node and using the right interface will correctly account for the
memory cgroups as well.

[akpm@linux-foundation.org: export __mod_lruvec_page_state to modules for arch/mips/kvm/]

Link: https://lkml.kernel.org/r/20201130212541.2781790-3-shakeelb@google.com
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For many workloads, pagetable consumption is significant and it makes
sense to expose it in the memory.stat for the memory cgroups.  However at
the moment, the pagetables are accounted per-zone.  Converting them to
per-node and using the right interface will correctly account for the
memory cgroups as well.

[akpm@linux-foundation.org: export __mod_lruvec_page_state to modules for arch/mips/kvm/]

Link: https://lkml.kernel.org/r/20201130212541.2781790-3-shakeelb@google.com
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: don't panic when links can't be created in sysfs</title>
<updated>2020-10-16T18:11:18+00:00</updated>
<author>
<name>Laurent Dufour</name>
<email>ldufour@linux.ibm.com</email>
</author>
<published>2020-10-16T03:09:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=90c7eaeb14a325a760d732184ff1fbed47e5fa98'/>
<id>90c7eaeb14a325a760d732184ff1fbed47e5fa98</id>
<content type='text'>
At boot time, or when doing memory hot-add operations, if the links in
sysfs can't be created, the system is still able to run, so just report
the error in the kernel log rather than BUG_ON and potentially make system
unusable because the callpath can be called with locks held.

Since the number of memory blocks managed could be high, the messages are
rate limited.

As a consequence, link_mem_sections() has no status to report anymore.

Signed-off-by: Laurent Dufour &lt;ldufour@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: "Rafael J . Wysocki" &lt;rafael@kernel.org&gt;
Cc: Scott Cheloha &lt;cheloha@linux.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20200915094143.79181-4-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
At boot time, or when doing memory hot-add operations, if the links in
sysfs can't be created, the system is still able to run, so just report
the error in the kernel log rather than BUG_ON and potentially make system
unusable because the callpath can be called with locks held.

Since the number of memory blocks managed could be high, the messages are
rate limited.

As a consequence, link_mem_sections() has no status to report anymore.

Signed-off-by: Laurent Dufour &lt;ldufour@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: "Rafael J . Wysocki" &lt;rafael@kernel.org&gt;
Cc: Scott Cheloha &lt;cheloha@linux.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20200915094143.79181-4-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'driver-core-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core</title>
<updated>2020-10-14T23:09:32+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-10-14T23:09:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fe151462bd0f7ad0e758f1cdcbeb6426e3d1ee8e'/>
<id>fe151462bd0f7ad0e758f1cdcbeb6426e3d1ee8e</id>
<content type='text'>
Pull driver core updates from Greg KH:
 "Here is the "big" set of driver core patches for 5.10-rc1

  They include a lot of different things, all related to the driver core
  and/or some driver logic:

   - sysfs common write functions to make it easier to audit sysfs
     attributes

   - device connection cleanups and fixes

   - devm helpers for a few functions

   - NOIO allocations for when devices are being removed

   - minor cleanups and fixes

  All have been in linux-next for a while with no reported issues"

* tag 'driver-core-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits)
  regmap: debugfs: use semicolons rather than commas to separate statements
  platform/x86: intel_pmc_core: do not create a static struct device
  drivers core: node: Use a more typical macro definition style for ACCESS_ATTR
  drivers core: Use sysfs_emit for shared_cpu_map_show and shared_cpu_list_show
  mm: and drivers core: Convert hugetlb_report_node_meminfo to sysfs_emit
  drivers core: Miscellaneous changes for sysfs_emit
  drivers core: Reindent a couple uses around sysfs_emit
  drivers core: Remove strcat uses around sysfs_emit and neaten
  drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions
  sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output
  dyndbg: use keyword, arg varnames for query term pairs
  driver core: force NOIO allocations during unplug
  platform_device: switch to simpler IDA interface
  driver core: platform: Document return type of more functions
  Revert "driver core: Annotate dev_err_probe() with __must_check"
  Revert "test_firmware: Test platform fw loading on non-EFI systems"
  iio: adc: xilinx-xadc: use devm_krealloc()
  hwmon: pmbus: use more devres helpers
  devres: provide devm_krealloc()
  syscore: Use pm_pr_dbg() for syscore_{suspend,resume}()
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull driver core updates from Greg KH:
 "Here is the "big" set of driver core patches for 5.10-rc1

  They include a lot of different things, all related to the driver core
  and/or some driver logic:

   - sysfs common write functions to make it easier to audit sysfs
     attributes

   - device connection cleanups and fixes

   - devm helpers for a few functions

   - NOIO allocations for when devices are being removed

   - minor cleanups and fixes

  All have been in linux-next for a while with no reported issues"

* tag 'driver-core-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits)
  regmap: debugfs: use semicolons rather than commas to separate statements
  platform/x86: intel_pmc_core: do not create a static struct device
  drivers core: node: Use a more typical macro definition style for ACCESS_ATTR
  drivers core: Use sysfs_emit for shared_cpu_map_show and shared_cpu_list_show
  mm: and drivers core: Convert hugetlb_report_node_meminfo to sysfs_emit
  drivers core: Miscellaneous changes for sysfs_emit
  drivers core: Reindent a couple uses around sysfs_emit
  drivers core: Remove strcat uses around sysfs_emit and neaten
  drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions
  sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output
  dyndbg: use keyword, arg varnames for query term pairs
  driver core: force NOIO allocations during unplug
  platform_device: switch to simpler IDA interface
  driver core: platform: Document return type of more functions
  Revert "driver core: Annotate dev_err_probe() with __must_check"
  Revert "test_firmware: Test platform fw loading on non-EFI systems"
  iio: adc: xilinx-xadc: use devm_krealloc()
  hwmon: pmbus: use more devres helpers
  devres: provide devm_krealloc()
  syscore: Use pm_pr_dbg() for syscore_{suspend,resume}()
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'acpi-numa'</title>
<updated>2020-10-13T12:44:50+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2020-10-13T12:44:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e4174ff78b9ec2179faeffe66640dbddce52b449'/>
<id>e4174ff78b9ec2179faeffe66640dbddce52b449</id>
<content type='text'>
* acpi-numa:
  docs: mm: numaperf.rst Add brief description for access class 1.
  node: Add access1 class to represent CPU to memory characteristics
  ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3
  ACPI: Let ACPI know we support Generic Initiator Affinity Structures
  x86: Support Generic Initiator only proximity domains
  ACPI: Support Generic Initiator only domains
  ACPI / NUMA: Add stub function for pxm_to_node()
  irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory
  ACPI: Remove side effect of partly creating a node in acpi_get_node()
  ACPI: Rename acpi_map_pxm_to_online_node() to pxm_to_online_node()
  ACPI: Remove side effect of partly creating a node in acpi_map_pxm_to_online_node()
  ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT
  ACPI: Add out of bounds and numa_off protections to pxm_to_node()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* acpi-numa:
  docs: mm: numaperf.rst Add brief description for access class 1.
  node: Add access1 class to represent CPU to memory characteristics
  ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3
  ACPI: Let ACPI know we support Generic Initiator Affinity Structures
  x86: Support Generic Initiator only proximity domains
  ACPI: Support Generic Initiator only domains
  ACPI / NUMA: Add stub function for pxm_to_node()
  irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory
  ACPI: Remove side effect of partly creating a node in acpi_get_node()
  ACPI: Rename acpi_map_pxm_to_online_node() to pxm_to_online_node()
  ACPI: Remove side effect of partly creating a node in acpi_map_pxm_to_online_node()
  ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT
  ACPI: Add out of bounds and numa_off protections to pxm_to_node()
</pre>
</div>
</content>
</entry>
</feed>
