<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/psi.h, branch v6.4-rc1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>sched/psi: Allow unprivileged polling of N*2s period</title>
<updated>2023-04-05T07:58:50+00:00</updated>
<author>
<name>Domenico Cerasuolo</name>
<email>cerasuolodomenico@gmail.com</email>
</author>
<published>2023-03-30T10:54:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d82caa273565b45fcf103148950549af76c314b0'/>
<id>d82caa273565b45fcf103148950549af76c314b0</id>
<content type='text'>
PSI offers 2 mechanisms to get information about a specific resource
pressure. One is reading from /proc/pressure/&lt;resource&gt;, which gives
average pressures aggregated every 2s. The other is creating a pollable
fd for a specific resource and cgroup.

The trigger creation requires CAP_SYS_RESOURCE, and gives the
possibility to pick specific time window and threshold, spawing an RT
thread to aggregate the data.

Systemd would like to provide containers the option to monitor pressure
on their own cgroup and sub-cgroups. For example, if systemd launches a
container that itself then launches services, the container should have
the ability to poll() for pressure in individual services. But neither
the container nor the services are privileged.

This patch implements a mechanism to allow unprivileged users to create
pressure triggers. The difference with privileged triggers creation is
that unprivileged ones must have a time window that's a multiple of 2s.
This is so that we can avoid unrestricted spawning of rt threads, and
use instead the same aggregation mechanism done for the averages, which
runs independently of any triggers.

Suggested-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Domenico Cerasuolo &lt;cerasuolodomenico@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lore.kernel.org/r/20230330105418.77061-5-cerasuolodomenico@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
PSI offers 2 mechanisms to get information about a specific resource
pressure. One is reading from /proc/pressure/&lt;resource&gt;, which gives
average pressures aggregated every 2s. The other is creating a pollable
fd for a specific resource and cgroup.

The trigger creation requires CAP_SYS_RESOURCE, and gives the
possibility to pick specific time window and threshold, spawing an RT
thread to aggregate the data.

Systemd would like to provide containers the option to monitor pressure
on their own cgroup and sub-cgroups. For example, if systemd launches a
container that itself then launches services, the container should have
the ability to poll() for pressure in individual services. But neither
the container nor the services are privileged.

This patch implements a mechanism to allow unprivileged users to create
pressure triggers. The difference with privileged triggers creation is
that unprivileged ones must have a time window that's a multiple of 2s.
This is so that we can avoid unrestricted spawning of rt threads, and
use instead the same aggregation mechanism done for the averages, which
runs independently of any triggers.

Suggested-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Domenico Cerasuolo &lt;cerasuolodomenico@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lore.kernel.org/r/20230330105418.77061-5-cerasuolodomenico@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/psi: Per-cgroup PSI accounting disable/re-enable interface</title>
<updated>2022-09-09T09:08:33+00:00</updated>
<author>
<name>Chengming Zhou</name>
<email>zhouchengming@bytedance.com</email>
</author>
<published>2022-09-07T09:03:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=34f26a15611afb03c33df6819359d36f5b382589'/>
<id>34f26a15611afb03c33df6819359d36f5b382589</id>
<content type='text'>
PSI accounts stalls for each cgroup separately and aggregates it
at each level of the hierarchy. This may cause non-negligible overhead
for some workloads when under deep level of the hierarchy.

commit 3958e2d0c34e ("cgroup: make per-cgroup pressure stall tracking configurable")
make PSI to skip per-cgroup stall accounting, only account system-wide
to avoid this each level overhead.

But for our use case, we also want leaf cgroup PSI stats accounted for
userspace adjustment on that cgroup, apart from only system-wide adjustment.

So this patch introduce a per-cgroup PSI accounting disable/re-enable
interface "cgroup.pressure", which is a read-write single value file that
allowed values are "0" and "1", the defaults is "1" so per-cgroup
PSI stats is enabled by default.

Implementation details:

It should be relatively straight-forward to disable and re-enable
state aggregation, time tracking, averaging on a per-cgroup level,
if we can live with losing history from while it was disabled.
I.e. the avgs will restart from 0, total= will have gaps.

But it's hard or complex to stop/restart groupc-&gt;tasks[] updates,
which is not implemented in this patch. So we always update
groupc-&gt;tasks[] and PSI_ONCPU bit in psi_group_change() even when
the cgroup PSI stats is disabled.

Suggested-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Chengming Zhou &lt;zhouchengming@bytedance.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lkml.kernel.org/r/20220907090332.2078-1-zhouchengming@bytedance.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
PSI accounts stalls for each cgroup separately and aggregates it
at each level of the hierarchy. This may cause non-negligible overhead
for some workloads when under deep level of the hierarchy.

commit 3958e2d0c34e ("cgroup: make per-cgroup pressure stall tracking configurable")
make PSI to skip per-cgroup stall accounting, only account system-wide
to avoid this each level overhead.

But for our use case, we also want leaf cgroup PSI stats accounted for
userspace adjustment on that cgroup, apart from only system-wide adjustment.

So this patch introduce a per-cgroup PSI accounting disable/re-enable
interface "cgroup.pressure", which is a read-write single value file that
allowed values are "0" and "1", the defaults is "1" so per-cgroup
PSI stats is enabled by default.

Implementation details:

It should be relatively straight-forward to disable and re-enable
state aggregation, time tracking, averaging on a per-cgroup level,
if we can live with losing history from while it was disabled.
I.e. the avgs will restart from 0, total= will have gaps.

But it's hard or complex to stop/restart groupc-&gt;tasks[] updates,
which is not implemented in this patch. So we always update
groupc-&gt;tasks[] and PSI_ONCPU bit in psi_group_change() even when
the cgroup PSI stats is disabled.

Suggested-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Chengming Zhou &lt;zhouchengming@bytedance.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lkml.kernel.org/r/20220907090332.2078-1-zhouchengming@bytedance.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/psi: Consolidate cgroup_psi()</title>
<updated>2022-09-09T09:08:33+00:00</updated>
<author>
<name>Chengming Zhou</name>
<email>zhouchengming@bytedance.com</email>
</author>
<published>2022-08-25T16:41:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=57899a6610e67ba26fa3251ebbef4a5ed21efc5d'/>
<id>57899a6610e67ba26fa3251ebbef4a5ed21efc5d</id>
<content type='text'>
cgroup_psi() can't return psi_group for root cgroup, so we have many
open code "psi = cgroup_ino(cgrp) == 1 ? &amp;psi_system : cgrp-&gt;psi".

This patch move cgroup_psi() definition to &lt;linux/psi.h&gt;, in which
we can return psi_system for root cgroup, so can handle all cgroups.

Signed-off-by: Chengming Zhou &lt;zhouchengming@bytedance.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lore.kernel.org/r/20220825164111.29534-9-zhouchengming@bytedance.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cgroup_psi() can't return psi_group for root cgroup, so we have many
open code "psi = cgroup_ino(cgrp) == 1 ? &amp;psi_system : cgrp-&gt;psi".

This patch move cgroup_psi() definition to &lt;linux/psi.h&gt;, in which
we can return psi_system for root cgroup, so can handle all cgroups.

Signed-off-by: Chengming Zhou &lt;zhouchengming@bytedance.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lore.kernel.org/r/20220825164111.29534-9-zhouchengming@bytedance.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/psi: Move private helpers to sched/stats.h</title>
<updated>2022-09-09T09:08:31+00:00</updated>
<author>
<name>Chengming Zhou</name>
<email>zhouchengming@bytedance.com</email>
</author>
<published>2022-08-25T16:41:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d79ddb069c5257a924456eb99b53fc1ea715c0a3'/>
<id>d79ddb069c5257a924456eb99b53fc1ea715c0a3</id>
<content type='text'>
This patch move psi_task_change/psi_task_switch declarations out of
PSI public header, since they are only needed for implementing the
PSI stats tracking in sched/stats.h

psi_task_switch is obvious, psi_task_change can't be public helper
since it doesn't check psi_disabled static key. And there is no
any user now, so put it in sched/stats.h too.

Signed-off-by: Chengming Zhou &lt;zhouchengming@bytedance.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lore.kernel.org/r/20220825164111.29534-5-zhouchengming@bytedance.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch move psi_task_change/psi_task_switch declarations out of
PSI public header, since they are only needed for implementing the
PSI stats tracking in sched/stats.h

psi_task_switch is obvious, psi_task_change can't be public helper
since it doesn't check psi_disabled static key. And there is no
any user now, so put it in sched/stats.h too.

Signed-off-by: Chengming Zhou &lt;zhouchengming@bytedance.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lore.kernel.org/r/20220825164111.29534-5-zhouchengming@bytedance.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/psi: Remove unused parameter nbytes of psi_trigger_create()</title>
<updated>2022-08-15T22:35:25+00:00</updated>
<author>
<name>Hao Jia</name>
<email>jiahao.os@bytedance.com</email>
</author>
<published>2022-08-06T12:05:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=76b079ef4cc954fc2c2e0333a01855b0b2b6bdee'/>
<id>76b079ef4cc954fc2c2e0333a01855b0b2b6bdee</id>
<content type='text'>
psi_trigger_create()'s 'nbytes' parameter is not used, so we can remove it.

Signed-off-by: Hao Jia &lt;jiahao.os@bytedance.com&gt;
Reviewed-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
psi_trigger_create()'s 'nbytes' parameter is not used, so we can remove it.

Signed-off-by: Hao Jia &lt;jiahao.os@bytedance.com&gt;
Reviewed-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>headers/prep: Fix header to build standalone: &lt;linux/psi.h&gt;</title>
<updated>2022-03-21T10:13:49+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2022-03-21T10:05:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3387ce4d8a5f2956fab827edf499fe6780e83faa'/>
<id>3387ce4d8a5f2956fab827edf499fe6780e83faa</id>
<content type='text'>
Add the &lt;linux/cgroup-defs.h&gt; dependency to &lt;linux/psi.h&gt;, because
cgroup_move_task() will dereference 'struct css_set'.

( Only older toolchains are affected, due to variations in
  the implementation of rcu_assign_pointer() et al. )

Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Reported-by: Sachin Sant &lt;sachinp@linux.ibm.com&gt;
Reported-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reported-by: Borislav Petkov &lt;bp@alien8.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the &lt;linux/cgroup-defs.h&gt; dependency to &lt;linux/psi.h&gt;, because
cgroup_move_task() will dereference 'struct css_set'.

( Only older toolchains are affected, due to variations in
  the implementation of rcu_assign_pointer() et al. )

Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Reported-by: Sachin Sant &lt;sachinp@linux.ibm.com&gt;
Reported-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reported-by: Borislav Petkov &lt;bp@alien8.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>psi: fix "no previous prototype" warnings when CONFIG_CGROUPS=n</title>
<updated>2022-01-30T07:56:58+00:00</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2022-01-29T21:41:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=51e50fbd3efc6064c30ed73a5e009018b36e290a'/>
<id>51e50fbd3efc6064c30ed73a5e009018b36e290a</id>
<content type='text'>
When CONFIG_CGROUPS is disabled psi code generates the following
warnings:

  kernel/sched/psi.c:1112:21: warning: no previous prototype for 'psi_trigger_create' [-Wmissing-prototypes]
      1112 | struct psi_trigger *psi_trigger_create(struct psi_group *group,
           |                     ^~~~~~~~~~~~~~~~~~
  kernel/sched/psi.c:1182:6: warning: no previous prototype for 'psi_trigger_destroy' [-Wmissing-prototypes]
      1182 | void psi_trigger_destroy(struct psi_trigger *t)
           |      ^~~~~~~~~~~~~~~~~~~
  kernel/sched/psi.c:1249:10: warning: no previous prototype for 'psi_trigger_poll' [-Wmissing-prototypes]
      1249 | __poll_t psi_trigger_poll(void **trigger_ptr,
           |          ^~~~~~~~~~~~~~~~

Change the declarations of these functions in the header to provide the
prototypes even when they are unused.

Link: https://lkml.kernel.org/r/20220119223940.787748-2-surenb@google.com
Fixes: 0e94682b73bf ("psi: introduce psi monitor")
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When CONFIG_CGROUPS is disabled psi code generates the following
warnings:

  kernel/sched/psi.c:1112:21: warning: no previous prototype for 'psi_trigger_create' [-Wmissing-prototypes]
      1112 | struct psi_trigger *psi_trigger_create(struct psi_group *group,
           |                     ^~~~~~~~~~~~~~~~~~
  kernel/sched/psi.c:1182:6: warning: no previous prototype for 'psi_trigger_destroy' [-Wmissing-prototypes]
      1182 | void psi_trigger_destroy(struct psi_trigger *t)
           |      ^~~~~~~~~~~~~~~~~~~
  kernel/sched/psi.c:1249:10: warning: no previous prototype for 'psi_trigger_poll' [-Wmissing-prototypes]
      1249 | __poll_t psi_trigger_poll(void **trigger_ptr,
           |          ^~~~~~~~~~~~~~~~

Change the declarations of these functions in the header to provide the
prototypes even when they are unused.

Link: https://lkml.kernel.org/r/20220119223940.787748-2-surenb@google.com
Fixes: 0e94682b73bf ("psi: introduce psi monitor")
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>psi: Fix uaf issue when psi trigger is destroyed while being polled</title>
<updated>2022-01-18T11:09:57+00:00</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2022-01-11T23:23:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a06247c6804f1a7c86a2e5398a4c1f1db1471848'/>
<id>a06247c6804f1a7c86a2e5398a4c1f1db1471848</id>
<content type='text'>
With write operation on psi files replacing old trigger with a new one,
the lifetime of its waitqueue is totally arbitrary. Overwriting an
existing trigger causes its waitqueue to be freed and pending poll()
will stumble on trigger-&gt;event_wait which was destroyed.
Fix this by disallowing to redefine an existing psi trigger. If a write
operation is used on a file descriptor with an already existing psi
trigger, the operation will fail with EBUSY error.
Also bypass a check for psi_disabled in the psi_trigger_destroy as the
flag can be flipped after the trigger is created, leading to a memory
leak.

Fixes: 0e94682b73bf ("psi: introduce psi monitor")
Reported-by: syzbot+cdb5dd11c97cc532efad@syzkaller.appspotmail.com
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Analyzed-by: Eric Biggers &lt;ebiggers@kernel.org&gt;
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220111232309.1786347-1-surenb@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With write operation on psi files replacing old trigger with a new one,
the lifetime of its waitqueue is totally arbitrary. Overwriting an
existing trigger causes its waitqueue to be freed and pending poll()
will stumble on trigger-&gt;event_wait which was destroyed.
Fix this by disallowing to redefine an existing psi trigger. If a write
operation is used on a file descriptor with an already existing psi
trigger, the operation will fail with EBUSY error.
Also bypass a check for psi_disabled in the psi_trigger_destroy as the
flag can be flipped after the trigger is created, leading to a memory
leak.

Fixes: 0e94682b73bf ("psi: introduce psi monitor")
Reported-by: syzbot+cdb5dd11c97cc532efad@syzkaller.appspotmail.com
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Analyzed-by: Eric Biggers &lt;ebiggers@kernel.org&gt;
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220111232309.1786347-1-surenb@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>psi: Add a missing SPDX license header</title>
<updated>2021-11-17T13:48:59+00:00</updated>
<author>
<name>Liu Xinpeng</name>
<email>liuxp11@chinatelecom.cn</email>
</author>
<published>2021-10-25T03:46:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2fb75e1b642f49253d8848c9e47e8942f5366221'/>
<id>2fb75e1b642f49253d8848c9e47e8942f5366221</id>
<content type='text'>
Add the missing SPDX license header to
include/linux/psi.h
include/linux/psi_types.h
kernel/sched/psi.c

Signed-off-by: Liu Xinpeng &lt;liuxp11@chinatelecom.cn&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lore.kernel.org/r/1635133586-84611-2-git-send-email-liuxp11@chinatelecom.cn
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the missing SPDX license header to
include/linux/psi.h
include/linux/psi_types.h
kernel/sched/psi.c

Signed-off-by: Liu Xinpeng &lt;liuxp11@chinatelecom.cn&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lore.kernel.org/r/1635133586-84611-2-git-send-email-liuxp11@chinatelecom.cn
</pre>
</div>
</content>
</entry>
<entry>
<title>psi: Use ONCPU state tracking machinery to detect reclaim</title>
<updated>2021-03-06T11:40:22+00:00</updated>
<author>
<name>Chengming Zhou</name>
<email>zhouchengming@bytedance.com</email>
</author>
<published>2021-03-03T03:46:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7fae6c8171d20ac55402930ee8ae760cf85dff7b'/>
<id>7fae6c8171d20ac55402930ee8ae760cf85dff7b</id>
<content type='text'>
Move the reclaim detection from the timer tick to the task state
tracking machinery using the recently added ONCPU state. And we
also add task psi_flags changes checking in the psi_task_switch()
optimization to update the parents properly.

In terms of performance and cost, this ONCPU task state tracking
is not cheaper than previous timer tick in aggregate. But the code is
simpler and shorter this way, so it's a maintainability win. And
Johannes did some testing with perf bench, the performace and cost
changes would be acceptable for real workloads.

Thanks to Johannes Weiner for pointing out the psi_task_switch()
optimization things and the clearer changelog.

Co-developed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Signed-off-by: Chengming Zhou &lt;zhouchengming@bytedance.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lkml.kernel.org/r/20210303034659.91735-3-zhouchengming@bytedance.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the reclaim detection from the timer tick to the task state
tracking machinery using the recently added ONCPU state. And we
also add task psi_flags changes checking in the psi_task_switch()
optimization to update the parents properly.

In terms of performance and cost, this ONCPU task state tracking
is not cheaper than previous timer tick in aggregate. But the code is
simpler and shorter this way, so it's a maintainability win. And
Johannes did some testing with perf bench, the performace and cost
changes would be acceptable for real workloads.

Thanks to Johannes Weiner for pointing out the psi_task_switch()
optimization things and the clearer changelog.

Co-developed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Signed-off-by: Chengming Zhou &lt;zhouchengming@bytedance.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Link: https://lkml.kernel.org/r/20210303034659.91735-3-zhouchengming@bytedance.com
</pre>
</div>
</content>
</entry>
</feed>
