summaryrefslogtreecommitdiff
path: root/include
diff options
context:
space:
mode:
authorTejun Heo <tj@kernel.org>2026-04-19 08:36:45 -1000
committerTejun Heo <tj@kernel.org>2026-04-20 06:55:33 -1000
commit41e3312861eafba171d9620150aaf2e99165d044 (patch)
tree0f22d2d6af12df3f4f39c7c203171df3eaae79d1 /include
parented859d4319863263665b239cd2c62c3aad1664ce (diff)
sched_ext: add p->scx.tid and SCX_OPS_TID_TO_TASK lookup
BPF schedulers that can't hold task_struct pointers (arena-backed ones in particular) key tasks by pid. During exit, pid is released before the task finishes passing through scheduler callbacks, so a dying task becomes invisible to the BPF side mid-schedule. scx_qmap hits this: an exiting task's dispatch callback can't recover its queue entry, stalling dispatch until SCX_EXIT_ERROR_STALL. Add a unique non-zero u64 p->scx.tid assigned at fork that survives the full task lifetime including exit. scx_bpf_tid_to_task() looks up the task; unlike bpf_task_from_pid(), it handles exiting tasks. The lookup costs an rhashtable insert/remove under scx_tasks_lock, so root schedulers opt in via SCX_OPS_TID_TO_TASK. Sub-schedulers that set the flag to declare a dependency are rejected at attach if root didn't opt in. scx_qmap converted: keys tasks by tid and enables SCX_OPS_ENQ_EXITING. Pre-patch it stalls within seconds under a non-leader-exec workload; with the patch it runs cleanly. v3: Warn on rhashtable_lookup_insert_fast() failure via new scx_tid_hash_insert() helper (Cheng-Yang Chou). v2: Guard scx_root deref in scx_bpf_tid_to_task() error path. The kfunc is registered via scx_kfunc_set_any and reachable from tracing and syscall programs when no scheduler is attached (Cheng-Yang Chou). Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Reviewed-by: Andrea Righi <arighi@nvidia.com>
Diffstat (limited to 'include')
-rw-r--r--include/linux/sched/ext.h9
1 files changed, 9 insertions, 0 deletions
diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h
index 1a3af2ea2a79..d05efcac794d 100644
--- a/include/linux/sched/ext.h
+++ b/include/linux/sched/ext.h
@@ -203,6 +203,15 @@ struct sched_ext_entity {
u64 core_sched_at; /* see scx_prio_less() */
#endif
+ /*
+ * Unique non-zero task ID assigned at fork. Persists across exec and
+ * is never reused. Lets BPF schedulers identify tasks without storing
+ * kernel pointers - arena-backed schedulers being one example. See
+ * scx_bpf_tid_to_task().
+ */
+ u64 tid;
+ struct rhash_head tid_hash_node; /* see SCX_OPS_TID_TO_TASK */
+
/* BPF scheduler modifiable fields */
/*