diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2026-04-14 20:28:40 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2026-04-14 20:28:40 -0700 |
| commit | 5c0f43e8535d619ff32400e2e916075109fc7a56 (patch) | |
| tree | 996739729de3adb4b7e43a925a3e25f8ce8a663e /kernel/pid.c | |
| parent | 7c8a4671dc3247a26a702e5f5996e9f453d7070d (diff) | |
| parent | 4c68d150246d7e1d826a807a82e6eb6b4669f42c (diff) | |
Merge tag 'kernel-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull pid_namespace updates from Christian Brauner:
- pid_namespace: make init creation more flexible
Annotate ->child_reaper accesses with {READ,WRITE}_ONCE() to protect
the unlocked readers from cpu/compiler reordering, and enforce that
pid 1 in a pid namespace is always the first allocated pid (the
set_tid path already required this).
On top of that, allow opening pid_for_children before the pid
namespace init has been created. This lets one process create the pid
namespace and a different process create the init via setns(), which
makes clone3(set_tid) usable in all cases evenly and is particularly
useful to CRIU when restoring nested containers.
A new selftest covers both the basic create-pidns-then-init flow and
the cross-process variant, and a MAINTAINERS entry for the pid
namespace code is added.
- unrelated signal cleanup: update outdated comment for the removed
freezable_schedule()
* tag 'kernel-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
signal: update outdated comment for removed freezable_schedule()
MAINTAINERS: add a pid namespace entry
selftests: Add tests for creating pidns init via setns
pid_namespace: allow opening pid_for_children before init was created
pid: check init is created first after idr alloc
pid_namespace: avoid optimization of accesses to ->child_reaper
Diffstat (limited to 'kernel/pid.c')
| -rw-r--r-- | kernel/pid.c | 19 |
1 files changed, 11 insertions, 8 deletions
diff --git a/kernel/pid.c b/kernel/pid.c index 3b96571d0fe6..677c84e319dd 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -128,7 +128,7 @@ void free_pid(struct pid *pid) * is the reaper wake up the reaper. The reaper * may be sleeping in zap_pid_ns_processes(). */ - wake_up_process(ns->child_reaper); + wake_up_process(READ_ONCE(ns->child_reaper)); break; case PIDNS_ADDING: /* Handle a fork failure of the first process */ @@ -215,12 +215,6 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid, retval = -EINVAL; if (tid < 1 || tid >= pid_max[ns->level - i]) goto out_abort; - /* - * Also fail if a PID != 1 is requested and - * no PID 1 exists. - */ - if (tid != 1 && !tmp->child_reaper) - goto out_abort; retval = -EPERM; if (!checkpoint_restore_ns_capable(tmp->user_ns)) goto out_abort; @@ -296,9 +290,18 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid, pid->numbers[i].nr = nr; pid->numbers[i].ns = tmp; - tmp = tmp->parent; i--; retried_preload = false; + + /* + * PID 1 (init) must be created first. + */ + if (!READ_ONCE(tmp->child_reaper) && nr != 1) { + retval = -EINVAL; + goto out_free; + } + + tmp = tmp->parent; } /* |
