summaryrefslogtreecommitdiff
path: root/Documentation/filesystems
diff options
context:
space:
mode:
authorAleksa Sarai <cyphar@cyphar.com>2025-08-05 15:45:09 +1000
committerChristian Brauner <brauner@kernel.org>2025-09-02 11:37:24 +0200
commitfe49652e36bb4337330d24dce7d571a49f824605 (patch)
treebb0900fb479c448176a12e5399f5871c531e39b8 /Documentation/filesystems
parent7df87820122acd3204565109f636a1367912655a (diff)
procfs: add "pidns" mount option
Since the introduction of pid namespaces, their interaction with procfs has been entirely implicit in ways that require a lot of dancing around by programs that need to construct sandboxes with different PID namespaces. Being able to explicitly specify the pid namespace to use when constructing a procfs super block will allow programs to no longer need to fork off a process which does then does unshare(2) / setns(2) and forks again in order to construct a procfs in a pidns. So, provide a "pidns" mount option which allows such users to just explicitly state which pid namespace they want that procfs instance to use. This interface can be used with fsconfig(2) either with a file descriptor or a path: fsconfig(procfd, FSCONFIG_SET_FD, "pidns", NULL, nsfd); fsconfig(procfd, FSCONFIG_SET_STRING, "pidns", "/proc/self/ns/pid", 0); or with classic mount(2) / mount(8): // mount -t proc -o pidns=/proc/self/ns/pid proc /tmp/proc mount("proc", "/tmp/proc", "proc", MS_..., "pidns=/proc/self/ns/pid"); As this new API is effectively shorthand for setns(2) followed by mount(2), the permission model for this mirrors pidns_install() to avoid opening up new attack surfaces by loosening the existing permission model. In order to avoid having to RCU-protect all users of proc_pid_ns() (to avoid UAFs), attempting to reconfigure an existing procfs instance's pid namespace will error out with -EBUSY. Creating new procfs instances is quite cheap, so this should not be an impediment to most users, and lets us avoid a lot of churn in fs/proc/* for a feature that it seems unlikely userspace would use. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/20250805-procfs-pidns-api-v4-2-705f984940e7@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/proc.rst8
1 files changed, 8 insertions, 0 deletions
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 2971551b7235..b7e3147ba3d4 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -2362,6 +2362,7 @@ The following mount options are supported:
hidepid= Set /proc/<pid>/ access mode.
gid= Set the group authorized to learn processes information.
subset= Show only the specified subset of procfs.
+ pidns= Specify a the namespace used by this procfs.
========= ========================================================
hidepid=off or hidepid=0 means classic mode - everybody may access all
@@ -2394,6 +2395,13 @@ information about processes information, just add identd to this group.
subset=pid hides all top level files and directories in the procfs that
are not related to tasks.
+pidns= specifies a pid namespace (either as a string path to something like
+`/proc/$pid/ns/pid`, or a file descriptor when using `FSCONFIG_SET_FD`) that
+will be used by the procfs instance when translating pids. By default, procfs
+will use the calling process's active pid namespace. Note that the pid
+namespace of an existing procfs instance cannot be modified (attempting to do
+so will give an `-EBUSY` error).
+
Chapter 5: Filesystem behavior
==============================