<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/security/commoncap.c, branch v2.6.25-rc2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>capabilities: introduce per-process capability bounding set</title>
<updated>2008-02-05T17:44:20+00:00</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serue@us.ibm.com</email>
</author>
<published>2008-02-05T06:29:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3b7391de67da515c91f48aa371de77cb6cc5c07e'/>
<id>3b7391de67da515c91f48aa371de77cb6cc5c07e</id>
<content type='text'>
The capability bounding set is a set beyond which capabilities cannot grow.
 Currently cap_bset is per-system.  It can be manipulated through sysctl,
but only init can add capabilities.  Root can remove capabilities.  By
default it includes all caps except CAP_SETPCAP.

This patch makes the bounding set per-process when file capabilities are
enabled.  It is inherited at fork from parent.  Noone can add elements,
CAP_SETPCAP is required to remove them.

One example use of this is to start a safer container.  For instance, until
device namespaces or per-container device whitelists are introduced, it is
best to take CAP_MKNOD away from a container.

The bounding set will not affect pP and pE immediately.  It will only
affect pP' and pE' after subsequent exec()s.  It also does not affect pI,
and exec() does not constrain pI'.  So to really start a shell with no way
of regain CAP_MKNOD, you would do

	prctl(PR_CAPBSET_DROP, CAP_MKNOD);
	cap_t cap = cap_get_proc();
	cap_value_t caparray[1];
	caparray[0] = CAP_MKNOD;
	cap_set_flag(cap, CAP_INHERITABLE, 1, caparray, CAP_DROP);
	cap_set_proc(cap);
	cap_free(cap);

The following test program will get and set the bounding
set (but not pI).  For instance

	./bset get
		(lists capabilities in bset)
	./bset drop cap_net_raw
		(starts shell with new bset)
		(use capset, setuid binary, or binary with
		file capabilities to try to increase caps)

************************************************************
cap_bound.c
************************************************************
 #include &lt;sys/prctl.h&gt;
 #include &lt;linux/capability.h&gt;
 #include &lt;sys/types.h&gt;
 #include &lt;unistd.h&gt;
 #include &lt;stdio.h&gt;
 #include &lt;stdlib.h&gt;
 #include &lt;string.h&gt;

 #ifndef PR_CAPBSET_READ
 #define PR_CAPBSET_READ 23
 #endif

 #ifndef PR_CAPBSET_DROP
 #define PR_CAPBSET_DROP 24
 #endif

int usage(char *me)
{
	printf("Usage: %s get\n", me);
	printf("       %s drop &lt;capability&gt;\n", me);
	return 1;
}

 #define numcaps 32
char *captable[numcaps] = {
	"cap_chown",
	"cap_dac_override",
	"cap_dac_read_search",
	"cap_fowner",
	"cap_fsetid",
	"cap_kill",
	"cap_setgid",
	"cap_setuid",
	"cap_setpcap",
	"cap_linux_immutable",
	"cap_net_bind_service",
	"cap_net_broadcast",
	"cap_net_admin",
	"cap_net_raw",
	"cap_ipc_lock",
	"cap_ipc_owner",
	"cap_sys_module",
	"cap_sys_rawio",
	"cap_sys_chroot",
	"cap_sys_ptrace",
	"cap_sys_pacct",
	"cap_sys_admin",
	"cap_sys_boot",
	"cap_sys_nice",
	"cap_sys_resource",
	"cap_sys_time",
	"cap_sys_tty_config",
	"cap_mknod",
	"cap_lease",
	"cap_audit_write",
	"cap_audit_control",
	"cap_setfcap"
};

int getbcap(void)
{
	int comma=0;
	unsigned long i;
	int ret;

	printf("i know of %d capabilities\n", numcaps);
	printf("capability bounding set:");
	for (i=0; i&lt;numcaps; i++) {
		ret = prctl(PR_CAPBSET_READ, i);
		if (ret &lt; 0)
			perror("prctl");
		else if (ret==1)
			printf("%s%s", (comma++) ? ", " : " ", captable[i]);
	}
	printf("\n");
	return 0;
}

int capdrop(char *str)
{
	unsigned long i;

	int found=0;
	for (i=0; i&lt;numcaps; i++) {
		if (strcmp(captable[i], str) == 0) {
			found=1;
			break;
		}
	}
	if (!found)
		return 1;
	if (prctl(PR_CAPBSET_DROP, i)) {
		perror("prctl");
		return 1;
	}
	return 0;
}

int main(int argc, char *argv[])
{
	if (argc&lt;2)
		return usage(argv[0]);
	if (strcmp(argv[1], "get")==0)
		return getbcap();
	if (strcmp(argv[1], "drop")!=0 || argc&lt;3)
		return usage(argv[0]);
	if (capdrop(argv[2])) {
		printf("unknown capability\n");
		return 1;
	}
	return execl("/bin/bash", "/bin/bash", NULL);
}
************************************************************

[serue@us.ibm.com: fix typo]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;a
Signed-off-by: "Serge E. Hallyn" &lt;serue@us.ibm.com&gt;
Tested-by: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The capability bounding set is a set beyond which capabilities cannot grow.
 Currently cap_bset is per-system.  It can be manipulated through sysctl,
but only init can add capabilities.  Root can remove capabilities.  By
default it includes all caps except CAP_SETPCAP.

This patch makes the bounding set per-process when file capabilities are
enabled.  It is inherited at fork from parent.  Noone can add elements,
CAP_SETPCAP is required to remove them.

One example use of this is to start a safer container.  For instance, until
device namespaces or per-container device whitelists are introduced, it is
best to take CAP_MKNOD away from a container.

The bounding set will not affect pP and pE immediately.  It will only
affect pP' and pE' after subsequent exec()s.  It also does not affect pI,
and exec() does not constrain pI'.  So to really start a shell with no way
of regain CAP_MKNOD, you would do

	prctl(PR_CAPBSET_DROP, CAP_MKNOD);
	cap_t cap = cap_get_proc();
	cap_value_t caparray[1];
	caparray[0] = CAP_MKNOD;
	cap_set_flag(cap, CAP_INHERITABLE, 1, caparray, CAP_DROP);
	cap_set_proc(cap);
	cap_free(cap);

The following test program will get and set the bounding
set (but not pI).  For instance

	./bset get
		(lists capabilities in bset)
	./bset drop cap_net_raw
		(starts shell with new bset)
		(use capset, setuid binary, or binary with
		file capabilities to try to increase caps)

************************************************************
cap_bound.c
************************************************************
 #include &lt;sys/prctl.h&gt;
 #include &lt;linux/capability.h&gt;
 #include &lt;sys/types.h&gt;
 #include &lt;unistd.h&gt;
 #include &lt;stdio.h&gt;
 #include &lt;stdlib.h&gt;
 #include &lt;string.h&gt;

 #ifndef PR_CAPBSET_READ
 #define PR_CAPBSET_READ 23
 #endif

 #ifndef PR_CAPBSET_DROP
 #define PR_CAPBSET_DROP 24
 #endif

int usage(char *me)
{
	printf("Usage: %s get\n", me);
	printf("       %s drop &lt;capability&gt;\n", me);
	return 1;
}

 #define numcaps 32
char *captable[numcaps] = {
	"cap_chown",
	"cap_dac_override",
	"cap_dac_read_search",
	"cap_fowner",
	"cap_fsetid",
	"cap_kill",
	"cap_setgid",
	"cap_setuid",
	"cap_setpcap",
	"cap_linux_immutable",
	"cap_net_bind_service",
	"cap_net_broadcast",
	"cap_net_admin",
	"cap_net_raw",
	"cap_ipc_lock",
	"cap_ipc_owner",
	"cap_sys_module",
	"cap_sys_rawio",
	"cap_sys_chroot",
	"cap_sys_ptrace",
	"cap_sys_pacct",
	"cap_sys_admin",
	"cap_sys_boot",
	"cap_sys_nice",
	"cap_sys_resource",
	"cap_sys_time",
	"cap_sys_tty_config",
	"cap_mknod",
	"cap_lease",
	"cap_audit_write",
	"cap_audit_control",
	"cap_setfcap"
};

int getbcap(void)
{
	int comma=0;
	unsigned long i;
	int ret;

	printf("i know of %d capabilities\n", numcaps);
	printf("capability bounding set:");
	for (i=0; i&lt;numcaps; i++) {
		ret = prctl(PR_CAPBSET_READ, i);
		if (ret &lt; 0)
			perror("prctl");
		else if (ret==1)
			printf("%s%s", (comma++) ? ", " : " ", captable[i]);
	}
	printf("\n");
	return 0;
}

int capdrop(char *str)
{
	unsigned long i;

	int found=0;
	for (i=0; i&lt;numcaps; i++) {
		if (strcmp(captable[i], str) == 0) {
			found=1;
			break;
		}
	}
	if (!found)
		return 1;
	if (prctl(PR_CAPBSET_DROP, i)) {
		perror("prctl");
		return 1;
	}
	return 0;
}

int main(int argc, char *argv[])
{
	if (argc&lt;2)
		return usage(argv[0]);
	if (strcmp(argv[1], "get")==0)
		return getbcap();
	if (strcmp(argv[1], "drop")!=0 || argc&lt;3)
		return usage(argv[0]);
	if (capdrop(argv[2])) {
		printf("unknown capability\n");
		return 1;
	}
	return execl("/bin/bash", "/bin/bash", NULL);
}
************************************************************

[serue@us.ibm.com: fix typo]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;a
Signed-off-by: "Serge E. Hallyn" &lt;serue@us.ibm.com&gt;
Tested-by: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add 64-bit capability support to the kernel</title>
<updated>2008-02-05T17:44:20+00:00</updated>
<author>
<name>Andrew Morgan</name>
<email>morgan@kernel.org</email>
</author>
<published>2008-02-05T06:29:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e338d263a76af78fe8f38a72131188b58fceb591'/>
<id>e338d263a76af78fe8f38a72131188b58fceb591</id>
<content type='text'>
The patch supports legacy (32-bit) capability userspace, and where possible
translates 32-bit capabilities to/from userspace and the VFS to 64-bit
kernel space capabilities.  If a capability set cannot be compressed into
32-bits for consumption by user space, the system call fails, with -ERANGE.

FWIW libcap-2.00 supports this change (and earlier capability formats)

 http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/

[akpm@linux-foundation.org: coding-syle fixes]
[akpm@linux-foundation.org: use get_task_comm()]
[ezk@cs.sunysb.edu: build fix]
[akpm@linux-foundation.org: do not initialise statics to 0 or NULL]
[akpm@linux-foundation.org: unused var]
[serue@us.ibm.com: export __cap_ symbols]
Signed-off-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Signed-off-by: Erez Zadok &lt;ezk@cs.sunysb.edu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The patch supports legacy (32-bit) capability userspace, and where possible
translates 32-bit capabilities to/from userspace and the VFS to 64-bit
kernel space capabilities.  If a capability set cannot be compressed into
32-bits for consumption by user space, the system call fails, with -ERANGE.

FWIW libcap-2.00 supports this change (and earlier capability formats)

 http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/

[akpm@linux-foundation.org: coding-syle fixes]
[akpm@linux-foundation.org: use get_task_comm()]
[ezk@cs.sunysb.edu: build fix]
[akpm@linux-foundation.org: do not initialise statics to 0 or NULL]
[akpm@linux-foundation.org: unused var]
[serue@us.ibm.com: export __cap_ symbols]
Signed-off-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Signed-off-by: Erez Zadok &lt;ezk@cs.sunysb.edu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>revert "capabilities: clean up file capability reading"</title>
<updated>2008-02-05T17:44:20+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2008-02-05T06:29:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8f6936f4d29aa14e54a2470b954a2e1f96322988'/>
<id>8f6936f4d29aa14e54a2470b954a2e1f96322988</id>
<content type='text'>
Revert b68680e4731abbd78863063aaa0dca2a6d8cc723 to make way for the next
patch: "Add 64-bit capability support to the kernel".

We want to keep the vfs_cap_data.data[] structure, using two 'data's for
64-bit caps (and later three for 96-bit caps), whereas
b68680e4731abbd78863063aaa0dca2a6d8cc723 had gotten rid of the 'data' struct
made its members inline.

The 64-bit caps patch keeps the stack abuse fix at get_file_caps(), which was
the more important part of that patch.

[akpm@linux-foundation.org: coding-style fixes]
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Cc: Andrew Morgan &lt;morgan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Revert b68680e4731abbd78863063aaa0dca2a6d8cc723 to make way for the next
patch: "Add 64-bit capability support to the kernel".

We want to keep the vfs_cap_data.data[] structure, using two 'data's for
64-bit caps (and later three for 96-bit caps), whereas
b68680e4731abbd78863063aaa0dca2a6d8cc723 had gotten rid of the 'data' struct
made its members inline.

The 64-bit caps patch keeps the stack abuse fix at get_file_caps(), which was
the more important part of that patch.

[akpm@linux-foundation.org: coding-style fixes]
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Cc: Andrew Morgan &lt;morgan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix filesystem capability support</title>
<updated>2008-01-22T03:39:41+00:00</updated>
<author>
<name>Andrew G. Morgan</name>
<email>morgan@kernel.org</email>
</author>
<published>2008-01-22T01:18:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a6dbb1ef2fc8d73578eacd02ac701f4233175c9f'/>
<id>a6dbb1ef2fc8d73578eacd02ac701f4233175c9f</id>
<content type='text'>
In linux-2.6.24-rc1, security/commoncap.c:cap_inh_is_capped() was
introduced. It has the exact reverse of its intended behavior. This
led to an unintended privilege esculation involving a process'
inheritable capability set.

To be exposed to this bug, you need to have Filesystem Capabilities
enabled and in use. That is:

- CONFIG_SECURITY_FILE_CAPABILITIES must be defined for the buggy code
  to be compiled in.

- You also need to have files on your system marked with fI bits raised.

Signed-off-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;

Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In linux-2.6.24-rc1, security/commoncap.c:cap_inh_is_capped() was
introduced. It has the exact reverse of its intended behavior. This
led to an unintended privilege esculation involving a process'
inheritable capability set.

To be exposed to this bug, you need to have Filesystem Capabilities
enabled and in use. That is:

- CONFIG_SECURITY_FILE_CAPABILITIES must be defined for the buggy code
  to be compiled in.

- You also need to have files on your system marked with fI bits raised.

Signed-off-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;

Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>file capabilities: don't prevent signaling setuid root programs</title>
<updated>2007-11-29T17:24:53+00:00</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serue@us.ibm.com</email>
</author>
<published>2007-11-29T00:21:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8ec2328f1138a58eaea55ec6150985a1623b01c5'/>
<id>8ec2328f1138a58eaea55ec6150985a1623b01c5</id>
<content type='text'>
An unprivileged process must be able to kill a setuid root program started
by the same user.  This is legacy behavior needed for instance for xinit to
kill X when the window manager exits.

When an unprivileged user runs a setuid root program in !SECURE_NOROOT
mode, fP, fI, and fE are set full on, so pP' and pE' are full on.  Then
cap_task_kill() prevents the user from signaling the setuid root task.
This is a change in behavior compared to when
!CONFIG_SECURITY_FILE_CAPABILITIES.

This patch introduces a special check into cap_task_kill() just to check
whether a non-root user is signaling a setuid root program started by the
same user.  If so, then signal is allowed.

Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Andrew Morgan &lt;morgan@kernel.org&gt;
Cc: Stephen Smalley &lt;sds@epoch.ncsc.mil&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
An unprivileged process must be able to kill a setuid root program started
by the same user.  This is legacy behavior needed for instance for xinit to
kill X when the window manager exits.

When an unprivileged user runs a setuid root program in !SECURE_NOROOT
mode, fP, fI, and fE are set full on, so pP' and pE' are full on.  Then
cap_task_kill() prevents the user from signaling the setuid root task.
This is a change in behavior compared to when
!CONFIG_SECURITY_FILE_CAPABILITIES.

This patch introduces a special check into cap_task_kill() just to check
whether a non-root user is signaling a setuid root program started by the
same user.  If so, then signal is allowed.

Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Andrew Morgan &lt;morgan@kernel.org&gt;
Cc: Stephen Smalley &lt;sds@epoch.ncsc.mil&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>file capabilities: allow sigcont within session</title>
<updated>2007-11-15T02:45:44+00:00</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serue@us.ibm.com</email>
</author>
<published>2007-11-15T01:00:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=91ad997a34d7abca1f04e819e31eb9f3d4e20585'/>
<id>91ad997a34d7abca1f04e819e31eb9f3d4e20585</id>
<content type='text'>
Fix http://bugzilla.kernel.org/show_bug.cgi?id=9247

Allow sigcont to be sent to a process with greater capabilities if it is in
the same session.  Otherwise, a shell from which I've started a root shell
and done 'suspend' can't be restarted by the parent shell.

Also don't do file-capabilities signaling checks when uids for the
processes don't match, since the standard check_kill_permission will have
done those checks.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Acked-by: Andrew Morgan &lt;morgan@kernel.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Tested-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Stephen Smalley &lt;sds@epoch.ncsc.mil&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@sisk.pl&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix http://bugzilla.kernel.org/show_bug.cgi?id=9247

Allow sigcont to be sent to a process with greater capabilities if it is in
the same session.  Otherwise, a shell from which I've started a root shell
and done 'suspend' can't be restarted by the parent shell.

Also don't do file-capabilities signaling checks when uids for the
processes don't match, since the standard check_kill_permission will have
done those checks.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Acked-by: Andrew Morgan &lt;morgan@kernel.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Tested-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Stephen Smalley &lt;sds@epoch.ncsc.mil&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@sisk.pl&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>capabilities: clean up file capability reading</title>
<updated>2007-10-22T15:13:18+00:00</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serue@us.ibm.com</email>
</author>
<published>2007-10-21T23:41:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b68680e4731abbd78863063aaa0dca2a6d8cc723'/>
<id>b68680e4731abbd78863063aaa0dca2a6d8cc723</id>
<content type='text'>
Simplify the vfs_cap_data structure.

Also fix get_file_caps which was declaring
__le32 v1caps[XATTR_CAPS_SZ] on the stack, but
XATTR_CAPS_SZ is already * sizeof(__le32).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Andrew Morgan &lt;morgan@kernel.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Simplify the vfs_cap_data structure.

Also fix get_file_caps which was declaring
__le32 v1caps[XATTR_CAPS_SZ] on the stack, but
XATTR_CAPS_SZ is already * sizeof(__le32).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Andrew Morgan &lt;morgan@kernel.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>pid namespaces: define is_global_init() and is_container_init()</title>
<updated>2007-10-19T18:53:37+00:00</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serue@us.ibm.com</email>
</author>
<published>2007-10-19T06:39:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b460cbc581a53cc088ceba80608021dd49c63c43'/>
<id>b460cbc581a53cc088ceba80608021dd49c63c43</id>
<content type='text'>
is_init() is an ambiguous name for the pid==1 check.  Split it into
is_global_init() and is_container_init().

A cgroup init has it's tsk-&gt;pid == 1.

A global init also has it's tsk-&gt;pid == 1 and it's active pid namespace
is the init_pid_ns.  But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.

Changelog:

	2.6.22-rc4-mm2-pidns1:
	- Use 'init_pid_ns.child_reaper' to determine if a given task is the
	  global init (/sbin/init) process. This would improve performance
	  and remove dependence on the task_pid().

	2.6.21-mm2-pidns2:

	- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
	  ppc,avr32}/traps.c for the _exception() call to is_global_init().
	  This way, we kill only the cgroup if the cgroup's init has a
	  bug rather than force a kernel panic.

[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Sukadev Bhattiprolu &lt;sukadev@us.ibm.com&gt;
Acked-by: Pavel Emelianov &lt;xemul@openvz.org&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Cedric Le Goater &lt;clg@fr.ibm.com&gt;
Cc: Dave Hansen &lt;haveblue@us.ibm.com&gt;
Cc: Herbert Poetzel &lt;herbert@13thfloor.at&gt;
Cc: Kirill Korotaev &lt;dev@sw.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
is_init() is an ambiguous name for the pid==1 check.  Split it into
is_global_init() and is_container_init().

A cgroup init has it's tsk-&gt;pid == 1.

A global init also has it's tsk-&gt;pid == 1 and it's active pid namespace
is the init_pid_ns.  But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.

Changelog:

	2.6.22-rc4-mm2-pidns1:
	- Use 'init_pid_ns.child_reaper' to determine if a given task is the
	  global init (/sbin/init) process. This would improve performance
	  and remove dependence on the task_pid().

	2.6.21-mm2-pidns2:

	- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
	  ppc,avr32}/traps.c for the _exception() call to is_global_init().
	  This way, we kill only the cgroup if the cgroup's init has a
	  bug rather than force a kernel panic.

[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Sukadev Bhattiprolu &lt;sukadev@us.ibm.com&gt;
Acked-by: Pavel Emelianov &lt;xemul@openvz.org&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Cedric Le Goater &lt;clg@fr.ibm.com&gt;
Cc: Dave Hansen &lt;haveblue@us.ibm.com&gt;
Cc: Herbert Poetzel &lt;herbert@13thfloor.at&gt;
Cc: Kirill Korotaev &lt;dev@sw.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>V3 file capabilities: alter behavior of cap_setpcap</title>
<updated>2007-10-18T21:37:24+00:00</updated>
<author>
<name>Andrew Morgan</name>
<email>morgan@kernel.org</email>
</author>
<published>2007-10-18T10:05:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=72c2d5823fc7be799a12184974c3bdc57acea3c4'/>
<id>72c2d5823fc7be799a12184974c3bdc57acea3c4</id>
<content type='text'>
The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1,
can change the capabilities of another process, p2.  This is not the
meaning that was intended for this capability at all, and this
implementation came about purely because, without filesystem capabilities,
there was no way to use capabilities without one process bestowing them on
another.

Since we now have a filesystem support for capabilities we can fix the
implementation of CAP_SETPCAP.

The most significant thing about this change is that, with it in effect, no
process can set the capabilities of another process.

The capabilities of a program are set via the capability convolution
rules:

   pI(post-exec) = pI(pre-exec)
   pP(post-exec) = (X(aka cap_bset) &amp; fP) | (pI(post-exec) &amp; fI)
   pE(post-exec) = fE ? pP(post-exec) : 0

at exec() time.  As such, the only influence the pre-exec() program can
have on the post-exec() program's capabilities are through the pI
capability set.

The correct implementation for CAP_SETPCAP (and that enabled by this patch)
is that it can be used to add extra pI capabilities to the current process
- to be picked up by subsequent exec()s when the above convolution rules
are applied.

Here is how it works:

Let's say we have a process, p. It has capability sets, pE, pP and pI.
Generally, p, can change the value of its own pI to pI' where

   (pI' &amp; ~pI) &amp; ~pP = 0.

That is, the only new things in pI' that were not present in pI need to
be present in pP.

The role of CAP_SETPCAP is basically to permit changes to pI beyond
the above:

   if (pE &amp; CAP_SETPCAP) {
      pI' = anything; /* ie., even (pI' &amp; ~pI) &amp; ~pP != 0  */
   }

This capability is useful for things like login, which (say, via
pam_cap) might want to raise certain inheritable capabilities for use
by the children of the logged-in user's shell, but those capabilities
are not useful to or needed by the login program itself.

One such use might be to limit who can run ping. You set the
capabilities of the 'ping' program to be "= cap_net_raw+i", and then
only shells that have (pI &amp; CAP_NET_RAW) will be able to run
it. Without CAP_SETPCAP implemented as described above, login(pam_cap)
would have to also have (pP &amp; CAP_NET_RAW) in order to raise this
capability and pass it on through the inheritable set.

Signed-off-by: Andrew Morgan &lt;morgan@kernel.org&gt;
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1,
can change the capabilities of another process, p2.  This is not the
meaning that was intended for this capability at all, and this
implementation came about purely because, without filesystem capabilities,
there was no way to use capabilities without one process bestowing them on
another.

Since we now have a filesystem support for capabilities we can fix the
implementation of CAP_SETPCAP.

The most significant thing about this change is that, with it in effect, no
process can set the capabilities of another process.

The capabilities of a program are set via the capability convolution
rules:

   pI(post-exec) = pI(pre-exec)
   pP(post-exec) = (X(aka cap_bset) &amp; fP) | (pI(post-exec) &amp; fI)
   pE(post-exec) = fE ? pP(post-exec) : 0

at exec() time.  As such, the only influence the pre-exec() program can
have on the post-exec() program's capabilities are through the pI
capability set.

The correct implementation for CAP_SETPCAP (and that enabled by this patch)
is that it can be used to add extra pI capabilities to the current process
- to be picked up by subsequent exec()s when the above convolution rules
are applied.

Here is how it works:

Let's say we have a process, p. It has capability sets, pE, pP and pI.
Generally, p, can change the value of its own pI to pI' where

   (pI' &amp; ~pI) &amp; ~pP = 0.

That is, the only new things in pI' that were not present in pI need to
be present in pP.

The role of CAP_SETPCAP is basically to permit changes to pI beyond
the above:

   if (pE &amp; CAP_SETPCAP) {
      pI' = anything; /* ie., even (pI' &amp; ~pI) &amp; ~pP != 0  */
   }

This capability is useful for things like login, which (say, via
pam_cap) might want to raise certain inheritable capabilities for use
by the children of the logged-in user's shell, but those capabilities
are not useful to or needed by the login program itself.

One such use might be to limit who can run ping. You set the
capabilities of the 'ping' program to be "= cap_net_raw+i", and then
only shells that have (pI &amp; CAP_NET_RAW) will be able to run
it. Without CAP_SETPCAP implemented as described above, login(pam_cap)
would have to also have (pP &amp; CAP_NET_RAW) in order to raise this
capability and pass it on through the inheritable set.

Signed-off-by: Andrew Morgan &lt;morgan@kernel.org&gt;
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>security/ cleanups</title>
<updated>2007-10-17T15:43:07+00:00</updated>
<author>
<name>Adrian Bunk</name>
<email>bunk@kernel.org</email>
</author>
<published>2007-10-17T06:31:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cbfee34520666862f8ff539e580c48958fbb7706'/>
<id>cbfee34520666862f8ff539e580c48958fbb7706</id>
<content type='text'>
This patch contains the following cleanups that are now possible:
- remove the unused security_operations-&gt;inode_xattr_getsuffix
- remove the no longer used security_operations-&gt;unregister_security
- remove some no longer required exit code
- remove a bunch of no longer used exports

Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Acked-by: James Morris &lt;jmorris@namei.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: Serge Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch contains the following cleanups that are now possible:
- remove the unused security_operations-&gt;inode_xattr_getsuffix
- remove the no longer used security_operations-&gt;unregister_security
- remove some no longer required exit code
- remove a bunch of no longer used exports

Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Acked-by: James Morris &lt;jmorris@namei.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: Serge Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
