Age | Commit message (Collapse) | Author |
|
commit a3a8784454692dd72e5d5d34dcdab17b4420e74c upstream.
When a key is being garbage collected, it's key->user would get put before
the ->destroy() callback is called, where the key is removed from it's
respective tracking structures.
This leaves a key hanging in a semi-invalid state which leaves a window open
for a different task to try an access key->user. An example is
find_keyring_by_name() which would dereference key->user for a key that is
in the process of being garbage collected (where key->user was freed but
->destroy() wasn't called yet - so it's still present in the linked list).
This would cause either a panic, or corrupt memory.
Fixes CVE-2014-9529.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b26bdde5bb27f3f900e25a95e33a0c476c8c2c48 upstream.
When loading encrypted-keys module, if the last check of
aes_get_sizes() in init_encrypted() fails, the driver just returns an
error without unregistering its key type. This results in the stale
entry in the list. In addition to memory leaks, this leads to a kernel
crash when registering a new key type later.
This patch fixes the problem by swapping the calls of aes_get_sizes()
and register_key_type(), and releasing resources properly at the error
paths.
Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=908163
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 923190d32de4428afbea5e5773be86bea60a9925 upstream.
sb_finish_set_opts() can race with inode_free_security()
when initializing inode security structures for inodes
created prior to initial policy load or by the filesystem
during ->mount(). This appears to have always been
a possible race, but commit 3dc91d4 ("SELinux: Fix possible
NULL pointer dereference in selinux_inode_permission()")
made it more evident by immediately reusing the unioned
list/rcu element of the inode security structure for call_rcu()
upon an inode_free_security(). But the underlying issue
was already present before that commit as a possible use-after-free
of isec.
Shivnandan Kumar reported the list corruption and proposed
a patch to split the list and rcu elements out of the union
as separate fields of the inode_security_struct so that setting
the rcu element would not affect the list element. However,
this would merely hide the issue and not truly fix the code.
This patch instead moves up the deletion of the list entry
prior to dropping the sbsec->isec_lock initially. Then,
if the inode is dropped subsequently, there will be no further
references to the isec.
Reported-by: Shivnandan Kumar <shivnandan.k@samsung.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3b1deef6b1289a99505858a3b212c5b50adf0c2f upstream.
evm_inode_setxattr() can be called with no value. The function does not
check the length so that following command can be used to produce the
kernel oops: setfattr -n security.evm FOO. This patch fixes it.
Changes in v3:
* there is no reason to return different error codes for EVM_XATTR_HMAC
and non EVM_XATTR_HMAC. Remove unnecessary test then.
Changes in v2:
* testing for validity of xattr type
[ 1106.396921] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1106.398192] IP: [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.399244] PGD 29048067 PUD 290d7067 PMD 0
[ 1106.399953] Oops: 0000 [#1] SMP
[ 1106.400020] Modules linked in: bridge stp llc evdev serio_raw i2c_piix4 button fuse
[ 1106.400020] CPU: 0 PID: 3635 Comm: setxattr Not tainted 3.16.0-kds+ #2936
[ 1106.400020] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 1106.400020] task: ffff8800291a0000 ti: ffff88002917c000 task.ti: ffff88002917c000
[ 1106.400020] RIP: 0010:[<ffffffff812af7b8>] [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.400020] RSP: 0018:ffff88002917fd50 EFLAGS: 00010246
[ 1106.400020] RAX: 0000000000000000 RBX: ffff88002917fdf8 RCX: 0000000000000000
[ 1106.400020] RDX: 0000000000000000 RSI: ffffffff818136d3 RDI: ffff88002917fdf8
[ 1106.400020] RBP: ffff88002917fd68 R08: 0000000000000000 R09: 00000000003ec1df
[ 1106.400020] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800438a0a00
[ 1106.400020] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1106.400020] FS: 00007f7dfa7d7740(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000
[ 1106.400020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1106.400020] CR2: 0000000000000000 CR3: 000000003763e000 CR4: 00000000000006f0
[ 1106.400020] Stack:
[ 1106.400020] ffff8800438a0a00 ffff88002917fdf8 0000000000000000 ffff88002917fd98
[ 1106.400020] ffffffff812a1030 ffff8800438a0a00 ffff88002917fdf8 0000000000000000
[ 1106.400020] 0000000000000000 ffff88002917fde0 ffffffff8116d08a ffff88002917fdc8
[ 1106.400020] Call Trace:
[ 1106.400020] [<ffffffff812a1030>] security_inode_setxattr+0x5d/0x6a
[ 1106.400020] [<ffffffff8116d08a>] vfs_setxattr+0x6b/0x9f
[ 1106.400020] [<ffffffff8116d1e0>] setxattr+0x122/0x16c
[ 1106.400020] [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
[ 1106.400020] [<ffffffff8114d011>] ? __sb_start_write+0x10f/0x143
[ 1106.400020] [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
[ 1106.400020] [<ffffffff811687c0>] ? __mnt_want_write+0x48/0x4f
[ 1106.400020] [<ffffffff8116d3e6>] SyS_setxattr+0x6e/0xb0
[ 1106.400020] [<ffffffff81529da9>] system_call_fastpath+0x16/0x1b
[ 1106.400020] Code: c3 0f 1f 44 00 00 55 48 89 e5 41 55 49 89 d5 41 54 49 89 fc 53 48 89 f3 48 c7 c6 d3 36 81 81 48 89 df e8 18 22 04 00 85 c0 75 07 <41> 80 7d 00 02 74 0d 48 89 de 4c 89 e7 e8 5a fe ff ff eb 03 83
[ 1106.400020] RIP [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.400020] RSP <ffff88002917fd50>
[ 1106.400020] CR2: 0000000000000000
[ 1106.428061] ---[ end trace ae08331628ba3050 ]---
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3dcbad52cf18c3c379e96b992d22815439ebbe53 upstream.
Unless an LSM labels a file during d_instantiate(), newly created
files are not labeled with an initial security.evm xattr, until
the file closes. EVM, before allowing a protected, security xattr
to be written, verifies the existing 'security.evm' value is good.
For newly created files without a security.evm label, this
verification prevents writing any protected, security xattrs,
until the file closes.
Following is the example when this happens:
fd = open("foo", O_CREAT | O_WRONLY, 0644);
setxattr("foo", "security.SMACK64", value, sizeof(value), 0);
close(fd);
While INTEGRITY_NOXATTRS status is handled in other places, such
as evm_inode_setattr(), it does not handle it in all cases in
evm_protect_xattr(). By limiting the use of INTEGRITY_NOXATTRS to
newly created files, we can now allow setting "protected" xattrs.
Changelog:
- limit the use of INTEGRITY_NOXATTRS to IMA identified new files
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b151d6b00bbb798c58f2f21305e7d43fa763f34f upstream.
On ima_file_free(), newly created empty files are not labeled with
an initial security.ima value, because the iversion did not change.
Commit dff6efc "fs: fix iversion handling" introduced a change in
iversion behavior. To verify this change use the shell command:
$ (exec >foo)
$ getfattr -h -e hex -d -m security foo
This patch defines the IMA_NEW_FILE flag. The flag is initially
set, when IMA detects that a new file is created, and subsequently
checked on the ima_file_free() hook to set the initial security.ima
value.
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7d8b6c63751cfbbe5eef81a48c22978b3407a3ad upstream.
This is effectively a revert of 7b9a7ec565505699f503b4fcf61500dceb36e744
plus fixing it a different way...
We found, when trying to run an application from an application which
had dropped privs that the kernel does security checks on undefined
capability bits. This was ESPECIALLY difficult to debug as those
undefined bits are hidden from /proc/$PID/status.
Consider a root application which drops all capabilities from ALL 4
capability sets. We assume, since the application is going to set
eff/perm/inh from an array that it will clear not only the defined caps
less than CAP_LAST_CAP, but also the higher 28ish bits which are
undefined future capabilities.
The BSET gets cleared differently. Instead it is cleared one bit at a
time. The problem here is that in security/commoncap.c::cap_task_prctl()
we actually check the validity of a capability being read. So any task
which attempts to 'read all things set in bset' followed by 'unset all
things set in bset' will not even attempt to unset the undefined bits
higher than CAP_LAST_CAP.
So the 'parent' will look something like:
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffc000000000
All of this 'should' be fine. Given that these are undefined bits that
aren't supposed to have anything to do with permissions. But they do...
So lets now consider a task which cleared the eff/perm/inh completely
and cleared all of the valid caps in the bset (but not the invalid caps
it couldn't read out of the kernel). We know that this is exactly what
the libcap-ng library does and what the go capabilities library does.
They both leave you in that above situation if you try to clear all of
you capapabilities from all 4 sets. If that root task calls execve()
the child task will pick up all caps not blocked by the bset. The bset
however does not block bits higher than CAP_LAST_CAP. So now the child
task has bits in eff which are not in the parent. These are
'meaningless' undefined bits, but still bits which the parent doesn't
have.
The problem is now in cred_cap_issubset() (or any operation which does a
subset test) as the child, while a subset for valid cap bits, is not a
subset for invalid cap bits! So now we set durring commit creds that
the child is not dumpable. Given it is 'more priv' than its parent. It
also means the parent cannot ptrace the child and other stupidity.
The solution here:
1) stop hiding capability bits in status
This makes debugging easier!
2) stop giving any task undefined capability bits. it's simple, it you
don't put those invalid bits in CAP_FULL_SET you won't get them in init
and you won't get them in any other task either.
This fixes the cap_issubset() tests and resulting fallout (which
made the init task in a docker container untraceable among other
things)
3) mask out undefined bits when sys_capset() is called as it might use
~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.
4) mask out undefined bit when we read a file capability off of disk as
again likely all bits are set in the xattr for forward/backward
compatibility.
This lets 'setcap all+pe /bin/bash; /bin/bash' run
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2fb1c9a4f2dbc2f0bd2431c7fa64d0b5483864e4 upstream.
Calculating the 'security.evm' HMAC value requires access to the
EVM encrypted key. Only the kernel should have access to it. This
patch prevents userspace tools(eg. setfattr, cp --preserve=xattr)
from setting/modifying the 'security.evm' HMAC value directly.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0430e49b6e7c6b5e076be8fefdee089958c9adad upstream.
Commit 8aac62706 "move exit_task_namespaces() outside of exit_notify"
introduced the kernel opps since the kernel v3.10, which happens when
Apparmor and IMA-appraisal are enabled at the same time.
----------------------------------------------------------------------
[ 106.750167] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[ 106.750221] IP: [<ffffffff811ec7da>] our_mnt+0x1a/0x30
[ 106.750241] PGD 0
[ 106.750254] Oops: 0000 [#1] SMP
[ 106.750272] Modules linked in: cuse parport_pc ppdev bnep rfcomm
bluetooth rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc
fscache dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp
kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul
ablk_helper cryptd snd_hda_codec_realtek dcdbas snd_hda_intel
snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi
snd_seq_midi_event snd_rawmidi psmouse snd_seq microcode serio_raw
snd_timer snd_seq_device snd soundcore video lpc_ich coretemp mac_hid lp
parport mei_me mei nbd hid_generic e1000e usbhid ahci ptp hid libahci
pps_core
[ 106.750658] CPU: 6 PID: 1394 Comm: mysqld Not tainted 3.13.0-rc7-kds+ #15
[ 106.750673] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A08
09/19/2012
[ 106.750689] task: ffff8800de804920 ti: ffff880400fca000 task.ti:
ffff880400fca000
[ 106.750704] RIP: 0010:[<ffffffff811ec7da>] [<ffffffff811ec7da>]
our_mnt+0x1a/0x30
[ 106.750725] RSP: 0018:ffff880400fcba60 EFLAGS: 00010286
[ 106.750738] RAX: 0000000000000000 RBX: 0000000000000100 RCX:
ffff8800d51523e7
[ 106.750764] RDX: ffffffffffffffea RSI: ffff880400fcba34 RDI:
ffff880402d20020
[ 106.750791] RBP: ffff880400fcbae0 R08: 0000000000000000 R09:
0000000000000001
[ 106.750817] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff8800d5152300
[ 106.750844] R13: ffff8803eb8df510 R14: ffff880400fcbb28 R15:
ffff8800d51523e7
[ 106.750871] FS: 0000000000000000(0000) GS:ffff88040d200000(0000)
knlGS:0000000000000000
[ 106.750910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 106.750935] CR2: 0000000000000018 CR3: 0000000001c0e000 CR4:
00000000001407e0
[ 106.750962] Stack:
[ 106.750981] ffffffff813434eb ffff880400fcbb20 ffff880400fcbb18
0000000000000000
[ 106.751037] ffff8800de804920 ffffffff8101b9b9 0001800000000000
0000000000000100
[ 106.751093] 0000010000000000 0000000000000002 000000000000000e
ffff8803eb8df500
[ 106.751149] Call Trace:
[ 106.751172] [<ffffffff813434eb>] ? aa_path_name+0x2ab/0x430
[ 106.751199] [<ffffffff8101b9b9>] ? sched_clock+0x9/0x10
[ 106.751225] [<ffffffff8134a68d>] aa_path_perm+0x7d/0x170
[ 106.751250] [<ffffffff8101b945>] ? native_sched_clock+0x15/0x80
[ 106.751276] [<ffffffff8134aa73>] aa_file_perm+0x33/0x40
[ 106.751301] [<ffffffff81348c5e>] common_file_perm+0x8e/0xb0
[ 106.751327] [<ffffffff81348d78>] apparmor_file_permission+0x18/0x20
[ 106.751355] [<ffffffff8130c853>] security_file_permission+0x23/0xa0
[ 106.751382] [<ffffffff811c77a2>] rw_verify_area+0x52/0xe0
[ 106.751407] [<ffffffff811c789d>] vfs_read+0x6d/0x170
[ 106.751432] [<ffffffff811cda31>] kernel_read+0x41/0x60
[ 106.751457] [<ffffffff8134fd45>] ima_calc_file_hash+0x225/0x280
[ 106.751483] [<ffffffff8134fb52>] ? ima_calc_file_hash+0x32/0x280
[ 106.751509] [<ffffffff8135022d>] ima_collect_measurement+0x9d/0x160
[ 106.751536] [<ffffffff810b552d>] ? trace_hardirqs_on+0xd/0x10
[ 106.751562] [<ffffffff8134f07c>] ? ima_file_free+0x6c/0xd0
[ 106.751587] [<ffffffff81352824>] ima_update_xattr+0x34/0x60
[ 106.751612] [<ffffffff8134f0d0>] ima_file_free+0xc0/0xd0
[ 106.751637] [<ffffffff811c9635>] __fput+0xd5/0x300
[ 106.751662] [<ffffffff811c98ae>] ____fput+0xe/0x10
[ 106.751687] [<ffffffff81086774>] task_work_run+0xc4/0xe0
[ 106.751712] [<ffffffff81066fad>] do_exit+0x2bd/0xa90
[ 106.751738] [<ffffffff8173c958>] ? retint_swapgs+0x13/0x1b
[ 106.751763] [<ffffffff8106780c>] do_group_exit+0x4c/0xc0
[ 106.751788] [<ffffffff81067894>] SyS_exit_group+0x14/0x20
[ 106.751814] [<ffffffff8174522d>] system_call_fastpath+0x1a/0x1f
[ 106.751839] Code: c3 0f 1f 44 00 00 55 48 89 e5 e8 22 fe ff ff 5d c3
0f 1f 44 00 00 55 65 48 8b 04 25 c0 c9 00 00 48 8b 80 28 06 00 00 48 89
e5 5d <48> 8b 40 18 48 39 87 c0 00 00 00 0f 94 c0 c3 0f 1f 80 00 00 00
[ 106.752185] RIP [<ffffffff811ec7da>] our_mnt+0x1a/0x30
[ 106.752214] RSP <ffff880400fcba60>
[ 106.752236] CR2: 0000000000000018
[ 106.752258] ---[ end trace 3c520748b4732721 ]---
----------------------------------------------------------------------
The reason for the oops is that IMA-appraisal uses "kernel_read()" when
file is closed. kernel_read() honors LSM security hook which calls
Apparmor handler, which uses current->nsproxy->mnt_ns. The 'guilty'
commit changed the order of cleanup code so that nsproxy->mnt_ns was
not already available for Apparmor.
Discussion about the issue with Al Viro and Eric W. Biederman suggested
that kernel_read() is too high-level for IMA. Another issue, except
security checking, that was identified is mandatory locking. kernel_read
honors it as well and it might prevent IMA from calculating necessary hash.
It was suggested to use simplified version of the function without security
and locking checks.
This patch introduces special version ima_kernel_read(), which skips security
and mandatory locking checking. It prevents the kernel oops to happen.
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f9b2a735bdddf836214b5dca74f6ca7712e5a08c upstream.
Files are measured or appraised based on the IMA policy. When a
file, in policy, is opened with the O_DIRECT flag, a deadlock
occurs.
The first attempt at resolving this lockdep temporarily removed the
O_DIRECT flag and restored it, after calculating the hash. The
second attempt introduced the O_DIRECT_HAVELOCK flag. Based on this
flag, do_blockdev_direct_IO() would skip taking the i_mutex a second
time. The third attempt, by Dmitry Kasatkin, resolves the i_mutex
locking issue, by re-introducing the IMA mutex, but uncovered
another problem. Reading a file with O_DIRECT flag set, writes
directly to userspace pages. A second patch allocates a user-space
like memory. This works for all IMA hooks, except ima_file_free(),
which is called on __fput() to recalculate the file hash.
Until this last issue is addressed, do not 'collect' the
measurement for measuring, appraising, or auditing files opened
with the O_DIRECT flag set. Based on policy, permit or deny file
access. This patch defines a new IMA policy rule option named
'permit_directio'. Policy rules could be defined, based on LSM
or other criteria, to permit specific applications to open files
with the O_DIRECT flag set.
Changelog v1:
- permit or deny file access based IMA policy rules
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d2c2b11cfa134f4fbdcc34088824da26a084d8de upstream.
[PATCH v3 1/2] device_cgroup: check if exception removal is allowed
When the device cgroup hierarchy was introduced in
bd2953ebbb53 - devcg: propagate local changes down the hierarchy
a specific case was overlooked. Consider the hierarchy bellow:
A default policy: ALLOW, exceptions will deny access
\
B default policy: ALLOW, exceptions will deny access
There's no need to verify when an new exception is added to B because
in this case exceptions will deny access to further devices, which is
always fine. Hierarchy in device cgroup only makes sure B won't have
more access than A.
But when an exception is removed (by writing devices.allow), it isn't
checked if the user is in fact removing an inherited exception from A,
thus giving more access to B.
Example:
# echo 'a' >A/devices.allow
# echo 'c 1:3 rw' >A/devices.deny
# echo $$ >A/B/tasks
# echo >/dev/null
-bash: /dev/null: Operation not permitted
# echo 'c 1:3 w' >A/B/devices.allow
# echo >/dev/null
#
This shouldn't be allowed and this patch fixes it by making sure to never allow
exceptions in this case to be removed if the exception is partially or fully
present on the parent.
v3: missing '*' in function description
v2: improved log message and formatting fixes
Cc: cgroups@vger.kernel.org
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 79d719749d23234e9b725098aa49133f3ef7299d upstream.
Whenever a device file is opened and checked against current device
cgroup rules, it uses the same function (may_access()) as when a new
exception rule is added by writing devices.{allow,deny}. And in both
cases, the algorithm is the same, doesn't matter the behavior.
First problem is having device access to be considered the same as rule
checking. Consider the following structure:
A (default behavior: allow, exceptions disallow access)
\
B (default behavior: allow, exceptions disallow access)
A new exception is added to B by writing devices.deny:
c 12:34 rw
When checking if that exception is allowed in may_access():
if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW) {
if (behavior == DEVCG_DEFAULT_ALLOW) {
/* the exception will deny access to certain devices */
return true;
Which is ok, since B is not getting more privileges than A, it doesn't
matter and the rule is accepted
Now, consider it's a device file open check and the process belongs to
cgroup B. The access will be generated as:
behavior: allow
exception: c 12:34 rw
The very same chunk of code will allow it, even if there's an explicit
exception telling to do otherwise.
A simple test case:
# mkdir new_group
# cd new_group
# echo $$ >tasks
# echo "c 1:3 w" >devices.deny
# echo >/dev/null
# echo $?
0
This is a serious bug and was introduced on
c39a2a3018f8 devcg: prepare may_access() for hierarchy support
To solve this problem, the device file open function was split from the
new exception check.
Second problem is how exceptions are processed by may_access(). The
first part of the said function tries to match fully with an existing
exception:
list_for_each_entry_rcu(ex, &dev_cgroup->exceptions, list) {
if ((refex->type & DEV_BLOCK) && !(ex->type & DEV_BLOCK))
continue;
if ((refex->type & DEV_CHAR) && !(ex->type & DEV_CHAR))
continue;
if (ex->major != ~0 && ex->major != refex->major)
continue;
if (ex->minor != ~0 && ex->minor != refex->minor)
continue;
if (refex->access & (~ex->access))
continue;
match = true;
break;
}
That means the new exception should be contained into an existing one to
be considered a match:
New exception Existing match? notes
b 12:34 rwm b 12:34 rwm yes
b 12:34 r b *:34 rw yes
b 12:34 rw b 12:34 w no extra "r"
b *:34 rw b 12:34 rw no too broad "*"
b *:34 rw b *:34 rwm yes
Which is fine in some cases. Consider:
A (default behavior: deny, exceptions allow access)
\
B (default behavior: deny, exceptions allow access)
In this case the full match makes sense, the new exception cannot add
more access than the parent allows
But this doesn't always work, consider:
A (default behavior: allow, exceptions disallow access)
\
B (default behavior: deny, exceptions allow access)
In this case, a new exception in B shouldn't match any of the exceptions
in A, after all you can't allow something that was forbidden by A. But
consider this scenario:
New exception Existing in A match? outcome
b 12:34 rw b 12:34 r no exception is accepted
Because the new exception has "w" as extra, it doesn't match, so it'll
be added to B's exception list.
The same problem can happen during a file access check. Consider a
cgroup with allow as default behavior:
Access Exception match?
b 12:34 rw b 12:34 r no
In this case, the access didn't match any of the exceptions in the
cgroup, which is required since exceptions will disallow access.
To solve this problem, two new functions were created to match an
exception either fully or partially. In the example above, a partial
check will be performed and it'll produce a match since at least
"b 12:34 r" from "b 12:34 rw" access matches.
Cc: cgroups@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c019e307ad82a8ee652b8ccbacf69ae94263b07b upstream.
With the new template mechanism introduced in IMA since kernel 3.13,
the format of data sent through the binary_runtime_measurements interface
is slightly changed. Now, for a generic measurement, the format of
template data (after the template name) is:
template_len | field1_len | field1 | ... | fieldN_len | fieldN
In addition, fields containing a string now include the '\0' termination
character.
Instead, the format for the 'ima' template should be:
SHA1 digest | event name length | event name
It must be noted that while in the IMA 3.13 code 'event name length' is
'IMA_EVENT_NAME_LEN_MAX + 1' (256 bytes), so that the template digest
is calculated correctly, and 'event name' contains '\0', in the pre 3.13
code 'event name length' is exactly the string length and 'event name'
does not contain the termination character.
The patch restores the behavior of the IMA code pre 3.13 for the 'ima'
template so that legacy userspace tools obtain a consistent behavior
when receiving data from the binary_runtime_measurements interface
regardless of which kernel version is used.
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f64410ec665479d7b4b77b7519e814253ed0f686 upstream.
This patch is based on an earlier patch by Eric Paris, he describes
the problem below:
"If an inode is accessed before policy load it will get placed on a
list of inodes to be initialized after policy load. After policy
load we call inode_doinit() which calls inode_doinit_with_dentry()
on all inodes accessed before policy load. In the case of inodes
in procfs that means we'll end up at the bottom where it does:
/* Default to the fs superblock SID. */
isec->sid = sbsec->sid;
if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) {
if (opt_dentry) {
isec->sclass = inode_mode_to_security_class(...)
rc = selinux_proc_get_sid(opt_dentry,
isec->sclass,
&sid);
if (rc)
goto out_unlock;
isec->sid = sid;
}
}
Since opt_dentry is null, we'll never call selinux_proc_get_sid()
and will leave the inode labeled with the label on the superblock.
I believe a fix would be to mimic the behavior of xattrs. Look
for an alias of the inode. If it can't be found, just leave the
inode uninitialized (and pick it up later) if it can be found, we
should be able to call selinux_proc_get_sid() ..."
On a system exhibiting this problem, you will notice a lot of files in
/proc with the generic "proc_t" type (at least the ones that were
accessed early in the boot), for example:
# ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
system_u:object_r:proc_t:s0 /proc/sys/kernel/shmmax
However, with this patch in place we see the expected result:
# ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
system_u:object_r:sysctl_kernel_t:s0 /proc/sys/kernel/shmmax
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
1) Fix a sleep in atomic when pfkey_sadb2xfrm_user_sec_ctx()
is called from pfkey_compile_policy().
Fix from Nikolay Aleksandrov.
2) security_xfrm_policy_alloc() can be called in process and atomic
context. Add an argument to let the callers choose the appropriate
way. Fix from Nikolay Aleksandrov.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
security_xfrm_policy_alloc can be called in atomic context so the
allocation should be done with GFP_ATOMIC. Add an argument to let the
callers choose the appropriate way. In order to do so a gfp argument
needs to be added to the method xfrm_policy_alloc_security in struct
security_operations and to the internal function
selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
callers and leave GFP_KERNEL as before for the rest.
The path that needed the gfp argument addition is:
security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)
Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
add it to security_context_to_sid which is used inside and prior to this
patch did only GFP_KERNEL allocation. So add gfp argument to
security_context_to_sid and adjust all of its callers as well.
CC: Paul Moore <paul@paul-moore.com>
CC: Dave Jones <davej@redhat.com>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Fan Du <fan.du@windriver.com>
CC: David S. Miller <davem@davemloft.net>
CC: LSM list <linux-security-module@vger.kernel.org>
CC: SELinux list <selinux@tycho.nsa.gov>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This fixes CVE-2014-0102.
The following command sequence produces an oops:
keyctl new_session
i=`keyctl newring _ses @s`
keyctl link @s $i
The problem is that search_nested_keyrings() sees two keyrings that have
matching type and description, so keyring_compare_object() returns true.
s_n_k() then passes the key to the iterator function -
keyring_detect_cycle_iterator() - which *should* check to see whether this is
the keyring of interest, not just one with the same name.
Because assoc_array_find() will return one and only one match, I assumed that
the iterator function would only see an exact match or never be called - but
the iterator isn't only called from assoc_array_find()...
The oops looks something like this:
kernel BUG at /data/fs/linux-2.6-fscache/security/keys/keyring.c:1003!
invalid opcode: 0000 [#1] SMP
...
RIP: keyring_detect_cycle_iterator+0xe/0x1f
...
Call Trace:
search_nested_keyrings+0x76/0x2aa
__key_link_check_live_key+0x50/0x5f
key_link+0x4e/0x85
keyctl_keyring_link+0x60/0x81
SyS_keyctl+0x65/0xe4
tracesys+0xdd/0xe2
The fix is to make keyring_detect_cycle_iterator() check that the key it
has is the key it was actually looking for rather than calling BUG_ON().
A testcase has been included in the keyutils testsuite for this:
http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=891f3365d07f1996778ade0e3428f01878a1790b
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
into for-linus
|
|
When writing policy via /sys/fs/selinux/policy I wrote the type and class
of filename trans rules in CPU endian instead of little endian. On
x86_64 this works just fine, but it means that on big endian arch's like
ppc64 and s390 userspace reads the policy and converts it from
le32_to_cpu. So the values are all screwed up. Write the values in le
format like it should have been to start.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
into for-linus
|
|
Setting an empty security context (length=0) on a file will
lead to incorrectly dereferencing the type and other fields
of the security context structure, yielding a kernel BUG.
As a zero-length security context is never valid, just reject
all such security contexts whether coming from userspace
via setxattr or coming from the filesystem upon a getxattr
request by SELinux.
Setting a security context value (empty or otherwise) unknown to
SELinux in the first place is only possible for a root process
(CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only
if the corresponding SELinux mac_admin permission is also granted
to the domain by policy. In Fedora policies, this is only allowed for
specific domains such as livecd for setting down security contexts
that are not defined in the build host policy.
Reproducer:
su
setenforce 0
touch foo
setfattr -n security.selinux foo
Caveat:
Relabeling or removing foo after doing the above may not be possible
without booting with SELinux disabled. Any subsequent access to foo
after doing the above will also trigger the BUG.
BUG output from Matthew Thode:
[ 473.893141] ------------[ cut here ]------------
[ 473.962110] kernel BUG at security/selinux/ss/services.c:654!
[ 473.995314] invalid opcode: 0000 [#6] SMP
[ 474.027196] Modules linked in:
[ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I
3.13.0-grsec #1
[ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0
07/29/10
[ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti:
ffff8805f50cd488
[ 474.183707] RIP: 0010:[<ffffffff814681c7>] [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246
[ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX:
0000000000000100
[ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI:
ffff8805e8aaa000
[ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09:
0000000000000006
[ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12:
0000000000000006
[ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15:
0000000000000000
[ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000)
knlGS:0000000000000000
[ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4:
00000000000207f0
[ 474.556058] Stack:
[ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98
ffff8805f1190a40
[ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990
ffff8805e8aac860
[ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060
ffff8805c0ac3d94
[ 474.690461] Call Trace:
[ 474.723779] [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a
[ 474.778049] [<ffffffff81468824>] security_compute_av+0xf4/0x20b
[ 474.811398] [<ffffffff8196f419>] avc_compute_av+0x2a/0x179
[ 474.843813] [<ffffffff8145727b>] avc_has_perm+0x45/0xf4
[ 474.875694] [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31
[ 474.907370] [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e
[ 474.938726] [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22
[ 474.970036] [<ffffffff811b057d>] vfs_getattr+0x19/0x2d
[ 475.000618] [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91
[ 475.030402] [<ffffffff811b063b>] vfs_lstat+0x19/0x1b
[ 475.061097] [<ffffffff811b077e>] SyS_newlstat+0x15/0x30
[ 475.094595] [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3
[ 475.148405] [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b
[ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48
8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7
75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8
[ 475.255884] RIP [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[ 475.296120] RSP <ffff8805c0ac3c38>
[ 475.328734] ---[ end trace f076482e9d754adc ]---
Reported-by: Matthew Thode <mthode@mthode.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
The SELinux AF_NETLINK/NETLINK_SOCK_DIAG socket class was missing the
SOCK_DIAG_BY_FAMILY definition which caused SELINUX_ERR messages when
the ss tool was run.
# ss
Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port
u_str ESTAB 0 0 * 14189 * 14190
u_str ESTAB 0 0 * 14145 * 14144
u_str ESTAB 0 0 * 14151 * 14150
{...}
# ausearch -m SELINUX_ERR
----
time->Thu Jan 23 11:11:16 2014
type=SYSCALL msg=audit(1390493476.445:374):
arch=c000003e syscall=44 success=yes exit=40
a0=3 a1=7fff03aa11f0 a2=28 a3=0 items=0 ppid=1852 pid=1895
auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0
tty=pts0 ses=1 comm="ss" exe="/usr/sbin/ss"
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=SELINUX_ERR msg=audit(1390493476.445:374):
SELinux: unrecognized netlink message type=20 for sclass=32
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
Linux 3.13
Conflicts:
security/selinux/hooks.c
Trivial merge issue in selinux_inet_conn_request() likely due to me
including patches that I sent to the stable folks in my next tree
resulting in the patch hitting twice (I think). Thankfully it was an
easy fix this time, but regardless, lesson learned, I will not do that
again.
|
|
Binaries compiled for arm may run on arm64 if CONFIG_COMPAT is
selected. Set LSM_MMAP_MIN_ADDR to 32768 if ARM64 && COMPAT to
prevent selinux failures launching 32-bit static executables that
are mapped at 0x8000.
Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Pull audit update from Eric Paris:
"Again we stayed pretty well contained inside the audit system.
Venturing out was fixing a couple of function prototypes which were
inconsistent (didn't hurt anything, but we used the same value as an
int, uint, u32, and I think even a long in a couple of places).
We also made a couple of minor changes to when a couple of LSMs called
the audit system. We hoped to add aarch64 audit support this go
round, but it wasn't ready.
I'm disappearing on vacation on Thursday. I should have internet
access, but it'll be spotty. If anything goes wrong please be sure to
cc rgb@redhat.com. He'll make fixing things his top priority"
* git://git.infradead.org/users/eparis/audit: (50 commits)
audit: whitespace fix in kernel-parameters.txt
audit: fix location of __net_initdata for audit_net_ops
audit: remove pr_info for every network namespace
audit: Modify a set of system calls in audit class definitions
audit: Convert int limit uses to u32
audit: Use more current logging style
audit: Use hex_byte_pack_upper
audit: correct a type mismatch in audit_syscall_exit()
audit: reorder AUDIT_TTY_SET arguments
audit: rework AUDIT_TTY_SET to only grab spin_lock once
audit: remove needless switch in AUDIT_SET
audit: use define's for audit version
audit: documentation of audit= kernel parameter
audit: wait_for_auditd rework for readability
audit: update MAINTAINERS
audit: log task info on feature change
audit: fix incorrect set of audit_sock
audit: print error message when fail to create audit socket
audit: fix dangling keywords in audit_log_set_loginuid() output
audit: log on errors from filter user rules
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
"The bulk of changes are cleanups and preparations for the upcoming
kernfs conversion.
- cgroup_event mechanism which is and will be used only by memcg is
moved to memcg.
- pidlist handling is updated so that it can be served by seq_file.
Also, the list is not sorted if sane_behavior. cgroup
documentation explicitly states that the file is not sorted but it
has been for quite some time.
- All cgroup file handling now happens on top of seq_file. This is
to prepare for kernfs conversion. In addition, all operations are
restructured so that they map 1-1 to kernfs operations.
- Other cleanups and low-pri fixes"
* 'for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (40 commits)
cgroup: trivial style updates
cgroup: remove stray references to css_id
doc: cgroups: Fix typo in doc/cgroups
cgroup: fix fail path in cgroup_load_subsys()
cgroup: fix missing unlock on error in cgroup_load_subsys()
cgroup: remove for_each_root_subsys()
cgroup: implement for_each_css()
cgroup: factor out cgroup_subsys_state creation into create_css()
cgroup: combine css handling loops in cgroup_create()
cgroup: reorder operations in cgroup_create()
cgroup: make for_each_subsys() useable under cgroup_root_mutex
cgroup: css iterations and css_from_dir() are safe under cgroup_mutex
cgroup: unify pidlist and other file handling
cgroup: replace cftype->read_seq_string() with cftype->seq_show()
cgroup: attach cgroup_open_file to all cgroup files
cgroup: generalize cgroup_pidlist_open_file
cgroup: unify read path so that seq_file is always used
cgroup: unify cgroup_write_X64() and cgroup_write_string()
cgroup: remove cftype->read(), ->read_map() and ->write()
hugetlb_cgroup: convert away from cftype->read()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
"Changes for this kernel include maintenance updates for Smack, SELinux
(and several networking fixes), IMA and TPM"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
SELinux: Fix memory leak upon loading policy
tpm/tpm-sysfs: active_show() can be static
tpm: tpm_tis: Fix compile problems with CONFIG_PM_SLEEP/CONFIG_PNP
tpm: Make tpm-dev allocate a per-file structure
tpm: Use the ops structure instead of a copy in tpm_vendor_specific
tpm: Create a tpm_class_ops structure and use it in the drivers
tpm: Pull all driver sysfs code into tpm-sysfs.c
tpm: Move sysfs functions from tpm-interface to tpm-sysfs
tpm: Pull everything related to /dev/tpmX into tpm-dev.c
char: tpm: nuvoton: remove unused variable
tpm: MAINTAINERS: Cleanup TPM Maintainers file
tpm/tpm_i2c_atmel: fix coccinelle warnings
tpm/tpm_ibmvtpm: fix unreachable code warning (smatch warning)
tpm/tpm_i2c_stm_st33: Check return code of get_burstcount
tpm/tpm_ppi: Check return value of acpi_get_name
tpm/tpm_ppi: Do not compare strcmp(a,b) == -1
ima: remove unneeded size_limit argument from ima_eventdigest_init_common()
ima: update IMA-templates.txt documentation
ima: pass HASH_ALGO__LAST as hash algo in ima_eventdigest_init()
ima: change the default hash algorithm to SHA1 in ima_eventdigest_ng_init()
...
|
|
Remove the call to audit_log() (which call audit_log_start()) and deal with
the errors in the caller, logging only once if the condition is met. Calling
audit_log_start() in this location makes buffer allocation and locking more
complicated in the calling tree (audit_filter_user()).
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
|
|
Two of the conditions in selinux_audit_rule_match() should never happen and
the third indicates a race that should be retried. Remove the calls to
audit_log() (which call audit_log_start()) and deal with the errors in the
caller, logging only once if the condition is met. Calling audit_log_start()
in this location makes buffer allocation and locking more complicated in the
calling tree (audit_filter_user()).
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
|
|
While running stress tests on adding and deleting ftrace instances I hit
this bug:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: selinux_inode_permission+0x85/0x160
PGD 63681067 PUD 7ddbe067 PMD 0
Oops: 0000 [#1] PREEMPT
CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
Call Trace:
security_inode_permission+0x1c/0x30
__inode_permission+0x41/0xa0
inode_permission+0x18/0x50
link_path_walk+0x66/0x920
path_openat+0xa6/0x6c0
do_filp_open+0x43/0xa0
do_sys_open+0x146/0x240
SyS_open+0x1e/0x20
system_call_fastpath+0x16/0x1b
Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
RIP selinux_inode_permission+0x85/0x160
CR2: 0000000000000020
Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.
in selinux_inode_permission():
isec = inode->i_security;
rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
Note, the crash came from stressing the deletion and reading of debugfs
files. I was not able to recreate this via normal files. But I'm not
sure they are safe. It may just be that the race window is much harder
to hit.
What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().
The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct. Now if the freeing of
the i_security happens with a call_rcu(), and the i_security field of
the inode is not changed (it gets freed as the inode gets freed) then
there will be no issue here. (Linus Torvalds suggested not setting the
field to NULL such that we do not need to check if it is NULL in the
permission check).
Note, this is a hack, but it fixes the problem at hand. A real fix is
to restructure the destroy_inode() to call all the destructor handlers
from the RCU callback. But that is a major job to do, and requires a
lot of work. For now, we just band-aid this bug with this fix (it
works), and work on a more maintainable solution in the future.
Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home
Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
Hello.
I got below leak with linux-3.10.0-54.0.1.el7.x86_64 .
[ 681.903890] kmemleak: 5538 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
Below is a patch, but I don't know whether we need special handing for undoing
ebitmap_set_bit() call.
----------
>>From fe97527a90fe95e2239dfbaa7558f0ed559c0992 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Mon, 6 Jan 2014 16:30:21 +0900
Subject: [PATCH] SELinux: Fix memory leak upon loading policy
Commit 2463c26d "SELinux: put name based create rules in a hashtable" did not
check return value from hashtab_insert() in filename_trans_read(). It leaks
memory if hashtab_insert() returns error.
unreferenced object 0xffff88005c9160d0 (size 8):
comm "systemd", pid 1, jiffies 4294688674 (age 235.265s)
hex dump (first 8 bytes):
57 0b 00 00 6b 6b 6b a5 W...kkk.
backtrace:
[<ffffffff816604ae>] kmemleak_alloc+0x4e/0xb0
[<ffffffff811cba5e>] kmem_cache_alloc_trace+0x12e/0x360
[<ffffffff812aec5d>] policydb_read+0xd1d/0xf70
[<ffffffff812b345c>] security_load_policy+0x6c/0x500
[<ffffffff812a623c>] sel_write_load+0xac/0x750
[<ffffffff811eb680>] vfs_write+0xc0/0x1f0
[<ffffffff811ec08c>] SyS_write+0x4c/0xa0
[<ffffffff81690419>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
However, we should not return EEXIST error to the caller, or the systemd will
show below message and the boot sequence freezes.
systemd[1]: Failed to load SELinux policy. Freezing.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
Conflicts:
security/selinux/hooks.c
Resolved using request struct.
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
|
|
This patch removes the 'size_limit' argument from
ima_eventdigest_init_common(). Since the 'd' field will never include
the hash algorithm as prefix and the 'd-ng' will always have it, we can
use the hash algorithm to differentiate the two cases in the modified
function (it is equal to HASH_ALGO__LAST in the first case, the opposite
in the second).
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
|
|
Replace the '-1' value with HASH_ALGO__LAST in ima_eventdigest_init()
as the called function ima_eventdigest_init_common() expects an unsigned
char.
Fix commit:
4d7aeee ima: define new template ima-ng and template fields d-ng and n-ng
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
|
|
Replace HASH_ALGO__LAST with HASH_ALGO_SHA1 as the initial value of
the hash algorithm so that the prefix 'sha1:' is added to violation
digests.
Fix commit:
4d7aeee ima: define new template ima-ng and template fields d-ng and n-ng
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Cc: <stable@vger.kernel.org> # 3.13.x
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
|
|
Eric Paris politely points out:
Inside smack_file_receive() it seems like you are initting the audit
field with LSM_AUDIT_DATA_TASK. And then use
smk_ad_setfield_u_fs_path().
Seems like LSM_AUDIT_DATA_PATH would make more sense. (and depending
on how it's used fix a crash...)
He is correct. This puts things in order.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
|
The mount restrictions imposed by Smack rely heavily on the
use of the filesystem "floor", which is the label that all
processes writing to the filesystem must have access to. It
turns out that while the "floor" notion is sound, it has yet
to be fully implemented and has never been used.
The sb_mount and sb_umount hooks only make sense if the
filesystem floor is used actively, and it isn't. They can
be reintroduced if a rational restriction comes up. Until
then, they get removed.
The sb_kern_mount hook is required for the option processing.
It is too permissive in the case of unprivileged mounts,
effectively bypassing the CAP_MAC_ADMIN restrictions if
any of the smack options are specified. Unprivileged mounts
are no longer allowed to set Smack filesystem options.
Additionally, the root and default values are set to the
label of the caller, in keeping with the policy that objects
get the label of their creator.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
|
smk_write_change_rule() is calling capable rather than
the more correct smack_privileged(). This allows for setting
rules in violation of the onlycap facility. This is the
simple repair.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
|
The syslog control requires that the calling proccess
have the floor ("_") Smack label. Tizen does not run any
processes except for kernel helpers with the floor label.
This changes allows the admin to configure a specific
label for syslog. The default value is the star ("*")
label, effectively removing the restriction. The value
can be set using smackfs/syslog for anyone who wants
a more restrictive behavior.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
|
selinux_setprocattr() does ptrace_parent(p) under task_lock(p),
but task_struct->alloc_lock doesn't pin ->parent or ->ptrace,
this looks confusing and triggers the "suspicious RCU usage"
warning because ptrace_parent() does rcu_dereference_check().
And in theory this is wrong, spin_lock()->preempt_disable()
doesn't necessarily imply rcu_read_lock() we need to access
the ->parent.
Reported-by: Evan McNabb <emcnabb@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
Fix a broken networking check. Return an error if peer recv fails. If
secmark is active and the packet recv succeeds the peer recv error is
ignored.
Signed-off-by: Chad Hanson <chanson@trustedcs.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
Smack prohibits processes from using the star ("*") and web ("@") labels
because we don't want files with those labels getting created implicitly.
All setting of those labels should be done explicitly. The trouble is that
there is no check for these labels in the processing of SMACK64EXEC. That
is repaired.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
|
selinux_setprocattr() does ptrace_parent(p) under task_lock(p),
but task_struct->alloc_lock doesn't pin ->parent or ->ptrace,
this looks confusing and triggers the "suspicious RCU usage"
warning because ptrace_parent() does rcu_dereference_check().
And in theory this is wrong, spin_lock()->preempt_disable()
doesn't necessarily imply rcu_read_lock() we need to access
the ->parent.
Reported-by: Evan McNabb <emcnabb@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
Remove duplicated include.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull SELinux fixes from James Morris.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()
selinux: look for IPsec labels on both inbound and outbound packets
selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()
selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()
selinux: fix possible memory leak
|
|
This reverts commit 102aefdda4d8275ce7d7100bc16c88c74272b260.
Tom London reports that it causes sync() to hang on Fedora rawhide:
https://bugzilla.redhat.com/show_bug.cgi?id=1033965
and Josh Boyer bisected it down to this commit. Reverting the commit in
the rawhide kernel fixes the problem.
Eric Paris root-caused it to incorrect subtype matching in that commit
breaking fuse, and has a tentative patch, but by now we're better off
retrying this in 3.14 rather than playing with it any more.
Reported-by: Tom London <selinux@gmail.com>
Bisected-by: Josh Boyer <jwboyer@fedoraproject.org>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Anand Avati <avati@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Revert "selinux: consider filesystem subtype in policies"
This reverts commit 102aefdda4d8275ce7d7100bc16c88c74272b260.
Explanation from Eric Paris:
SELinux policy can specify if it should use a filesystem's
xattrs or not. In current policy we have a specification that
fuse should not use xattrs but fuse.glusterfs should use
xattrs. This patch has a bug in which non-glusterfs
filesystems would match the rule saying fuse.glusterfs should
use xattrs. If both fuse and the particular filesystem in
question are not written to handle xattr calls during the mount
command, they will deadlock.
I have fixed the bug to do proper matching, however I believe a
revert is still the correct solution. The reason I believe
that is because the code still does not work. The s_subtype is
not set until after the SELinux hook which attempts to match on
the ".gluster" portion of the rule. So we cannot match on the
rule in question. The code is useless.
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|
into for-linus
|