linux-toradex.git/kernel, branch v3.18.21

kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP

2015-08-27T17:26:05+00:00

[ Upstream commit 7e01b5acd88b3f3108d8c4ce44e3205d67437202 ]

Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code
to override the gfp flags of the allocation for the kexec control
page. The loop in kimage_alloc_normal_control_pages allocates pages
with GFP_KERNEL until a page is found that happens to have an
address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems
with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT
the loop will keep allocating memory until the oom killer steps in.

Signed-off-by: Martin Schwidefsky 
Signed-off-by: Sasha Levin

genirq: Prevent resend to interrupts marked IRQ_NESTED_THREAD

2015-08-27T17:25:40+00:00

[ Upstream commit 75a06189fc508a2acf470b0b12710362ffb2c4b1 ]

The resend mechanism happily calls the interrupt handler of interrupts
which are marked IRQ_NESTED_THREAD from softirq context. This can
result in crashes because the interrupt handler is not the proper way
to invoke the device handlers. They must be invoked via
handle_nested_irq.

Prevent the resend even if the interrupt has no valid parent irq
set. Its better to have a lost interrupt than a crashing machine.

Reported-by: Uwe Kleine-König 
Signed-off-by: Thomas Gleixner 
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin

signal: fix information leak in copy_siginfo_from_user32

2015-08-27T17:25:37+00:00

[ Upstream commit 3c00cb5e68dc719f2fc73a33b1b230aadfcb1309 ]

This function can leak kernel stack data when the user siginfo_t has a
positive si_code value.  The top 16 bits of si_code descibe which fields
in the siginfo_t union are active, but they are treated inconsistently
between copy_siginfo_from_user32, copy_siginfo_to_user32 and
copy_siginfo_to_user.

copy_siginfo_from_user32 is called from rt_sigqueueinfo and
rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
of si_code.

This fixes the following information leaks:
x86:   8 bytes leaked when sending a signal from a 32-bit process to
       itself. This leak grows to 16 bytes if the process uses x32.
       (si_code = __SI_CHLD)
x86:   100 bytes leaked when sending a signal from a 32-bit process to
       a 64-bit process. (si_code = -1)
sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
       64-bit process. (si_code = any)

parsic and s390 have similar bugs, but they are not vulnerable because
rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
to a different process.  These bugs are also fixed for consistency.

Signed-off-by: Amanieu d'Antras 
Cc: Oleg Nesterov 
Cc: Ingo Molnar 
Cc: Russell King 
Cc: Ralf Baechle 
Cc: Benjamin Herrenschmidt 
Cc: Chris Metcalf 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin

signal: fix information leak in copy_siginfo_to_user

2015-08-27T17:25:36+00:00

[ Upstream commit c08a75d950725cdd87c19232a5a3850c51520359 ]

commit 26135022f85105ad725cda103fa069e29e83bd16 upstream.

This function may copy the si_addr_lsb, si_lower and si_upper fields to
user mode when they haven't been initialized, which can leak kernel
stack data to user mode.

Just checking the value of si_code is insufficient because the same
si_code value is shared between multiple signals.  This is solved by
checking the value of si_signo in addition to si_code.

Signed-off-by: Amanieu d'Antras 
Cc: Oleg Nesterov 
Cc: Ingo Molnar 
Cc: Russell King 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin

security_syslog() should be called once only

2015-08-04T18:36:42+00:00

[ Upstream commit d194e5d666225b04c7754471df0948f645b6ab3a ]

The final version of commit 637241a900cb ("kmsg: honor dmesg_restrict
sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are
processed incorrectly:

- open of /dev/kmsg checks syslog access permissions by using
  check_syslog_permissions() where security_syslog() is not called if
  dmesg_restrict is set.

- syslog syscall and /proc/kmsg calls do_syslog() where security_syslog
  can be executed twice (inside check_syslog_permissions() and then
  directly in do_syslog())

With this patch security_syslog() is called once only in all
syslog-related operations regardless of dmesg_restrict value.

Fixes: 637241a900cb ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg")
Signed-off-by: Vasily Averin 
Cc: Kees Cook 
Cc: Josh Boyer 
Cc: Eric Paris 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin

tracing: Have branch tracer use recursive field of task struct

2015-08-04T18:30:44+00:00

[ Upstream commit 6224beb12e190ff11f3c7d4bf50cb2922878f600 ]

Fengguang Wu's tests triggered a bug in the branch tracer's start up
test when CONFIG_DEBUG_PREEMPT set. This was because that config
adds some debug logic in the per cpu field, which calls back into
the branch tracer.

The branch tracer has its own recursive checks, but uses a per cpu
variable to implement it. If retrieving the per cpu variable calls
back into the branch tracer, you can see how things will break.

Instead of using a per cpu variable, use the trace_recursion field
of the current task struct. Simply set a bit when entering the
branch tracing and clear it when leaving. If the bit is set on
entry, just don't do the tracing.

There's also the case with lockdep, as the local_irq_save() called
before the recursion can also trigger code that can call back into
the function. Changing that to a raw_local_irq_save() will protect
that as well.

This prevents the recursion and the inevitable crash that follows.

Link: http://lkml.kernel.org/r/20150630141803.GA28071@wfg-t540p.sh.intel.com

Cc: stable@vger.kernel.org # 3.10+
Reported-by: Fengguang Wu 
Tested-by: Fengguang Wu 
Signed-off-by: Steven Rostedt 
Signed-off-by: Sasha Levin

tracing/filter: Do not WARN on operand count going below zero

2015-08-04T18:30:09+00:00

[ Upstream commit b4875bbe7e68f139bd3383828ae8e994a0df6d28 ]

When testing the fix for the trace filter, I could not come up with
a scenario where the operand count goes below zero, so I added a
WARN_ON_ONCE(cnt < 0) to the logic. But there is legitimate case
that it can happen (although the filter would be wrong).

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

That is, a single operation without any operands will hit the path
where the WARN_ON_ONCE() can trigger. Although this is harmless,
and the filter is reported as a error. But instead of spitting out
a warning to the kernel dmesg, just fail nicely and report it via
the proper channels.

Link: http://lkml.kernel.org/r/558C6082.90608@oracle.com

Reported-by: Vince Weaver 
Reported-by: Sasha Levin 
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Steven Rostedt 
Signed-off-by: Sasha Levin

perf: Fix ring_buffer_attach() RCU sync, again

2015-07-13T12:50:09+00:00

[ Upstream commit 2f993cf093643b98477c421fa2b9a98dcc940323 ]

While looking for other users of get_state/cond_sync. I Found
ring_buffer_attach() and it looks obviously buggy?

Don't we need to ensure that we have "synchronize" _between_
list_del() and list_add() ?

IOW. Suppose that ring_buffer_attach() preempts right_after
get_state_synchronize_rcu() and gp completes before spin_lock().

In this case cond_synchronize_rcu() does nothing and we reuse
->rb_entry without waiting for gp in between?

It also moves the ->rcu_pending check under "if (rb)", to make it
more readable imo.

Signed-off-by: Oleg Nesterov 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Alexander Shishkin 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: josh@joshtriplett.org
Cc: tj@kernel.org
Fixes: b69cf53640da ("perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()")
Link: http://lkml.kernel.org/r/20150530200425.GA15748@redhat.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Sasha Levin

tracing: Have filter check for balanced ops

2015-07-13T12:50:03+00:00

[ Upstream commit 2cf30dc180cea808077f003c5116388183e54f9e ]

When the following filter is used it causes a warning to trigger:

 # cd /sys/kernel/debug/tracing
 # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
 # cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: No error

 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990()
 Modules linked in: bnep lockd grace bluetooth  ...
 CPU: 3 PID: 1223 Comm: bash Tainted: G        W       4.1.0-rc3-test+ #450
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
  0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0
  0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c
  ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea
 Call Trace:
  [] dump_stack+0x4c/0x6e
  [] warn_slowpath_common+0x97/0xe0
  [] ? _kstrtoull+0x2c/0x80
  [] warn_slowpath_null+0x1a/0x20
  [] replace_preds+0x3c5/0x990
  [] create_filter+0x82/0xb0
  [] apply_event_filter+0xd4/0x180
  [] event_filter_write+0x8f/0x120
  [] __vfs_write+0x28/0xe0
  [] ? __sb_start_write+0x53/0xf0
  [] ? security_file_permission+0x30/0xc0
  [] vfs_write+0xb8/0x1b0
  [] SyS_write+0x4f/0xb0
  [] system_call_fastpath+0x12/0x6a
 ---[ end trace e11028bd95818dcd ]---

Worse yet, reading the error message (the filter again) it says that
there was no error, when there clearly was. The issue is that the
code that checks the input does not check for balanced ops. That is,
having an op between a closed parenthesis and the next token.

This would only cause a warning, and fail out before doing any real
harm, but it should still not caues a warning, and the error reported
should work:

 # cd /sys/kernel/debug/tracing
 # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
 # cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: Meaningless filter expression

And give no kernel warning.

Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home

Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: stable@vger.kernel.org # 2.6.31+
Reported-by: Vince Weaver 
Tested-by: Vince Weaver 
Signed-off-by: Steven Rostedt 
Signed-off-by: Sasha Levin

tracing/filter: Do not allow infix to exceed end of string

2015-07-05T14:12:52+00:00

[ Upstream commit 6b88f44e161b9ee2a803e5b2b1fbcf4e20e8b980 ]

While debugging a WARN_ON() for filtering, I found that it is possible
for the filter string to be referenced after its end. With the filter:

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

The filter_parse() function can call infix_get_op() which calls
infix_advance() that updates the infix filter pointers for the cnt
and tail without checking if the filter is already at the end, which
will put the cnt to zero and the tail beyond the end. The loop then calls
infix_next() that has

	ps->infix.cnt--;
	return ps->infix.string[ps->infix.tail++];

The cnt will now be below zero, and the tail that is returned is
already passed the end of the filter string. So far the allocation
of the filter string usually has some buffer that is zeroed out, but
if the filter string is of the exact size of the allocated buffer
there's no guarantee that the charater after the nul terminating
character will be zero.

Luckily, only root can write to the filter.

Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Steven Rostedt 
Signed-off-by: Sasha Levin