<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/signal.c, branch v4.9.128</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>kernel/signal.c: avoid undefined behaviour in kill_something_info</title>
<updated>2018-05-30T05:50:18+00:00</updated>
<author>
<name>zhongjiang</name>
<email>zhongjiang@huawei.com</email>
</author>
<published>2017-07-10T22:52:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ec1975ac988686eba0f105f87ed0b587da43d384'/>
<id>ec1975ac988686eba0f105f87ed0b587da43d384</id>
<content type='text'>
commit 4ea77014af0d6205b05503d1c7aac6eace11d473 upstream.

When running kill(72057458746458112, 0) in userspace I hit the following
issue.

  UBSAN: Undefined behaviour in kernel/signal.c:1462:11
  negation of -2147483648 cannot be represented in type 'int':
  CPU: 226 PID: 9849 Comm: test Tainted: G    B          ---- -------   3.10.0-327.53.58.70.x86_64_ubsan+ #116
  Hardware name: Huawei Technologies Co., Ltd. RH8100 V3/BC61PBIA, BIOS BLHSV028 11/11/2014
  Call Trace:
    dump_stack+0x19/0x1b
    ubsan_epilogue+0xd/0x50
    __ubsan_handle_negate_overflow+0x109/0x14e
    SYSC_kill+0x43e/0x4d0
    SyS_kill+0xe/0x10
    system_call_fastpath+0x16/0x1b

Add code to avoid the UBSAN detection.

[akpm@linux-foundation.org: tweak comment]
Link: http://lkml.kernel.org/r/1496670008-59084-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhongjiang &lt;zhongjiang@huawei.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4ea77014af0d6205b05503d1c7aac6eace11d473 upstream.

When running kill(72057458746458112, 0) in userspace I hit the following
issue.

  UBSAN: Undefined behaviour in kernel/signal.c:1462:11
  negation of -2147483648 cannot be represented in type 'int':
  CPU: 226 PID: 9849 Comm: test Tainted: G    B          ---- -------   3.10.0-327.53.58.70.x86_64_ubsan+ #116
  Hardware name: Huawei Technologies Co., Ltd. RH8100 V3/BC61PBIA, BIOS BLHSV028 11/11/2014
  Call Trace:
    dump_stack+0x19/0x1b
    ubsan_epilogue+0xd/0x50
    __ubsan_handle_negate_overflow+0x109/0x14e
    SYSC_kill+0x43e/0x4d0
    SyS_kill+0xe/0x10
    system_call_fastpath+0x16/0x1b

Add code to avoid the UBSAN detection.

[akpm@linux-foundation.org: tweak comment]
Link: http://lkml.kernel.org/r/1496670008-59084-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhongjiang &lt;zhongjiang@huawei.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>signals: avoid unnecessary taking of sighand-&gt;siglock</title>
<updated>2018-05-22T14:57:57+00:00</updated>
<author>
<name>Waiman Long</name>
<email>Waiman.Long@hpe.com</email>
</author>
<published>2016-12-14T23:04:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=20a30619b3315c1f11565cfbc0dbe2ea92a8a003'/>
<id>20a30619b3315c1f11565cfbc0dbe2ea92a8a003</id>
<content type='text'>
commit c7be96af89d4b53211862d8599b2430e8900ed92 upstream.

When running certain database workload on a high-end system with many
CPUs, it was found that spinlock contention in the sigprocmask syscalls
became a significant portion of the overall CPU cycles as shown below.

  9.30%  9.30%  905387  dataserver  /proc/kcore 0x7fff8163f4d2
  [k] _raw_spin_lock_irq
            |
            ---_raw_spin_lock_irq
               |
               |--99.34%-- __set_current_blocked
               |          sigprocmask
               |          sys_rt_sigprocmask
               |          system_call_fastpath
               |          |
               |          |--50.63%-- __swapcontext
               |          |          |
               |          |          |--99.91%-- upsleepgeneric
               |          |
               |          |--49.36%-- __setcontext
               |          |          ktskRun

Looking further into the swapcontext function in glibc, it was found that
the function always call sigprocmask() without checking if there are
changes in the signal mask.

A check was added to the __set_current_blocked() function to avoid taking
the sighand-&gt;siglock spinlock if there is no change in the signal mask.
This will prevent unneeded spinlock contention when many threads are
trying to call sigprocmask().

With this patch applied, the spinlock contention in sigprocmask() was
gone.

Link: http://lkml.kernel.org/r/1474979209-11867-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Waiman Long &lt;Waiman.Long@hpe.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Stas Sergeev &lt;stsp@list.ru&gt;
Cc: Scott J Norton &lt;scott.norton@hpe.com&gt;
Cc: Douglas Hatch &lt;doug.hatch@hpe.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c7be96af89d4b53211862d8599b2430e8900ed92 upstream.

When running certain database workload on a high-end system with many
CPUs, it was found that spinlock contention in the sigprocmask syscalls
became a significant portion of the overall CPU cycles as shown below.

  9.30%  9.30%  905387  dataserver  /proc/kcore 0x7fff8163f4d2
  [k] _raw_spin_lock_irq
            |
            ---_raw_spin_lock_irq
               |
               |--99.34%-- __set_current_blocked
               |          sigprocmask
               |          sys_rt_sigprocmask
               |          system_call_fastpath
               |          |
               |          |--50.63%-- __swapcontext
               |          |          |
               |          |          |--99.91%-- upsleepgeneric
               |          |
               |          |--49.36%-- __setcontext
               |          |          ktskRun

Looking further into the swapcontext function in glibc, it was found that
the function always call sigprocmask() without checking if there are
changes in the signal mask.

A check was added to the __set_current_blocked() function to avoid taking
the sighand-&gt;siglock spinlock if there is no change in the signal mask.
This will prevent unneeded spinlock contention when many threads are
trying to call sigprocmask().

With this patch applied, the spinlock contention in sigprocmask() was
gone.

Link: http://lkml.kernel.org/r/1474979209-11867-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Waiman Long &lt;Waiman.Long@hpe.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Stas Sergeev &lt;stsp@list.ru&gt;
Cc: Scott J Norton &lt;scott.norton@hpe.com&gt;
Cc: Douglas Hatch &lt;doug.hatch@hpe.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/signal.c: remove the no longer needed SIGNAL_UNKILLABLE check in complete_signal()</title>
<updated>2018-01-10T08:29:53+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2017-11-17T23:30:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4d53eb494950de4dc898a014600b56254a913a6d'/>
<id>4d53eb494950de4dc898a014600b56254a913a6d</id>
<content type='text'>
commit 426915796ccaf9c2bd9bb06dc5702225957bc2e5 upstream.

complete_signal() checks SIGNAL_UNKILLABLE before it starts to destroy
the thread group, today this is wrong in many ways.

If nothing else, fatal_signal_pending() should always imply that the
whole thread group (except -&gt;group_exit_task if it is not NULL) is
killed, this check breaks the rule.

After the previous changes we can rely on sig_task_ignored();
sig_fatal(sig) &amp;&amp; SIGNAL_UNKILLABLE can only be true if we actually want
to kill this task and sig == SIGKILL OR it is traced and debugger can
intercept the signal.

This should hopefully fix the problem reported by Dmitry.  This
test-case

	static int init(void *arg)
	{
		for (;;)
			pause();
	}

	int main(void)
	{
		char stack[16 * 1024];

		for (;;) {
			int pid = clone(init, stack + sizeof(stack)/2,
					CLONE_NEWPID | SIGCHLD, NULL);
			assert(pid &gt; 0);

			assert(ptrace(PTRACE_ATTACH, pid, 0, 0) == 0);
			assert(waitpid(-1, NULL, WSTOPPED) == pid);

			assert(ptrace(PTRACE_DETACH, pid, 0, SIGSTOP) == 0);
			assert(syscall(__NR_tkill, pid, SIGKILL) == 0);
			assert(pid == wait(NULL));
		}
	}

triggers the WARN_ON_ONCE(!(task-&gt;jobctl &amp; JOBCTL_STOP_PENDING)) in
task_participate_group_stop().  do_signal_stop()-&gt;signal_group_exit()
checks SIGNAL_GROUP_EXIT and return false, but task_set_jobctl_pending()
checks fatal_signal_pending() and does not set JOBCTL_STOP_PENDING.

And his should fix the minor security problem reported by Kyle,
SECCOMP_RET_TRACE can miss fatal_signal_pending() the same way if the
task is the root of a pid namespace.

Link: http://lkml.kernel.org/r/20171103184246.GD21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Reported-by: Kyle Huey &lt;me@kylehuey.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 426915796ccaf9c2bd9bb06dc5702225957bc2e5 upstream.

complete_signal() checks SIGNAL_UNKILLABLE before it starts to destroy
the thread group, today this is wrong in many ways.

If nothing else, fatal_signal_pending() should always imply that the
whole thread group (except -&gt;group_exit_task if it is not NULL) is
killed, this check breaks the rule.

After the previous changes we can rely on sig_task_ignored();
sig_fatal(sig) &amp;&amp; SIGNAL_UNKILLABLE can only be true if we actually want
to kill this task and sig == SIGKILL OR it is traced and debugger can
intercept the signal.

This should hopefully fix the problem reported by Dmitry.  This
test-case

	static int init(void *arg)
	{
		for (;;)
			pause();
	}

	int main(void)
	{
		char stack[16 * 1024];

		for (;;) {
			int pid = clone(init, stack + sizeof(stack)/2,
					CLONE_NEWPID | SIGCHLD, NULL);
			assert(pid &gt; 0);

			assert(ptrace(PTRACE_ATTACH, pid, 0, 0) == 0);
			assert(waitpid(-1, NULL, WSTOPPED) == pid);

			assert(ptrace(PTRACE_DETACH, pid, 0, SIGSTOP) == 0);
			assert(syscall(__NR_tkill, pid, SIGKILL) == 0);
			assert(pid == wait(NULL));
		}
	}

triggers the WARN_ON_ONCE(!(task-&gt;jobctl &amp; JOBCTL_STOP_PENDING)) in
task_participate_group_stop().  do_signal_stop()-&gt;signal_group_exit()
checks SIGNAL_GROUP_EXIT and return false, but task_set_jobctl_pending()
checks fatal_signal_pending() and does not set JOBCTL_STOP_PENDING.

And his should fix the minor security problem reported by Kyle,
SECCOMP_RET_TRACE can miss fatal_signal_pending() the same way if the
task is the root of a pid namespace.

Link: http://lkml.kernel.org/r/20171103184246.GD21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Reported-by: Kyle Huey &lt;me@kylehuey.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/signal.c: protect the SIGNAL_UNKILLABLE tasks from !sig_kernel_only() signals</title>
<updated>2018-01-10T08:29:52+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2017-11-17T23:30:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=794ac8ef9b06bd51186f6e54b763ecf4f94d7171'/>
<id>794ac8ef9b06bd51186f6e54b763ecf4f94d7171</id>
<content type='text'>
commit ac25385089f673560867eb5179228a44ade0cfc1 upstream.

Change sig_task_ignored() to drop the SIG_DFL &amp;&amp; !sig_kernel_only()
signals even if force == T.  This simplifies the next change and this
matches the same check in get_signal() which will drop these signals
anyway.

Link: http://lkml.kernel.org/r/20171103184227.GC21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ac25385089f673560867eb5179228a44ade0cfc1 upstream.

Change sig_task_ignored() to drop the SIG_DFL &amp;&amp; !sig_kernel_only()
signals even if force == T.  This simplifies the next change and this
matches the same check in get_signal() which will drop these signals
anyway.

Link: http://lkml.kernel.org/r/20171103184227.GC21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/signal.c: protect the traced SIGNAL_UNKILLABLE tasks from SIGKILL</title>
<updated>2018-01-10T08:29:52+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2017-11-17T23:30:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1453b3ac6cf8c62abb5a50680c6626d460c63044'/>
<id>1453b3ac6cf8c62abb5a50680c6626d460c63044</id>
<content type='text'>
commit 628c1bcba204052d19b686b5bac149a644cdb72e upstream.

The comment in sig_ignored() says "Tracers may want to know about even
ignored signals" but SIGKILL can not be reported to debugger and it is
just wrong to return 0 in this case: SIGKILL should only kill the
SIGNAL_UNKILLABLE task if it comes from the parent ns.

Change sig_ignored() to ignore -&gt;ptrace if sig == SIGKILL and rely on
sig_task_ignored().

SISGTOP coming from within the namespace is not really right too but at
least debugger can intercept it, and we can't drop it here because this
will break "gdb -p 1": ptrace_attach() won't work.  Perhaps we will add
another -&gt;ptrace check later, we will see.

Link: http://lkml.kernel.org/r/20171103184206.GB21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 628c1bcba204052d19b686b5bac149a644cdb72e upstream.

The comment in sig_ignored() says "Tracers may want to know about even
ignored signals" but SIGKILL can not be reported to debugger and it is
just wrong to return 0 in this case: SIGKILL should only kill the
SIGNAL_UNKILLABLE task if it comes from the parent ns.

Change sig_ignored() to ignore -&gt;ptrace if sig == SIGKILL and rely on
sig_task_ignored().

SISGTOP coming from within the namespace is not really right too but at
least debugger can intercept it, and we can't drop it here because this
will break "gdb -p 1": ptrace_attach() won't work.  Perhaps we will add
another -&gt;ptrace check later, we will see.

Link: http://lkml.kernel.org/r/20171103184206.GB21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>signal: protect SIGNAL_UNKILLABLE from unintentional clearing.</title>
<updated>2017-08-11T15:49:36+00:00</updated>
<author>
<name>Jamie Iles</name>
<email>jamie.iles@oracle.com</email>
</author>
<published>2017-01-11T00:57:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=916a05b90d8322f38ea133feb3a194617e507c3c'/>
<id>916a05b90d8322f38ea133feb3a194617e507c3c</id>
<content type='text'>
[ Upstream commit 2d39b3cd34e6d323720d4c61bd714f5ae202c022 ]

Since commit 00cd5c37afd5 ("ptrace: permit ptracing of /sbin/init") we
can now trace init processes.  init is initially protected with
SIGNAL_UNKILLABLE which will prevent fatal signals such as SIGSTOP, but
there are a number of paths during tracing where SIGNAL_UNKILLABLE can
be implicitly cleared.

This can result in init becoming stoppable/killable after tracing.  For
example, running:

  while true; do kill -STOP 1; done &amp;
  strace -p 1

and then stopping strace and the kill loop will result in init being
left in state TASK_STOPPED.  Sending SIGCONT to init will resume it, but
init will now respond to future SIGSTOP signals rather than ignoring
them.

Make sure that when setting SIGNAL_STOP_CONTINUED/SIGNAL_STOP_STOPPED
that we don't clear SIGNAL_UNKILLABLE.

Link: http://lkml.kernel.org/r/20170104122017.25047-1-jamie.iles@oracle.com
Signed-off-by: Jamie Iles &lt;jamie.iles@oracle.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2d39b3cd34e6d323720d4c61bd714f5ae202c022 ]

Since commit 00cd5c37afd5 ("ptrace: permit ptracing of /sbin/init") we
can now trace init processes.  init is initially protected with
SIGNAL_UNKILLABLE which will prevent fatal signals such as SIGSTOP, but
there are a number of paths during tracing where SIGNAL_UNKILLABLE can
be implicitly cleared.

This can result in init becoming stoppable/killable after tracing.  For
example, running:

  while true; do kill -STOP 1; done &amp;
  strace -p 1

and then stopping strace and the kill loop will result in init being
left in state TASK_STOPPED.  Sending SIGCONT to init will resume it, but
init will now respond to future SIGSTOP signals rather than ignoring
them.

Make sure that when setting SIGNAL_STOP_CONTINUED/SIGNAL_STOP_STOPPED
that we don't clear SIGNAL_UNKILLABLE.

Link: http://lkml.kernel.org/r/20170104122017.25047-1-jamie.iles@oracle.com
Signed-off-by: Jamie Iles &lt;jamie.iles@oracle.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>signal: Only reschedule timers on signals timers have sent</title>
<updated>2017-06-29T11:00:29+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2017-06-13T09:31:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f719f20abe2a52fff61ffc3b230308279b841475'/>
<id>f719f20abe2a52fff61ffc3b230308279b841475</id>
<content type='text'>
commit 57db7e4a2d92c2d3dfbca4ef8057849b2682436b upstream.

Thomas Gleixner  wrote:
&gt; The CRIU support added a 'feature' which allows a user space task to send
&gt; arbitrary (kernel) signals to itself. The changelog says:
&gt;
&gt;   The kernel prevents sending of siginfo with positive si_code, because
&gt;   these codes are reserved for kernel.  I think we can allow a task to
&gt;   send such a siginfo to itself.  This operation should not be dangerous.
&gt;
&gt; Quite contrary to that claim, it turns out that it is outright dangerous
&gt; for signals with info-&gt;si_code == SI_TIMER. The following code sequence in
&gt; a user space task allows to crash the kernel:
&gt;
&gt;    id = timer_create(CLOCK_XXX, ..... signo = SIGX);
&gt;    timer_set(id, ....);
&gt;    info-&gt;si_signo = SIGX;
&gt;    info-&gt;si_code = SI_TIMER:
&gt;    info-&gt;_sifields._timer._tid = id;
&gt;    info-&gt;_sifields._timer._sys_private = 2;
&gt;    rt_[tg]sigqueueinfo(..., SIGX, info);
&gt;    sigemptyset(&amp;sigset);
&gt;    sigaddset(&amp;sigset, SIGX);
&gt;    rt_sigtimedwait(sigset, info);
&gt;
&gt; For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
&gt; results in a kernel crash because sigwait() dequeues the signal and the
&gt; dequeue code observes:
&gt;
&gt;   info-&gt;si_code == SI_TIMER &amp;&amp; info-&gt;_sifields._timer._sys_private != 0
&gt;
&gt; which triggers the following callchain:
&gt;
&gt;  do_schedule_next_timer() -&gt; posix_cpu_timer_schedule() -&gt; arm_timer()
&gt;
&gt; arm_timer() executes a list_add() on the timer, which is already armed via
&gt; the timer_set() syscall. That's a double list add which corrupts the posix
&gt; cpu timer list. As a consequence the kernel crashes on the next operation
&gt; touching the posix cpu timer list.
&gt;
&gt; Posix clocks which are internally implemented based on hrtimers are not
&gt; affected by this because hrtimer_start() can handle already armed timers
&gt; nicely, but it's a reliable way to trigger the WARN_ON() in
&gt; hrtimer_forward(), which complains about calling that function on an
&gt; already armed timer.

This problem has existed since the posix timer code was merged into
2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
inject not just a signal (which linux has supported since 1.0) but the
full siginfo of a signal.

The core problem is that the code will reschedule in response to
signals getting dequeued not just for signals the timers sent but
for other signals that happen to a si_code of SI_TIMER.

Avoid this confusion by testing to see if the queued signal was
preallocated as all timer signals are preallocated, and so far
only the timer code preallocates signals.

Move the check for if a timer needs to be rescheduled up into
collect_signal where the preallocation check must be performed,
and pass the result back to dequeue_signal where the code reschedules
timers.   This makes it clear why the code cares about preallocated
timers.

Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reference: 66dd34ad31e5 ("signal: allow to send any siginfo to itself")
Reference: 1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
Fixes: db8b50ba75f2 ("[PATCH] POSIX clocks &amp; timers")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 57db7e4a2d92c2d3dfbca4ef8057849b2682436b upstream.

Thomas Gleixner  wrote:
&gt; The CRIU support added a 'feature' which allows a user space task to send
&gt; arbitrary (kernel) signals to itself. The changelog says:
&gt;
&gt;   The kernel prevents sending of siginfo with positive si_code, because
&gt;   these codes are reserved for kernel.  I think we can allow a task to
&gt;   send such a siginfo to itself.  This operation should not be dangerous.
&gt;
&gt; Quite contrary to that claim, it turns out that it is outright dangerous
&gt; for signals with info-&gt;si_code == SI_TIMER. The following code sequence in
&gt; a user space task allows to crash the kernel:
&gt;
&gt;    id = timer_create(CLOCK_XXX, ..... signo = SIGX);
&gt;    timer_set(id, ....);
&gt;    info-&gt;si_signo = SIGX;
&gt;    info-&gt;si_code = SI_TIMER:
&gt;    info-&gt;_sifields._timer._tid = id;
&gt;    info-&gt;_sifields._timer._sys_private = 2;
&gt;    rt_[tg]sigqueueinfo(..., SIGX, info);
&gt;    sigemptyset(&amp;sigset);
&gt;    sigaddset(&amp;sigset, SIGX);
&gt;    rt_sigtimedwait(sigset, info);
&gt;
&gt; For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
&gt; results in a kernel crash because sigwait() dequeues the signal and the
&gt; dequeue code observes:
&gt;
&gt;   info-&gt;si_code == SI_TIMER &amp;&amp; info-&gt;_sifields._timer._sys_private != 0
&gt;
&gt; which triggers the following callchain:
&gt;
&gt;  do_schedule_next_timer() -&gt; posix_cpu_timer_schedule() -&gt; arm_timer()
&gt;
&gt; arm_timer() executes a list_add() on the timer, which is already armed via
&gt; the timer_set() syscall. That's a double list add which corrupts the posix
&gt; cpu timer list. As a consequence the kernel crashes on the next operation
&gt; touching the posix cpu timer list.
&gt;
&gt; Posix clocks which are internally implemented based on hrtimers are not
&gt; affected by this because hrtimer_start() can handle already armed timers
&gt; nicely, but it's a reliable way to trigger the WARN_ON() in
&gt; hrtimer_forward(), which complains about calling that function on an
&gt; already armed timer.

This problem has existed since the posix timer code was merged into
2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
inject not just a signal (which linux has supported since 1.0) but the
full siginfo of a signal.

The core problem is that the code will reschedule in response to
signals getting dequeued not just for signals the timers sent but
for other signals that happen to a si_code of SI_TIMER.

Avoid this confusion by testing to see if the queued signal was
preallocated as all timer signals are preallocated, and so far
only the timer code preallocates signals.

Move the check for if a timer needs to be rescheduled up into
collect_signal where the preallocation check must be performed,
and pass the result back to dequeue_signal where the code reschedules
timers.   This makes it clear why the code cares about preallocated
timers.

Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reference: 66dd34ad31e5 ("signal: allow to send any siginfo to itself")
Reference: 1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
Fixes: db8b50ba75f2 ("[PATCH] POSIX clocks &amp; timers")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>sigaltstack: support SS_AUTODISARM for CONFIG_COMPAT</title>
<updated>2017-03-12T05:41:44+00:00</updated>
<author>
<name>Stas Sergeev</name>
<email>stsp@list.ru</email>
</author>
<published>2017-02-27T22:27:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6d94a6b32e97c56dfbe96cc489517b338608f091'/>
<id>6d94a6b32e97c56dfbe96cc489517b338608f091</id>
<content type='text'>
commit 441398d378f29a5ad6d0fcda07918e54e4961800 upstream.

Currently SS_AUTODISARM is not supported in compatibility mode, but does
not return -EINVAL either.  This makes dosemu built with -m32 on x86_64
to crash.  Also the kernel's sigaltstack selftest fails if compiled with
-m32.

This patch adds the needed support.

Link: http://lkml.kernel.org/r/20170205101213.8163-2-stsp@list.ru
Signed-off-by: Stas Sergeev &lt;stsp@users.sourceforge.net&gt;
Cc: Milosz Tanski &lt;milosz@adfin.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Cc: Waiman Long &lt;Waiman.Long@hpe.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dmitry Safonov &lt;dsafonov@virtuozzo.com&gt;
Cc: Wang Xiaoqiang &lt;wangxq10@lzu.edu.cn&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 441398d378f29a5ad6d0fcda07918e54e4961800 upstream.

Currently SS_AUTODISARM is not supported in compatibility mode, but does
not return -EINVAL either.  This makes dosemu built with -m32 on x86_64
to crash.  Also the kernel's sigaltstack selftest fails if compiled with
-m32.

This patch adds the needed support.

Link: http://lkml.kernel.org/r/20170205101213.8163-2-stsp@list.ru
Signed-off-by: Stas Sergeev &lt;stsp@users.sourceforge.net&gt;
Cc: Milosz Tanski &lt;milosz@adfin.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Cc: Waiman Long &lt;Waiman.Long@hpe.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dmitry Safonov &lt;dsafonov@virtuozzo.com&gt;
Cc: Wang Xiaoqiang &lt;wangxq10@lzu.edu.cn&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/signal: Add SA_{X32,IA32}_ABI sa_flags</title>
<updated>2016-09-14T19:28:11+00:00</updated>
<author>
<name>Dmitry Safonov</name>
<email>dsafonov@virtuozzo.com</email>
</author>
<published>2016-09-05T13:33:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6846351052e685c2d1428e80ead2d7ca3d7ed913'/>
<id>6846351052e685c2d1428e80ead2d7ca3d7ed913</id>
<content type='text'>
Introduce new flags that defines which ABI to use on creating sigframe.
Those flags kernel will set according to sigaction syscall ABI,
which set handler for the signal being delivered.

So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
Those flags will be used only under CONFIG_COMPAT.

Similar way ARM uses sa_flags to differ in which mode deliver signal
for 26-bit applications (look at SA_THIRYTWO).

Signed-off-by: Dmitry Safonov &lt;dsafonov@virtuozzo.com&gt;
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-7-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce new flags that defines which ABI to use on creating sigframe.
Those flags kernel will set according to sigaction syscall ABI,
which set handler for the signal being delivered.

So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
Those flags will be used only under CONFIG_COMPAT.

Similar way ARM uses sa_flags to differ in which mode deliver signal
for 26-bit applications (look at SA_THIRYTWO).

Signed-off-by: Dmitry Safonov &lt;dsafonov@virtuozzo.com&gt;
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-7-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>signals: Use hrtimer for sigtimedwait()</title>
<updated>2016-07-07T08:35:07+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-07-04T09:50:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2b1ecc3d1a6b10f8fbac7f83d80db30b5a2c2791'/>
<id>2b1ecc3d1a6b10f8fbac7f83d80db30b5a2c2791</id>
<content type='text'>
We've converted most timeout related syscalls to hrtimers, but
sigtimedwait() did not get this treatment.

Convert it so we get a reasonable accuracy and remove the
user space exposure to the timer wheel properties.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arjan van de Ven &lt;arjan@infradead.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Cyril Hrubis &lt;chrubis@suse.cz&gt;
Cc: George Spelvin &lt;linux@sciencehorizons.net&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.787164909@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We've converted most timeout related syscalls to hrtimers, but
sigtimedwait() did not get this treatment.

Convert it so we get a reasonable accuracy and remove the
user space exposure to the timer wheel properties.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arjan van de Ven &lt;arjan@infradead.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Cyril Hrubis &lt;chrubis@suse.cz&gt;
Cc: George Spelvin &lt;linux@sciencehorizons.net&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.787164909@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
