<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel, branch v4.9.76</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>kernel/signal.c: remove the no longer needed SIGNAL_UNKILLABLE check in complete_signal()</title>
<updated>2018-01-10T08:29:53+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2017-11-17T23:30:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4d53eb494950de4dc898a014600b56254a913a6d'/>
<id>4d53eb494950de4dc898a014600b56254a913a6d</id>
<content type='text'>
commit 426915796ccaf9c2bd9bb06dc5702225957bc2e5 upstream.

complete_signal() checks SIGNAL_UNKILLABLE before it starts to destroy
the thread group, today this is wrong in many ways.

If nothing else, fatal_signal_pending() should always imply that the
whole thread group (except -&gt;group_exit_task if it is not NULL) is
killed, this check breaks the rule.

After the previous changes we can rely on sig_task_ignored();
sig_fatal(sig) &amp;&amp; SIGNAL_UNKILLABLE can only be true if we actually want
to kill this task and sig == SIGKILL OR it is traced and debugger can
intercept the signal.

This should hopefully fix the problem reported by Dmitry.  This
test-case

	static int init(void *arg)
	{
		for (;;)
			pause();
	}

	int main(void)
	{
		char stack[16 * 1024];

		for (;;) {
			int pid = clone(init, stack + sizeof(stack)/2,
					CLONE_NEWPID | SIGCHLD, NULL);
			assert(pid &gt; 0);

			assert(ptrace(PTRACE_ATTACH, pid, 0, 0) == 0);
			assert(waitpid(-1, NULL, WSTOPPED) == pid);

			assert(ptrace(PTRACE_DETACH, pid, 0, SIGSTOP) == 0);
			assert(syscall(__NR_tkill, pid, SIGKILL) == 0);
			assert(pid == wait(NULL));
		}
	}

triggers the WARN_ON_ONCE(!(task-&gt;jobctl &amp; JOBCTL_STOP_PENDING)) in
task_participate_group_stop().  do_signal_stop()-&gt;signal_group_exit()
checks SIGNAL_GROUP_EXIT and return false, but task_set_jobctl_pending()
checks fatal_signal_pending() and does not set JOBCTL_STOP_PENDING.

And his should fix the minor security problem reported by Kyle,
SECCOMP_RET_TRACE can miss fatal_signal_pending() the same way if the
task is the root of a pid namespace.

Link: http://lkml.kernel.org/r/20171103184246.GD21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Reported-by: Kyle Huey &lt;me@kylehuey.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 426915796ccaf9c2bd9bb06dc5702225957bc2e5 upstream.

complete_signal() checks SIGNAL_UNKILLABLE before it starts to destroy
the thread group, today this is wrong in many ways.

If nothing else, fatal_signal_pending() should always imply that the
whole thread group (except -&gt;group_exit_task if it is not NULL) is
killed, this check breaks the rule.

After the previous changes we can rely on sig_task_ignored();
sig_fatal(sig) &amp;&amp; SIGNAL_UNKILLABLE can only be true if we actually want
to kill this task and sig == SIGKILL OR it is traced and debugger can
intercept the signal.

This should hopefully fix the problem reported by Dmitry.  This
test-case

	static int init(void *arg)
	{
		for (;;)
			pause();
	}

	int main(void)
	{
		char stack[16 * 1024];

		for (;;) {
			int pid = clone(init, stack + sizeof(stack)/2,
					CLONE_NEWPID | SIGCHLD, NULL);
			assert(pid &gt; 0);

			assert(ptrace(PTRACE_ATTACH, pid, 0, 0) == 0);
			assert(waitpid(-1, NULL, WSTOPPED) == pid);

			assert(ptrace(PTRACE_DETACH, pid, 0, SIGSTOP) == 0);
			assert(syscall(__NR_tkill, pid, SIGKILL) == 0);
			assert(pid == wait(NULL));
		}
	}

triggers the WARN_ON_ONCE(!(task-&gt;jobctl &amp; JOBCTL_STOP_PENDING)) in
task_participate_group_stop().  do_signal_stop()-&gt;signal_group_exit()
checks SIGNAL_GROUP_EXIT and return false, but task_set_jobctl_pending()
checks fatal_signal_pending() and does not set JOBCTL_STOP_PENDING.

And his should fix the minor security problem reported by Kyle,
SECCOMP_RET_TRACE can miss fatal_signal_pending() the same way if the
task is the root of a pid namespace.

Link: http://lkml.kernel.org/r/20171103184246.GD21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Reported-by: Kyle Huey &lt;me@kylehuey.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/signal.c: protect the SIGNAL_UNKILLABLE tasks from !sig_kernel_only() signals</title>
<updated>2018-01-10T08:29:52+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2017-11-17T23:30:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=794ac8ef9b06bd51186f6e54b763ecf4f94d7171'/>
<id>794ac8ef9b06bd51186f6e54b763ecf4f94d7171</id>
<content type='text'>
commit ac25385089f673560867eb5179228a44ade0cfc1 upstream.

Change sig_task_ignored() to drop the SIG_DFL &amp;&amp; !sig_kernel_only()
signals even if force == T.  This simplifies the next change and this
matches the same check in get_signal() which will drop these signals
anyway.

Link: http://lkml.kernel.org/r/20171103184227.GC21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ac25385089f673560867eb5179228a44ade0cfc1 upstream.

Change sig_task_ignored() to drop the SIG_DFL &amp;&amp; !sig_kernel_only()
signals even if force == T.  This simplifies the next change and this
matches the same check in get_signal() which will drop these signals
anyway.

Link: http://lkml.kernel.org/r/20171103184227.GC21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/signal.c: protect the traced SIGNAL_UNKILLABLE tasks from SIGKILL</title>
<updated>2018-01-10T08:29:52+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2017-11-17T23:30:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1453b3ac6cf8c62abb5a50680c6626d460c63044'/>
<id>1453b3ac6cf8c62abb5a50680c6626d460c63044</id>
<content type='text'>
commit 628c1bcba204052d19b686b5bac149a644cdb72e upstream.

The comment in sig_ignored() says "Tracers may want to know about even
ignored signals" but SIGKILL can not be reported to debugger and it is
just wrong to return 0 in this case: SIGKILL should only kill the
SIGNAL_UNKILLABLE task if it comes from the parent ns.

Change sig_ignored() to ignore -&gt;ptrace if sig == SIGKILL and rely on
sig_task_ignored().

SISGTOP coming from within the namespace is not really right too but at
least debugger can intercept it, and we can't drop it here because this
will break "gdb -p 1": ptrace_attach() won't work.  Perhaps we will add
another -&gt;ptrace check later, we will see.

Link: http://lkml.kernel.org/r/20171103184206.GB21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 628c1bcba204052d19b686b5bac149a644cdb72e upstream.

The comment in sig_ignored() says "Tracers may want to know about even
ignored signals" but SIGKILL can not be reported to debugger and it is
just wrong to return 0 in this case: SIGKILL should only kill the
SIGNAL_UNKILLABLE task if it comes from the parent ns.

Change sig_ignored() to ignore -&gt;ptrace if sig == SIGKILL and rely on
sig_task_ignored().

SISGTOP coming from within the namespace is not really right too but at
least debugger can intercept it, and we can't drop it here because this
will break "gdb -p 1": ptrace_attach() won't work.  Perhaps we will add
another -&gt;ptrace check later, we will see.

Link: http://lkml.kernel.org/r/20171103184206.GB21036@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Tested-by: Kyle Huey &lt;me@kylehuey.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kernel: make groups_sort calling a responsibility group_info allocators</title>
<updated>2018-01-10T08:29:52+00:00</updated>
<author>
<name>Thiago Rafael Becker</name>
<email>thiago.becker@gmail.com</email>
</author>
<published>2017-12-14T23:33:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=79258d9834803518a80b0ed0603c790638f0478b'/>
<id>79258d9834803518a80b0ed0603c790638f0478b</id>
<content type='text'>
commit bdcf0a423ea1c40bbb40e7ee483b50fc8aa3d758 upstream.

In testing, we found that nfsd threads may call set_groups in parallel
for the same entry cached in auth.unix.gid, racing in the call of
groups_sort, corrupting the groups for that entry and leading to
permission denials for the client.

This patch:
 - Make groups_sort globally visible.
 - Move the call to groups_sort to the modifiers of group_info
 - Remove the call to groups_sort from set_groups

Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com
Signed-off-by: Thiago Rafael Becker &lt;thiago.becker@gmail.com&gt;
Reviewed-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Reviewed-by: NeilBrown &lt;neilb@suse.com&gt;
Acked-by: "J. Bruce Fields" &lt;bfields@fieldses.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bdcf0a423ea1c40bbb40e7ee483b50fc8aa3d758 upstream.

In testing, we found that nfsd threads may call set_groups in parallel
for the same entry cached in auth.unix.gid, racing in the call of
groups_sort, corrupting the groups for that entry and leading to
permission denials for the client.

This patch:
 - Make groups_sort globally visible.
 - Move the call to groups_sort to the modifiers of group_info
 - Remove the call to groups_sort from set_groups

Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com
Signed-off-by: Thiago Rafael Becker &lt;thiago.becker@gmail.com&gt;
Reviewed-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Reviewed-by: NeilBrown &lt;neilb@suse.com&gt;
Acked-by: "J. Bruce Fields" &lt;bfields@fieldses.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/acct.c: fix the acct-&gt;needcheck check in check_free_space()</title>
<updated>2018-01-10T08:29:51+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2018-01-05T00:17:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=790080ce0e3288c2d289d0771aa63814949c6da5'/>
<id>790080ce0e3288c2d289d0771aa63814949c6da5</id>
<content type='text'>
commit 4d9570158b6260f449e317a5f9ed030c2504a615 upstream.

As Tsukada explains, the time_is_before_jiffies(acct-&gt;needcheck) check
is very wrong, we need time_is_after_jiffies() to make sys_acct() work.

Ignoring the overflows, the code should "goto out" if needcheck &gt;
jiffies, while currently it checks "needcheck &lt; jiffies" and thus in the
likely case check_free_space() does nothing until jiffies overflow.

In particular this means that sys_acct() is simply broken, acct_on()
sets acct-&gt;needcheck = jiffies and expects that check_free_space()
should set acct-&gt;active = 1 after the free-space check, but this won't
happen if jiffies increments in between.

This was broken by commit 32dc73086015 ("get rid of timer in
kern/acct.c") in 2011, then another (correct) commit 795a2f22a8ea
("acct() should honour the limits from the very beginning") made the
problem more visible.

Link: http://lkml.kernel.org/r/20171213133940.GA6554@redhat.com
Fixes: 32dc73086015 ("get rid of timer in kern/acct.c")
Reported-by: TSUKADA Koutaro &lt;tsukada@ascade.co.jp&gt;
Suggested-by: TSUKADA Koutaro &lt;tsukada@ascade.co.jp&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4d9570158b6260f449e317a5f9ed030c2504a615 upstream.

As Tsukada explains, the time_is_before_jiffies(acct-&gt;needcheck) check
is very wrong, we need time_is_after_jiffies() to make sys_acct() work.

Ignoring the overflows, the code should "goto out" if needcheck &gt;
jiffies, while currently it checks "needcheck &lt; jiffies" and thus in the
likely case check_free_space() does nothing until jiffies overflow.

In particular this means that sys_acct() is simply broken, acct_on()
sets acct-&gt;needcheck = jiffies and expects that check_free_space()
should set acct-&gt;active = 1 after the free-space check, but this won't
happen if jiffies increments in between.

This was broken by commit 32dc73086015 ("get rid of timer in
kern/acct.c") in 2011, then another (correct) commit 795a2f22a8ea
("acct() should honour the limits from the very beginning") made the
problem more visible.

Link: http://lkml.kernel.org/r/20171213133940.GA6554@redhat.com
Fixes: 32dc73086015 ("get rid of timer in kern/acct.c")
Reported-by: TSUKADA Koutaro &lt;tsukada@ascade.co.jp&gt;
Suggested-by: TSUKADA Koutaro &lt;tsukada@ascade.co.jp&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE</title>
<updated>2018-01-05T14:46:32+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2017-09-04T01:57:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0994a2cf8fe4e884bad4810681117a7d0096c8e7'/>
<id>0994a2cf8fe4e884bad4810681117a7d0096c8e7</id>
<content type='text'>
Kaiser only needs to map one page of the stack; and
kernel/fork.c did not build on powerpc (no __PAGE_KERNEL).
It's all cleaner if linux/kaiser.h provides kaiser_map_thread_stack()
and kaiser_unmap_thread_stack() wrappers around asm/kaiser.h's
kaiser_add_mapping() and kaiser_remove_mapping().  And use
linux/kaiser.h in init/main.c to avoid the #ifdefs there.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Kaiser only needs to map one page of the stack; and
kernel/fork.c did not build on powerpc (no __PAGE_KERNEL).
It's all cleaner if linux/kaiser.h provides kaiser_map_thread_stack()
and kaiser_unmap_thread_stack() wrappers around asm/kaiser.h's
kaiser_add_mapping() and kaiser_remove_mapping().  And use
linux/kaiser.h in init/main.c to avoid the #ifdefs there.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kaiser: merged update</title>
<updated>2018-01-05T14:46:32+00:00</updated>
<author>
<name>Dave Hansen</name>
<email>dave.hansen@linux.intel.com</email>
</author>
<published>2017-08-30T23:23:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8f0baadf2bea3861217763734b57e1dd2db703dd'/>
<id>8f0baadf2bea3861217763734b57e1dd2db703dd</id>
<content type='text'>
Merged fixes and cleanups, rebased to 4.9.51 tree (no 5-level paging).

Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merged fixes and cleanups, rebased to 4.9.51 tree (no 5-level paging).

Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KAISER: Kernel Address Isolation</title>
<updated>2018-01-05T14:46:32+00:00</updated>
<author>
<name>Richard Fellner</name>
<email>richard.fellner@student.tugraz.at</email>
</author>
<published>2017-05-04T12:26:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=13be4483bb487176c48732b887780630a141ae96'/>
<id>13be4483bb487176c48732b887780630a141ae96</id>
<content type='text'>
This patch introduces our implementation of KAISER (Kernel Address Isolation to
have Side-channels Efficiently Removed), a kernel isolation technique to close
hardware side channels on kernel address information.

More information about the patch can be found on:

        https://github.com/IAIK/KAISER

From: Richard Fellner &lt;richard.fellner@student.tugraz.at&gt;
From: Daniel Gruss &lt;daniel.gruss@iaik.tugraz.at&gt;
Subject: [RFC, PATCH] x86_64: KAISER - do not map kernel in user mode
Date: Thu, 4 May 2017 14:26:50 +0200
Link: http://marc.info/?l=linux-kernel&amp;m=149390087310405&amp;w=2
Kaiser-4.10-SHA1: c4b1831d44c6144d3762ccc72f0c4e71a0c713e5

To: &lt;linux-kernel@vger.kernel.org&gt;
To: &lt;kernel-hardening@lists.openwall.com&gt;
Cc: &lt;clementine.maurice@iaik.tugraz.at&gt;
Cc: &lt;moritz.lipp@iaik.tugraz.at&gt;
Cc: Michael Schwarz &lt;michael.schwarz@iaik.tugraz.at&gt;
Cc: Richard Fellner &lt;richard.fellner@student.tugraz.at&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: &lt;kirill.shutemov@linux.intel.com&gt;
Cc: &lt;anders.fogh@gdata-adan.de&gt;

After several recent works [1,2,3] KASLR on x86_64 was basically
considered dead by many researchers. We have been working on an
efficient but effective fix for this problem and found that not mapping
the kernel space when running in user mode is the solution to this
problem [4] (the corresponding paper [5] will be presented at ESSoS17).

With this RFC patch we allow anybody to configure their kernel with the
flag CONFIG_KAISER to add our defense mechanism.

If there are any questions we would love to answer them.
We also appreciate any comments!

Cheers,
Daniel (+ the KAISER team from Graz University of Technology)

[1] http://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
[2] https://www.blackhat.com/docs/us-16/materials/us-16-Fogh-Using-Undocumented-CPU-Behaviour-To-See-Into-Kernel-Mode-And-Break-KASLR-In-The-Process.pdf
[3] https://www.blackhat.com/docs/us-16/materials/us-16-Jang-Breaking-Kernel-Address-Space-Layout-Randomization-KASLR-With-Intel-TSX.pdf
[4] https://github.com/IAIK/KAISER
[5] https://gruss.cc/files/kaiser.pdf

[patch based also on
https://raw.githubusercontent.com/IAIK/KAISER/master/KAISER/0001-KAISER-Kernel-Address-Isolation.patch]

Signed-off-by: Richard Fellner &lt;richard.fellner@student.tugraz.at&gt;
Signed-off-by: Moritz Lipp &lt;moritz.lipp@iaik.tugraz.at&gt;
Signed-off-by: Daniel Gruss &lt;daniel.gruss@iaik.tugraz.at&gt;
Signed-off-by: Michael Schwarz &lt;michael.schwarz@iaik.tugraz.at&gt;
Acked-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch introduces our implementation of KAISER (Kernel Address Isolation to
have Side-channels Efficiently Removed), a kernel isolation technique to close
hardware side channels on kernel address information.

More information about the patch can be found on:

        https://github.com/IAIK/KAISER

From: Richard Fellner &lt;richard.fellner@student.tugraz.at&gt;
From: Daniel Gruss &lt;daniel.gruss@iaik.tugraz.at&gt;
Subject: [RFC, PATCH] x86_64: KAISER - do not map kernel in user mode
Date: Thu, 4 May 2017 14:26:50 +0200
Link: http://marc.info/?l=linux-kernel&amp;m=149390087310405&amp;w=2
Kaiser-4.10-SHA1: c4b1831d44c6144d3762ccc72f0c4e71a0c713e5

To: &lt;linux-kernel@vger.kernel.org&gt;
To: &lt;kernel-hardening@lists.openwall.com&gt;
Cc: &lt;clementine.maurice@iaik.tugraz.at&gt;
Cc: &lt;moritz.lipp@iaik.tugraz.at&gt;
Cc: Michael Schwarz &lt;michael.schwarz@iaik.tugraz.at&gt;
Cc: Richard Fellner &lt;richard.fellner@student.tugraz.at&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: &lt;kirill.shutemov@linux.intel.com&gt;
Cc: &lt;anders.fogh@gdata-adan.de&gt;

After several recent works [1,2,3] KASLR on x86_64 was basically
considered dead by many researchers. We have been working on an
efficient but effective fix for this problem and found that not mapping
the kernel space when running in user mode is the solution to this
problem [4] (the corresponding paper [5] will be presented at ESSoS17).

With this RFC patch we allow anybody to configure their kernel with the
flag CONFIG_KAISER to add our defense mechanism.

If there are any questions we would love to answer them.
We also appreciate any comments!

Cheers,
Daniel (+ the KAISER team from Graz University of Technology)

[1] http://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
[2] https://www.blackhat.com/docs/us-16/materials/us-16-Fogh-Using-Undocumented-CPU-Behaviour-To-See-Into-Kernel-Mode-And-Break-KASLR-In-The-Process.pdf
[3] https://www.blackhat.com/docs/us-16/materials/us-16-Jang-Breaking-Kernel-Address-Space-Layout-Randomization-KASLR-With-Intel-TSX.pdf
[4] https://github.com/IAIK/KAISER
[5] https://gruss.cc/files/kaiser.pdf

[patch based also on
https://raw.githubusercontent.com/IAIK/KAISER/master/KAISER/0001-KAISER-Kernel-Address-Isolation.patch]

Signed-off-by: Richard Fellner &lt;richard.fellner@student.tugraz.at&gt;
Signed-off-by: Moritz Lipp &lt;moritz.lipp@iaik.tugraz.at&gt;
Signed-off-by: Daniel Gruss &lt;daniel.gruss@iaik.tugraz.at&gt;
Signed-off-by: Michael Schwarz &lt;michael.schwarz@iaik.tugraz.at&gt;
Acked-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()</title>
<updated>2018-01-02T19:35:17+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-22T14:51:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e8119ac05d7160cce59ce1ff04c210c22e147a6c'/>
<id>e8119ac05d7160cce59ce1ff04c210c22e147a6c</id>
<content type='text'>
commit 5d62c183f9e9df1deeea0906d099a94e8a43047a upstream.

The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
subsequently invokes tick_nohz_stop_sched_tick() are:

  if ((idle_cpu(cpu) &amp;&amp; !need_resched()) || tick_nohz_full_cpu(cpu))

If need_resched() is not set, but a timer softirq is pending then this is
an indication that the softirq code punted and delegated the execution to
softirqd. need_resched() is not true because the current interrupted task
takes precedence over softirqd.

Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
timer interrupts because the timer wheel contains an expired timer, but
softirqs are not yet executed. So it returns an immediate expiry request,
which causes the timer to fire immediately again. Lather, rinse and
repeat....

Prevent that by adding a check for a pending timer soft interrupt to the
conditions in tick_nohz_stop_sched_tick() which avoid calling
get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
prevents a repetitive programming of an already expired timer.

Reported-by: Sebastian Siewior &lt;bigeasy@linutronix.d&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 5d62c183f9e9df1deeea0906d099a94e8a43047a upstream.

The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
subsequently invokes tick_nohz_stop_sched_tick() are:

  if ((idle_cpu(cpu) &amp;&amp; !need_resched()) || tick_nohz_full_cpu(cpu))

If need_resched() is not set, but a timer softirq is pending then this is
an indication that the softirq code punted and delegated the execution to
softirqd. need_resched() is not true because the current interrupted task
takes precedence over softirqd.

Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
timer interrupts because the timer wheel contains an expired timer, but
softirqs are not yet executed. So it returns an immediate expiry request,
which causes the timer to fire immediately again. Lather, rinse and
repeat....

Prevent that by adding a check for a pending timer soft interrupt to the
conditions in tick_nohz_stop_sched_tick() which avoid calling
get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
prevents a repetitive programming of an already expired timer.

Reported-by: Sebastian Siewior &lt;bigeasy@linutronix.d&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>timers: Reinitialize per cpu bases on hotplug</title>
<updated>2018-01-02T19:35:17+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-27T20:37:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=249d4a9b3246f4ec92433ba8ea3bae5ceb4dc1ed'/>
<id>249d4a9b3246f4ec92433ba8ea3bae5ceb4dc1ed</id>
<content type='text'>
commit 26456f87aca7157c057de65c9414b37f1ab881d1 upstream.

The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.

Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.

Set base-&gt;must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 26456f87aca7157c057de65c9414b37f1ab881d1 upstream.

The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.

Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.

Set base-&gt;must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
