<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/sched, branch v4.16-rc4</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge branch 'akpm' (patches from Andrew)</title>
<updated>2018-02-22T18:45:46+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-02-22T18:45:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=238ca357079b1b378e55187e8ea34e9e9c4c5695'/>
<id>238ca357079b1b378e55187e8ea34e9e9c4c5695</id>
<content type='text'>
Merge misc fixes from Andrew Morton:
 "16 fixes"

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;:
  mm: don't defer struct page initialization for Xen pv guests
  lib/Kconfig.debug: enable RUNTIME_TESTING_MENU
  vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems
  selftests/memfd: add run_fuse_test.sh to TEST_FILES
  bug.h: work around GCC PR82365 in BUG()
  mm/swap.c: make functions and their kernel-doc agree (again)
  mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc
  ida: do zeroing in ida_pre_get()
  mm, swap, frontswap: fix THP swap if frontswap enabled
  certs/blacklist_nohashes.c: fix const confusion in certs blacklist
  kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
  mm, mlock, vmscan: no more skipping pagevecs
  mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats
  Kbuild: always define endianess in kconfig.h
  include/linux/sched/mm.h: re-inline mmdrop()
  tools: fix cross-compile var clobbering
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge misc fixes from Andrew Morton:
 "16 fixes"

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;:
  mm: don't defer struct page initialization for Xen pv guests
  lib/Kconfig.debug: enable RUNTIME_TESTING_MENU
  vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems
  selftests/memfd: add run_fuse_test.sh to TEST_FILES
  bug.h: work around GCC PR82365 in BUG()
  mm/swap.c: make functions and their kernel-doc agree (again)
  mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc
  ida: do zeroing in ida_pre_get()
  mm, swap, frontswap: fix THP swap if frontswap enabled
  certs/blacklist_nohashes.c: fix const confusion in certs blacklist
  kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
  mm, mlock, vmscan: no more skipping pagevecs
  mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats
  Kbuild: always define endianess in kconfig.h
  include/linux/sched/mm.h: re-inline mmdrop()
  tools: fix cross-compile var clobbering
</pre>
</div>
</content>
</entry>
<entry>
<title>efivarfs: Limit the rate for non-root to read files</title>
<updated>2018-02-22T18:21:02+00:00</updated>
<author>
<name>Luck, Tony</name>
<email>tony.luck@intel.com</email>
</author>
<published>2018-02-22T17:15:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bef3efbeb897b56867e271cdbc5f8adaacaeb9cd'/>
<id>bef3efbeb897b56867e271cdbc5f8adaacaeb9cd</id>
<content type='text'>
Each read from a file in efivarfs results in two calls to EFI
(one to get the file size, another to get the actual data).

On X86 these EFI calls result in broadcast system management
interrupts (SMI) which affect performance of the whole system.
A malicious user can loop performing reads from efivarfs bringing
the system to its knees.

Linus suggested per-user rate limit to solve this.

So we add a ratelimit structure to "user_struct" and initialize
it for the root user for no limit. When allocating user_struct for
other users we set the limit to 100 per second. This could be used
for other places that want to limit the rate of some detrimental
user action.

In efivarfs if the limit is exceeded when reading, we take an
interruptible nap for 50ms and check the rate limit again.

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Each read from a file in efivarfs results in two calls to EFI
(one to get the file size, another to get the actual data).

On X86 these EFI calls result in broadcast system management
interrupts (SMI) which affect performance of the whole system.
A malicious user can loop performing reads from efivarfs bringing
the system to its knees.

Linus suggested per-user rate limit to solve this.

So we add a ratelimit structure to "user_struct" and initialize
it for the root user for no limit. When allocating user_struct for
other users we set the limit to 100 per second. This could be used
for other places that want to limit the rate of some detrimental
user action.

In efivarfs if the limit is exceeded when reading, we take an
interruptible nap for 50ms and check the rate limit again.

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>include/linux/sched/mm.h: re-inline mmdrop()</title>
<updated>2018-02-21T23:35:42+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2018-02-21T22:45:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d34bc48f8275b6ce0da44f639d68344891268ee9'/>
<id>d34bc48f8275b6ce0da44f639d68344891268ee9</id>
<content type='text'>
As Peter points out, Doing a CALL+RET for just the decrement is a bit silly.

Fixes: d70f2a14b72a4bc ("include/linux/sched/mm.h: uninline mmdrop_async(), etc")
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infraded.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As Peter points out, Doing a CALL+RET for just the decrement is a bit silly.

Fixes: d70f2a14b72a4bc ("include/linux/sched/mm.h: uninline mmdrop_async(), etc")
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infraded.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'linus' into sched/urgent, to resolve conflicts</title>
<updated>2018-02-06T20:12:31+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2018-02-06T20:12:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=82845079160817cc6ac64e5321bbd935e0a47b3a'/>
<id>82845079160817cc6ac64e5321bbd935e0a47b3a</id>
<content type='text'>
 Conflicts:
	arch/arm64/kernel/entry.S
	arch/x86/Kconfig
	include/linux/sched/mm.h
	kernel/fork.c

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
 Conflicts:
	arch/arm64/kernel/entry.S
	arch/x86/Kconfig
	include/linux/sched/mm.h
	kernel/fork.c

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>membarrier: Provide core serializing command, *_SYNC_CORE</title>
<updated>2018-02-05T20:35:03+00:00</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2018-01-29T20:20:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=70216e18e519a54a2f13569e8caff99a092a92d6'/>
<id>70216e18e519a54a2f13569e8caff99a092a92d6</id>
<content type='text'>
Provide core serializing membarrier command to support memory reclaim
by JIT.

Each architecture needs to explicitly opt into that support by
documenting in their architecture code how they provide the core
serializing instructions required when returning from the membarrier
IPI, and after the scheduler has updated the curr-&gt;mm pointer (before
going back to user-space). They should then select
ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on
their architecture.

Architectures selecting this feature need to either document that
they issue core serializing instructions when returning to user-space,
or implement their architecture-specific sync_core_before_usermode().

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrea Parri &lt;parri.andrea@gmail.com&gt;
Cc: Andrew Hunter &lt;ahh@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Dave Watson &lt;davejwatson@fb.com&gt;
Cc: David Sehr &lt;sehr@google.com&gt;
Cc: Greg Hackmann &lt;ghackmann@google.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Maged Michael &lt;maged.michael@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-9-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Provide core serializing membarrier command to support memory reclaim
by JIT.

Each architecture needs to explicitly opt into that support by
documenting in their architecture code how they provide the core
serializing instructions required when returning from the membarrier
IPI, and after the scheduler has updated the curr-&gt;mm pointer (before
going back to user-space). They should then select
ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on
their architecture.

Architectures selecting this feature need to either document that
they issue core serializing instructions when returning to user-space,
or implement their architecture-specific sync_core_before_usermode().

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrea Parri &lt;parri.andrea@gmail.com&gt;
Cc: Andrew Hunter &lt;ahh@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Dave Watson &lt;davejwatson@fb.com&gt;
Cc: David Sehr &lt;sehr@google.com&gt;
Cc: Greg Hackmann &lt;ghackmann@google.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Maged Michael &lt;maged.michael@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-9-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>membarrier: Provide GLOBAL_EXPEDITED command</title>
<updated>2018-02-05T20:34:31+00:00</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2018-01-29T20:20:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c5f58bd58f432be5d92df33c5458e0bcbee3aadf'/>
<id>c5f58bd58f432be5d92df33c5458e0bcbee3aadf</id>
<content type='text'>
Allow expedited membarrier to be used for data shared between processes
through shared memory.

Processes wishing to receive the membarriers register with
MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED. Those which want to issue
membarrier invoke MEMBARRIER_CMD_GLOBAL_EXPEDITED.

This allows extremely simple kernel-level implementation: we have almost
everything we need with the PRIVATE_EXPEDITED barrier code. All we need
to do is to add a flag in the mm_struct that will be used to check
whether we need to send the IPI to the current thread of each CPU.

There is a slight downside to this approach compared to targeting
specific shared memory users: when performing a membarrier operation,
all registered "global" receivers will get the barrier, even if they
don't share a memory mapping with the sender issuing
MEMBARRIER_CMD_GLOBAL_EXPEDITED.

This registration approach seems to fit the requirement of not
disturbing processes that really deeply care about real-time: they
simply should not register with MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED.

In order to align the membarrier command names, the "MEMBARRIER_CMD_SHARED"
command is renamed to "MEMBARRIER_CMD_GLOBAL", keeping an alias of
MEMBARRIER_CMD_SHARED to MEMBARRIER_CMD_GLOBAL for UAPI header backward
compatibility.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrea Parri &lt;parri.andrea@gmail.com&gt;
Cc: Andrew Hunter &lt;ahh@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Dave Watson &lt;davejwatson@fb.com&gt;
Cc: David Sehr &lt;sehr@google.com&gt;
Cc: Greg Hackmann &lt;ghackmann@google.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Maged Michael &lt;maged.michael@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-5-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow expedited membarrier to be used for data shared between processes
through shared memory.

Processes wishing to receive the membarriers register with
MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED. Those which want to issue
membarrier invoke MEMBARRIER_CMD_GLOBAL_EXPEDITED.

This allows extremely simple kernel-level implementation: we have almost
everything we need with the PRIVATE_EXPEDITED barrier code. All we need
to do is to add a flag in the mm_struct that will be used to check
whether we need to send the IPI to the current thread of each CPU.

There is a slight downside to this approach compared to targeting
specific shared memory users: when performing a membarrier operation,
all registered "global" receivers will get the barrier, even if they
don't share a memory mapping with the sender issuing
MEMBARRIER_CMD_GLOBAL_EXPEDITED.

This registration approach seems to fit the requirement of not
disturbing processes that really deeply care about real-time: they
simply should not register with MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED.

In order to align the membarrier command names, the "MEMBARRIER_CMD_SHARED"
command is renamed to "MEMBARRIER_CMD_GLOBAL", keeping an alias of
MEMBARRIER_CMD_SHARED to MEMBARRIER_CMD_GLOBAL for UAPI header backward
compatibility.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrea Parri &lt;parri.andrea@gmail.com&gt;
Cc: Andrew Hunter &lt;ahh@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Dave Watson &lt;davejwatson@fb.com&gt;
Cc: David Sehr &lt;sehr@google.com&gt;
Cc: Greg Hackmann &lt;ghackmann@google.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Maged Michael &lt;maged.michael@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-5-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>membarrier: Document scheduler barrier requirements</title>
<updated>2018-02-05T20:34:21+00:00</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2018-01-29T20:20:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=306e060435d7a3aef8f6f033e43b0f581638adce'/>
<id>306e060435d7a3aef8f6f033e43b0f581638adce</id>
<content type='text'>
Document the membarrier requirement on having a full memory barrier in
__schedule() after coming from user-space, before storing to rq-&gt;curr.
It is provided by smp_mb__after_spinlock() in __schedule().

Document that membarrier requires a full barrier on transition from
kernel thread to userspace thread. We currently have an implicit barrier
from atomic_dec_and_test() in mmdrop() that ensures this.

The x86 switch_mm_irqs_off() full barrier is currently provided by many
cpumask update operations as well as write_cr3(). Document that
write_cr3() provides this barrier.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrea Parri &lt;parri.andrea@gmail.com&gt;
Cc: Andrew Hunter &lt;ahh@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Dave Watson &lt;davejwatson@fb.com&gt;
Cc: David Sehr &lt;sehr@google.com&gt;
Cc: Greg Hackmann &lt;ghackmann@google.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Maged Michael &lt;maged.michael@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-4-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Document the membarrier requirement on having a full memory barrier in
__schedule() after coming from user-space, before storing to rq-&gt;curr.
It is provided by smp_mb__after_spinlock() in __schedule().

Document that membarrier requires a full barrier on transition from
kernel thread to userspace thread. We currently have an implicit barrier
from atomic_dec_and_test() in mmdrop() that ensures this.

The x86 switch_mm_irqs_off() full barrier is currently provided by many
cpumask update operations as well as write_cr3(). Document that
write_cr3() provides this barrier.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrea Parri &lt;parri.andrea@gmail.com&gt;
Cc: Andrew Hunter &lt;ahh@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Dave Watson &lt;davejwatson@fb.com&gt;
Cc: David Sehr &lt;sehr@google.com&gt;
Cc: Greg Hackmann &lt;ghackmann@google.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Maged Michael &lt;maged.michael@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-4-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc, membarrier: Skip memory barrier in switch_mm()</title>
<updated>2018-02-05T20:34:02+00:00</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2018-01-29T20:20:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3ccfebedd8cf54e291c809c838d8ad5cc00f5688'/>
<id>3ccfebedd8cf54e291c809c838d8ad5cc00f5688</id>
<content type='text'>
Allow PowerPC to skip the full memory barrier in switch_mm(), and
only issue the barrier when scheduling into a task belonging to a
process that has registered to use expedited private.

Threads targeting the same VM but which belong to different thread
groups is a tricky case. It has a few consequences:

It turns out that we cannot rely on get_nr_threads(p) to count the
number of threads using a VM. We can use
(atomic_read(&amp;mm-&gt;mm_users) == 1 &amp;&amp; get_nr_threads(p) == 1)
instead to skip the synchronize_sched() for cases where the VM only has
a single user, and that user only has a single thread.

It also turns out that we cannot use for_each_thread() to set
thread flags in all threads using a VM, as it only iterates on the
thread group.

Therefore, test the membarrier state variable directly rather than
relying on thread flags. This means
membarrier_register_private_expedited() needs to set the
MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and
only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows
private expedited membarrier commands to succeed.
membarrier_arch_switch_mm() now tests for the
MEMBARRIER_STATE_PRIVATE_EXPEDITED flag.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andrea Parri &lt;parri.andrea@gmail.com&gt;
Cc: Andrew Hunter &lt;ahh@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Dave Watson &lt;davejwatson@fb.com&gt;
Cc: David Sehr &lt;sehr@google.com&gt;
Cc: Greg Hackmann &lt;ghackmann@google.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Maged Michael &lt;maged.michael@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20180129202020.8515-3-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow PowerPC to skip the full memory barrier in switch_mm(), and
only issue the barrier when scheduling into a task belonging to a
process that has registered to use expedited private.

Threads targeting the same VM but which belong to different thread
groups is a tricky case. It has a few consequences:

It turns out that we cannot rely on get_nr_threads(p) to count the
number of threads using a VM. We can use
(atomic_read(&amp;mm-&gt;mm_users) == 1 &amp;&amp; get_nr_threads(p) == 1)
instead to skip the synchronize_sched() for cases where the VM only has
a single user, and that user only has a single thread.

It also turns out that we cannot use for_each_thread() to set
thread flags in all threads using a VM, as it only iterates on the
thread group.

Therefore, test the membarrier state variable directly rather than
relying on thread flags. This means
membarrier_register_private_expedited() needs to set the
MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and
only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows
private expedited membarrier commands to succeed.
membarrier_arch_switch_mm() now tests for the
MEMBARRIER_STATE_PRIVATE_EXPEDITED flag.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andrea Parri &lt;parri.andrea@gmail.com&gt;
Cc: Andrew Hunter &lt;ahh@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Dave Watson &lt;davejwatson@fb.com&gt;
Cc: David Sehr &lt;sehr@google.com&gt;
Cc: Greg Hackmann &lt;ghackmann@google.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Maged Michael &lt;maged.michael@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20180129202020.8515-3-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux</title>
<updated>2018-02-04T00:25:42+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-02-04T00:25:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=617aebe6a97efa539cc4b8a52adccd89596e6be0'/>
<id>617aebe6a97efa539cc4b8a52adccd89596e6be0</id>
<content type='text'>
Pull hardened usercopy whitelisting from Kees Cook:
 "Currently, hardened usercopy performs dynamic bounds checking on slab
  cache objects. This is good, but still leaves a lot of kernel memory
  available to be copied to/from userspace in the face of bugs.

  To further restrict what memory is available for copying, this creates
  a way to whitelist specific areas of a given slab cache object for
  copying to/from userspace, allowing much finer granularity of access
  control.

  Slab caches that are never exposed to userspace can declare no
  whitelist for their objects, thereby keeping them unavailable to
  userspace via dynamic copy operations. (Note, an implicit form of
  whitelisting is the use of constant sizes in usercopy operations and
  get_user()/put_user(); these bypass all hardened usercopy checks since
  these sizes cannot change at runtime.)

  This new check is WARN-by-default, so any mistakes can be found over
  the next several releases without breaking anyone's system.

  The series has roughly the following sections:
   - remove %p and improve reporting with offset
   - prepare infrastructure and whitelist kmalloc
   - update VFS subsystem with whitelists
   - update SCSI subsystem with whitelists
   - update network subsystem with whitelists
   - update process memory with whitelists
   - update per-architecture thread_struct with whitelists
   - update KVM with whitelists and fix ioctl bug
   - mark all other allocations as not whitelisted
   - update lkdtm for more sensible test overage"

* tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits)
  lkdtm: Update usercopy tests for whitelisting
  usercopy: Restrict non-usercopy caches to size 0
  kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
  kvm: whitelist struct kvm_vcpu_arch
  arm: Implement thread_struct whitelist for hardened usercopy
  arm64: Implement thread_struct whitelist for hardened usercopy
  x86: Implement thread_struct whitelist for hardened usercopy
  fork: Provide usercopy whitelisting for task_struct
  fork: Define usercopy region in thread_stack slab caches
  fork: Define usercopy region in mm_struct slab caches
  net: Restrict unwhitelisted proto caches to size 0
  sctp: Copy struct sctp_sock.autoclose to userspace using put_user()
  sctp: Define usercopy region in SCTP proto slab cache
  caif: Define usercopy region in caif proto slab cache
  ip: Define usercopy region in IP proto slab cache
  net: Define usercopy region in struct proto slab cache
  scsi: Define usercopy region in scsi_sense_cache slab cache
  cifs: Define usercopy region in cifs_request slab cache
  vxfs: Define usercopy region in vxfs_inode slab cache
  ufs: Define usercopy region in ufs_inode_cache slab cache
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull hardened usercopy whitelisting from Kees Cook:
 "Currently, hardened usercopy performs dynamic bounds checking on slab
  cache objects. This is good, but still leaves a lot of kernel memory
  available to be copied to/from userspace in the face of bugs.

  To further restrict what memory is available for copying, this creates
  a way to whitelist specific areas of a given slab cache object for
  copying to/from userspace, allowing much finer granularity of access
  control.

  Slab caches that are never exposed to userspace can declare no
  whitelist for their objects, thereby keeping them unavailable to
  userspace via dynamic copy operations. (Note, an implicit form of
  whitelisting is the use of constant sizes in usercopy operations and
  get_user()/put_user(); these bypass all hardened usercopy checks since
  these sizes cannot change at runtime.)

  This new check is WARN-by-default, so any mistakes can be found over
  the next several releases without breaking anyone's system.

  The series has roughly the following sections:
   - remove %p and improve reporting with offset
   - prepare infrastructure and whitelist kmalloc
   - update VFS subsystem with whitelists
   - update SCSI subsystem with whitelists
   - update network subsystem with whitelists
   - update process memory with whitelists
   - update per-architecture thread_struct with whitelists
   - update KVM with whitelists and fix ioctl bug
   - mark all other allocations as not whitelisted
   - update lkdtm for more sensible test overage"

* tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits)
  lkdtm: Update usercopy tests for whitelisting
  usercopy: Restrict non-usercopy caches to size 0
  kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
  kvm: whitelist struct kvm_vcpu_arch
  arm: Implement thread_struct whitelist for hardened usercopy
  arm64: Implement thread_struct whitelist for hardened usercopy
  x86: Implement thread_struct whitelist for hardened usercopy
  fork: Provide usercopy whitelisting for task_struct
  fork: Define usercopy region in thread_stack slab caches
  fork: Define usercopy region in mm_struct slab caches
  net: Restrict unwhitelisted proto caches to size 0
  sctp: Copy struct sctp_sock.autoclose to userspace using put_user()
  sctp: Define usercopy region in SCTP proto slab cache
  caif: Define usercopy region in caif proto slab cache
  ip: Define usercopy region in IP proto slab cache
  net: Define usercopy region in struct proto slab cache
  scsi: Define usercopy region in scsi_sense_cache slab cache
  cifs: Define usercopy region in cifs_request slab cache
  vxfs: Define usercopy region in vxfs_inode slab cache
  ufs: Define usercopy region in ufs_inode_cache slab cache
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>include/linux/sched/mm.h: uninline mmdrop_async(), etc</title>
<updated>2018-02-01T01:18:36+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2018-02-01T00:15:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d70f2a14b72a4bc094cf3a92e4794644a7adc590'/>
<id>d70f2a14b72a4bc094cf3a92e4794644a7adc590</id>
<content type='text'>
mmdrop_async() is only used in fork.c.  Move that and its support
functions into fork.c, uninline it all.

Quite a lot of code gets moved around to avoid forward declarations.

Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
mmdrop_async() is only used in fork.c.  Move that and its support
functions into fork.c, uninline it all.

Quite a lot of code gets moved around to avoid forward declarations.

Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
