<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/tracehook.h, branch v5.1-rc1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>x86/ptrace: Fix documentation for tracehook_report_syscall_entry()</title>
<updated>2018-11-12T03:53:27+00:00</updated>
<author>
<name>Elvira Khabirova</name>
<email>lineprinter@altlinux.org</email>
</author>
<published>2018-11-10T03:22:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d19f9130b814d33c03118493c17454f7d90075d1'/>
<id>d19f9130b814d33c03118493c17454f7d90075d1</id>
<content type='text'>
tracehook_report_syscall_entry() is called not only
if %TIF_SYSCALL_TRACE is set, but also if %TIF_SYSCALL_EMU is set,
as appears from x86's entry code.

Signed-off-by: Elvira Khabirova &lt;lineprinter@altlinux.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: ldv@altlinux.org
Cc: oleg@redhat.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20181110042209.26333972@akathisia
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
tracehook_report_syscall_entry() is called not only
if %TIF_SYSCALL_TRACE is set, but also if %TIF_SYSCALL_EMU is set,
as appears from x86's entry code.

Signed-off-by: Elvira Khabirova &lt;lineprinter@altlinux.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: ldv@altlinux.org
Cc: oleg@redhat.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20181110042209.26333972@akathisia
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>signal: Simplify tracehook_report_syscall_exit</title>
<updated>2018-09-19T13:45:42+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2018-04-16T19:18:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=efc463adbccf709c5dbaf6cfbf84b7e94b62810a'/>
<id>efc463adbccf709c5dbaf6cfbf84b7e94b62810a</id>
<content type='text'>
Replace user_single_step_siginfo with user_single_step_report
that allocates siginfo structure on the stack and sends it.

This allows tracehook_report_syscall_exit to become a simple
if statement that calls user_single_step_report or ptrace_report_syscall
depending on the value of step.

Update the default helper function now called user_single_step_report
to explicitly set si_code to SI_USER and to set si_uid and si_pid to 0.
The default helper has always been doing this (using memset) but it
was far from obvious.

The powerpc helper can now just call force_sig_fault.
The x86 helper can now just call send_sigtrap.

Unfortunately the default implementation of user_single_step_report
can not use force_sig_fault as it does not use a SIGTRAP si_code.
So it has to carefully setup the siginfo and use use force_sig_info.

The net result is code that is easier to understand and simpler
to maintain.

Ref: 85ec7fd9f8e5 ("ptrace: introduce user_single_step_siginfo() helper")
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace user_single_step_siginfo with user_single_step_report
that allocates siginfo structure on the stack and sends it.

This allows tracehook_report_syscall_exit to become a simple
if statement that calls user_single_step_report or ptrace_report_syscall
depending on the value of step.

Update the default helper function now called user_single_step_report
to explicitly set si_code to SI_USER and to set si_uid and si_pid to 0.
The default helper has always been doing this (using memset) but it
was far from obvious.

The powerpc helper can now just call force_sig_fault.
The x86 helper can now just call send_sigtrap.

Unfortunately the default implementation of user_single_step_report
can not use force_sig_fault as it does not use a SIGTRAP si_code.
So it has to carefully setup the siginfo and use use force_sig_info.

The net result is code that is easier to understand and simpler
to maintain.

Ref: 85ec7fd9f8e5 ("ptrace: introduce user_single_step_siginfo() helper")
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blkcg: add generic throttling mechanism</title>
<updated>2018-07-09T15:07:54+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>jbacik@fb.com</email>
</author>
<published>2018-07-03T15:14:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d09d8df3a29403693d9d20cc34ed101f2c558e2b'/>
<id>d09d8df3a29403693d9d20cc34ed101f2c558e2b</id>
<content type='text'>
Since IO can be issued from literally anywhere it's almost impossible to
do throttling without having some sort of adverse effect somewhere else
in the system because of locking or other dependencies.  The best way to
solve this is to do the throttling when we know we aren't holding any
other kernel resources.  Do this by tracking throttling in a per-blkg
basis, and if we require throttling flag the task that it needs to check
before it returns to user space and possibly sleep there.

This is to address the case where a process is doing work that is
generating IO that can't be throttled, whether that is directly with a
lot of REQ_META IO, or indirectly by allocating so much memory that it
is swamping the disk with REQ_SWAP.  We can't use task_add_work as we
don't want to induce a memory allocation in the IO path, so simply
saving the request queue in the task and flagging it to do the
notify_resume thing achieves the same result without the overhead of a
memory allocation.

Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since IO can be issued from literally anywhere it's almost impossible to
do throttling without having some sort of adverse effect somewhere else
in the system because of locking or other dependencies.  The best way to
solve this is to do the throttling when we know we aren't holding any
other kernel resources.  Do this by tracking throttling in a per-blkg
basis, and if we require throttling flag the task that it needs to check
before it returns to user space and possibly sleep there.

This is to address the case where a process is doing work that is
generating IO that can't be throttled, whether that is directly with a
lot of REQ_META IO, or indirectly by allocating so much memory that it
is swamping the disk with REQ_SWAP.  We can't use task_add_work as we
don't want to induce a memory allocation in the IO path, so simply
saving the request queue in the task and flagging it to do the
notify_resume thing achieves the same result without the overhead of a
memory allocation.

Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>signal: Ensure every siginfo we send has all bits initialized</title>
<updated>2018-04-25T15:40:51+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2018-04-17T20:26:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3eb0f5193b497083391aa05d35210d5645211eef'/>
<id>3eb0f5193b497083391aa05d35210d5645211eef</id>
<content type='text'>
Call clear_siginfo to ensure every stack allocated siginfo is properly
initialized before being passed to the signal sending functions.

Note: It is not safe to depend on C initializers to initialize struct
siginfo on the stack because C is allowed to skip holes when
initializing a structure.

The initialization of struct siginfo in tracehook_report_syscall_exit
was moved from the helper user_single_step_siginfo into
tracehook_report_syscall_exit itself, to make it clear that the local
variable siginfo gets fully initialized.

In a few cases the scope of struct siginfo has been reduced to make it
clear that siginfo siginfo is not used on other paths in the function
in which it is declared.

Instances of using memset to initialize siginfo have been replaced
with calls clear_siginfo for clarity.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Call clear_siginfo to ensure every stack allocated siginfo is properly
initialized before being passed to the signal sending functions.

Note: It is not safe to depend on C initializers to initialize struct
siginfo on the stack because C is allowed to skip holes when
initializing a structure.

The initialization of struct siginfo in tracehook_report_syscall_exit
was moved from the helper user_single_step_siginfo into
tracehook_report_syscall_exit itself, to make it clear that the local
variable siginfo gets fully initialized.

In a few cases the scope of struct siginfo has been reduced to make it
clear that siginfo siginfo is not used on other paths in the function
in which it is declared.

Instances of using memset to initialize siginfo have been replaced
with calls clear_siginfo for clarity.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memcg: punt high overage reclaim to return-to-userland path</title>
<updated>2015-11-06T03:34:48+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2015-11-06T02:46:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b23afb93d317c65cef553b804f08dec8a7a0f7e1'/>
<id>b23afb93d317c65cef553b804f08dec8a7a0f7e1</id>
<content type='text'>
Currently, try_charge() tries to reclaim memory synchronously when the
high limit is breached; however, if the allocation doesn't have
__GFP_WAIT, synchronous reclaim is skipped.  If a process performs only
speculative allocations, it can blow way past the high limit.  This is
actually easily reproducible by simply doing "find /".  slab/slub
allocator tries speculative allocations first, so as long as there's
memory which can be consumed without blocking, it can keep allocating
memory regardless of the high limit.

This patch makes try_charge() always punt the over-high reclaim to the
return-to-userland path.  If try_charge() detects that high limit is
breached, it adds the overage to current-&gt;memcg_nr_pages_over_high and
schedules execution of mem_cgroup_handle_over_high() which performs
synchronous reclaim from the return-to-userland path.

As long as kernel doesn't have a run-away allocation spree, this should
provide enough protection while making kmemcg behave more consistently.
It also has the following benefits.

- All over-high reclaims can use GFP_KERNEL regardless of the specific
  gfp mask in use, e.g. GFP_NOFS, when the limit was breached.

- It copes with prio inversion.  Previously, a low-prio task with
  small memory.high might perform over-high reclaim with a bunch of
  locks held.  If a higher prio task needed any of these locks, it
  would have to wait until the low prio task finished reclaim and
  released the locks.  By handing over-high reclaim to the task exit
  path this issue can be avoided.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Michal Hocko &lt;mhocko@kernel.org&gt;
Reviewed-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, try_charge() tries to reclaim memory synchronously when the
high limit is breached; however, if the allocation doesn't have
__GFP_WAIT, synchronous reclaim is skipped.  If a process performs only
speculative allocations, it can blow way past the high limit.  This is
actually easily reproducible by simply doing "find /".  slab/slub
allocator tries speculative allocations first, so as long as there's
memory which can be consumed without blocking, it can keep allocating
memory regardless of the high limit.

This patch makes try_charge() always punt the over-high reclaim to the
return-to-userland path.  If try_charge() detects that high limit is
breached, it adds the overage to current-&gt;memcg_nr_pages_over_high and
schedules execution of mem_cgroup_handle_over_high() which performs
synchronous reclaim from the return-to-userland path.

As long as kernel doesn't have a run-away allocation spree, this should
provide enough protection while making kmemcg behave more consistently.
It also has the following benefits.

- All over-high reclaims can use GFP_KERNEL regardless of the specific
  gfp mask in use, e.g. GFP_NOFS, when the limit was breached.

- It copes with prio inversion.  Previously, a low-prio task with
  small memory.high might perform over-high reclaim with a bunch of
  locks held.  If a higher prio task needed any of these locks, it
  would have to wait until the low prio task finished reclaim and
  released the locks.  By handing over-high reclaim to the task exit
  path this issue can be avoided.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Michal Hocko &lt;mhocko@kernel.org&gt;
Reviewed-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracehook_signal_handler: Remove sig, info, ka and regs</title>
<updated>2014-08-06T11:03:43+00:00</updated>
<author>
<name>Richard Weinberger</name>
<email>richard@nod.at</email>
</author>
<published>2013-10-07T13:37:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=df5601f9c3d831b4c478b004a1ed90a18643adbe'/>
<id>df5601f9c3d831b4c478b004a1ed90a18643adbe</id>
<content type='text'>
These parameters are nowhere used, so we can remove them.

Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These parameters are nowhere used, so we can remove them.

Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arch: Mass conversion of smp_mb__*()</title>
<updated>2014-04-18T12:20:48+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-03-17T17:06:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4e857c58efeb99393cba5a5d0d8ec7117183137c'/>
<id>4e857c58efeb99393cba5a5d0d8ec7117183137c</id>
<content type='text'>
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>trim task_work: get rid of hlist</title>
<updated>2012-07-22T19:57:55+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2012-06-27T05:24:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=158e1645e07f3e9f7e4962d7a0997f5c3b98311b'/>
<id>158e1645e07f3e9f7e4962d7a0997f5c3b98311b</id>
<content type='text'>
layout based on Oleg's suggestion; single-linked list,
task-&gt;task_works points to the last element, forward pointer
from said last element points to head.  I'd still prefer
much more regular scheme with two pointers in task_work,
but...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
layout based on Oleg's suggestion; single-linked list,
task-&gt;task_works points to the last element, forward pointer
from said last element points to head.  I'd still prefer
much more regular scheme with two pointers in task_work,
but...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>keys: kill the dummy key_replace_session_keyring()</title>
<updated>2012-05-24T02:11:31+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-05-11T00:59:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dea649b8ac1861107c5d91e1a71121434fc64193'/>
<id>dea649b8ac1861107c5d91e1a71121434fc64193</id>
<content type='text'>
After the previouse change key_replace_session_keyring() becomes a nop.
Remove the dummy definition in key.h and update the callers in
arch/*/kernel/signal.c.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After the previouse change key_replace_session_keyring() becomes a nop.
Remove the dummy definition in key.h and update the callers in
arch/*/kernel/signal.c.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>task_work_add: generic process-context callbacks</title>
<updated>2012-05-24T02:09:21+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-05-11T00:59:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e73f8959af0439d114847eab5a8a5ce48f1217c4'/>
<id>e73f8959af0439d114847eab5a8a5ce48f1217c4</id>
<content type='text'>
Provide a simple mechanism that allows running code in the (nonatomic)
context of the arbitrary task.

The caller does task_work_add(task, task_work) and this task executes
task_work-&gt;func() either from do_notify_resume() or from do_exit().  The
callback can rely on PF_EXITING to detect the latter case.

"struct task_work" can be embedded in another struct, still it has "void
*data" to handle the most common/simple case.

This allows us to kill the -&gt;replacement_session_keyring hack, and
potentially this can have more users.

Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
tracehook_notify_resume() and do_exit().  But at the same time we can
remove the "replacement_session_keyring != NULL" checks from
arch/*/signal.c and exit_creds().

Note: task_work_add/task_work_run abuses -&gt;pi_lock.  This is only because
this lock is already used by lookup_pi_state() to synchronize with
do_exit() setting PF_EXITING.  Fortunately the scope of this lock in
task_work.c is really tiny, and the code is unlikely anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Provide a simple mechanism that allows running code in the (nonatomic)
context of the arbitrary task.

The caller does task_work_add(task, task_work) and this task executes
task_work-&gt;func() either from do_notify_resume() or from do_exit().  The
callback can rely on PF_EXITING to detect the latter case.

"struct task_work" can be embedded in another struct, still it has "void
*data" to handle the most common/simple case.

This allows us to kill the -&gt;replacement_session_keyring hack, and
potentially this can have more users.

Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
tracehook_notify_resume() and do_exit().  But at the same time we can
remove the "replacement_session_keyring != NULL" checks from
arch/*/signal.c and exit_creds().

Note: task_work_add/task_work_run abuses -&gt;pi_lock.  This is only because
this lock is already used by lookup_pi_state() to synchronize with
do_exit() setting PF_EXITING.  Fortunately the scope of this lock in
task_work.c is really tiny, and the code is unlikely anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
