<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/tracehook.h, branch v4.6-rc6</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>memcg: punt high overage reclaim to return-to-userland path</title>
<updated>2015-11-06T03:34:48+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2015-11-06T02:46:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b23afb93d317c65cef553b804f08dec8a7a0f7e1'/>
<id>b23afb93d317c65cef553b804f08dec8a7a0f7e1</id>
<content type='text'>
Currently, try_charge() tries to reclaim memory synchronously when the
high limit is breached; however, if the allocation doesn't have
__GFP_WAIT, synchronous reclaim is skipped.  If a process performs only
speculative allocations, it can blow way past the high limit.  This is
actually easily reproducible by simply doing "find /".  slab/slub
allocator tries speculative allocations first, so as long as there's
memory which can be consumed without blocking, it can keep allocating
memory regardless of the high limit.

This patch makes try_charge() always punt the over-high reclaim to the
return-to-userland path.  If try_charge() detects that high limit is
breached, it adds the overage to current-&gt;memcg_nr_pages_over_high and
schedules execution of mem_cgroup_handle_over_high() which performs
synchronous reclaim from the return-to-userland path.

As long as kernel doesn't have a run-away allocation spree, this should
provide enough protection while making kmemcg behave more consistently.
It also has the following benefits.

- All over-high reclaims can use GFP_KERNEL regardless of the specific
  gfp mask in use, e.g. GFP_NOFS, when the limit was breached.

- It copes with prio inversion.  Previously, a low-prio task with
  small memory.high might perform over-high reclaim with a bunch of
  locks held.  If a higher prio task needed any of these locks, it
  would have to wait until the low prio task finished reclaim and
  released the locks.  By handing over-high reclaim to the task exit
  path this issue can be avoided.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Michal Hocko &lt;mhocko@kernel.org&gt;
Reviewed-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, try_charge() tries to reclaim memory synchronously when the
high limit is breached; however, if the allocation doesn't have
__GFP_WAIT, synchronous reclaim is skipped.  If a process performs only
speculative allocations, it can blow way past the high limit.  This is
actually easily reproducible by simply doing "find /".  slab/slub
allocator tries speculative allocations first, so as long as there's
memory which can be consumed without blocking, it can keep allocating
memory regardless of the high limit.

This patch makes try_charge() always punt the over-high reclaim to the
return-to-userland path.  If try_charge() detects that high limit is
breached, it adds the overage to current-&gt;memcg_nr_pages_over_high and
schedules execution of mem_cgroup_handle_over_high() which performs
synchronous reclaim from the return-to-userland path.

As long as kernel doesn't have a run-away allocation spree, this should
provide enough protection while making kmemcg behave more consistently.
It also has the following benefits.

- All over-high reclaims can use GFP_KERNEL regardless of the specific
  gfp mask in use, e.g. GFP_NOFS, when the limit was breached.

- It copes with prio inversion.  Previously, a low-prio task with
  small memory.high might perform over-high reclaim with a bunch of
  locks held.  If a higher prio task needed any of these locks, it
  would have to wait until the low prio task finished reclaim and
  released the locks.  By handing over-high reclaim to the task exit
  path this issue can be avoided.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Michal Hocko &lt;mhocko@kernel.org&gt;
Reviewed-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracehook_signal_handler: Remove sig, info, ka and regs</title>
<updated>2014-08-06T11:03:43+00:00</updated>
<author>
<name>Richard Weinberger</name>
<email>richard@nod.at</email>
</author>
<published>2013-10-07T13:37:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=df5601f9c3d831b4c478b004a1ed90a18643adbe'/>
<id>df5601f9c3d831b4c478b004a1ed90a18643adbe</id>
<content type='text'>
These parameters are nowhere used, so we can remove them.

Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These parameters are nowhere used, so we can remove them.

Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arch: Mass conversion of smp_mb__*()</title>
<updated>2014-04-18T12:20:48+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-03-17T17:06:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4e857c58efeb99393cba5a5d0d8ec7117183137c'/>
<id>4e857c58efeb99393cba5a5d0d8ec7117183137c</id>
<content type='text'>
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>trim task_work: get rid of hlist</title>
<updated>2012-07-22T19:57:55+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2012-06-27T05:24:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=158e1645e07f3e9f7e4962d7a0997f5c3b98311b'/>
<id>158e1645e07f3e9f7e4962d7a0997f5c3b98311b</id>
<content type='text'>
layout based on Oleg's suggestion; single-linked list,
task-&gt;task_works points to the last element, forward pointer
from said last element points to head.  I'd still prefer
much more regular scheme with two pointers in task_work,
but...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
layout based on Oleg's suggestion; single-linked list,
task-&gt;task_works points to the last element, forward pointer
from said last element points to head.  I'd still prefer
much more regular scheme with two pointers in task_work,
but...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>keys: kill the dummy key_replace_session_keyring()</title>
<updated>2012-05-24T02:11:31+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-05-11T00:59:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dea649b8ac1861107c5d91e1a71121434fc64193'/>
<id>dea649b8ac1861107c5d91e1a71121434fc64193</id>
<content type='text'>
After the previouse change key_replace_session_keyring() becomes a nop.
Remove the dummy definition in key.h and update the callers in
arch/*/kernel/signal.c.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After the previouse change key_replace_session_keyring() becomes a nop.
Remove the dummy definition in key.h and update the callers in
arch/*/kernel/signal.c.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>task_work_add: generic process-context callbacks</title>
<updated>2012-05-24T02:09:21+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-05-11T00:59:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e73f8959af0439d114847eab5a8a5ce48f1217c4'/>
<id>e73f8959af0439d114847eab5a8a5ce48f1217c4</id>
<content type='text'>
Provide a simple mechanism that allows running code in the (nonatomic)
context of the arbitrary task.

The caller does task_work_add(task, task_work) and this task executes
task_work-&gt;func() either from do_notify_resume() or from do_exit().  The
callback can rely on PF_EXITING to detect the latter case.

"struct task_work" can be embedded in another struct, still it has "void
*data" to handle the most common/simple case.

This allows us to kill the -&gt;replacement_session_keyring hack, and
potentially this can have more users.

Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
tracehook_notify_resume() and do_exit().  But at the same time we can
remove the "replacement_session_keyring != NULL" checks from
arch/*/signal.c and exit_creds().

Note: task_work_add/task_work_run abuses -&gt;pi_lock.  This is only because
this lock is already used by lookup_pi_state() to synchronize with
do_exit() setting PF_EXITING.  Fortunately the scope of this lock in
task_work.c is really tiny, and the code is unlikely anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Provide a simple mechanism that allows running code in the (nonatomic)
context of the arbitrary task.

The caller does task_work_add(task, task_work) and this task executes
task_work-&gt;func() either from do_notify_resume() or from do_exit().  The
callback can rely on PF_EXITING to detect the latter case.

"struct task_work" can be embedded in another struct, still it has "void
*data" to handle the most common/simple case.

This allows us to kill the -&gt;replacement_session_keyring hack, and
potentially this can have more users.

Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
tracehook_notify_resume() and do_exit().  But at the same time we can
remove the "replacement_session_keyring != NULL" checks from
arch/*/signal.c and exit_creds().

Note: task_work_add/task_work_run abuses -&gt;pi_lock.  This is only because
this lock is already used by lookup_pi_state() to synchronize with
do_exit() setting PF_EXITING.  Fortunately the scope of this lock in
task_work.c is really tiny, and the code is unlikely anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>move key_repace_session_keyring() into tracehook_notify_resume()</title>
<updated>2012-05-24T02:09:20+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2012-05-23T18:44:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a42c6ded827dbd396d2efde7530620be029a72d1'/>
<id>a42c6ded827dbd396d2efde7530620be029a72d1</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>TIF_NOTIFY_RESUME is defined on all targets now</title>
<updated>2012-05-24T02:09:19+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2012-04-24T06:44:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1227dd773d8d4e3983b4b751f9ffa0f41402fb7c'/>
<id>1227dd773d8d4e3983b4b751f9ffa0f41402fb7c</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ptrace: the killed tracee should not enter the syscall</title>
<updated>2012-03-23T23:58:40+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-03-23T22:02:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=15cab952139404d0e593cb1aaab0a3547ac0f95b'/>
<id>15cab952139404d0e593cb1aaab0a3547ac0f95b</id>
<content type='text'>
Another old/known problem.  If the tracee is killed after it reports
syscall_entry, it starts the syscall and debugger can't control this.
This confuses the users and this creates the security problems for
ptrace jailers.

Change tracehook_report_syscall_entry() to return non-zero if killed,
this instructs syscall_trace_enter() to abort the syscall.

Reported-by: Chris Evans &lt;scarybeasts@gmail.com&gt;
Tested-by: Indan Zupancic &lt;indan@nul.nu&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Denys Vlasenko &lt;vda.linux@googlemail.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Pedro Alves &lt;palves@redhat.com&gt;
Cc: Jan Kratochvil &lt;jan.kratochvil@redhat.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Another old/known problem.  If the tracee is killed after it reports
syscall_entry, it starts the syscall and debugger can't control this.
This confuses the users and this creates the security problems for
ptrace jailers.

Change tracehook_report_syscall_entry() to return non-zero if killed,
this instructs syscall_trace_enter() to abort the syscall.

Reported-by: Chris Evans &lt;scarybeasts@gmail.com&gt;
Tested-by: Indan Zupancic &lt;indan@nul.nu&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Denys Vlasenko &lt;vda.linux@googlemail.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Pedro Alves &lt;palves@redhat.com&gt;
Cc: Jan Kratochvil &lt;jan.kratochvil@redhat.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kill tracehook_notify_death()</title>
<updated>2011-06-27T18:30:08+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2011-06-23T17:06:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=45cdf5cc0703c537194588c63d53bad1f2539d36'/>
<id>45cdf5cc0703c537194588c63d53bad1f2539d36</id>
<content type='text'>
Kill tracehook_notify_death(), reimplement the logic in its caller,
exit_notify().

Also, change the exec_id's check to use thread_group_leader() instead
of task_detached(), this is more clear. This logic only applies to
the exiting leader, a sub-thread must never change its exit_signal.

Note: when the traced group leader exits the exit_signal-or-SIGCHLD
logic looks really strange:

	- we notify the tracer even if !thread_group_empty() but
	   do_wait(WEXITED) can't work until all threads exit

	- if the tracer is real_parent, it is not clear why can't
	  we use -&gt;exit_signal event if !thread_group_empty()

-v2: do not try to fix the 2nd oddity to avoid the subtle behavior
     change mixed with reorganization, suggested by Tejun.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Kill tracehook_notify_death(), reimplement the logic in its caller,
exit_notify().

Also, change the exec_id's check to use thread_group_leader() instead
of task_detached(), this is more clear. This logic only applies to
the exiting leader, a sub-thread must never change its exit_signal.

Note: when the traced group leader exits the exit_signal-or-SIGCHLD
logic looks really strange:

	- we notify the tracer even if !thread_group_empty() but
	   do_wait(WEXITED) can't work until all threads exit

	- if the tracer is real_parent, it is not clear why can't
	  we use -&gt;exit_signal event if !thread_group_empty()

-v2: do not try to fix the 2nd oddity to avoid the subtle behavior
     change mixed with reorganization, suggested by Tejun.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
