linux-toradex.git/include/linux/tracehook.h, branch v4.6-rc6

memcg: punt high overage reclaim to return-to-userland path

2015-11-06T03:34:48+00:00

Currently, try_charge() tries to reclaim memory synchronously when the
high limit is breached; however, if the allocation doesn't have
__GFP_WAIT, synchronous reclaim is skipped.  If a process performs only
speculative allocations, it can blow way past the high limit.  This is
actually easily reproducible by simply doing "find /".  slab/slub
allocator tries speculative allocations first, so as long as there's
memory which can be consumed without blocking, it can keep allocating
memory regardless of the high limit.

This patch makes try_charge() always punt the over-high reclaim to the
return-to-userland path.  If try_charge() detects that high limit is
breached, it adds the overage to current->memcg_nr_pages_over_high and
schedules execution of mem_cgroup_handle_over_high() which performs
synchronous reclaim from the return-to-userland path.

As long as kernel doesn't have a run-away allocation spree, this should
provide enough protection while making kmemcg behave more consistently.
It also has the following benefits.

- All over-high reclaims can use GFP_KERNEL regardless of the specific
  gfp mask in use, e.g. GFP_NOFS, when the limit was breached.

- It copes with prio inversion.  Previously, a low-prio task with
  small memory.high might perform over-high reclaim with a bunch of
  locks held.  If a higher prio task needed any of these locks, it
  would have to wait until the low prio task finished reclaim and
  released the locks.  By handing over-high reclaim to the task exit
  path this issue can be avoided.

Signed-off-by: Tejun Heo 
Acked-by: Michal Hocko 
Reviewed-by: Vladimir Davydov 
Acked-by: Johannes Weiner 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

tracehook_signal_handler: Remove sig, info, ka and regs

2014-08-06T11:03:43+00:00

These parameters are nowhere used, so we can remove them.

Signed-off-by: Richard Weinberger

arch: Mass conversion of smp_mb__*()

2014-04-18T12:20:48+00:00

Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra 
Acked-by: Paul E. McKenney 
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds 
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar

trim task_work: get rid of hlist

2012-07-22T19:57:55+00:00

layout based on Oleg's suggestion; single-linked list,
task->task_works points to the last element, forward pointer
from said last element points to head.  I'd still prefer
much more regular scheme with two pointers in task_work,
but...

Signed-off-by: Al Viro

keys: kill the dummy key_replace_session_keyring()

2012-05-24T02:11:31+00:00

After the previouse change key_replace_session_keyring() becomes a nop.
Remove the dummy definition in key.h and update the callers in
arch/*/kernel/signal.c.

Signed-off-by: Oleg Nesterov 
Acked-by: David Howells 
Cc: Thomas Gleixner 
Cc: Richard Kuo 
Cc: Linus Torvalds 
Cc: Alexander Gordeev 
Cc: Chris Zankel 
Cc: David Smith 
Cc: "Frank Ch. Eigler" 
Cc: Geert Uytterhoeven 
Cc: Larry Woodman 
Cc: Peter Zijlstra 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
Signed-off-by: Al Viro

task_work_add: generic process-context callbacks

2012-05-24T02:09:21+00:00

Provide a simple mechanism that allows running code in the (nonatomic)
context of the arbitrary task.

The caller does task_work_add(task, task_work) and this task executes
task_work->func() either from do_notify_resume() or from do_exit().  The
callback can rely on PF_EXITING to detect the latter case.

"struct task_work" can be embedded in another struct, still it has "void
*data" to handle the most common/simple case.

This allows us to kill the ->replacement_session_keyring hack, and
potentially this can have more users.

Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
tracehook_notify_resume() and do_exit().  But at the same time we can
remove the "replacement_session_keyring != NULL" checks from
arch/*/signal.c and exit_creds().

Note: task_work_add/task_work_run abuses ->pi_lock.  This is only because
this lock is already used by lookup_pi_state() to synchronize with
do_exit() setting PF_EXITING.  Fortunately the scope of this lock in
task_work.c is really tiny, and the code is unlikely anyway.

Signed-off-by: Oleg Nesterov 
Acked-by: David Howells 
Cc: Thomas Gleixner 
Cc: Richard Kuo 
Cc: Linus Torvalds 
Cc: Alexander Gordeev 
Cc: Chris Zankel 
Cc: David Smith 
Cc: "Frank Ch. Eigler" 
Cc: Geert Uytterhoeven 
Cc: Larry Woodman 
Cc: Peter Zijlstra 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
Signed-off-by: Al Viro

move key_repace_session_keyring() into tracehook_notify_resume()

2012-05-24T02:09:20+00:00

Signed-off-by: Al Viro

TIF_NOTIFY_RESUME is defined on all targets now

2012-05-24T02:09:19+00:00

Signed-off-by: Al Viro

ptrace: the killed tracee should not enter the syscall

2012-03-23T23:58:40+00:00

Another old/known problem.  If the tracee is killed after it reports
syscall_entry, it starts the syscall and debugger can't control this.
This confuses the users and this creates the security problems for
ptrace jailers.

Change tracehook_report_syscall_entry() to return non-zero if killed,
this instructs syscall_trace_enter() to abort the syscall.

Reported-by: Chris Evans 
Tested-by: Indan Zupancic 
Signed-off-by: Oleg Nesterov 
Cc: Denys Vlasenko 
Cc: Tejun Heo 
Cc: Pedro Alves 
Cc: Jan Kratochvil 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

kill tracehook_notify_death()

2011-06-27T18:30:08+00:00

Kill tracehook_notify_death(), reimplement the logic in its caller,
exit_notify().

Also, change the exec_id's check to use thread_group_leader() instead
of task_detached(), this is more clear. This logic only applies to
the exiting leader, a sub-thread must never change its exit_signal.

Note: when the traced group leader exits the exit_signal-or-SIGCHLD
logic looks really strange:

	- we notify the tracer even if !thread_group_empty() but
	   do_wait(WEXITED) can't work until all threads exit

	- if the tracer is real_parent, it is not clear why can't
	  we use ->exit_signal event if !thread_group_empty()

-v2: do not try to fix the 2nd oddity to avoid the subtle behavior
     change mixed with reorganization, suggested by Tejun.

Signed-off-by: Oleg Nesterov 
Reviewed-by: Tejun Heo