<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/ptrace.c, branch v3.0.16</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>ptrace: ptrace_resume() shouldn't wake up !TASK_TRACED thread</title>
<updated>2011-05-25T17:20:21+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2011-05-25T17:20:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0666fb51b1483f27506e212cc7f7b2645b5c7acc'/>
<id>0666fb51b1483f27506e212cc7f7b2645b5c7acc</id>
<content type='text'>
It is not clear why ptrace_resume() does wake_up_process(). Unless the
caller is PTRACE_KILL the tracee should be TASK_TRACED so we can use
wake_up_state(__TASK_TRACED). If sys_ptrace() races with SIGKILL we do
not need the extra and potentionally spurious wakeup.

If the caller is PTRACE_KILL, wake_up_process() is even more wrong.
The tracee can sleep in any state in any place, and if we have a buggy
code which doesn't handle a spurious wakeup correctly PTRACE_KILL can
be used to exploit it. For example:

	int main(void)
	{
		int child, status;

		child = fork();
		if (!child) {
			int ret;

			assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);

			ret = pause();
			printf("pause: %d %m\n", ret);

			return 0x23;
		}

		sleep(1);
		assert(ptrace(PTRACE_KILL, child, 0,0) == 0);

		assert(child == wait(&amp;status));
		printf("wait: %x\n", status);

		return 0;
	}

prints "pause: -1 Unknown error 514", -ERESTARTNOHAND leaks to the
userland. In this case sys_pause() is buggy as well and should be
fixed.

I do not know what was the original rationality behind PTRACE_KILL.
The man page is simply wrong and afaics it was always wrong. Imho
it should be deprecated, or may be it should do send_sig(SIGKILL)
as Denys suggests, but in any case I do not think that the current
behaviour was intentional.

Note: there is another problem, ptrace_resume() changes -&gt;exit_code
and this can race with SIGKILL too. Eventually we should change ptrace
to not use -&gt;exit_code.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is not clear why ptrace_resume() does wake_up_process(). Unless the
caller is PTRACE_KILL the tracee should be TASK_TRACED so we can use
wake_up_state(__TASK_TRACED). If sys_ptrace() races with SIGKILL we do
not need the extra and potentionally spurious wakeup.

If the caller is PTRACE_KILL, wake_up_process() is even more wrong.
The tracee can sleep in any state in any place, and if we have a buggy
code which doesn't handle a spurious wakeup correctly PTRACE_KILL can
be used to exploit it. For example:

	int main(void)
	{
		int child, status;

		child = fork();
		if (!child) {
			int ret;

			assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);

			ret = pause();
			printf("pause: %d %m\n", ret);

			return 0x23;
		}

		sleep(1);
		assert(ptrace(PTRACE_KILL, child, 0,0) == 0);

		assert(child == wait(&amp;status));
		printf("wait: %x\n", status);

		return 0;
	}

prints "pause: -1 Unknown error 514", -ERESTARTNOHAND leaks to the
userland. In this case sys_pause() is buggy as well and should be
fixed.

I do not know what was the original rationality behind PTRACE_KILL.
The man page is simply wrong and afaics it was always wrong. Imho
it should be deprecated, or may be it should do send_sig(SIGKILL)
as Denys suggests, but in any case I do not think that the current
behaviour was intentional.

Note: there is another problem, ptrace_resume() changes -&gt;exit_code
and this can race with SIGKILL too. Eventually we should change ptrace
to not use -&gt;exit_code.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc</title>
<updated>2011-05-20T20:33:21+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-05-20T20:33:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3ed4c0583daa34dedb568b26ff99e5a7b58db612'/>
<id>3ed4c0583daa34dedb568b26ff99e5a7b58db612</id>
<content type='text'>
* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (41 commits)
  signal: trivial, fix the "timespec declared inside parameter list" warning
  job control: reorganize wait_task_stopped()
  ptrace: fix signal-&gt;wait_chldexit usage in task_clear_group_stop_trapping()
  signal: sys_sigprocmask() needs retarget_shared_pending()
  signal: cleanup sys_sigprocmask()
  signal: rename signandsets() to sigandnsets()
  signal: do_sigtimedwait() needs retarget_shared_pending()
  signal: introduce do_sigtimedwait() to factor out compat/native code
  signal: sys_rt_sigtimedwait: simplify the timeout logic
  signal: cleanup sys_rt_sigprocmask()
  x86: signal: sys_rt_sigreturn() should use set_current_blocked()
  x86: signal: handle_signal() should use set_current_blocked()
  signal: sigprocmask() should do retarget_shared_pending()
  signal: sigprocmask: narrow the scope of -&gt;siglock
  signal: retarget_shared_pending: optimize while_each_thread() loop
  signal: retarget_shared_pending: consider shared/unblocked signals only
  signal: introduce retarget_shared_pending()
  ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/
  signal: Turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED
  signal: do_signal_stop: Remove the unneeded task_clear_group_stop_pending()
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (41 commits)
  signal: trivial, fix the "timespec declared inside parameter list" warning
  job control: reorganize wait_task_stopped()
  ptrace: fix signal-&gt;wait_chldexit usage in task_clear_group_stop_trapping()
  signal: sys_sigprocmask() needs retarget_shared_pending()
  signal: cleanup sys_sigprocmask()
  signal: rename signandsets() to sigandnsets()
  signal: do_sigtimedwait() needs retarget_shared_pending()
  signal: introduce do_sigtimedwait() to factor out compat/native code
  signal: sys_rt_sigtimedwait: simplify the timeout logic
  signal: cleanup sys_rt_sigprocmask()
  x86: signal: sys_rt_sigreturn() should use set_current_blocked()
  x86: signal: handle_signal() should use set_current_blocked()
  signal: sigprocmask() should do retarget_shared_pending()
  signal: sigprocmask: narrow the scope of -&gt;siglock
  signal: retarget_shared_pending: optimize while_each_thread() loop
  signal: retarget_shared_pending: consider shared/unblocked signals only
  signal: introduce retarget_shared_pending()
  ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/
  signal: Turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED
  signal: do_signal_stop: Remove the unneeded task_clear_group_stop_pending()
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>ptrace: Prepare to fix racy accesses on task breakpoints</title>
<updated>2011-04-25T15:28:24+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-04-07T14:53:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bf26c018490c2fce7fe9b629083b96ce0e6ad019'/>
<id>bf26c018490c2fce7fe9b629083b96ce0e6ad019</id>
<content type='text'>
When a task is traced and is in a stopped state, the tracer
may execute a ptrace request to examine the tracee state and
get its task struct. Right after, the tracee can be killed
and thus its breakpoints released.
This can happen concurrently when the tracer is in the middle
of reading or modifying these breakpoints, leading to dereferencing
a freed pointer.

Hence, to prepare the fix, create a generic breakpoint reference
holding API. When a reference on the breakpoints of a task is
held, the breakpoints won't be released until the last reference
is dropped. After that, no more ptrace request on the task's
breakpoints can be serviced for the tracer.

Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Prasad &lt;prasad@linux.vnet.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: v2.6.33.. &lt;stable@kernel.org&gt;
Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a task is traced and is in a stopped state, the tracer
may execute a ptrace request to examine the tracee state and
get its task struct. Right after, the tracee can be killed
and thus its breakpoints released.
This can happen concurrently when the tracer is in the middle
of reading or modifying these breakpoints, leading to dereferencing
a freed pointer.

Hence, to prepare the fix, create a generic breakpoint reference
holding API. When a reference on the breakpoints of a task is
held, the breakpoints won't be released until the last reference
is dropped. After that, no more ptrace request on the task's
breakpoints can be serviced for the tracer.

Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Prasad &lt;prasad@linux.vnet.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: v2.6.33.. &lt;stable@kernel.org&gt;
Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into ptrace</title>
<updated>2011-04-07T18:44:11+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2011-04-07T18:44:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e46bc9b6fd65bc9f406a4211fbf95683cc9c2937'/>
<id>e46bc9b6fd65bc9f406a4211fbf95683cc9c2937</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/</title>
<updated>2011-04-04T00:11:05+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2011-04-01T18:13:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=321fb561971ba0f10ce18c0f8a4b9fbfc7cef4b9'/>
<id>321fb561971ba0f10ce18c0f8a4b9fbfc7cef4b9</id>
<content type='text'>
After "ptrace: Clean transitions between TASK_STOPPED and TRACED"
d79fdd6d96f46fabb779d86332e3677c6f5c2a4f, ptrace_check_attach()
should never see a TASK_STOPPED tracee and s/STOPPED/TRACED/ is
no longer legal. Add the warning.

Note: ptrace_check_attach() can be greatly simplified, in particular
it doesn't need tasklist. But I'd prefer another patch for that.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After "ptrace: Clean transitions between TASK_STOPPED and TRACED"
d79fdd6d96f46fabb779d86332e3677c6f5c2a4f, ptrace_check_attach()
should never see a TASK_STOPPED tracee and s/STOPPED/TRACED/ is
no longer legal. Add the warning.

Note: ptrace_check_attach() can be greatly simplified, in particular
it doesn't need tasklist. But I'd prefer another patch for that.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>userns: allow ptrace from non-init user namespaces</title>
<updated>2011-03-24T02:47:05+00:00</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serge@hallyn.com</email>
</author>
<published>2011-03-23T23:43:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8409cca7056113bee3236cb6a8e4d8d4d1eef102'/>
<id>8409cca7056113bee3236cb6a8e4d8d4d1eef102</id>
<content type='text'>
ptrace is allowed to tasks in the same user namespace according to the
usual rules (i.e.  the same rules as for two tasks in the init user
namespace).  ptrace is also allowed to a user namespace to which the
current task the has CAP_SYS_PTRACE capability.

Changelog:
	Dec 31: Address feedback by Eric:
		. Correct ptrace uid check
		. Rename may_ptrace_ns to ptrace_capable
		. Also fix the cap_ptrace checks.
	Jan  1: Use const cred struct
	Jan 11: use task_ns_capable() in place of ptrace_capable().
	Feb 23: same_or_ancestore_user_ns() was not an appropriate
		check to constrain cap_issubset.  Rather, cap_issubset()
		only is meaningful when both capsets are in the same
		user_ns.

Signed-off-by: Serge E. Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Daniel Lezcano &lt;daniel.lezcano@free.fr&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ptrace is allowed to tasks in the same user namespace according to the
usual rules (i.e.  the same rules as for two tasks in the init user
namespace).  ptrace is also allowed to a user namespace to which the
current task the has CAP_SYS_PTRACE capability.

Changelog:
	Dec 31: Address feedback by Eric:
		. Correct ptrace uid check
		. Rename may_ptrace_ns to ptrace_capable
		. Also fix the cap_ptrace checks.
	Jan  1: Use const cred struct
	Jan 11: use task_ns_capable() in place of ptrace_capable().
	Feb 23: same_or_ancestore_user_ns() was not an appropriate
		check to constrain cap_issubset.  Rather, cap_issubset()
		only is meaningful when both capsets are in the same
		user_ns.

Signed-off-by: Serge E. Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Daniel Lezcano &lt;daniel.lezcano@free.fr&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ptrace: Always put ptracee into appropriate execution state</title>
<updated>2011-03-23T09:37:01+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-03-23T09:37:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0e9f0a4abfd80f8adca624538d479d95159b16d8'/>
<id>0e9f0a4abfd80f8adca624538d479d95159b16d8</id>
<content type='text'>
Currently, __ptrace_unlink() wakes up the tracee iff it's in
TASK_TRACED.  For unlinking from PTRACE_DETACH, this is correct as the
tracee is guaranteed to be in TASK_TRACED or dead; however, unlinking
also happens when the ptracer exits and in this case the ptracee can
be in any state and ptrace might be left running even if the group it
belongs to is stopped.

This patch updates __ptrace_unlink() such that GROUP_STOP_PENDING is
reinstated regardless of the ptracee's current state as long as it's
alive and makes sure that signal_wake_up() is called if execution
state transition is necessary.

Test case follows.

  #include &lt;unistd.h&gt;
  #include &lt;time.h&gt;
  #include &lt;sys/types.h&gt;
  #include &lt;sys/ptrace.h&gt;
  #include &lt;sys/wait.h&gt;

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(void)
  {
	  pid_t tracee;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  nanosleep(&amp;ts1s, NULL);
			  write(1, ".", 1);
		  }
	  }

	  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
	  waitid(P_PID, tracee, &amp;si, WSTOPPED);
	  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
	  waitid(P_PID, tracee, &amp;si, WSTOPPED);
	  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
	  write(1, "exiting", 7);
	  return 0;
  }

Before the patch, after the parent process exits, the child is left
running and prints out "." every second.

  exiting..... (continues)

After the patch, the group stop initiated by the implied SIGSTOP from
PTRACE_ATTACH is re-established when the parent exits.

  exiting

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, __ptrace_unlink() wakes up the tracee iff it's in
TASK_TRACED.  For unlinking from PTRACE_DETACH, this is correct as the
tracee is guaranteed to be in TASK_TRACED or dead; however, unlinking
also happens when the ptracer exits and in this case the ptracee can
be in any state and ptrace might be left running even if the group it
belongs to is stopped.

This patch updates __ptrace_unlink() such that GROUP_STOP_PENDING is
reinstated regardless of the ptracee's current state as long as it's
alive and makes sure that signal_wake_up() is called if execution
state transition is necessary.

Test case follows.

  #include &lt;unistd.h&gt;
  #include &lt;time.h&gt;
  #include &lt;sys/types.h&gt;
  #include &lt;sys/ptrace.h&gt;
  #include &lt;sys/wait.h&gt;

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(void)
  {
	  pid_t tracee;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  nanosleep(&amp;ts1s, NULL);
			  write(1, ".", 1);
		  }
	  }

	  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
	  waitid(P_PID, tracee, &amp;si, WSTOPPED);
	  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
	  waitid(P_PID, tracee, &amp;si, WSTOPPED);
	  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
	  write(1, "exiting", 7);
	  return 0;
  }

Before the patch, after the parent process exits, the child is left
running and prints out "." every second.

  exiting..... (continues)

After the patch, the group stop initiated by the implied SIGSTOP from
PTRACE_ATTACH is re-established when the parent exits.

  exiting

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ptrace: Collapse ptrace_untrace() into __ptrace_unlink()</title>
<updated>2011-03-23T09:37:01+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-03-23T09:37:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e3bd058f62896ec7a2c605ed62a0a811e9147947'/>
<id>e3bd058f62896ec7a2c605ed62a0a811e9147947</id>
<content type='text'>
Remove the extra task_is_traced() check in __ptrace_unlink() and
collapse ptrace_untrace() into __ptrace_unlink().  This is to prepare
for further changes.

While at it, drop the comment on top of ptrace_untrace() and convert
__ptrace_unlink() comment to docbook format.  Detailed comment will be
added by the next patch.

This patch doesn't cause any visible behavior changes.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the extra task_is_traced() check in __ptrace_unlink() and
collapse ptrace_untrace() into __ptrace_unlink().  This is to prepare
for further changes.

While at it, drop the comment on top of ptrace_untrace() and convert
__ptrace_unlink() comment to docbook format.  Detailed comment will be
added by the next patch.

This patch doesn't cause any visible behavior changes.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ptrace: Clean transitions between TASK_STOPPED and TRACED</title>
<updated>2011-03-23T09:37:00+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-03-23T09:37:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d79fdd6d96f46fabb779d86332e3677c6f5c2a4f'/>
<id>d79fdd6d96f46fabb779d86332e3677c6f5c2a4f</id>
<content type='text'>
Currently, if the task is STOPPED on ptrace attach, it's left alone
and the state is silently changed to TRACED on the next ptrace call.
The behavior breaks the assumption that arch_ptrace_stop() is called
before any task is poked by ptrace and is ugly in that a task
manipulates the state of another task directly.

With GROUP_STOP_PENDING, the transitions between TASK_STOPPED and
TRACED can be made clean.  The tracer can use the flag to tell the
tracee to retry stop on attach and detach.  On retry, the tracee will
enter the desired state in the correct way.  The lower 16bits of
task-&gt;group_stop is used to remember the signal number which caused
the last group stop.  This is used while retrying for ptrace attach as
the original group_exit_code could have been consumed with wait(2) by
then.

As the real parent may wait(2) and consume the group_exit_code
anytime, the group_exit_code needs to be saved separately so that it
can be used when switching from regular sleep to ptrace_stop().  This
is recorded in the lower 16bits of task-&gt;group_stop.

If a task is already stopped and there's no intervening SIGCONT, a
ptrace request immediately following a successful PTRACE_ATTACH should
always succeed even if the tracer doesn't wait(2) for attach
completion; however, with this change, the tracee might still be
TASK_RUNNING trying to enter TASK_TRACED which would cause the
following request to fail with -ESRCH.

This intermediate state is hidden from the ptracer by setting
GROUP_STOP_TRAPPING on attach and making ptrace_check_attach() wait
for it to clear on its signal-&gt;wait_chldexit.  Completing the
transition or getting killed clears TRAPPING and wakes up the tracer.

Note that the STOPPED -&gt; RUNNING -&gt; TRACED transition is still visible
to other threads which are in the same group as the ptracer and the
reverse transition is visible to all.  Please read the comments for
details.

Oleg:

* Spotted a race condition where a task may retry group stop without
  proper bookkeeping.  Fixed by redoing bookkeeping on retry.

* Spotted that the transition is visible to userland in several
  different ways.  Most are fixed with GROUP_STOP_TRAPPING.  Unhandled
  corner case is documented.

* Pointed out not setting GROUP_STOP_SIGMASK on an already stopped
  task would result in more consistent behavior.

* Pointed out that calling ptrace_stop() from do_signal_stop() in
  TASK_STOPPED can race with group stop start logic and then confuse
  the TRAPPING wait in ptrace_check_attach().  ptrace_stop() is now
  called with TASK_RUNNING.

* Suggested using signal-&gt;wait_chldexit instead of bit wait.

* Spotted a race condition between TRACED transition and clearing of
  TRAPPING.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
Cc: Jan Kratochvil &lt;jan.kratochvil@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, if the task is STOPPED on ptrace attach, it's left alone
and the state is silently changed to TRACED on the next ptrace call.
The behavior breaks the assumption that arch_ptrace_stop() is called
before any task is poked by ptrace and is ugly in that a task
manipulates the state of another task directly.

With GROUP_STOP_PENDING, the transitions between TASK_STOPPED and
TRACED can be made clean.  The tracer can use the flag to tell the
tracee to retry stop on attach and detach.  On retry, the tracee will
enter the desired state in the correct way.  The lower 16bits of
task-&gt;group_stop is used to remember the signal number which caused
the last group stop.  This is used while retrying for ptrace attach as
the original group_exit_code could have been consumed with wait(2) by
then.

As the real parent may wait(2) and consume the group_exit_code
anytime, the group_exit_code needs to be saved separately so that it
can be used when switching from regular sleep to ptrace_stop().  This
is recorded in the lower 16bits of task-&gt;group_stop.

If a task is already stopped and there's no intervening SIGCONT, a
ptrace request immediately following a successful PTRACE_ATTACH should
always succeed even if the tracer doesn't wait(2) for attach
completion; however, with this change, the tracee might still be
TASK_RUNNING trying to enter TASK_TRACED which would cause the
following request to fail with -ESRCH.

This intermediate state is hidden from the ptracer by setting
GROUP_STOP_TRAPPING on attach and making ptrace_check_attach() wait
for it to clear on its signal-&gt;wait_chldexit.  Completing the
transition or getting killed clears TRAPPING and wakes up the tracer.

Note that the STOPPED -&gt; RUNNING -&gt; TRACED transition is still visible
to other threads which are in the same group as the ptracer and the
reverse transition is visible to all.  Please read the comments for
details.

Oleg:

* Spotted a race condition where a task may retry group stop without
  proper bookkeeping.  Fixed by redoing bookkeeping on retry.

* Spotted that the transition is visible to userland in several
  different ways.  Most are fixed with GROUP_STOP_TRAPPING.  Unhandled
  corner case is documented.

* Pointed out not setting GROUP_STOP_SIGMASK on an already stopped
  task would result in more consistent behavior.

* Pointed out that calling ptrace_stop() from do_signal_stop() in
  TASK_STOPPED can race with group stop start logic and then confuse
  the TRAPPING wait in ptrace_check_attach().  ptrace_stop() is now
  called with TASK_RUNNING.

* Suggested using signal-&gt;wait_chldexit instead of bit wait.

* Spotted a race condition between TRACED transition and clearing of
  TRAPPING.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
Cc: Jan Kratochvil &lt;jan.kratochvil@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ptrace: Remove the extra wake_up_state() from ptrace_detach()</title>
<updated>2011-03-23T09:37:00+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-03-23T09:37:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9f2bf6513a6cca0b00cbbf67ba6197017cfba548'/>
<id>9f2bf6513a6cca0b00cbbf67ba6197017cfba548</id>
<content type='text'>
This wake_up_state() has a turbulent history.  This is a remnant from
ancient ptrace implementation and patently wrong.  Commit 95a3540d
(ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic) removed
it but the change was reverted later by commit edaba2c5 (ptrace:
revert "ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic")
citing compatibility breakage and general brokeness of the whole group
stop / ptrace interaction.  Then, recently, it got converted from
wake_up_process() to wake_up_state() to make it less dangerous.

Digging through the mailing archives, the compatibility breakage
doesn't seem to be critical in the sense that the behavior isn't well
defined or reliable to begin with and it seems to have been agreed to
remove the wakeup with proper cleanup of the whole thing.

Now that the group stop and its interaction with ptrace are being
cleaned up, it's high time to finally kill this silliness.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This wake_up_state() has a turbulent history.  This is a remnant from
ancient ptrace implementation and patently wrong.  Commit 95a3540d
(ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic) removed
it but the change was reverted later by commit edaba2c5 (ptrace:
revert "ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic")
citing compatibility breakage and general brokeness of the whole group
stop / ptrace interaction.  Then, recently, it got converted from
wake_up_process() to wake_up_state() to make it less dangerous.

Digging through the mailing archives, the compatibility breakage
doesn't seem to be critical in the sense that the behavior isn't well
defined or reliable to begin with and it seems to have been agreed to
remove the wakeup with proper cleanup of the whole thing.

Now that the group stop and its interaction with ptrace are being
cleaned up, it's high time to finally kill this silliness.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
