<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/proc/base.c, branch v2.6.32.31</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>procfs: fix tid fdinfo</title>
<updated>2010-05-12T21:57:03+00:00</updated>
<author>
<name>Jerome Marchand</name>
<email>jmarchan@redhat.com</email>
</author>
<published>2010-04-27T20:13:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=08997c60a04eb38343ff7b5c2cca39468e76eebd'/>
<id>08997c60a04eb38343ff7b5c2cca39468e76eebd</id>
<content type='text'>
commit 3835541dd481091c4dbf5ef83c08aed12e50fd61 upstream.

Correct the file_operations struct in fdinfo entry of tid_base_stuff[].

Presently /proc/*/task/*/fdinfo contains symlinks to opened files like
/proc/*/fd/.

Signed-off-by: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 3835541dd481091c4dbf5ef83c08aed12e50fd61 upstream.

Correct the file_operations struct in fdinfo entry of tid_base_stuff[].

Presently /proc/*/task/*/fdinfo contains symlinks to opened files like
/proc/*/fd/.

Signed-off-by: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>oom: fix the unsafe usage of badness() in proc_oom_score()</title>
<updated>2010-04-26T14:41:13+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2010-04-01T13:13:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e72f38d31b4a9a5e6c6207a815067be361e4b872'/>
<id>e72f38d31b4a9a5e6c6207a815067be361e4b872</id>
<content type='text'>
commit b95c35e76b29ba812e5dabdd91592e25ec640e93 upstream.

proc_oom_score(task) has a reference to task_struct, but that is all.
If this task was already released before we take tasklist_lock

	- we can't use task-&gt;group_leader, it points to nowhere

	- it is not safe to call badness() even if this task is
	  -&gt;group_leader, has_intersects_mems_allowed() assumes
	  it is safe to iterate over -&gt;thread_group list.

	- even worse, badness() can hit -&gt;signal == NULL

Add the pid_alive() check to ensure __unhash_process() was not called.

Also, use "task" instead of task-&gt;group_leader. badness() should return
the same result for any sub-thread. Currently this is not true, but
this should be changed anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b95c35e76b29ba812e5dabdd91592e25ec640e93 upstream.

proc_oom_score(task) has a reference to task_struct, but that is all.
If this task was already released before we take tasklist_lock

	- we can't use task-&gt;group_leader, it points to nowhere

	- it is not safe to call badness() even if this task is
	  -&gt;group_leader, has_intersects_mems_allowed() assumes
	  it is safe to iterate over -&gt;thread_group list.

	- even worse, badness() can hit -&gt;signal == NULL

Add the pid_alive() check to ensure __unhash_process() was not called.

Also, use "task" instead of task-&gt;group_leader. badness() should return
the same result for any sub-thread. Currently this is not true, but
this should be changed anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Switch proc/self to nd_set_link()</title>
<updated>2010-03-15T15:49:37+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2010-01-14T06:03:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6b3a4b9c49dfd933ca05ef9a052da8355143b849'/>
<id>6b3a4b9c49dfd933ca05ef9a052da8355143b849</id>
<content type='text'>
commit 7fee4868be91e71a3ee8e57289ebf5e10a12297e upstream.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7fee4868be91e71a3ee8e57289ebf5e10a12297e upstream.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>pidns: fix a leak in /proc dentries and inodes with pid namespaces.</title>
<updated>2009-11-12T15:25:57+00:00</updated>
<author>
<name>Sukadev Bhattiprolu</name>
<email>sukadev@linux.vnet.ibm.com</email>
</author>
<published>2009-11-11T22:26:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=29f12ca32122db98481150be09d35bd72b68045e'/>
<id>29f12ca32122db98481150be09d35bd72b68045e</id>
<content type='text'>
Daniel Lezcano reported a leak in 'struct pid' and 'struct pid_namespace'
that is discussed in:

	http://lkml.org/lkml/2009/10/2/159.

To summarize the thread, when container-init is terminated, it sets the
PF_EXITING flag, zaps other processes in the container and waits to reap
them.  As a part of reaping, the container-init should flush any /proc
dentries associated with the processes.  But because the container-init is
itself exiting and the following PF_EXITING check, the dentries are not
flushed, resulting in leak in /proc inodes and dentries.

This fix reverts the commit 7766755a2f249e7e0 ("Fix /proc dcache deadlock
in do_exit") which introduced the check for PF_EXITING.  At the time of
the commit, shrink_dcache_parent() flushed dentries from other filesystems
also and could have caused a deadlock which the commit fixed.  But as
pointed out by Eric Biederman, after commit 0feae5c47aabdde59,
shrink_dcache_parent() no longer affects other filesystems.  So reverting
the commit is now safe.

As pointed out by Jan Kara, the leak is not as critical since the
unclaimed space will be reclaimed under memory pressure or by:

	echo 3 &gt; /proc/sys/vm/drop_caches

But since this check is no longer required, its best to remove it.

Signed-off-by: Sukadev Bhattiprolu &lt;sukadev@us.ibm.com&gt;
Reported-by: Daniel Lezcano &lt;dlezcano@fr.ibm.com&gt;
Acked-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Acked-by: Jan Kara &lt;jack@ucw.cz&gt;
Cc: Andrea Arcangeli &lt;andrea@cpushare.com&gt;
Cc: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Daniel Lezcano reported a leak in 'struct pid' and 'struct pid_namespace'
that is discussed in:

	http://lkml.org/lkml/2009/10/2/159.

To summarize the thread, when container-init is terminated, it sets the
PF_EXITING flag, zaps other processes in the container and waits to reap
them.  As a part of reaping, the container-init should flush any /proc
dentries associated with the processes.  But because the container-init is
itself exiting and the following PF_EXITING check, the dentries are not
flushed, resulting in leak in /proc inodes and dentries.

This fix reverts the commit 7766755a2f249e7e0 ("Fix /proc dcache deadlock
in do_exit") which introduced the check for PF_EXITING.  At the time of
the commit, shrink_dcache_parent() flushed dentries from other filesystems
also and could have caused a deadlock which the commit fixed.  But as
pointed out by Eric Biederman, after commit 0feae5c47aabdde59,
shrink_dcache_parent() no longer affects other filesystems.  So reverting
the commit is now safe.

As pointed out by Jan Kara, the leak is not as critical since the
unclaimed space will be reclaimed under memory pressure or by:

	echo 3 &gt; /proc/sys/vm/drop_caches

But since this check is no longer required, its best to remove it.

Signed-off-by: Sukadev Bhattiprolu &lt;sukadev@us.ibm.com&gt;
Reported-by: Daniel Lezcano &lt;dlezcano@fr.ibm.com&gt;
Acked-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Acked-by: Jan Kara &lt;jack@ucw.cz&gt;
Cc: Andrea Arcangeli &lt;andrea@cpushare.com&gt;
Cc: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/proc/base.c: fix proc_fault_inject_write() input sanity check</title>
<updated>2009-09-23T14:39:40+00:00</updated>
<author>
<name>Vincent Li</name>
<email>macli@brc.ubc.ca</email>
</author>
<published>2009-09-22T23:45:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cba8aafe1e07dfc8bae5ba78be8e02883bd34d31'/>
<id>cba8aafe1e07dfc8bae5ba78be8e02883bd34d31</id>
<content type='text'>
Remove obfuscated zero-length input check and return -EINVAL instead of
-EIO error to make the error message clear to user.  Add whitespace
stripping.  No functionality changes.

The old code:

echo  1  &gt; /proc/pid/make-it-fail (ok)
echo 1foo &gt; /proc/pid/make-it-fail (-bash: echo: write error: Input/output error)

The new code:

echo  1  &gt; /proc/pid/make-it-fail (ok)
echo 1foo &gt; /proc/pid/make-it-fail (-bash: echo: write error: Invalid argument)

This patch is conservative in changes to not breaking existing
scripts/applications.

Signed-off-by: Vincent Li &lt;macli@brc.ubc.ca&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove obfuscated zero-length input check and return -EINVAL instead of
-EIO error to make the error message clear to user.  Add whitespace
stripping.  No functionality changes.

The old code:

echo  1  &gt; /proc/pid/make-it-fail (ok)
echo 1foo &gt; /proc/pid/make-it-fail (-bash: echo: write error: Input/output error)

The new code:

echo  1  &gt; /proc/pid/make-it-fail (ok)
echo 1foo &gt; /proc/pid/make-it-fail (-bash: echo: write error: Invalid argument)

This patch is conservative in changes to not breaking existing
scripts/applications.

Signed-off-by: Vincent Li &lt;macli@brc.ubc.ca&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc_flush_task: flush /proc/tid/task/pid when a sub-thread exits</title>
<updated>2009-09-23T14:39:40+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2009-09-22T23:45:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9b4d1cbef8f41aff2b3e4ca31f566c071fe601de'/>
<id>9b4d1cbef8f41aff2b3e4ca31f566c071fe601de</id>
<content type='text'>
The exiting sub-thread flushes /proc/pid only, but this doesn't buy too
much: ps and friends mostly use /proc/tid/task/pid.

Remove "if (thread_group_leader())" checks from proc_flush_task() path,
this means we always remove /proc/tid/task/pid dentry on exit, and this
actually matches the comment above proc_flush_task().

The test-case:

	static void* tfunc(void *arg)
	{
		char name[256];

		sprintf(name, "/proc/%d/task/%ld/status", getpid(), gettid());
		close(open(name, O_RDONLY));

		return NULL;
	}

	int main(void)
	{
		pthread_t t;

		for (;;) {
			if (!pthread_create(&amp;t, NULL, &amp;tfunc, NULL))
				pthread_join(t, NULL);
		}
	}

slabtop shows that pid/proc_inode_cache/etc grow quickly and
"indefinitely" until the task is killed or shrink_slab() is called, not
good.  And the main thread needs a lot of time to exit.

The same can happen if something like "ps -efL" runs continuously, while
some application spawns short-living threads.

Reported-by: "James M. Leddy" &lt;jleddy@redhat.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Dominic Duval &lt;dduval@redhat.com&gt;
Cc: Frank Hirtz &lt;fhirtz@redhat.com&gt;
Cc: "Fuller, Johnray" &lt;Johnray.Fuller@gs.com&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Paul Batkowski &lt;pbatkowski@redhat.com&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The exiting sub-thread flushes /proc/pid only, but this doesn't buy too
much: ps and friends mostly use /proc/tid/task/pid.

Remove "if (thread_group_leader())" checks from proc_flush_task() path,
this means we always remove /proc/tid/task/pid dentry on exit, and this
actually matches the comment above proc_flush_task().

The test-case:

	static void* tfunc(void *arg)
	{
		char name[256];

		sprintf(name, "/proc/%d/task/%ld/status", getpid(), gettid());
		close(open(name, O_RDONLY));

		return NULL;
	}

	int main(void)
	{
		pthread_t t;

		for (;;) {
			if (!pthread_create(&amp;t, NULL, &amp;tfunc, NULL))
				pthread_join(t, NULL);
		}
	}

slabtop shows that pid/proc_inode_cache/etc grow quickly and
"indefinitely" until the task is killed or shrink_slab() is called, not
good.  And the main thread needs a lot of time to exit.

The same can happen if something like "ps -efL" runs continuously, while
some application spawns short-living threads.

Reported-by: "James M. Leddy" &lt;jleddy@redhat.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Dominic Duval &lt;dduval@redhat.com&gt;
Cc: Frank Hirtz &lt;fhirtz@redhat.com&gt;
Cc: "Fuller, Johnray" &lt;Johnray.Fuller@gs.com&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Paul Batkowski &lt;pbatkowski@redhat.com&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: fix reported unit for RLIMIT_CPU</title>
<updated>2009-09-23T14:39:40+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees.cook@canonical.com</email>
</author>
<published>2009-09-22T23:45:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cff4edb591c153a779a27a3fd8e7bc1217f2f6b8'/>
<id>cff4edb591c153a779a27a3fd8e7bc1217f2f6b8</id>
<content type='text'>
/proc/$pid/limits should show RLIMIT_CPU as seconds, which is the unit
used in kernel/posix-cpu-timers.c:

        unsigned long psecs = cputime_to_secs(ptime);
        ...
        if (psecs &gt;= sig-&gt;rlim[RLIMIT_CPU].rlim_max) {
                ...
                __group_send_sig_info(SIGKILL, SEND_SIG_PRIV, tsk);

Signed-off-by: Kees Cook &lt;kees.cook@canonical.com&gt;
Acked-by: WANG Cong &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
/proc/$pid/limits should show RLIMIT_CPU as seconds, which is the unit
used in kernel/posix-cpu-timers.c:

        unsigned long psecs = cputime_to_secs(ptime);
        ...
        if (psecs &gt;= sig-&gt;rlim[RLIMIT_CPU].rlim_max) {
                ...
                __group_send_sig_info(SIGKILL, SEND_SIG_PRIV, tsk);

Signed-off-by: Kees Cook &lt;kees.cook@canonical.com&gt;
Acked-by: WANG Cong &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>oom: fix oom_adjust_write() input sanity check</title>
<updated>2009-09-22T14:17:39+00:00</updated>
<author>
<name>KOSAKI Motohiro</name>
<email>kosaki.motohiro@jp.fujitsu.com</email>
</author>
<published>2009-09-22T00:03:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5d863b89688e5811cd9e5bd0082cb38abe03adf3'/>
<id>5d863b89688e5811cd9e5bd0082cb38abe03adf3</id>
<content type='text'>
Andrew Morton pointed out oom_adjust_write() has very strange EIO
and new line handling. this patch fixes it.

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Andrew Morton pointed out oom_adjust_write() has very strange EIO
and new line handling. this patch fixes it.

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>oom: make oom_score to per-process value</title>
<updated>2009-09-22T14:17:39+00:00</updated>
<author>
<name>KOSAKI Motohiro</name>
<email>kosaki.motohiro@jp.fujitsu.com</email>
</author>
<published>2009-09-22T00:03:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=495789a51a91cb8c015d8d77fecbac1caf20b186'/>
<id>495789a51a91cb8c015d8d77fecbac1caf20b186</id>
<content type='text'>
oom-killer kills a process, not task.  Then oom_score should be calculated
as per-process too.  it makes consistency more and makes speed up
select_bad_process().

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
oom-killer kills a process, not task.  Then oom_score should be calculated
as per-process too.  it makes consistency more and makes speed up
select_bad_process().

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>oom: move oom_adj value from task_struct to signal_struct</title>
<updated>2009-09-22T14:17:39+00:00</updated>
<author>
<name>KOSAKI Motohiro</name>
<email>kosaki.motohiro@jp.fujitsu.com</email>
</author>
<published>2009-09-22T00:03:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=28b83c5193e7ab951e402252278f2cc79dc4d298'/>
<id>28b83c5193e7ab951e402252278f2cc79dc4d298</id>
<content type='text'>
Currently, OOM logic callflow is here.

    __out_of_memory()
        select_bad_process()            for each task
            badness()                   calculate badness of one task
                oom_kill_process()      search child
                    oom_kill_task()     kill target task and mm shared tasks with it

example, process-A have two thread, thread-A and thread-B and it have very
fat memory and each thread have following oom_adj and oom_score.

     thread-A: oom_adj = OOM_DISABLE, oom_score = 0
     thread-B: oom_adj = 0,           oom_score = very-high

Then, select_bad_process() select thread-B, but oom_kill_task() refuse
kill the task because thread-A have OOM_DISABLE.  Thus __out_of_memory()
call select_bad_process() again.  but select_bad_process() select the same
task.  It mean kernel fall in livelock.

The fact is, select_bad_process() must select killable task.  otherwise
OOM logic go into livelock.

And root cause is, oom_adj shouldn't be per-thread value.  it should be
per-process value because OOM-killer kill a process, not thread.  Thus
This patch moves oomkilladj (now more appropriately named oom_adj) from
struct task_struct to struct signal_struct.  it naturally prevent
select_bad_process() choose wrong task.

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, OOM logic callflow is here.

    __out_of_memory()
        select_bad_process()            for each task
            badness()                   calculate badness of one task
                oom_kill_process()      search child
                    oom_kill_task()     kill target task and mm shared tasks with it

example, process-A have two thread, thread-A and thread-B and it have very
fat memory and each thread have following oom_adj and oom_score.

     thread-A: oom_adj = OOM_DISABLE, oom_score = 0
     thread-B: oom_adj = 0,           oom_score = very-high

Then, select_bad_process() select thread-B, but oom_kill_task() refuse
kill the task because thread-A have OOM_DISABLE.  Thus __out_of_memory()
call select_bad_process() again.  but select_bad_process() select the same
task.  It mean kernel fall in livelock.

The fact is, select_bad_process() must select killable task.  otherwise
OOM logic go into livelock.

And root cause is, oom_adj shouldn't be per-thread value.  it should be
per-process value because OOM-killer kill a process, not thread.  Thus
This patch moves oomkilladj (now more appropriately named oom_adj) from
struct task_struct to struct signal_struct.  it naturally prevent
select_bad_process() choose wrong task.

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
