<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/mm/memory.c, branch v2.6.25.2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>memcg: when do_swap's do_wp_page fails</title>
<updated>2008-03-05T00:35:14+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hugh@veritas.com</email>
</author>
<published>2008-03-04T22:29:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=61469f1d51777fc3b6d8d70da8373ee77ee13349'/>
<id>61469f1d51777fc3b6d8d70da8373ee77ee13349</id>
<content type='text'>
Don't uncharge when do_swap_page's call to do_wp_page fails: the page which
was charged for is there in the pagetable, and will be correctly uncharged
when that area is unmapped - it was only its COWing which failed.

And while we're here, remove earlier XXX comment: yes, OR in do_wp_page's
return value (maybe VM_FAULT_WRITE) with do_swap_page's there; but if it
fails, mask out success bits, which might confuse some arches e.g.  sparc.

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hirokazu Takahashi &lt;taka@valinux.co.jp&gt;
Cc: YAMAMOTO Takashi &lt;yamamoto@valinux.co.jp&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Don't uncharge when do_swap_page's call to do_wp_page fails: the page which
was charged for is there in the pagetable, and will be correctly uncharged
when that area is unmapped - it was only its COWing which failed.

And while we're here, remove earlier XXX comment: yes, OR in do_wp_page's
return value (maybe VM_FAULT_WRITE) with do_swap_page's there; but if it
fails, mask out success bits, which might confuse some arches e.g.  sparc.

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hirokazu Takahashi &lt;taka@valinux.co.jp&gt;
Cc: YAMAMOTO Takashi &lt;yamamoto@valinux.co.jp&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memcg: page_cache_release not __free_page</title>
<updated>2008-03-05T00:35:14+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hugh@veritas.com</email>
</author>
<published>2008-03-04T22:29:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6dbf6d3bb955d5a92005b6ecd6ffad2c5b95b963'/>
<id>6dbf6d3bb955d5a92005b6ecd6ffad2c5b95b963</id>
<content type='text'>
There's nothing wrong with mem_cgroup_charge failure in do_wp_page and
do_anonymous page using __free_page, but it does look odd when nearby code
uses page_cache_release: use that instead (while turning a blind eye to
ancient inconsistencies of page_cache_release versus put_page).

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hirokazu Takahashi &lt;taka@valinux.co.jp&gt;
Cc: YAMAMOTO Takashi &lt;yamamoto@valinux.co.jp&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's nothing wrong with mem_cgroup_charge failure in do_wp_page and
do_anonymous page using __free_page, but it does look odd when nearby code
uses page_cache_release: use that instead (while turning a blind eye to
ancient inconsistencies of page_cache_release versus put_page).

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hirokazu Takahashi &lt;taka@valinux.co.jp&gt;
Cc: YAMAMOTO Takashi &lt;yamamoto@valinux.co.jp&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86</title>
<updated>2008-02-15T05:23:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@woody.linux-foundation.org</email>
</author>
<published>2008-02-15T05:23:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=664a1566df81b44f7e5e234d55e3bc8c6c0be211'/>
<id>664a1566df81b44f7e5e234d55e3bc8c6c0be211</id>
<content type='text'>
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: cpa, fix out of date comment
  KVM is not seen under X86 config with latest git (32 bit compile)
  x86: cpa: ensure page alignment
  x86: include proper prototypes for rodata_test
  x86: fix gart_iommu_init()
  x86: EFI set_memory_x()/set_memory_uc() fixes
  x86: make dump_pagetable() static
  x86: fix "BUG: sleeping function called from invalid context" in print_vma_addr()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: cpa, fix out of date comment
  KVM is not seen under X86 config with latest git (32 bit compile)
  x86: cpa: ensure page alignment
  x86: include proper prototypes for rodata_test
  x86: fix gart_iommu_init()
  x86: EFI set_memory_x()/set_memory_uc() fixes
  x86: make dump_pagetable() static
  x86: fix "BUG: sleeping function called from invalid context" in print_vma_addr()
</pre>
</div>
</content>
</entry>
<entry>
<title>d_path: Make d_path() use a struct path</title>
<updated>2008-02-15T05:17:09+00:00</updated>
<author>
<name>Jan Blunck</name>
<email>jblunck@suse.de</email>
</author>
<published>2008-02-15T03:38:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cf28b4863f9ee8f122e8ff3ac0d403e07ba9c6d9'/>
<id>cf28b4863f9ee8f122e8ff3ac0d403e07ba9c6d9</id>
<content type='text'>
d_path() is used on a &lt;dentry,vfsmount&gt; pair.  Lets use a struct path to
reflect this.

[akpm@linux-foundation.org: fix build in mm/memory.c]
Signed-off-by: Jan Blunck &lt;jblunck@suse.de&gt;
Acked-by: Bryan Wu &lt;bryan.wu@analog.com&gt;
Acked-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "J. Bruce Fields" &lt;bfields@fieldses.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Michael Halcrow &lt;mhalcrow@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
d_path() is used on a &lt;dentry,vfsmount&gt; pair.  Lets use a struct path to
reflect this.

[akpm@linux-foundation.org: fix build in mm/memory.c]
Signed-off-by: Jan Blunck &lt;jblunck@suse.de&gt;
Acked-by: Bryan Wu &lt;bryan.wu@analog.com&gt;
Acked-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "J. Bruce Fields" &lt;bfields@fieldses.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Michael Halcrow &lt;mhalcrow@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: fix "BUG: sleeping function called from invalid context" in print_vma_addr()</title>
<updated>2008-02-14T22:30:19+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-02-13T19:21:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e8bff74afbdb4ad72bf6135c84289c47cf557892'/>
<id>e8bff74afbdb4ad72bf6135c84289c47cf557892</id>
<content type='text'>
Jiri Kosina reported the following deadlock scenario with
show_unhandled_signals enabled:

 [   68.379022] gnome-settings-[2941] trap int3 ip:3d2c840f34
 sp:7fff36f5d100 error:0&lt;3&gt;BUG: sleeping function called from invalid
 context at kernel/rwsem.c:21
 [   68.379039] in_atomic():1, irqs_disabled():0
 [   68.379044] no locks held by gnome-settings-/2941.
 [   68.379050] Pid: 2941, comm: gnome-settings- Not tainted 2.6.25-rc1 #30
 [   68.379054]
 [   68.379056] Call Trace:
 [   68.379061]  &lt;#DB&gt;  [&lt;ffffffff81064883&gt;] ? __debug_show_held_locks+0x13/0x30
 [   68.379109]  [&lt;ffffffff81036765&gt;] __might_sleep+0xe5/0x110
 [   68.379123]  [&lt;ffffffff812f2240&gt;] down_read+0x20/0x70
 [   68.379137]  [&lt;ffffffff8109cdca&gt;] print_vma_addr+0x3a/0x110
 [   68.379152]  [&lt;ffffffff8100f435&gt;] do_trap+0xf5/0x170
 [   68.379168]  [&lt;ffffffff8100f52b&gt;] do_int3+0x7b/0xe0
 [   68.379180]  [&lt;ffffffff812f4a6f&gt;] int3+0x9f/0xd0
 [   68.379203]  &lt;&lt;EOE&gt;&gt;
 [   68.379229]  in libglib-2.0.so.0.1505.0[3d2c800000+dc000]

and tracked it down to:

  commit 03252919b79891063cf99145612360efbdf9500b
  Author: Andi Kleen &lt;ak@suse.de&gt;
  Date:   Wed Jan 30 13:33:18 2008 +0100

      x86: print which shared library/executable faulted in segfault etc. messages

the problem is that we call down_read() from an atomic context.

Solve this by returning from print_vma_addr() if the preempt count is
elevated. Update preempt_conditional_sti / preempt_conditional_cli to
unconditionally lift the preempt count even on !CONFIG_PREEMPT.

Reported-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Jiri Kosina reported the following deadlock scenario with
show_unhandled_signals enabled:

 [   68.379022] gnome-settings-[2941] trap int3 ip:3d2c840f34
 sp:7fff36f5d100 error:0&lt;3&gt;BUG: sleeping function called from invalid
 context at kernel/rwsem.c:21
 [   68.379039] in_atomic():1, irqs_disabled():0
 [   68.379044] no locks held by gnome-settings-/2941.
 [   68.379050] Pid: 2941, comm: gnome-settings- Not tainted 2.6.25-rc1 #30
 [   68.379054]
 [   68.379056] Call Trace:
 [   68.379061]  &lt;#DB&gt;  [&lt;ffffffff81064883&gt;] ? __debug_show_held_locks+0x13/0x30
 [   68.379109]  [&lt;ffffffff81036765&gt;] __might_sleep+0xe5/0x110
 [   68.379123]  [&lt;ffffffff812f2240&gt;] down_read+0x20/0x70
 [   68.379137]  [&lt;ffffffff8109cdca&gt;] print_vma_addr+0x3a/0x110
 [   68.379152]  [&lt;ffffffff8100f435&gt;] do_trap+0xf5/0x170
 [   68.379168]  [&lt;ffffffff8100f52b&gt;] do_int3+0x7b/0xe0
 [   68.379180]  [&lt;ffffffff812f4a6f&gt;] int3+0x9f/0xd0
 [   68.379203]  &lt;&lt;EOE&gt;&gt;
 [   68.379229]  in libglib-2.0.so.0.1505.0[3d2c800000+dc000]

and tracked it down to:

  commit 03252919b79891063cf99145612360efbdf9500b
  Author: Andi Kleen &lt;ak@suse.de&gt;
  Date:   Wed Jan 30 13:33:18 2008 +0100

      x86: print which shared library/executable faulted in segfault etc. messages

the problem is that we call down_read() from an atomic context.

Solve this by returning from print_vma_addr() if the preempt count is
elevated. Update preempt_conditional_sti / preempt_conditional_cli to
unconditionally lift the preempt count even on !CONFIG_PREEMPT.

Reported-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Be more robust about bad arguments in get_user_pages()</title>
<updated>2008-02-12T04:44:44+00:00</updated>
<author>
<name>Jonathan Corbet</name>
<email>corbet@lwn.net</email>
</author>
<published>2008-02-11T23:17:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=900cf086fd2fbad07f72f4575449e0d0958f860f'/>
<id>900cf086fd2fbad07f72f4575449e0d0958f860f</id>
<content type='text'>
So I spent a while pounding my head against my monitor trying to figure
out the vmsplice() vulnerability - how could a failure to check for
*read* access turn into a root exploit? It turns out that it's a buffer
overflow problem which is made easy by the way get_user_pages() is
coded.

In particular, "len" is a signed int, and it is only checked at the
*end* of a do {} while() loop.  So, if it is passed in as zero, the loop
will execute once and decrement len to -1.  At that point, the loop will
proceed until the next invalid address is found; in the process, it will
likely overflow the pages array passed in to get_user_pages().

I think that, if get_user_pages() has been asked to grab zero pages,
that's what it should do.  Thus this patch; it is, among other things,
enough to block the (already fixed) root exploit and any others which
might be lurking in similar code.  I also think that the number of pages
should be unsigned, but changing the prototype of this function probably
requires some more careful review.

Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So I spent a while pounding my head against my monitor trying to figure
out the vmsplice() vulnerability - how could a failure to check for
*read* access turn into a root exploit? It turns out that it's a buffer
overflow problem which is made easy by the way get_user_pages() is
coded.

In particular, "len" is a signed int, and it is only checked at the
*end* of a do {} while() loop.  So, if it is passed in as zero, the loop
will execute once and decrement len to -1.  At that point, the loop will
proceed until the next invalid address is found; in the process, it will
likely overflow the pages array passed in to get_user_pages().

I think that, if get_user_pages() has been asked to grab zero pages,
that's what it should do.  Thus this patch; it is, among other things,
enough to block the (already fixed) root exploit and any others which
might be lurking in similar code.  I also think that the number of pages
should be unsigned, but changing the prototype of this function probably
requires some more careful review.

Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>CONFIG_HIGHPTE vs. sub-page page tables.</title>
<updated>2008-02-08T17:22:42+00:00</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2008-02-08T12:22:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2f569afd9ced9ebec9a6eb3dbf6f83429be0a7b4'/>
<id>2f569afd9ced9ebec9a6eb3dbf6f83429be0a7b4</id>
<content type='text'>
Background: I've implemented 1K/2K page tables for s390.  These sub-page
page tables are required to properly support the s390 virtualization
instruction with KVM.  The SIE instruction requires that the page tables
have 256 page table entries (pte) followed by 256 page status table entries
(pgste).  The pgstes are only required if the process is using the SIE
instruction.  The pgstes are updated by the hardware and by the hypervisor
for a number of reasons, one of them is dirty and reference bit tracking.
To avoid wasting memory the standard pte table allocation should return
1K/2K (31/64 bit) and 2K/4K if the process is using SIE.

Problem: Page size on s390 is 4K, page table size is 1K or 2K.  That means
the s390 version for pte_alloc_one cannot return a pointer to a struct
page.  Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
cannot return a pointer to a pte either, since that would require more than
32 bit for the return value of pte_alloc_one (and the pte * would not be
accessible since its not kmapped).

Solution: The only solution I found to this dilemma is a new typedef: a
pgtable_t.  For s390 pgtable_t will be a (pte *) - to be introduced with a
later patch.  For everybody else it will be a (struct page *).  The
additional problem with the initialization of the ptl lock and the
NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
a destructor pgtable_page_dtor.  The page table allocation and free
functions need to call these two whenever a page table page is allocated or
freed.  pmd_populate will get a pgtable_t instead of a struct page pointer.
 To get the pgtable_t back from a pmd entry that has been installed with
pmd_populate a new function pmd_pgtable is added.  It replaces the pmd_page
call in free_pte_range and apply_to_pte_range.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: &lt;linux-arch@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Background: I've implemented 1K/2K page tables for s390.  These sub-page
page tables are required to properly support the s390 virtualization
instruction with KVM.  The SIE instruction requires that the page tables
have 256 page table entries (pte) followed by 256 page status table entries
(pgste).  The pgstes are only required if the process is using the SIE
instruction.  The pgstes are updated by the hardware and by the hypervisor
for a number of reasons, one of them is dirty and reference bit tracking.
To avoid wasting memory the standard pte table allocation should return
1K/2K (31/64 bit) and 2K/4K if the process is using SIE.

Problem: Page size on s390 is 4K, page table size is 1K or 2K.  That means
the s390 version for pte_alloc_one cannot return a pointer to a struct
page.  Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
cannot return a pointer to a pte either, since that would require more than
32 bit for the return value of pte_alloc_one (and the pte * would not be
accessible since its not kmapped).

Solution: The only solution I found to this dilemma is a new typedef: a
pgtable_t.  For s390 pgtable_t will be a (pte *) - to be introduced with a
later patch.  For everybody else it will be a (struct page *).  The
additional problem with the initialization of the ptl lock and the
NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
a destructor pgtable_page_dtor.  The page table allocation and free
functions need to call these two whenever a page table page is allocated or
freed.  pmd_populate will get a pgtable_t instead of a struct page pointer.
 To get the pgtable_t back from a pmd entry that has been installed with
pmd_populate a new function pmd_pgtable is added.  It replaces the pmd_page
call in free_pte_range and apply_to_pte_range.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: &lt;linux-arch@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Memory controller: make charging gfp mask aware</title>
<updated>2008-02-07T16:42:19+00:00</updated>
<author>
<name>Balbir Singh</name>
<email>balbir@linux.vnet.ibm.com</email>
</author>
<published>2008-02-07T08:14:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e1a1cd590e3fcb0d2e230128daf2337ea55387dc'/>
<id>e1a1cd590e3fcb0d2e230128daf2337ea55387dc</id>
<content type='text'>
Nick Piggin pointed out that swap cache and page cache addition routines
could be called from non GFP_KERNEL contexts.  This patch makes the
charging routine aware of the gfp context.  Charging might fail if the
cgroup is over it's limit, in which case a suitable error is returned.

This patch was tested on a Powerpc box.  I am still looking at being able
to test the path, through which allocations happen in non GFP_KERNEL
contexts.

[kamezawa.hiroyu@jp.fujitsu.com: problem with ZONE_MOVABLE]
Signed-off-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Cc: Pavel Emelianov &lt;xemul@openvz.org&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Cc: Kirill Korotaev &lt;dev@sw.ru&gt;
Cc: Herbert Poetzl &lt;herbert@13thfloor.at&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Vaidyanathan Srinivasan &lt;svaidy@linux.vnet.ibm.com&gt;
Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Nick Piggin pointed out that swap cache and page cache addition routines
could be called from non GFP_KERNEL contexts.  This patch makes the
charging routine aware of the gfp context.  Charging might fail if the
cgroup is over it's limit, in which case a suitable error is returned.

This patch was tested on a Powerpc box.  I am still looking at being able
to test the path, through which allocations happen in non GFP_KERNEL
contexts.

[kamezawa.hiroyu@jp.fujitsu.com: problem with ZONE_MOVABLE]
Signed-off-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Cc: Pavel Emelianov &lt;xemul@openvz.org&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Cc: Kirill Korotaev &lt;dev@sw.ru&gt;
Cc: Herbert Poetzl &lt;herbert@13thfloor.at&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Vaidyanathan Srinivasan &lt;svaidy@linux.vnet.ibm.com&gt;
Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Memory controller: memory accounting</title>
<updated>2008-02-07T16:42:18+00:00</updated>
<author>
<name>Balbir Singh</name>
<email>balbir@linux.vnet.ibm.com</email>
</author>
<published>2008-02-07T08:13:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8a9f3ccd24741b50200c3f33d62534c7271f3dfc'/>
<id>8a9f3ccd24741b50200c3f33d62534c7271f3dfc</id>
<content type='text'>
Add the accounting hooks.  The accounting is carried out for RSS and Page
Cache (unmapped) pages.  There is now a common limit and accounting for both.
The RSS accounting is accounted at page_add_*_rmap() and page_remove_rmap()
time.  Page cache is accounted at add_to_page_cache(),
__delete_from_page_cache().  Swap cache is also accounted for.

Each page's page_cgroup is protected with the last bit of the
page_cgroup pointer, this makes handling of race conditions involving
simultaneous mappings of a page easier.  A reference count is kept in the
page_cgroup to deal with cases where a page might be unmapped from the RSS
of all tasks, but still lives in the page cache.

Credits go to Vaidyanathan Srinivasan for helping with reference counting work
of the page cgroup.  Almost all of the page cache accounting code has help
from Vaidyanathan Srinivasan.

[hugh@veritas.com: fix swapoff breakage]
[akpm@linux-foundation.org: fix locking]
Signed-off-by: Vaidyanathan Srinivasan &lt;svaidy@linux.vnet.ibm.com&gt;
Signed-off-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Cc: Pavel Emelianov &lt;xemul@openvz.org&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Cc: Kirill Korotaev &lt;dev@sw.ru&gt;
Cc: Herbert Poetzl &lt;herbert@13thfloor.at&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: &lt;Valdis.Kletnieks@vt.edu&gt;
Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the accounting hooks.  The accounting is carried out for RSS and Page
Cache (unmapped) pages.  There is now a common limit and accounting for both.
The RSS accounting is accounted at page_add_*_rmap() and page_remove_rmap()
time.  Page cache is accounted at add_to_page_cache(),
__delete_from_page_cache().  Swap cache is also accounted for.

Each page's page_cgroup is protected with the last bit of the
page_cgroup pointer, this makes handling of race conditions involving
simultaneous mappings of a page easier.  A reference count is kept in the
page_cgroup to deal with cases where a page might be unmapped from the RSS
of all tasks, but still lives in the page cache.

Credits go to Vaidyanathan Srinivasan for helping with reference counting work
of the page cgroup.  Almost all of the page cache accounting code has help
from Vaidyanathan Srinivasan.

[hugh@veritas.com: fix swapoff breakage]
[akpm@linux-foundation.org: fix locking]
Signed-off-by: Vaidyanathan Srinivasan &lt;svaidy@linux.vnet.ibm.com&gt;
Signed-off-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Cc: Pavel Emelianov &lt;xemul@openvz.org&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Cc: Kirill Korotaev &lt;dev@sw.ru&gt;
Cc: Herbert Poetzl &lt;herbert@13thfloor.at&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: &lt;Valdis.Kletnieks@vt.edu&gt;
Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>brk randomization: introduce CONFIG_COMPAT_BRK</title>
<updated>2008-02-06T21:39:44+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-02-06T21:39:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=32a932332c8bad842804842eaf9651ad6268e637'/>
<id>32a932332c8bad842804842eaf9651ad6268e637</id>
<content type='text'>
based on similar patch from: Pavel Machek &lt;pavel@ucw.cz&gt;

Introduce CONFIG_COMPAT_BRK. If disabled then the kernel is free
(but not obliged to) randomize the brk area.

Heap randomization breaks ancient binaries, so we keep COMPAT_BRK
enabled by default.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
based on similar patch from: Pavel Machek &lt;pavel@ucw.cz&gt;

Introduce CONFIG_COMPAT_BRK. If disabled then the kernel is free
(but not obliged to) randomize the brk area.

Heap randomization breaks ancient binaries, so we keep COMPAT_BRK
enabled by default.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
</feed>
