<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/gfp.h, branch v2.6.22.7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Move remote node draining out of slab allocators</title>
<updated>2007-05-09T19:30:56+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2007-05-09T09:35:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4037d452202e34214e8a939fa5621b2b3bbb45b7'/>
<id>4037d452202e34214e8a939fa5621b2b3bbb45b7</id>
<content type='text'>
Currently the slab allocators contain callbacks into the page allocator to
perform the draining of pagesets on remote nodes.  This requires SLUB to have
a whole subsystem in order to be compatible with SLAB.  Moving node draining
out of the slab allocators avoids a section of code in SLUB.

Move the node draining so that is is done when the vm statistics are updated.
At that point we are already touching all the cachelines with the pagesets of
a processor.

Add a expire counter there.  If we have to update per zone or global vm
statistics then assume that the pageset will require subsequent draining.

The expire counter will be decremented on each vm stats update pass until it
reaches zero.  Then we will drain one batch from the pageset.  The draining
will cause vm counter updates which will then cause another expiration until
the pcp is empty.  So we will drain a batch every 3 seconds.

Note that remote node draining is a somewhat esoteric feature that is required
on large NUMA systems because otherwise significant portions of system memory
can become trapped in pcp queues.  The number of pcp is determined by the
number of processors and nodes in a system.  A system with 4 processors and 2
nodes has 8 pcps which is okay.  But a system with 1024 processors and 512
nodes has 512k pcps with a high potential for large amount of memory being
caught in them.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently the slab allocators contain callbacks into the page allocator to
perform the draining of pagesets on remote nodes.  This requires SLUB to have
a whole subsystem in order to be compatible with SLAB.  Moving node draining
out of the slab allocators avoids a section of code in SLUB.

Move the node draining so that is is done when the vm statistics are updated.
At that point we are already touching all the cachelines with the pagesets of
a processor.

Add a expire counter there.  If we have to update per zone or global vm
statistics then assume that the pageset will require subsequent draining.

The expire counter will be decremented on each vm stats update pass until it
reaches zero.  Then we will drain one batch from the pageset.  The draining
will cause vm counter updates which will then cause another expiration until
the pcp is empty.  So we will drain a batch every 3 seconds.

Note that remote node draining is a somewhat esoteric feature that is required
on large NUMA systems because otherwise significant portions of system memory
can become trapped in pcp queues.  The number of pcp is determined by the
number of processors and nodes in a system.  A system with 4 processors and 2
nodes has 8 pcps which is okay.  But a system with 1024 processors and 512
nodes has 512k pcps with a high potential for large amount of memory being
caught in them.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Slab allocators: remove useless __GFP_NO_GROW flag</title>
<updated>2007-05-07T19:12:57+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2007-05-06T21:50:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cfce66047f1893cb7d3abb0d53e65cbbd8d605f0'/>
<id>cfce66047f1893cb7d3abb0d53e65cbbd8d605f0</id>
<content type='text'>
There is no user remaining and I have never seen any use of that flag.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no user remaining and I have never seen any use of that flag.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] optional ZONE_DMA: optional ZONE_DMA in the VM</title>
<updated>2007-02-11T18:51:18+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2007-02-10T09:43:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4b51d66989218aad731a721b5b28c79bf5388c09'/>
<id>4b51d66989218aad731a721b5b28c79bf5388c09</id>
<content type='text'>
Make ZONE_DMA optional in core code.

- ifdef all code for ZONE_DMA and related definitions following the example
  for ZONE_DMA32 and ZONE_HIGHMEM.

- Without ZONE_DMA, ZONE_HIGHMEM and ZONE_DMA32 we get to a ZONES_SHIFT of
  0.

- Modify the VM statistics to work correctly without a DMA zone.

- Modify slab to not create DMA slabs if there is no ZONE_DMA.

[akpm@osdl.org: cleanup]
[jdike@addtoit.com: build fix]
[apw@shadowen.org: Simplify calculation of the number of bits we need for ZONES_SHIFT]
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Cc: Matthew Wilcox &lt;willy@debian.org&gt;
Cc: James Bottomley &lt;James.Bottomley@steeleye.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Signed-off-by: Andy Whitcroft &lt;apw@shadowen.org&gt;
Signed-off-by: Jeff Dike &lt;jdike@addtoit.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make ZONE_DMA optional in core code.

- ifdef all code for ZONE_DMA and related definitions following the example
  for ZONE_DMA32 and ZONE_HIGHMEM.

- Without ZONE_DMA, ZONE_HIGHMEM and ZONE_DMA32 we get to a ZONES_SHIFT of
  0.

- Modify the VM statistics to work correctly without a DMA zone.

- Modify slab to not create DMA slabs if there is no ZONE_DMA.

[akpm@osdl.org: cleanup]
[jdike@addtoit.com: build fix]
[apw@shadowen.org: Simplify calculation of the number of bits we need for ZONES_SHIFT]
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Cc: Matthew Wilcox &lt;willy@debian.org&gt;
Cc: James Bottomley &lt;James.Bottomley@steeleye.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Signed-off-by: Andy Whitcroft &lt;apw@shadowen.org&gt;
Signed-off-by: Jeff Dike &lt;jdike@addtoit.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] in non-NUMA case mark GFP_THISNODE gfp_t</title>
<updated>2007-02-09T17:14:06+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@ftp.linux.org.uk</email>
</author>
<published>2007-02-09T16:38:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f2e97df669d32f74152336f46e4e0e328b993c57'/>
<id>f2e97df669d32f74152336f46e4e0e328b993c57</id>
<content type='text'>
... operations with it are OK as is, but flags &amp; ~0 will have no idea that
this ~0 is meant to be ~gfp_t.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... operations with it are OK as is, but flags &amp; ~0 will have no idea that
this ~0 is meant to be ~gfp_t.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] mm: add arch_alloc_page</title>
<updated>2006-12-07T16:39:21+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2006-12-07T04:32:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cc102509074bba0316f2b5deebd7ef4447da295e'/>
<id>cc102509074bba0316f2b5deebd7ef4447da295e</id>
<content type='text'>
Add an arch_alloc_page to match arch_free_page.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add an arch_alloc_page to match arch_free_page.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] Disable GFP_THISNODE in the non-NUMA case</title>
<updated>2006-09-27T15:26:12+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2006-09-27T08:50:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=77f700dab4c05f8ee17584ec869672796d7bcb87'/>
<id>77f700dab4c05f8ee17584ec869672796d7bcb87</id>
<content type='text'>
GFP_THISNODE must be set to 0 in the non numa case otherwise we disable retry
and warnings for failing allocations in the SMP and UP case.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
GFP_THISNODE must be set to 0 in the non numa case otherwise we disable retry
and warnings for failing allocations in the SMP and UP case.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] Define easier to handle GFP_THISNODE</title>
<updated>2006-09-26T15:48:50+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2006-09-26T06:31:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=980128f223fa3c75e3ebdde650c9f1bcabd4c0a2'/>
<id>980128f223fa3c75e3ebdde650c9f1bcabd4c0a2</id>
<content type='text'>
In many places we will need to use the same combination of flags.  Specify
a single GFP_THISNODE definition for ease of use in gfp.h.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In many places we will need to use the same combination of flags.  Specify
a single GFP_THISNODE definition for ease of use in gfp.h.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] Add __GFP_THISNODE to avoid fallback to other nodes and ignore cpuset/memory policy restrictions</title>
<updated>2006-09-26T15:48:50+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2006-09-26T06:31:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9b819d204cf602eab1a53a9ec4b8d2ca51e02a1d'/>
<id>9b819d204cf602eab1a53a9ec4b8d2ca51e02a1d</id>
<content type='text'>
Add a new gfp flag __GFP_THISNODE to avoid fallback to other nodes.  This
flag is essential if a kernel component requires memory to be located on a
certain node.  It will be needed for alloc_pages_node() to force allocation
on the indicated node and for alloc_pages() to force allocation on the
current node.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: Andy Whitcroft &lt;apw@shadowen.org&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new gfp flag __GFP_THISNODE to avoid fallback to other nodes.  This
flag is essential if a kernel component requires memory to be located on a
certain node.  It will be needed for alloc_pages_node() to force allocation
on the indicated node and for alloc_pages() to force allocation on the
current node.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: Andy Whitcroft &lt;apw@shadowen.org&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] linearly index zone-&gt;node_zonelists[]</title>
<updated>2006-09-26T15:48:47+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2006-09-26T06:31:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=19655d3487001d7df0e10e9cbfc27c758b77c2b5'/>
<id>19655d3487001d7df0e10e9cbfc27c758b77c2b5</id>
<content type='text'>
I wonder why we need this bitmask indexing into zone-&gt;node_zonelists[]?

We always start with the highest zone and then include all lower zones
if we build zonelists.

Are there really cases where we need allocation from ZONE_DMA or
ZONE_HIGHMEM but not ZONE_NORMAL? It seems that the current implementation
of highest_zone() makes that already impossible.

If we go linear on the index then gfp_zone() == highest_zone() and a lot
of definitions fall by the wayside.

We can now revert back to the use of gfp_zone() in mempolicy.c ;-)

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I wonder why we need this bitmask indexing into zone-&gt;node_zonelists[]?

We always start with the highest zone and then include all lower zones
if we build zonelists.

Are there really cases where we need allocation from ZONE_DMA or
ZONE_HIGHMEM but not ZONE_NORMAL? It seems that the current implementation
of highest_zone() makes that already impossible.

If we go linear on the index then gfp_zone() == highest_zone() and a lot
of definitions fall by the wayside.

We can now revert back to the use of gfp_zone() in mempolicy.c ;-)

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] mempolicies: fix policy_zone check</title>
<updated>2006-09-26T15:48:47+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2006-09-26T06:31:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4e4785bcf0c8503224fa6c17d8e0228de781bff6'/>
<id>4e4785bcf0c8503224fa6c17d8e0228de781bff6</id>
<content type='text'>
There is a check in zonelist_policy that compares pieces of the bitmap
obtained from a gfp mask via GFP_ZONETYPES with a zone number in function
zonelist_policy().

The bitmap is an ORed mask of __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM.
The policy_zone is a zone number with the possible values of ZONE_DMA,
ZONE_DMA32, ZONE_HIGHMEM and ZONE_NORMAL. These are two different domains
of values.

For some reason seemed to work before the zone reduction patchset (It
definitely works on SGI boxes since we just have one zone and the check
cannot fail).

With the zone reduction patchset this check definitely fails on systems
with two zones if the system actually has memory in both zones.

This is because ZONE_NORMAL is selected using no __GFP flag at
all and thus gfp_zone(gfpmask) == 0. ZONE_DMA is selected when __GFP_DMA
is set. __GFP_DMA is 0x01.  So gfp_zone(gfpmask) == 1.

policy_zone is set to ZONE_NORMAL (==1) if ZONE_NORMAL and ZONE_DMA are
populated.

For ZONE_NORMAL gfp_zone(&lt;no _GFP_DMA&gt;) yields 0 which is &lt;
policy_zone(ZONE_NORMAL) and so policy is not applied to regular memory
allocations!

Instead gfp_zone(__GFP_DMA) == 1 which results in policy being applied
to DMA allocations!

What we realy want in that place is to establish the highest allowable
zone for a given gfp_mask. If the highest zone is higher or equal to the
policy_zone then memory policies need to be applied. We have such
a highest_zone() function in page_alloc.c.

So move the highest_zone() function from mm/page_alloc.c into
include/linux/gfp.h.  On the way we simplify the function and use the new
zone_type that was also introduced with the zone reduction patchset plus we
also specify the right type for the gfp flags parameter.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a check in zonelist_policy that compares pieces of the bitmap
obtained from a gfp mask via GFP_ZONETYPES with a zone number in function
zonelist_policy().

The bitmap is an ORed mask of __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM.
The policy_zone is a zone number with the possible values of ZONE_DMA,
ZONE_DMA32, ZONE_HIGHMEM and ZONE_NORMAL. These are two different domains
of values.

For some reason seemed to work before the zone reduction patchset (It
definitely works on SGI boxes since we just have one zone and the check
cannot fail).

With the zone reduction patchset this check definitely fails on systems
with two zones if the system actually has memory in both zones.

This is because ZONE_NORMAL is selected using no __GFP flag at
all and thus gfp_zone(gfpmask) == 0. ZONE_DMA is selected when __GFP_DMA
is set. __GFP_DMA is 0x01.  So gfp_zone(gfpmask) == 1.

policy_zone is set to ZONE_NORMAL (==1) if ZONE_NORMAL and ZONE_DMA are
populated.

For ZONE_NORMAL gfp_zone(&lt;no _GFP_DMA&gt;) yields 0 which is &lt;
policy_zone(ZONE_NORMAL) and so policy is not applied to regular memory
allocations!

Instead gfp_zone(__GFP_DMA) == 1 which results in policy being applied
to DMA allocations!

What we realy want in that place is to establish the highest allowable
zone for a given gfp_mask. If the highest zone is higher or equal to the
policy_zone then memory policies need to be applied. We have such
a highest_zone() function in page_alloc.c.

So move the highest_zone() function from mm/page_alloc.c into
include/linux/gfp.h.  On the way we simplify the function and use the new
zone_type that was also introduced with the zone reduction patchset plus we
also specify the right type for the gfp flags parameter.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
