<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/hugetlb.h, branch v2.6.23.2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>hugepage: fix broken check for offset alignment in hugepage mappings</title>
<updated>2007-08-31T08:42:23+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2007-08-31T06:56:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dec4ad86c2fbea062e9ef9caa6d6e79f7c5e0b12'/>
<id>dec4ad86c2fbea062e9ef9caa6d6e79f7c5e0b12</id>
<content type='text'>
For hugepage mappings, the file offset, like the address and size, needs to
be aligned to the size of a hugepage.

In commit 68589bc353037f233fe510ad9ff432338c95db66, the check for this was
moved into prepare_hugepage_range() along with the address and size checks.
 But since BenH's rework of the get_unmapped_area() paths leading up to
commit 4b1d89290b62bb2db476c94c82cf7442aab440c8, prepare_hugepage_range()
is only called for MAP_FIXED mappings, not for other mappings.  This means
we're no longer ever checking for an aligned offset - I've confirmed that
mmap() will (apparently) succeed with a misaligned offset on both powerpc
and i386 at least.

This patch restores the check, removing it from prepare_hugepage_range()
and putting it back into hugetlbfs_file_mmap().  I'm putting it there,
rather than in the get_unmapped_area() path so it only needs to go in one
place, than separately in the half-dozen or so arch-specific
implementations of hugetlb_get_unmapped_area().

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For hugepage mappings, the file offset, like the address and size, needs to
be aligned to the size of a hugepage.

In commit 68589bc353037f233fe510ad9ff432338c95db66, the check for this was
moved into prepare_hugepage_range() along with the address and size checks.
 But since BenH's rework of the get_unmapped_area() paths leading up to
commit 4b1d89290b62bb2db476c94c82cf7442aab440c8, prepare_hugepage_range()
is only called for MAP_FIXED mappings, not for other mappings.  This means
we're no longer ever checking for an aligned offset - I've confirmed that
mmap() will (apparently) succeed with a misaligned offset on both powerpc
and i386 at least.

This patch restores the check, removing it from prepare_hugepage_range()
and putting it back into hugetlbfs_file_mmap().  I'm putting it there,
rather than in the get_unmapped_area() path so it only needs to go in one
place, than separately in the half-dozen or so arch-specific
implementations of hugetlb_get_unmapped_area().

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Remove fs.h from mm.h</title>
<updated>2007-07-30T00:09:29+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2007-07-29T22:36:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4e950f6f0189f65f8bf069cf2272649ef418f5e4'/>
<id>4e950f6f0189f65f8bf069cf2272649ef418f5e4</id>
<content type='text'>
Remove fs.h from mm.h. For this,
 1) Uninline vma_wants_writenotify(). It's pretty huge anyway.
 2) Add back fs.h or less bloated headers (err.h) to files that need it.

As result, on x86_64 allyesconfig, fs.h dependencies cut down from 3929 files
rebuilt down to 3444 (-12.3%).

Cross-compile tested without regressions on my two usual configs and (sigh):

alpha              arm-mx1ads        mips-bigsur          powerpc-ebony
alpha-allnoconfig  arm-neponset      mips-capcella        powerpc-g5
alpha-defconfig    arm-netwinder     mips-cobalt          powerpc-holly
alpha-up           arm-netx          mips-db1000          powerpc-iseries
arm                arm-ns9xxx        mips-db1100          powerpc-linkstation
arm-assabet        arm-omap_h2_1610  mips-db1200          powerpc-lite5200
arm-at91rm9200dk   arm-onearm        mips-db1500          powerpc-maple
arm-at91rm9200ek   arm-picotux200    mips-db1550          powerpc-mpc7448_hpc2
arm-at91sam9260ek  arm-pleb          mips-ddb5477         powerpc-mpc8272_ads
arm-at91sam9261ek  arm-pnx4008       mips-decstation      powerpc-mpc8313_rdb
arm-at91sam9263ek  arm-pxa255-idp    mips-e55             powerpc-mpc832x_mds
arm-at91sam9rlek   arm-realview      mips-emma2rh         powerpc-mpc832x_rdb
arm-ateb9200       arm-realview-smp  mips-excite          powerpc-mpc834x_itx
arm-badge4         arm-rpc           mips-fulong          powerpc-mpc834x_itxgp
arm-carmeva        arm-s3c2410       mips-ip22            powerpc-mpc834x_mds
arm-cerfcube       arm-shannon       mips-ip27            powerpc-mpc836x_mds
arm-clps7500       arm-shark         mips-ip32            powerpc-mpc8540_ads
arm-collie         arm-simpad        mips-jazz            powerpc-mpc8544_ds
arm-corgi          arm-spitz         mips-jmr3927         powerpc-mpc8560_ads
arm-csb337         arm-trizeps4      mips-malta           powerpc-mpc8568mds
arm-csb637         arm-versatile     mips-mipssim         powerpc-mpc85xx_cds
arm-ebsa110        i386              mips-mpc30x          powerpc-mpc8641_hpcn
arm-edb7211        i386-allnoconfig  mips-msp71xx         powerpc-mpc866_ads
arm-em_x270        i386-defconfig    mips-ocelot          powerpc-mpc885_ads
arm-ep93xx         i386-up           mips-pb1100          powerpc-pasemi
arm-footbridge     ia64              mips-pb1500          powerpc-pmac32
arm-fortunet       ia64-allnoconfig  mips-pb1550          powerpc-ppc64
arm-h3600          ia64-bigsur       mips-pnx8550-jbs     powerpc-prpmc2800
arm-h7201          ia64-defconfig    mips-pnx8550-stb810  powerpc-ps3
arm-h7202          ia64-gensparse    mips-qemu            powerpc-pseries
arm-hackkit        ia64-sim          mips-rbhma4200       powerpc-up
arm-integrator     ia64-sn2          mips-rbhma4500       s390
arm-iop13xx        ia64-tiger        mips-rm200           s390-allnoconfig
arm-iop32x         ia64-up           mips-sb1250-swarm    s390-defconfig
arm-iop33x         ia64-zx1          mips-sead            s390-up
arm-ixp2000        m68k              mips-tb0219          sparc
arm-ixp23xx        m68k-amiga        mips-tb0226          sparc-allnoconfig
arm-ixp4xx         m68k-apollo       mips-tb0287          sparc-defconfig
arm-jornada720     m68k-atari        mips-workpad         sparc-up
arm-kafa           m68k-bvme6000     mips-wrppmc          sparc64
arm-kb9202         m68k-hp300        mips-yosemite        sparc64-allnoconfig
arm-ks8695         m68k-mac          parisc               sparc64-defconfig
arm-lart           m68k-mvme147      parisc-allnoconfig   sparc64-up
arm-lpd270         m68k-mvme16x      parisc-defconfig     um-x86_64
arm-lpd7a400       m68k-q40          parisc-up            x86_64
arm-lpd7a404       m68k-sun3         powerpc              x86_64-allnoconfig
arm-lubbock        m68k-sun3x        powerpc-cell         x86_64-defconfig
arm-lusl7200       mips              powerpc-celleb       x86_64-up
arm-mainstone      mips-atlas        powerpc-chrp32

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove fs.h from mm.h. For this,
 1) Uninline vma_wants_writenotify(). It's pretty huge anyway.
 2) Add back fs.h or less bloated headers (err.h) to files that need it.

As result, on x86_64 allyesconfig, fs.h dependencies cut down from 3929 files
rebuilt down to 3444 (-12.3%).

Cross-compile tested without regressions on my two usual configs and (sigh):

alpha              arm-mx1ads        mips-bigsur          powerpc-ebony
alpha-allnoconfig  arm-neponset      mips-capcella        powerpc-g5
alpha-defconfig    arm-netwinder     mips-cobalt          powerpc-holly
alpha-up           arm-netx          mips-db1000          powerpc-iseries
arm                arm-ns9xxx        mips-db1100          powerpc-linkstation
arm-assabet        arm-omap_h2_1610  mips-db1200          powerpc-lite5200
arm-at91rm9200dk   arm-onearm        mips-db1500          powerpc-maple
arm-at91rm9200ek   arm-picotux200    mips-db1550          powerpc-mpc7448_hpc2
arm-at91sam9260ek  arm-pleb          mips-ddb5477         powerpc-mpc8272_ads
arm-at91sam9261ek  arm-pnx4008       mips-decstation      powerpc-mpc8313_rdb
arm-at91sam9263ek  arm-pxa255-idp    mips-e55             powerpc-mpc832x_mds
arm-at91sam9rlek   arm-realview      mips-emma2rh         powerpc-mpc832x_rdb
arm-ateb9200       arm-realview-smp  mips-excite          powerpc-mpc834x_itx
arm-badge4         arm-rpc           mips-fulong          powerpc-mpc834x_itxgp
arm-carmeva        arm-s3c2410       mips-ip22            powerpc-mpc834x_mds
arm-cerfcube       arm-shannon       mips-ip27            powerpc-mpc836x_mds
arm-clps7500       arm-shark         mips-ip32            powerpc-mpc8540_ads
arm-collie         arm-simpad        mips-jazz            powerpc-mpc8544_ds
arm-corgi          arm-spitz         mips-jmr3927         powerpc-mpc8560_ads
arm-csb337         arm-trizeps4      mips-malta           powerpc-mpc8568mds
arm-csb637         arm-versatile     mips-mipssim         powerpc-mpc85xx_cds
arm-ebsa110        i386              mips-mpc30x          powerpc-mpc8641_hpcn
arm-edb7211        i386-allnoconfig  mips-msp71xx         powerpc-mpc866_ads
arm-em_x270        i386-defconfig    mips-ocelot          powerpc-mpc885_ads
arm-ep93xx         i386-up           mips-pb1100          powerpc-pasemi
arm-footbridge     ia64              mips-pb1500          powerpc-pmac32
arm-fortunet       ia64-allnoconfig  mips-pb1550          powerpc-ppc64
arm-h3600          ia64-bigsur       mips-pnx8550-jbs     powerpc-prpmc2800
arm-h7201          ia64-defconfig    mips-pnx8550-stb810  powerpc-ps3
arm-h7202          ia64-gensparse    mips-qemu            powerpc-pseries
arm-hackkit        ia64-sim          mips-rbhma4200       powerpc-up
arm-integrator     ia64-sn2          mips-rbhma4500       s390
arm-iop13xx        ia64-tiger        mips-rm200           s390-allnoconfig
arm-iop32x         ia64-up           mips-sb1250-swarm    s390-defconfig
arm-iop33x         ia64-zx1          mips-sead            s390-up
arm-ixp2000        m68k              mips-tb0219          sparc
arm-ixp23xx        m68k-amiga        mips-tb0226          sparc-allnoconfig
arm-ixp4xx         m68k-apollo       mips-tb0287          sparc-defconfig
arm-jornada720     m68k-atari        mips-workpad         sparc-up
arm-kafa           m68k-bvme6000     mips-wrppmc          sparc64
arm-kb9202         m68k-hp300        mips-yosemite        sparc64-allnoconfig
arm-ks8695         m68k-mac          parisc               sparc64-defconfig
arm-lart           m68k-mvme147      parisc-allnoconfig   sparc64-up
arm-lpd270         m68k-mvme16x      parisc-defconfig     um-x86_64
arm-lpd7a400       m68k-q40          parisc-up            x86_64
arm-lpd7a404       m68k-sun3         powerpc              x86_64-allnoconfig
arm-lubbock        m68k-sun3x        powerpc-cell         x86_64-defconfig
arm-lusl7200       mips              powerpc-celleb       x86_64-up
arm-mainstone      mips-atlas        powerpc-chrp32

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Allow huge page allocations to use GFP_HIGH_MOVABLE</title>
<updated>2007-07-17T17:22:59+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mel@csn.ul.ie</email>
</author>
<published>2007-07-17T11:03:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=396faf0303d273219db5d7eb4a2879ad977ed185'/>
<id>396faf0303d273219db5d7eb4a2879ad977ed185</id>
<content type='text'>
Huge pages are not movable so are not allocated from ZONE_MOVABLE.  However,
as ZONE_MOVABLE will always have pages that can be migrated or reclaimed, it
can be used to satisfy hugepage allocations even when the system has been
running a long time.  This allows an administrator to resize the hugepage pool
at runtime depending on the size of ZONE_MOVABLE.

This patch adds a new sysctl called hugepages_treat_as_movable.  When a
non-zero value is written to it, future allocations for the huge page pool
will use ZONE_MOVABLE.  Despite huge pages being non-movable, we do not
introduce additional external fragmentation of note as huge pages are always
the largest contiguous block we care about.

[akpm@linux-foundation.org: various fixes]
Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Huge pages are not movable so are not allocated from ZONE_MOVABLE.  However,
as ZONE_MOVABLE will always have pages that can be migrated or reclaimed, it
can be used to satisfy hugepage allocations even when the system has been
running a long time.  This allows an administrator to resize the hugepage pool
at runtime depending on the size of ZONE_MOVABLE.

This patch adds a new sysctl called hugepages_treat_as_movable.  When a
non-zero value is written to it, future allocations for the huge page pool
will use ZONE_MOVABLE.  Despite huge pages being non-movable, we do not
introduce additional external fragmentation of note as huge pages are always
the largest contiguous block we care about.

[akpm@linux-foundation.org: various fixes]
Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>shm: fix the filename of hugetlb sysv shared memory</title>
<updated>2007-06-16T20:16:16+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2007-06-16T17:16:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9d66586f7723b73c5925c7c7819c260484627851'/>
<id>9d66586f7723b73c5925c7c7819c260484627851</id>
<content type='text'>
Some user space tools need to identify SYSV shared memory when examining
/proc/&lt;pid&gt;/maps.  To do so they look for a block device with major zero, a
dentry named SYSV&lt;sysv key&gt;, and having the minor of the internal sysv
shared memory kernel mount.

To help these tools and to make it easier for people just browsing
/proc/&lt;pid&gt;/maps this patch modifies hugetlb sysv shared memory to use the
SYSV&lt;key&gt; dentry naming convention.

User space tools will still have to be aware that hugetlb sysv shared
memory lives on a different internal kernel mount and so has a different
block device minor number from the rest of sysv shared memory.

Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: "Serge E. Hallyn" &lt;serge@hallyn.com&gt;
Cc: Albert Cahalan &lt;acahalan@gmail.com&gt;
Cc: Badari Pulavarty &lt;pbadari@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some user space tools need to identify SYSV shared memory when examining
/proc/&lt;pid&gt;/maps.  To do so they look for a block device with major zero, a
dentry named SYSV&lt;sysv key&gt;, and having the minor of the internal sysv
shared memory kernel mount.

To help these tools and to make it easier for people just browsing
/proc/&lt;pid&gt;/maps this patch modifies hugetlb sysv shared memory to use the
SYSV&lt;key&gt; dentry naming convention.

User space tools will still have to be aware that hugetlb sysv shared
memory lives on a different internal kernel mount and so has a different
block device minor number from the rest of sysv shared memory.

Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: "Serge E. Hallyn" &lt;serge@hallyn.com&gt;
Cc: Albert Cahalan &lt;acahalan@gmail.com&gt;
Cc: Badari Pulavarty &lt;pbadari@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proper prototype for hugetlb_get_unmapped_area()</title>
<updated>2007-05-07T19:12:51+00:00</updated>
<author>
<name>Adrian Bunk</name>
<email>bunk@stusta.de</email>
</author>
<published>2007-05-06T21:49:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d2ba27e8007b35d24764c0877ab2428e00a5c5ab'/>
<id>d2ba27e8007b35d24764c0877ab2428e00a5c5ab</id>
<content type='text'>
Add a proper prototype for hugetlb_get_unmapped_area() in
include/linux/hugetlb.h.

Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
Acked-by: William Irwin &lt;wli@holomorphy.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a proper prototype for hugetlb_get_unmapped_area() in
include/linux/hugetlb.h.

Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
Acked-by: William Irwin &lt;wli@holomorphy.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] Fix get_unmapped_area and fsync for hugetlb shm segments</title>
<updated>2007-03-02T01:18:39+00:00</updated>
<author>
<name>Adam Litke</name>
<email>agl@us.ibm.com</email>
</author>
<published>2007-03-01T23:46:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=516dffdcd8827a40532798602830dfcfc672294c'/>
<id>516dffdcd8827a40532798602830dfcfc672294c</id>
<content type='text'>
This patch provides the following hugetlb-related fixes to the recent stacked
shm files changes:
 - Update is_file_hugepages() so it will reconize hugetlb shm segments.
 - get_unmapped_area must be called with the nested file struct to handle
   the sfd-&gt;file-&gt;f_ops-&gt;get_unmapped_area == NULL case.
 - The fsync f_op must be wrapped since it is specified in the hugetlbfs
   f_ops.

This is based on proposed fixes from Eric Biederman that were debugged and
tested by me.  Without it, attempting to use hugetlb shared memory segments
on powerpc (and likely ia64) will kill your box.

Signed-off-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: William Irwin &lt;bill.irwin@oracle.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch provides the following hugetlb-related fixes to the recent stacked
shm files changes:
 - Update is_file_hugepages() so it will reconize hugetlb shm segments.
 - get_unmapped_area must be called with the nested file struct to handle
   the sfd-&gt;file-&gt;f_ops-&gt;get_unmapped_area == NULL case.
 - The fsync f_op must be wrapped since it is specified in the hugetlbfs
   f_ops.

This is based on proposed fixes from Eric Biederman that were debugged and
tested by me.  Without it, attempting to use hugetlb shared memory segments
on powerpc (and likely ia64) will kill your box.

Signed-off-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: William Irwin &lt;bill.irwin@oracle.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] shared page table for hugetlb page</title>
<updated>2006-12-07T16:39:21+00:00</updated>
<author>
<name>Chen, Kenneth W</name>
<email>kenneth.w.chen@intel.com</email>
</author>
<published>2006-12-07T04:32:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=39dde65c9940c97fcd178a3d2b1c57ed8b7b68aa'/>
<id>39dde65c9940c97fcd178a3d2b1c57ed8b7b68aa</id>
<content type='text'>
Following up with the work on shared page table done by Dave McCracken.  This
set of patch target shared page table for hugetlb memory only.

The shared page table is particular useful in the situation of large number of
independent processes sharing large shared memory segments.  In the normal
page case, the amount of memory saved from process' page table is quite
significant.  For hugetlb, the saving on page table memory is not the primary
objective (as hugetlb itself already cuts down page table overhead
significantly), instead, the purpose of using shared page table on hugetlb is
to allow faster TLB refill and smaller cache pollution upon TLB miss.

With PT sharing, pte entries are shared among hundreds of processes, the cache
consumption used by all the page table is smaller and in return, application
gets much higher cache hit ratio.  One other effect is that cache hit ratio
with hardware page walker hitting on pte in cache will be higher and this
helps to reduce tlb miss latency.  These two effects contribute to higher
application performance.

Signed-off-by: Ken Chen &lt;kenneth.w.chen@intel.com&gt;
Acked-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: Dave McCracken &lt;dmccr@us.ibm.com&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Following up with the work on shared page table done by Dave McCracken.  This
set of patch target shared page table for hugetlb memory only.

The shared page table is particular useful in the situation of large number of
independent processes sharing large shared memory segments.  In the normal
page case, the amount of memory saved from process' page table is quite
significant.  For hugetlb, the saving on page table memory is not the primary
objective (as hugetlb itself already cuts down page table overhead
significantly), instead, the purpose of using shared page table on hugetlb is
to allow faster TLB refill and smaller cache pollution upon TLB miss.

With PT sharing, pte entries are shared among hundreds of processes, the cache
consumption used by all the page table is smaller and in return, application
gets much higher cache hit ratio.  One other effect is that cache hit ratio
with hardware page walker hitting on pte in cache will be higher and this
helps to reduce tlb miss latency.  These two effects contribute to higher
application performance.

Signed-off-by: Ken Chen &lt;kenneth.w.chen@intel.com&gt;
Acked-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: Dave McCracken &lt;dmccr@us.ibm.com&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] hugetlb: prepare_hugepage_range check offset too</title>
<updated>2006-11-14T17:09:27+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hugh@veritas.com</email>
</author>
<published>2006-11-14T10:03:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=68589bc353037f233fe510ad9ff432338c95db66'/>
<id>68589bc353037f233fe510ad9ff432338c95db66</id>
<content type='text'>
(David:)

If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example,
because the given file offset is not hugepage aligned - then do_mmap_pgoff
will go to the unmap_and_free_vma backout path.

But at this stage the vma hasn't been marked as hugepage, and the backout path
will call unmap_region() on it.  That will eventually call down to the
non-hugepage version of unmap_page_range().  On ppc64, at least, that will
cause serious problems if there are any existing hugepage pagetable entries in
the vicinity - for example if there are any other hugepage mappings under the
same PUD.  unmap_page_range() will trigger a bad_pud() on the hugepage pud
entries.  I suspect this will also cause bad problems on ia64, though I don't
have a machine to test it on.

(Hugh:)

prepare_hugepage_range() should check file offset alignment when it checks
virtual address and length, to stop MAP_FIXED with a bad huge offset from
unmapping before it fails further down.  PowerPC should apply the same
prepare_hugepage_range alignment checks as ia64 and all the others do.

Then none of the alignment checks in hugetlbfs_file_mmap are required (nor
is the check for too small a mapping); but even so, move up setting of
VM_HUGETLB and add a comment to warn of what David Gibson discovered - if
hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region
when unwinding from error will go the non-huge way, which may cause bad
behaviour on architectures (powerpc and ia64) which segregate their huge
mappings into a separate region of the address space.

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Acked-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
(David:)

If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example,
because the given file offset is not hugepage aligned - then do_mmap_pgoff
will go to the unmap_and_free_vma backout path.

But at this stage the vma hasn't been marked as hugepage, and the backout path
will call unmap_region() on it.  That will eventually call down to the
non-hugepage version of unmap_page_range().  On ppc64, at least, that will
cause serious problems if there are any existing hugepage pagetable entries in
the vicinity - for example if there are any other hugepage mappings under the
same PUD.  unmap_page_range() will trigger a bad_pud() on the hugepage pud
entries.  I suspect this will also cause bad problems on ia64, though I don't
have a machine to test it on.

(Hugh:)

prepare_hugepage_range() should check file offset alignment when it checks
virtual address and length, to stop MAP_FIXED with a bad huge offset from
unmapping before it fails further down.  PowerPC should apply the same
prepare_hugepage_range alignment checks as ia64 and all the others do.

Then none of the alignment checks in hugetlbfs_file_mmap are required (nor
is the check for too small a mapping); but even so, move up setting of
VM_HUGETLB and add a comment to warn of what David Gibson discovered - if
hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region
when unwinding from error will go the non-huge way, which may cause bad
behaviour on architectures (powerpc and ia64) which segregate their huge
mappings into a separate region of the address space.

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Acked-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] hugetlb: fix linked list corruption in unmap_hugepage_range()</title>
<updated>2006-10-11T18:14:15+00:00</updated>
<author>
<name>Chen, Kenneth W</name>
<email>kenneth.w.chen@intel.com</email>
</author>
<published>2006-10-11T08:20:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=502717f4e112b18d9c37753a32f675bec9f2838b'/>
<id>502717f4e112b18d9c37753a32f675bec9f2838b</id>
<content type='text'>
commit fe1668ae5bf0145014c71797febd9ad5670d5d05 causes kernel to oops with
libhugetlbfs test suite.  The problem is that hugetlb pages can be shared
by multiple mappings.  Multiple threads can fight over page-&gt;lru in the
unmap path and bad things happen.  We now serialize __unmap_hugepage_range
to void concurrent linked list manipulation.  Such serialization is also
needed for shared page table page on hugetlb area.  This patch will fixed
the bug and also serve as a prepatch for shared page table.

Signed-off-by: Ken Chen &lt;kenneth.w.chen@intel.com&gt;
Cc: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit fe1668ae5bf0145014c71797febd9ad5670d5d05 causes kernel to oops with
libhugetlbfs test suite.  The problem is that hugetlb pages can be shared
by multiple mappings.  Multiple threads can fight over page-&gt;lru in the
unmap path and bad things happen.  We now serialize __unmap_hugepage_range
to void concurrent linked list manipulation.  Such serialization is also
needed for shared page table page on hugetlb area.  This patch will fixed
the bug and also serve as a prepatch for shared page table.

Signed-off-by: Ken Chen &lt;kenneth.w.chen@intel.com&gt;
Cc: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] tightening hugetlb strict accounting</title>
<updated>2006-06-23T14:42:48+00:00</updated>
<author>
<name>Chen, Kenneth W</name>
<email>kenneth.w.chen@intel.com</email>
</author>
<published>2006-06-23T09:03:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a43a8c39bbb493c9e93f6764b350de2e33e18e92'/>
<id>a43a8c39bbb493c9e93f6764b350de2e33e18e92</id>
<content type='text'>
Current hugetlb strict accounting for shared mapping always assume mapping
starts at zero file offset and reserves pages between zero and size of the
file.  This assumption often reserves (or lock down) a lot more pages then
necessary if application maps at none zero file offset.  libhugetlbfs is
one example that requires proper reservation on shared mapping starts at
none zero offset.

This patch extends the reservation and hugetlb strict accounting to support
any arbitrary pair of (offset, len), resulting a much more robust and
accurate scheme.  More importantly, it won't lock down any hugetlb pages
outside file mapping.

Signed-off-by: Ken Chen &lt;kenneth.w.chen@intel.com&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current hugetlb strict accounting for shared mapping always assume mapping
starts at zero file offset and reserves pages between zero and size of the
file.  This assumption often reserves (or lock down) a lot more pages then
necessary if application maps at none zero file offset.  libhugetlbfs is
one example that requires proper reservation on shared mapping starts at
none zero offset.

This patch extends the reservation and hugetlb strict accounting to support
any arbitrary pair of (offset, len), resulting a much more robust and
accurate scheme.  More importantly, it won't lock down any hugetlb pages
outside file mapping.

Signed-off-by: Ken Chen &lt;kenneth.w.chen@intel.com&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
