linux-toradex.git/mm, branch v2.6.23.5

revert "x86_64: allocate sparsemem memmap above 4G"

2007-11-16T16:22:59+00:00

Reverted upstream by commit 6a22c57b8d2a62dea7280a6b2ac807a539ef0716

Revert this commit:

	commit 2e1c49db4c640b35df13889b86b9d62215ade4b6
	Author: Zou Nan hai 
	Date:   Fri Jun 1 00:46:28 2007 -0700
	
	x86_64: allocate sparsemem memmap above 4G

This reverts commit 2e1c49db4c640b35df13889b86b9d62215ade4b6.

First off, testing in Fedora has shown it to cause boot failures,
bisected down by Martin Ebourne, and reported by Dave Jobes.  So the
commit will likely be reverted in the 2.6.23 stable kernels.

Secondly, in the 2.6.24 model, x86-64 has now grown support for
SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
bug is not visible any more, it's become invisible due to the code just
being irrelevant and no longer enabled on the only architecture that
this ever affected.

Reported-by: Dave Jones 
Tested-by: Martin Ebourne 
Cc: Zou Nan hai 
Cc: Suresh Siddha 
Cc: Andrew Morton 
Acked-by: Andy Whitcroft 
Signed-off-by: Linus Torvalds 
Cc: Chuck Ebbert 
Signed-off-by: Greg Kroah-Hartman

fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE

2007-11-16T16:12:44+00:00

patch 487e9bf25cbae11b131d6a14bdbb3a6a77380837 in mainline.

It's possible to provoke unionfs (not yet in mainline, though in mm and
some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)).  I expect
it's possible to provoke the 2.6.23 ecryptfs in the same way (but the
2.6.24 ecryptfs no longer calls lower level's ->writepage).

This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could
leak from tmpfs via write_cache_pages and unionfs to userspace.  There's
already a fix (e423003028183df54f039dfda8b58c49e78c89d7 - writeback: don't
propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so
far as it goes; but insufficient because it doesn't address the underlying
issue, that shmem_writepage expects to be called only by vmscan (relying on
backing_dev_info capabilities to prevent the normal writeback path from
ever approaching it).

That's an increasingly fragile assumption, and ramdisk_writepage (the other
source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check
wbc->for_reclaim before returning it.  Make the same check in
shmem_writepage, thereby sidestepping the page_mapped BUG also.

Signed-off-by: Hugh Dickins 
Cc: Erez Zadok 
Reviewed-by: Pekka Enberg 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

writeback: don't propagate AOP_WRITEPAGE_ACTIVATE

2007-11-16T16:12:43+00:00

patch e423003028183df54f039dfda8b58c49e78c89d7 in mainline.

This is a writeback-internal marker but we're propagating it all the way back
to userspace!.

Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

SLUB: Fix memory leak by not reusing cpu_slab

2007-11-16T16:12:43+00:00

patch 05aa345034de6ae9c77fb93f6a796013641d57d5 in mainline.

SLUB: Fix memory leak by not reusing cpu_slab

Fix the memory leak that may occur when we attempt to reuse a cpu_slab
that was allocated while we reenabled interrupts in order to be able to
grow a slab cache. The per cpu freelist may contain objects and in that
situation we may overwrite the per cpu freelist pointer loosing objects.
This only occurs if we find that the concurrently allocated slab fits
our allocation needs.

If we simply always deactivate the slab then the freelist will be properly
reintegrated and the memory leak will go away.

Signed-off-by: Christoph Lameter 
Cc: Hugh Dickins 
Signed-off-by: Greg Kroah-Hartman

Remove broken ptrace() special-case code from file mapping

2007-11-16T16:12:42+00:00

The kernel has for random historical reasons allowed ptrace() accesses
to access (and insert) pages into the page cache above the size of the
file.

However, Nick broke that by mistake when doing the new fault handling in
commit 54cb8821de07f2ffcd28c380ce9b93d5784b40d7 ("mm: merge populate and
nopage into fault (fixes nonlinear)".  The breakage caused a hang with
gdb when trying to access the invalid page.

The ptrace "feature" really isn't worth resurrecting, since it really is
wrong both from a portability _and_ from an internal page cache validity
standpoint.  So this removes those old broken remnants, and fixes the
ptrace() hang in the process.

Noticed and bisected by Duane Griffin, who also supplied a test-case
(quoth Nick: "Well that's probably the best bug report I've ever had,
thanks Duane!").

Cc: Duane Griffin 
Acked-by: Nick Piggin 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

fix page release issue in filemap_fault

2007-10-08T19:58:14+00:00

find_lock_page increases page's usage count, we should decrease it
before return VM_FAULT_SIGBUS

Signed-off-by: Yan Zheng
Cc: Nick Piggin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08T19:58:14+00:00

The test for VM_CAN_NONLINEAR always fails

Signed-off-by: Yan Zheng
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: set_page_dirty_balance() vs ->page_mkwrite()

2007-10-08T19:58:14+00:00

All the current page_mkwrite() implementations also set the page dirty. Which
results in the set_page_dirty_balance() call to _not_ call balance, because the
page is already found dirty.

This allows us to dirty a _lot_ of pages without ever hitting
balance_dirty_pages().  Not good (tm).

Force a balance call if ->page_mkwrite() was successful.

Signed-off-by: Peter Zijlstra 
Signed-off-by: Linus Torvalds

xen: disable split pte locks for now

2007-10-06T16:31:30+00:00

When pinning and unpinning pagetables, we must protect them against
being used by other CPUs, lest they see the pagetable in an
intermediate read-only-but-not-pinned state.

When using split pte locks, doing this properly would require taking
all the pte locks for the pagetable while pinning, but this may overflow
the PREEMPT_BITS part of the preempt counter if the process has mapped
more than about 512M of memory.

However, failing to take the pte locks causes write-protect faults when
the pageout code is trying to clear the Access bit on a pte which is part
of a freshy created and still being pinned process after fork.

This is a short-term fix until the problem is solved properly.

Signed-off-by: Jeremy Fitzhardinge 
Acked-by: Rik van Riel 
Acked-by: Hugh Dickins 
Cc: David Rientjes 
Cc: Andrew Morton 
Cc: Andi Kleen 
Cc: Keir Fraser 
Cc: Jan Beulich 
Signed-off-by: Linus Torvalds

Fix sys_remap_file_pages BUG at highmem.c:15!

2007-10-04T17:13:09+00:00

Gurudas Pai reports kernel BUG at arch/i386/mm/highmem.c:15! below
sys_remap_file_pages, while running Oracle database test on x86 in 6GB
RAM: kunmap thinks we're in_interrupt because the preempt count has
wrapped.

That's because __do_fault expected to unmap page_table, but one of its
two callers do_nonlinear_fault already unmapped it: let do_linear_fault
unmap it first too, and then there's no need to pass the page_table arg
down.

Why have we been so slow to notice this? Probably through forgetting
that the mapping_cap_account_dirty test means that sys_remap_file_pages
nowadays only goes the full nonlinear vma route on a few memory-backed
filesystems like ramfs, tmpfs and hugetlbfs.

[ It also depends on CONFIG_HIGHPTE, so it becomes even harder to
  trigger in practice. Many who have need of large memory have probably
  migrated to x86-64..

  Problem introduced by commit d0217ac04ca6591841e5665f518e38064f4e65bd
  ("mm: fault feedback #1")                -- Linus ]

Signed-off-by: Hugh Dickins 
Cc: gurudas pai 
Cc: Nick Piggin 
Cc: Andrew Morton 
Signed-off-by: Linus Torvalds