linux-toradex.git/mm/msync.c, branch v2.6.18-rc7

[PATCH] Kill PF_SYNCWRITE flag

2006-06-23T15:10:39+00:00

A process flag to indicate whether we are doing sync io is incredibly
ugly. It also causes performance problems when one does a lot of async
io and then proceeds to sync it. Part of the io will go out as async,
and the other part as sync. This causes a disconnect between the
previously submitted io and the synced io. For io schedulers such as CFQ,
this will cause us lost merges and suboptimal behaviour in scheduling.

Remove PF_SYNCWRITE completely from the fsync/msync paths, and let
the O_DIRECT path just directly indicate that the writes are sync
by using WRITE_SYNC instead.

Signed-off-by: Jens Axboe

The comment describing how MS_ASYNC works in msync.c is confusing

2006-03-24T17:30:53+00:00

because of a typo.  This patch just changes "my" to "by", which I
believe was the original intent.

Signed-off-by: Adrian Bunk

[PATCH] msync(): use do_fsync()

2006-03-24T15:33:27+00:00

No need to duplicate all that code.

Cc: Hugh Dickins 
Cc: Nick Piggin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] msync: fix return value

2006-03-24T15:33:26+00:00

msync() does a strange thing.  Essentially:

	vma = find_vma();
	for ( ; ; ) {
		if (!vma)
			return -ENOMEM;
		...
		vma = vma->vm_next;
	}

so an msync() request which starts within or before a valid VMA and which ends
within or beyond the final VMA will incorrectly return -ENOMEM.

Fix.

Cc: Hugh Dickins 
Cc: Nick Piggin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] msync(MS_SYNC): don't hold mmap_sem while syncing

2006-03-24T15:33:26+00:00

It seems bad to hold mmap_sem while performing synchronous disk I/O.  Alter
the msync(MS_SYNC) code so that the lock is released while we sync the file.

Cc: Hugh Dickins 
Cc: Nick Piggin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] msync(): perform dirty page levelling

2006-03-24T15:33:26+00:00

It seems sensible to perform dirty page throttling in msync: as the application
dirties pages we can kick off pdflush early, or even force the msync() caller
to perform writeout, or even throttle the msync() caller.

The main effect of this is to start disk writeback earlier if we've just
discovered that a large amount of pagecache has been dirtied.  (Otherwise it
wouldn't happen for up to five seconds, next time pdflush wakes up).

It also will cause the page-dirtying process to get panalised for dirtying
those pages rather than whacking someone else with the problem.

We should do this for munmap() and possibly even exit(), too.

We drop the mmap_sem while performing the dirty page balancing.  It doesn't
seem right to hold mmap_sem for that long.

Note that this patch only affects MS_ASYNC.  MS_SYNC will be syncing all the
dirty pages anyway.

We note that msync(MS_SYNC) does a full-file-sync inside mmap_sem, and always
has.  We can fix that up...

The patch also tightens up the mmap_sem coverage in sys_msync(): no point in
taking it while we perform the incoming arg checking.

Cc: Hugh Dickins 
Cc: Nick Piggin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem

2006-01-09T23:59:24+00:00

This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.

Modified-by: Ingo Molnar 

(finished the conversion)

Signed-off-by: Jes Sorensen 
Signed-off-by: Ingo Molnar

mm: re-architect the VM_UNPAGED logic

2005-11-28T22:34:23+00:00

This replaces the (in my opinion horrible) VM_UNMAPPED logic with very
explicit support for a "remapped page range" aka VM_PFNMAP.  It allows a
VM area to contain an arbitrary range of page table entries that the VM
never touches, and never considers to be normal pages.

Any user of "remap_pfn_range()" automatically gets this new
functionality, and doesn't even have to mark the pages reserved or
indeed mark them any other way.  It just works.  As a side effect, doing
mmap() on /dev/mem works for arbitrary ranges.

Sparc update from David in the next commit.

Signed-off-by: Linus Torvalds

[PATCH] unpaged: VM_UNPAGED

2005-11-22T17:13:42+00:00

Although we tend to associate VM_RESERVED with remap_pfn_range, quite a few
drivers set VM_RESERVED on areas which are then populated by nopage.  The
PageReserved removal in 2.6.15-rc1 changed VM_RESERVED not to free pages in
zap_pte_range, without changing those drivers not to set it: so their pages
just leak away.

Let's not change miscellaneous drivers now: introduce VM_UNPAGED at the core,
to flag the special areas where the ptes may have no struct page, or if they
have then it's not to be touched.  Replace most instances of VM_RESERVED in
core mm by VM_UNPAGED.  Force it on in remap_pfn_range, and the sparc and
sparc64 io_remap_pfn_range.

Revert addition of VM_RESERVED to powerpc vdso, it's not needed there.  Is it
needed anywhere?  It still governs the mm->reserved_vm statistic, and special
vmas not to be merged, and areas not to be core dumped; but could probably be
eliminated later (the drivers are probably specifying it because in 2.4 it
kept swapout off the vma, but in 2.6 we work from the LRU, which these pages
don't get on).

Use the VM_SHM slot for VM_UNPAGED, and define VM_SHM to 0: it serves no
purpose whatsoever, and should be removed from drivers when we clean up.

Signed-off-by: Hugh Dickins 
Acked-by: William Irwin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] mm: pte_offset_map_lock loops

2005-10-30T04:40:40+00:00

Convert those common loops using page_table_lock on the outside and
pte_offset_map within to use just pte_offset_map_lock within instead.

These all hold mmap_sem (some exclusively, some not), so at no level can a
page table be whipped away from beneath them.  But whereas pte_alloc loops
tested with the "atomic" pmd_present, these loops are testing with pmd_none,
which on i386 PAE tests both lower and upper halves.

That's now unsafe, so add a cast into pmd_none to test only the vital lower
half: we lose a little sensitivity to a corrupt middle directory, but not
enough to worry about.  It appears that i386 and UML were the only
architectures vulnerable in this way, and pgd and pud no problem.

Signed-off-by: Hugh Dickins 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds