linux-toradex.git/fs/cachefiles, branch v3.10.78

lift sb_start_write() out of ->write()

2013-04-09T18:12:56+00:00

Signed-off-by: Al Viro

FS-Cache: Mark cancellation of in-progress operation

2012-12-20T22:34:00+00:00

Mark as cancelled an operation that is in progress rather than pending at the
time it is cancelled, and call fscache_complete_op() to cancel an operation so
that blocked ops can be started.

Signed-off-by: David Howells

FS-Cache: Don't mask off the object event mask when printing it

2012-12-20T22:08:53+00:00

Don't mask off the object event mask when printing it.  That way it can be seen
if threre are bits set that shouldn't be.

Signed-off-by: David Howells

CacheFiles: Add missing retrieval completions

2012-12-20T22:07:40+00:00

CacheFiles is missing some calls to fscache_retrieval_complete() in the error
handling/collision paths of its reader functions.

This can be seen by the following assertion tripping in fscache_put_operation()
whereby the operation being destroyed is still in the in-progress state and has
not been cancelled or completed:

FS-Cache: Assertion failed
3 == 5 is false
------------[ cut here ]------------
kernel BUG at fs/fscache/operation.c:408!
invalid opcode: 0000 [#1] SMP
CPU 2
Modules linked in: xfs ioatdma dca loop joydev evdev
psmouse dcdbas pcspkr serio_raw i5000_edac edac_core i5k_amb shpchp
pci_hotplug sg sr_mod]

Pid: 8062, comm: httpd Not tainted 3.1.0-rc8 #1 Dell Inc. PowerEdge 1950/0DT097
RIP: 0010:[]  [] fscache_put_operation+0x304/0x330
RSP: 0018:ffff880062f739d8  EFLAGS: 00010296
RAX: 0000000000000025 RBX: ffff8800c5122e84 RCX: ffffffff81ddf040
RDX: 00000000ffffffff RSI: 0000000000000082 RDI: ffffffff81ddef30
RBP: ffff880062f739f8 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000003 R12: ffff8800c5122e40
R13: ffff880037a2cd20 R14: ffff880087c7a058 R15: ffff880087c7a000
FS:  00007f63dcf636e0(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0c0a91f000 CR3: 0000000062ec2000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process httpd (pid: 8062, threadinfo ffff880062f72000, task ffff880087e58000)
Stack:
 ffff880062f73bf8 0000000000000000 ffff880062f73bf8 ffff880037a2cd20
 ffff880062f73a68 ffffffff8119aa7e ffff88006540e000 ffff880062f73ad4
 ffff88008e9a4308 ffff880037a2cd20 ffff880062f73a48 ffff8800c5122e40
Call Trace:
 [] __fscache_read_or_alloc_pages+0x1fe/0x530
 [] __nfs_readpages_from_fscache+0x70/0x1c0
 [] nfs_readpages+0xca/0x1e0
 [] ? rpc_do_put_task+0x36/0x50
 [] ? alloc_nfs_open_context+0x4b/0x110
 [] ? rpc_call_sync+0x5a/0x70
 [] __do_page_cache_readahead+0x1ca/0x270
 [] ra_submit+0x21/0x30
 [] ondemand_readahead+0x11d/0x250
 [] page_cache_sync_readahead+0x36/0x60
 [] generic_file_aio_read+0x454/0x770
 [] nfs_file_read+0xe1/0x130
 [] do_sync_read+0xd9/0x120
 [] ? mntput+0x1f/0x40
 [] ? fput+0x1cb/0x260
 [] vfs_read+0xc8/0x180
 [] sys_read+0x55/0x90

Reported-by: Mark Moseley 
Signed-off-by: David Howells

CacheFiles: Implement invalidation

2012-12-20T22:06:08+00:00

Implement invalidation for CacheFiles.  This is in two parts:

 (1) Provide an invalidation method (which just truncates the backing file).

 (2) Abort attempts to copy anything read from the backing file whilst
     invalidation is in progress.

Question: CacheFiles uses truncation in a couple of places.  It has been using
notify_change() rather than sys_truncate() or something similar.  This means
it bypasses a bunch of checks and suchlike that it possibly should be making
(security, file locking, lease breaking, vfsmount write).  Should it be using
vfs_truncate() as added by a preceding patch or should it use notify_write()
and assume that anyone poking around in the cache files on disk gets
everything they deserve?

Signed-off-by: David Howells

FS-Cache: Fix operation state management and accounting

2012-12-20T21:58:26+00:00

Fix the state management of internal fscache operations and the accounting of
what operations are in what states.

This is done by:

 (1) Give struct fscache_operation a enum variable that directly represents the
     state it's currently in, rather than spreading this knowledge over a bunch
     of flags, who's processing the operation at the moment and whether it is
     queued or not.

     This makes it easier to write assertions to check the state at various
     points and to prevent invalid state transitions.

 (2) Add an 'operation complete' state and supply a function to indicate the
     completion of an operation (fscache_op_complete()) and make things call
     it.  The final call to fscache_put_operation() can then check that an op
     in the appropriate state (complete or cancelled).

 (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
     govern the state of an object:

	(a) The ->n_ops is now the number of extant operations on the object
	    and is now decremented by fscache_put_operation() only.

	(b) The ->n_in_progress is simply the number of objects that have been
	    taken off of the object's pending queue for the purposes of being
	    run.  This is decremented by fscache_op_complete() only.

	(c) The ->n_exclusive is the number of exclusive ops that have been
	    submitted and queued or are in progress.  It is decremented by
	    fscache_op_complete() and by fscache_cancel_op().

     fscache_put_operation() and fscache_operation_gc() now no longer try to
     clean up ->n_exclusive and ->n_in_progress.  That was leading to double
     decrements against fscache_cancel_op().

     fscache_cancel_op() now no longer decrements ->n_ops.  That was leading to
     double decrements against fscache_put_operation().

     fscache_submit_exclusive_op() now decides whether it has to queue an op
     based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
     will persist in being true even after all preceding operations have been
     cancelled or completed.  Furthermore, if an object is active and there are
     runnable ops against it, there must be at least one op running.

 (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
     provide a function to record completion of the pages as they complete.

     When n_pages reaches 0, the operation is deemed to be complete and
     fscache_op_complete() is called.

     Add calls to fscache_retrieval_complete() anywhere we've finished with a
     page we've been given to read or allocate for.  This includes places where
     we just return pages to the netfs for reading from the server and where
     accessing the cache fails and we discard the proposed netfs page.

The bugs in the unfixed state management manifest themselves as oopses like the
following where the operation completion gets out of sync with return of the
cookie by the netfs.  This is possible because the cache unlocks and returns
all the netfs pages before recording its completion - which means that there's
nothing to stop the netfs discarding them and returning the cookie.


FS-Cache: Cookie 'NFS.fh' still has outstanding reads
------------[ cut here ]------------
kernel BUG at fs/fscache/cookie.c:519!
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc

Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090                  /DG965RY
RIP: 0010:[]  [] __fscache_relinquish_cookie+0x170/0x343 [fscache]
RSP: 0018:ffff8800368cfb00  EFLAGS: 00010282
RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
FS:  0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
Stack:
 ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
Call Trace:
 [] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
 [] nfs_clear_inode+0x3c/0x41 [nfs]
 [] nfs4_evict_inode+0x2f/0x33 [nfs]
 [] evict+0xa1/0x15c
 [] dispose_list+0x2c/0x38
 [] prune_icache_sb+0x28c/0x29b
 [] prune_super+0xd5/0x140
 [] shrink_slab+0x102/0x1ab
 [] balance_pgdat+0x2f2/0x595
 [] ? process_timeout+0xb/0xb
 [] kswapd+0x270/0x289
 [] ? __init_waitqueue_head+0x46/0x46
 [] ? balance_pgdat+0x595/0x595
 [] kthread+0x7f/0x87
 [] kernel_thread_helper+0x4/0x10
 [] ? finish_task_switch+0x45/0xc0
 [] ? retint_restore_args+0xe/0xe
 [] ? __init_kthread_worker+0x53/0x53
 [] ? gs_change+0xb/0xb

Signed-off-by: David Howells

CacheFiles: Make some debugging statements conditional

2012-12-20T21:58:25+00:00

Downgrade some debugging statements to not unconditionally print stuff, but
rather be conditional on the appropriate module parameter setting.

Signed-off-by: David Howells

CacheFiles: Downgrade the requirements passed to the allocator

2012-12-20T21:58:25+00:00

Downgrade the requirements passed to the allocator in the gfp flags parameter.
FS-Cache/CacheFiles can handle OOM conditions simply by aborting the attempt to
store an object or a page in the cache.

Signed-off-by: David Howells

CacheFiles: Fix the marking of cached pages

2012-12-20T21:54:30+00:00

Under some circumstances CacheFiles defers the marking of pages with PG_fscache
so that it can take advantage of pagevecs to reduce the number of calls to
fscache_mark_pages_cached() and the netfs's hook to keep track of this.

There are, however, two problems with this:

 (1) It can lead to the PG_fscache mark being applied _after_ the page is set
     PG_uptodate and unlocked (by the call to fscache_end_io()).

 (2) CacheFiles's ref on the page is dropped immediately following
     fscache_end_io() - and so may not still be held when the mark is applied.
     This can lead to the page being passed back to the allocator before the
     mark is applied.

Fix this by, where appropriate, marking the page before calling
fscache_end_io() and releasing the page.  This means that we can't take
advantage of pagevecs and have to make a separate call for each page to the
marking routines.

The symptoms of this are Bad Page state errors cropping up under memory
pressure, for example:

BUG: Bad page state in process tar  pfn:002da
page:ffffea0000009fb0 count:0 mapcount:0 mapping:          (null) index:0x1447
page flags: 0x1000(private_2)
Pid: 4574, comm: tar Tainted: G        W   3.1.0-rc4-fsdevel+ #1064
Call Trace:
 [] ? dump_page+0xb9/0xbe
 [] bad_page+0xd5/0xea
 [] get_page_from_freelist+0x35b/0x46a
 [] __alloc_pages_nodemask+0x362/0x662
 [] __do_page_cache_readahead+0x13a/0x267
 [] ? __do_page_cache_readahead+0xa2/0x267
 [] ra_submit+0x1c/0x20
 [] ondemand_readahead+0x28b/0x29a
 [] ? ondemand_readahead+0x163/0x29a
 [] page_cache_sync_readahead+0x38/0x3a
 [] generic_file_aio_read+0x2ab/0x67e
 [] nfs_file_read+0xa4/0xc9 [nfs]
 [] do_sync_read+0xba/0xfa
 [] ? security_file_permission+0x7b/0x84
 [] ? rw_verify_area+0xab/0xc8
 [] vfs_read+0xaa/0x13a
 [] sys_read+0x45/0x6c
 [] system_call_fastpath+0x16/0x1b

As can be seen, PG_private_2 (== PG_fscache) is set in the page flags.

Instrumenting fscache_mark_pages_cached() to verify whether page->mapping was
set appropriately showed that sometimes it wasn't.  This led to the discovery
that sometimes the page has apparently been reclaimed by the time the marker
got to see it.

Reported-by: M. Stevens 
Signed-off-by: David Howells 
Reviewed-by: Jeff Layton

fs: cachefiles: add support for large files in filesystem caching

2012-07-31T00:25:21+00:00

Support the caching of large files.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=31182

Signed-off-by: Justin Lecher 
Signed-off-by: Suresh Jayaraman 
Tested-by: Suresh Jayaraman 
Acked-by: David Howells 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds