summaryrefslogtreecommitdiff
path: root/fs/ext3/inode.c
AgeCommit message (Collapse)Author
2012-09-14ext3: Fix fdatasync() for files with only i_size changesJan Kara
commit 156bddd8e505b295540f3ca0e27dda68cb0d49aa upstream. Code tracking when transaction needs to be committed on fdatasync(2) forgets to handle a situation when only inode's i_size is changed. Thus in such situations fdatasync(2) doesn't force transaction with new i_size to disk and that can result in wrong i_size after a crash. Fix the issue by updating inode's i_datasync_tid whenever its size is updated. Reported-by: Kristian Nielsen <knielsen@knielsen-hq.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-31ext3: move headers to fs/ext3/Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-02-29ext3: Update ctime in ext3_splice_branch() only when neededKazuya Mio
Currently ext3 updates ctime in ext3_splice_branch() which is called whenever we allocate one block. But it is wasteful because ext3 doesn't support nanosecond timestamp. This leads to a performance loss. Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-01-09Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2/3/4: delete unneeded includes of module.h ext{3,4}: Fix potential race when setversion ioctl updates inode udf: Mark LVID buffer as uptodate before marking it dirty ext3: Don't warn from writepage when readonly inode is spotted after error jbd: Remove j_barrier mutex reiserfs: Force inode evictions before umount to avoid crash reiserfs: Fix quota mount option parsing udf: Treat symlink component of type 2 as / udf: Fix deadlock when converting file from in-ICB one to normal one udf: Cleanup calling convention of inode_getblk() ext2: Fix error handling on inode bitmap corruption ext3: Fix error handling on inode bitmap corruption ext3: replace ll_rw_block with other functions ext3: NULL dereference in ext3_evict_inode() jbd: clear revoked flag on buffers before a new transaction started ext3: call ext3_mark_recovery_complete() when recovery is really needed
2012-01-09ext2/3/4: delete unneeded includes of module.hPaul Gortmaker
Delete any instances of include module.h that were not strictly required. In the case of ext2, the declaration of MODULE_LICENSE etc. were in inode.c but the module_init/exit were in super.c, so relocate the MODULE_LICENCE/AUTHOR block to super.c which makes it consistent with ext3 and ext4 at the same time. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-01-09ext3: Don't warn from writepage when readonly inode is spotted after errorJan Kara
WARN_ON_ONCE(IS_RDONLY(inode)) tends to trip when filesystem hits error and is remounted read-only. This unnecessarily scares users (well, they should be scared because of filesystem error, but the stack trace distracts them from the right source of their fear ;-). We could as well just remove the WARN_ON but it's not hard to fix it to not trip on filesystem with errors and not use more cycles in the common case so that's what we do. CC: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2012-01-09ext3: replace ll_rw_block with other functionsZheng Liu
ll_rw_block() is deprecated. Thus we replace it with other functions. CC: Jan Kara <jack@suse.cz> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-12-02treewide: Fix typos in various parts of the kernel, and fix some comments.Justin P. Mattock
The below patch fixes some typos in various parts of the kernel, as well as fixes some comments. Please let me know if I missed anything, and I will try to get it changed and resent. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-11-22ext3: NULL dereference in ext3_evict_inode()Dan Carpenter
This is an fsfuzzer bug. ->s_journal is set at the end of ext3_load_journal() but we try to use it in the error handling from ext3_get_journal() while it's still NULL. [ 337.039041] BUG: unable to handle kernel NULL pointer dereference at 0000000000000024 [ 337.040380] IP: [<ffffffff816e6539>] _raw_spin_lock+0x9/0x30 [ 337.041687] PGD 0 [ 337.043118] Oops: 0002 [#1] SMP [ 337.044483] CPU 3 [ 337.044495] Modules linked in: ecb md4 cifs fuse kvm_intel kvm brcmsmac brcmutil crc8 cordic r8169 [last unloaded: scsi_wait_scan] [ 337.047633] [ 337.049259] Pid: 8308, comm: mount Not tainted 3.2.0-rc2-next-20111121+ #24 SAMSUNG ELECTRONICS CO., LTD. RV411/RV511/E3511/S3511 /RV411/RV511/E3511/S3511 [ 337.051064] RIP: 0010:[<ffffffff816e6539>] [<ffffffff816e6539>] _raw_spin_lock+0x9/0x30 [ 337.052879] RSP: 0018:ffff8800b1d11ae8 EFLAGS: 00010282 [ 337.054668] RAX: 0000000000000100 RBX: 0000000000000000 RCX: ffff8800b77c2000 [ 337.056400] RDX: ffff8800a97b5c00 RSI: 0000000000000000 RDI: 0000000000000024 [ 337.058099] RBP: ffff8800b1d11ae8 R08: 6000000000000000 R09: e018000000000000 [ 337.059841] R10: ff67366cc2607c03 R11: 00000000110688e6 R12: 0000000000000000 [ 337.061607] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8800a78f06e8 [ 337.063385] FS: 00007f9d95652800(0000) GS:ffff8800b7180000(0000) knlGS:0000000000000000 [ 337.065110] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 337.066801] CR2: 0000000000000024 CR3: 00000000aef2c000 CR4: 00000000000006e0 [ 337.068581] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 337.070321] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 337.072105] Process mount (pid: 8308, threadinfo ffff8800b1d10000, task ffff8800b1d02be0) [ 337.073800] Stack: [ 337.075487] ffff8800b1d11b08 ffffffff811f48cf ffff88007ac9b158 0000000000000000 [ 337.077255] ffff8800b1d11b38 ffffffff8119405d ffff88007ac9b158 ffff88007ac9b250 [ 337.078851] ffffffff8181bda0 ffffffff8181bda0 ffff8800b1d11b68 ffffffff81131e31 [ 337.080284] Call Trace: [ 337.081706] [<ffffffff811f48cf>] log_start_commit+0x1f/0x40 [ 337.083107] [<ffffffff8119405d>] ext3_evict_inode+0x1fd/0x2a0 [ 337.084490] [<ffffffff81131e31>] evict+0xa1/0x1a0 [ 337.085857] [<ffffffff81132031>] iput+0x101/0x210 [ 337.087220] [<ffffffff811339d1>] iget_failed+0x21/0x30 [ 337.088581] [<ffffffff811905fc>] ext3_iget+0x15c/0x450 [ 337.089936] [<ffffffff8118b0c1>] ? ext3_rsv_window_add+0x81/0x100 [ 337.091284] [<ffffffff816df9a4>] ext3_get_journal+0x15/0xde [ 337.092641] [<ffffffff811a2e9b>] ext3_fill_super+0xf2b/0x1c30 [ 337.093991] [<ffffffff810ddf7d>] ? register_shrinker+0x4d/0x60 [ 337.095332] [<ffffffff8111c112>] mount_bdev+0x1a2/0x1e0 [ 337.096680] [<ffffffff811a1f70>] ? ext3_setup_super+0x210/0x210 [ 337.098026] [<ffffffff8119a770>] ext3_mount+0x10/0x20 [ 337.099362] [<ffffffff8111cbee>] mount_fs+0x3e/0x1b0 [ 337.100759] [<ffffffff810eda1b>] ? __alloc_percpu+0xb/0x10 [ 337.102330] [<ffffffff81135385>] vfs_kern_mount+0x65/0xc0 [ 337.103889] [<ffffffff8113611f>] do_kern_mount+0x4f/0x100 [ 337.105442] [<ffffffff811378fc>] do_mount+0x19c/0x890 [ 337.106989] [<ffffffff810e8456>] ? memdup_user+0x46/0x90 [ 337.108572] [<ffffffff810e84f3>] ? strndup_user+0x53/0x70 [ 337.110114] [<ffffffff811383fb>] sys_mount+0x8b/0xe0 [ 337.111617] [<ffffffff816ed93b>] system_call_fastpath+0x16/0x1b [ 337.113133] Code: 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f b6 03 38 c2 75 f7 48 83 c4 08 5b 5d c3 0f 1f 84 00 00 00 00 00 55 b8 00 01 00 00 48 89 e5 <f0> 66 0f c1 07 0f b6 d4 38 c2 74 0c 0f 1f 00 f3 90 0f b6 07 38 [ 337.116588] RIP [<ffffffff816e6539>] _raw_spin_lock+0x9/0x30 [ 337.118260] RSP <ffff8800b1d11ae8> [ 337.119998] CR2: 0000000000000024 [ 337.188701] ---[ end trace c36d790becac1615 ]--- Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-11-02filesystems: add set_nlink()Miklos Szeredi
Replace remaining direct i_nlink updates with a new set_nlink() updater function. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-08-23block: separate priority boosting from REQ_METAChristoph Hellwig
Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule, and lave REQ_META purely for marking requests as metadata in blktrace. All existing callers of REQ_META except for XFS are updated to also set REQ_PRIO for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-23block: remove READ_META and WRITE_METAChristoph Hellwig
Replace all occurnanced of the undocumented READ_META with READ | REQ_META and remove the unused WRITE_META define. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-26Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: jbd: change the field "b_cow_tid" of struct journal_head from type unsigned to tid_t ext3.txt: update the links in the section "useful links" to the latest ones ext3: Fix data corruption in inodes with journalled data ext2: check xattr name_len before acquiring xattr_sem in ext2_xattr_get ext3: Fix compilation with -DDX_DEBUG quota: Remove unused declaration jbd: Use WRITE_SYNC in journal checkpoint. jbd: Fix oops in journal_remove_journal_head() ext3: Return -EINVAL when start is beyond the end of fs in ext3_trim_fs() ext3/ioctl.c: silence sparse warnings about different address spaces ext3/ext4 Documentation: remove bh/nobh since it has been deprecated ext3: Improve truncate error handling ext3: use proper little-endian bitops ext2: include fs.h into ext2_fs.h ext3: Fix oops in ext3_try_to_allocate_with_rsv() jbd: fix a bug of leaking jh->b_jcount jbd: remove dependency on __GFP_NOFAIL ext3: Convert ext3 to new truncate calling convention jbd: Add fixed tracepoints ext3: Add fixed tracepoints Resolve conflicts in fs/ext3/fsync.c due to fsync locking push-down and new fixed tracepoints.
2011-07-23ext3: Fix data corruption in inodes with journalled dataJan Kara
When journalling data for an inode (either because it is a symlink or because the filesystem is mounted in data=journal mode), ext3_evict_inode() can discard unwritten data by calling truncate_inode_pages(). This is because we don't mark the buffer / page dirty when journalling data but only add the buffer to the running transaction and thus mm does not know there are still unwritten data. Fix the problem by carefully tracking transaction containing inode's data, committing this transaction, and writing uncheckpointed buffers when inode should be reaped. Signed-off-by: Jan Kara <jack@suse.cz>
2011-07-20fs: simplify the blockdev_direct_IO prototypeChristoph Hellwig
Simple filesystems always pass inode->i_sb_bdev as the block device argument, and never need a end_io handler. Let's simply things for them and for my grepping activity by dropping these arguments. The only thing not falling into that scheme is ext4, which passes and end_io handler without needing special flags (yet), but given how messy the direct I/O code there is use of __blockdev_direct_IO in one instead of two out of three cases isn't going to make a large difference anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20fs: move inode_dio_wait calls into ->setattrChristoph Hellwig
Let filesystems handle waiting for direct I/O requests themselves instead of doing it beforehand. This means filesystem-specific locks to prevent new dio referenes from appearing can be held. This is important to allow generalizing i_dio_count to non-DIO_LOCKING filesystems. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-25ext3: Improve truncate error handlingJan Kara
New truncate calling convention allows us to handle errors from ext3_block_truncate_page(). So reorganize the code so that ext3_block_truncate_page() is called before we change inode size. This also removes unnecessary block zeroing from error recovery after failed buffered writes (zeroing isn't needed because we could have never written non-zero data to disk). We have to be careful and keep zeroing in direct IO write error recovery because there we might have already overwritten end of the last file block. Signed-off-by: Jan Kara <jack@suse.cz>
2011-06-25ext3: Convert ext3 to new truncate calling conventionJan Kara
Mostly trivial conversion. We fix a bug that IS_IMMUTABLE and IS_APPEND files could not be truncated during failed writes as we change the code. In fact the test is not needed at all because both IS_IMMUTABLE and IS_APPEND is tested in upper layers in do_sys_[f]truncate(), may_write(), etc. Signed-off-by: Jan Kara <jack@suse.cz>
2011-06-25ext3: Add fixed tracepointsLukas Czerner
This commit adds fixed tracepoints to the ext3 code. It is based on ext4 tracepoints, however due to the differences of both file systems, there are some tracepoints missing (those for delaloc and for multi-block allocator) and there are some ext3 specific as well (for reservation windows). Here is a list: ext3_free_inode ext3_request_inode ext3_allocate_inode ext3_evict_inode ext3_drop_inode ext3_mark_inode_dirty ext3_write_begin ext3_ordered_write_end ext3_writeback_write_end ext3_journalled_write_end ext3_ordered_writepage ext3_writeback_writepage ext3_journalled_writepage ext3_readpage ext3_releasepage ext3_invalidatepage ext3_discard_blocks ext3_request_blocks ext3_allocate_blocks ext3_free_blocks ext3_sync_file_enter ext3_sync_file_exit ext3_sync_fs ext3_rsv_window_add ext3_discard_reservation ext3_alloc_new_reservation ext3_reserved ext3_forget ext3_read_block_bitmap ext3_direct_IO_enter ext3_direct_IO_exit ext3_unlink_enter ext3_unlink_exit ext3_truncate_enter ext3_truncate_exit ext3_get_blocks_enter ext3_get_blocks_exit ext3_load_inode Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz>
2011-05-27fs: pass exact type of data dirties to ->dirty_inodeChristoph Hellwig
Tell the filesystem if we just updated timestamp (I_DIRTY_SYNC) or anything else, so that the filesystem can track internally if it needs to push out a transaction for fdatasync or not. This is just the prototype change with no user for it yet. I plan to push large XFS changes for the next merge window, and getting this trivial infrastructure in this window would help a lot to avoid tree interdependencies. Also remove incorrect comments that ->dirty_inode can't block. That has been changed a long time ago, and many implementations rely on it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-04-08Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: quota: Don't write quota info in dquot_commit() ext3: Fix writepage credits computation for ordered mode
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-24ext3: Fix writepage credits computation for ordered modeYongqiang Yang
Original computation forgets to count writes of indirect block themselves (it only counts with blocks necessary for their allocation) in ordered mode. Acked-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by:Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-03-10block: remove per-queue pluggingJens Axboe
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-01-10ext3: Add more journal error checkNamhyung Kim
Check return value of ext3_journal_get_write_acccess() and ext3_journal_dirty_metadata(). Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-27Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (24 commits) quota: Fix possible oops in __dquot_initialize() ext3: Update kernel-doc comments jbd/2: fixed typos ext2: fixed typo. ext3: Fix debug messages in ext3_group_extend() jbd: Convert atomic_inc() to get_bh() ext3: Remove misplaced BUFFER_TRACE() in ext3_truncate() jbd: Fix debug message in do_get_write_access() jbd: Check return value of __getblk() ext3: Use DIV_ROUND_UP() on group desc block counting ext3: Return proper error code on ext3_fill_super() ext3: Remove unnecessary casts on bh->b_data ext3: Cleanup ext3_setup_super() quota: Fix issuing of warnings from dquot_transfer quota: fix dquot_disable vs dquot_transfer race v2 jbd: Convert bitops to buffer fns ext3/jbd: Avoid WARN() messages when failing to write the superblock jbd: Use offset_in_page() instead of manual calculation jbd: Remove unnecessary goto statement jbd: Use printk_ratelimited() in journal_alloc_journal_head() ...
2010-10-28ext3: Update kernel-doc commentsNamhyung Kim
Update missing/broken argument descriptions and fix formatting. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28ext3: Remove misplaced BUFFER_TRACE() in ext3_truncate()Namhyung Kim
Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-25fs: kill block_prepare_writeChristoph Hellwig
__block_write_begin and block_prepare_write are identical except for slightly different calling conventions. Convert all callers to the __block_write_begin calling conventions and drop block_prepare_write. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits) no need for list_for_each_entry_safe()/resetting with superblock list Fix sget() race with failing mount vfs: don't hold s_umount over close_bdev_exclusive() call sysv: do not mark superblock dirty on remount sysv: do not mark superblock dirty on mount btrfs: remove junk sb_dirt change BFS: clean up the superblock usage AFFS: wait for sb synchronization when needed AFFS: clean up dirty flag usage cifs: truncate fallout mbcache: fix shrinker function return value mbcache: Remove unused features add f_flags to struct statfs(64) pass a struct path to vfs_statfs update VFS documentation for method changes. All filesystems that need invalidate_inode_buffers() are doing that explicitly convert remaining ->clear_inode() to ->evict_inode() Make ->drop_inode() just return whether inode needs to be dropped fs/inode.c:clear_inode() is gone fs/inode.c:evict() doesn't care about delete vs. non-delete paths now ... Fix up trivial conflicts in fs/nilfs2/super.c
2010-08-09convert ext3 to ->evict_inode()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-09remove inode_setattrChristoph Hellwig
Replace inode_setattr with opencoded variants of it in all callers. This moves the remaining call to vmtruncate into the filesystem methods where it can be replaced with the proper truncate sequence. In a few cases it was obvious that we would never end up calling vmtruncate so it was left out in the opencoded variant: spufs: explicitly checks for ATTR_SIZE earlier btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above In addition to that ncpfs called inode_setattr with handcrafted iattrs, which allowed to trim down the opencoded variant. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-09introduce __block_write_beginChristoph Hellwig
Split up the block_write_begin implementation - __block_write_begin is a new trivial wrapper for block_prepare_write that always takes an already allocated page and can be either called from block_write_begin or filesystem code that already has a page allocated. Remove the handling of already allocated pages from block_write_begin after switching all callers that do it to __block_write_begin. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-09sort out blockdev_direct_IO variantsChristoph Hellwig
Move the call to vmtruncate to get rid of accessive blocks to the callers in prepearation of the new truncate calling sequence. This was only done for DIO_LOCKING filesystems, so the __blockdev_direct_IO_newtrunc variant was not needed anyway. Get rid of blockdev_direct_IO_no_locking and its _newtrunc variant while at it as just opencoding the two additional paramters is shorted than the name suffix. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-05ext3: Fix dirtying of journalled buffers in data=journal modeJan Kara
In data=journal mode, we still use block_write_begin() to prepare page for writing. This function can occasionally mark buffer dirty which violates journalling assumptions - when a buffer is part of a transaction, it should be dirty and a buffer can be already part of a forget list of some transaction when block_write_begin() gets called. This violation of journalling assumptions then results in "JBD: Spotted dirty metadata buffer..." warnings. In fact, temporary dirtying the buffer while the page is still locked does not really cause problems to the journalling because we won't write the buffer until the page gets unlocked. So we just have to make sure to clear dirty bits before unlocking the page. Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz>
2010-07-21ext3: Avoid filesystem corruption after a crash under heavy delete loadJan Kara
It can happen that ext3_free_branches calls ext3_forget() for an indirect block in an earlier transaction than a transaction in which we clear pointer to this indirect block. Thus if we crash before a transaction clearing the block pointer is committed, we will see indirect block pointing to already freed blocks and complain during orphan list cleanup. The fix is simple: Make sure ext3_forget() is called in the transaction doing block pointer clearing. This is a backport of an ext4 fix by Amir G. <amir73il@users.sourceforge.net> Signed-off-by: Jan Kara <jack@suse.cz>
2010-07-21ext3: remove vestiges of nobh supportChristoph Hellwig
The nobh option was only supported for writeback mode, but given that all write paths (except mmapped writed) actually create buffer heads, it effectively was a no-op already. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21quota: unify quota init condition in setattrDmitry Monakhov
Quota must being initialized if size or uid/git changes requested. But initialization performed in two different places: in case of i_size file system is responsible for dquot init , but in case of uid/gid init will be called internally in dquot_transfer(). This ambiguity makes code harder to understand. Let's move this logic to one common helper function. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-29ext3: fix broken handling of EXT3_STATE_NEWLinus Torvalds
In commit 9df93939b735 ("ext3: Use bitops to read/modify EXT3_I(inode)->i_state") ext3 changed its internal 'i_state' variable to use bitops for its state handling. However, unline the same ext4 change, it didn't actually change the name of the field when it changed the semantics of it. As a result, an old use of 'i_state' remained in fs/ext3/ialloc.c that initialized the field to EXT3_STATE_NEW. And that does not work _at_all_ when we're now working with individually named bits rather than values that get masked. So the code tried to mark the state to be new, but in actual fact set the field to EXT3_STATE_JDATA. Which makes no sense at all, and screws up all the code that checks whether the inode was newly allocated. In particular, it made the xattr code unhappy, and caused various random behavior, like apparently https://bugzilla.redhat.com/show_bug.cgi?id=577911 So fix the initialization, and rename the field to match ext4 so that we don't have this happen again. Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Daniel J Walsh <dwalsh@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-05Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits) quota: stop using QUOTA_OK / NO_QUOTA dquot: cleanup dquot initialize routine dquot: move dquot initialization responsibility into the filesystem dquot: cleanup dquot drop routine dquot: move dquot drop responsibility into the filesystem dquot: cleanup dquot transfer routine dquot: move dquot transfer responsibility into the filesystem dquot: cleanup inode allocation / freeing routines dquot: cleanup space allocation / freeing routines ext3: add writepage sanity checks ext3: Truncate allocated blocks if direct IO write fails to update i_size quota: Properly invalidate caches even for filesystems with blocksize < pagesize quota: generalize quota transfer interface quota: sb_quota state flags cleanup jbd: Delay discarding buffers in journal_unmap_buffer ext3: quota_write cross block boundary behaviour quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota quota: split out compat_sys_quotactl support from quota.c quota: split out netlink notification support from quota.c quota: remove invalid optimization from quota_sync_all ... Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c
2010-03-05pass writeback_control to ->write_inodeChristoph Hellwig
This gives the filesystem more information about the writeback that is happening. Trond requested this for the NFS unstable write handling, and other filesystems might benefit from this too by beeing able to distinguish between the different callers in more detail. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-05dquot: cleanup dquot initialize routineChristoph Hellwig
Get rid of the initialize dquot operation - it is now always called from the filesystem and if a filesystem really needs it's own (which none currently does) it can just call into it's own routine directly. Rename the now static low-level dquot_initialize helper to __dquot_initialize and vfs_dq_init to dquot_initialize to have a consistent namespace. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-05dquot: move dquot initialization responsibility into the filesystemChristoph Hellwig
Currently various places in the VFS call vfs_dq_init directly. This means we tie the quota code into the VFS. Get rid of that and make the filesystem responsible for the initialization. For most metadata operations this is a straight forward move into the methods, but for truncate and open it's a bit more complicated. For truncate we currently only call vfs_dq_init for the sys_truncate case because open already takes care of it for ftruncate and open(O_TRUNC) - the new code causes an additional vfs_dq_init for those which is harmless. For open the initialization is moved from do_filp_open into the open method, which means it happens slightly earlier now, and only for regular files. The latter is fine because we don't need to initialize it for operations on special files, and we already do it as part of the namespace operations for directories. Add a dquot_file_open helper that filesystems that support generic quotas can use to fill in ->open. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-05dquot: cleanup dquot transfer routineChristoph Hellwig
Get rid of the transfer dquot operation - it is now always called from the filesystem and if a filesystem really needs it's own (which none currently does) it can just call into it's own routine directly. Rename the now static low-level dquot_transfer helper to __dquot_transfer and vfs_dq_transfer to dquot_transfer to have a consistent namespace, and make the new dquot_transfer return a normal negative errno value which all callers expect. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-05dquot: cleanup space allocation / freeing routinesChristoph Hellwig
Get rid of the alloc_space, free_space, reserve_space, claim_space and release_rsv dquot operations - they are always called from the filesystem and if a filesystem really needs their own (which none currently does) it can just call into it's own routine directly. Move shared logic into the common __dquot_alloc_space, dquot_claim_space_nodirty and __dquot_free_space low-level methods, and rationalize the wrappers around it to move as much as possible code into the common block for CONFIG_QUOTA vs not. Also rename all these helpers to be named dquot_* instead of vfs_dq_*. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-05ext3: add writepage sanity checksDmitry Monakhov
- There is theoretical possibility to perform writepage on RO superblock. Add explicit check for what case. - Page must being locked before writepage. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-05ext3: Truncate allocated blocks if direct IO write fails to update i_sizeJan Kara
We have to truncate blocks allocated to file during direct IO when we fail to update i_size properly. Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-05ext3: Use bitops to read/modify EXT3_I(inode)->i_stateJan Kara
At several places we modify EXT3_I(inode)->i_state without holding i_mutex (ext3_release_file, ext3_bmap, ext3_journalled_writepage, ext3_do_update_inode, ...). These modifications are racy and we can lose updates to i_state. So convert handling of i_state to use bitops which are atomic. Signed-off-by: Jan Kara <jack@suse.cz>
2009-12-23ext3: quota macros cleanup [V2]Dmitry Monakhov
Currently all quota block reservation macros contains hardcoded "2" aka MAXQUOTAS value. This is no good because in some places it is not obvious to understand what does this digit represent. Let's introduce new macro with self descriptive name. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>
2009-12-10ext3: Fix data / filesystem corruption when write fails to copy dataJan Kara
When ext3_write_begin fails after allocating some blocks or generic_perform_write fails to copy data to write, we truncate blocks already instantiated beyond i_size. Although these blocks were never inside i_size, we have to truncate pagecache of these blocks so that corresponding buffers get unmapped. Otherwise subsequent __block_prepare_write (called because we are retrying the write) will find the buffers mapped, not call ->get_block, and thus the page will be backed by already freed blocks leading to filesystem and data corruption. Reported-by: James Y Knight <foom@fuhm.net> Signed-off-by: Jan Kara <jack@suse.cz>