summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-03-04ext4: use ext4_get_block_write in buffer writeJiaying Zhang
Allocate uninitialized extent before ext4 buffer write and convert the extent to initialized after io completes. The purpose is to make sure an extent can only be marked initialized after it has been written with new data so we can safely drop the i_mutex lock in ext4 DIO read without exposing stale data. This helps to improve multi-thread DIO read performance on high-speed disks. Skip the nobh and data=journal mount cases to make things simple for now. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-02ext4: mechanical rename some of the direct I/O get_block's identifiersJiaying Zhang
This commit renames some of the direct I/O's block allocation flags, variables, and functions introduced in Mingming's "Direct IO for holes and fallocate" patches so that they can be used by ext4's buffered write path as well. Also changed the related function comments accordingly to cover both direct write and buffered write cases. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-02ext4: make "offset" consistent in ext4_check_dir_entry()Toshiyuki Okajima
The callers of ext4_check_dir_entry() usually pass in the "file offset" (ext4_readdir, htree_dirblock_to_tree, search_dirblock, ext4_dx_find_entry, empty_dir), but a few callers (add_dirent_to_buf, ext4_delete_entry) only pass in the buffer offset. To accomodate those last two (which would be hard to fix otherwise), this patch changes ext4_check_dir_entry() to print the physical block number and the relative offset as well as the passed-in offset. Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-01ext4: Handle non empty on-disk orphan linkDmitry Monakhov
In case of truncate errors we explicitly remove inode from in-core orphan list via orphan_del(NULL, inode) without modifying the on-disk list. But later on, the same inode may be inserted in the orphan list again which will result the on-disk linked list getting corrupted. If inode i_dtime contains valid value, then skip on-disk list modification. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-01ext4: explicitly remove inode from orphan list after failed direct ioDmitry Monakhov
Otherwise non-empty orphan list will be triggered on umount. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-01ext4: fix error handling in migrateDmitry Monakhov
Set i_nlink to zero for temporary inode from very beginning. otherwise we may fail to start new journal handle and this inode will be unreferenced but with i_nlink == 1 Since we hold inode reference it can not be pruned. Also add missed journal_start retval check. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-01ext4: deprecate obsoleted mount optionsDmitry Monakhov
Declare following list of mount options as deprecated: - bsddf, miniddf - grpid, bsdgroups, nogrpid, sysvgroups Declare following list of default mount options as deprecated: - bsdgroups Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-01ext4: Fix fencepost error in chosing choosing group vs file preallocation.Tao Ma
The ext4 multiblock allocator decides whether to use group or file preallocation based on the file size. When the file size reaches s_mb_stream_request (default is 16 blocks), it changes to use a file-specific preallocation. This is cool, but it has a tiny problem. See a simple script: mkfs.ext4 -b 1024 /dev/sda8 1000000 mount -t ext4 -o nodelalloc /dev/sda8 /mnt/ext4 for((i=0;i<5;i++)) do cat /mnt/4096>>/mnt/ext4/a #4096 is a file with 4096 characters. cat /mnt/4096>>/mnt/ext4/b done debuge4fs -R 'stat a' /dev/sda8|grep BLOCKS -A 1 And you get BLOCKS: (0-14):8705-8719, (15):2356, (16-19):8465-8468 So there are 3 extents, a bit strange for the lonely 15th logical block. As we write to the 16 blocks, we choose file preallocation in ext4_mb_group_or_file, but in ext4_mb_normalize_request, we meet with the 16*1024 range, so no preallocation will be carried. file b then reserves the space after '2356', so when when write 16, we start from another part. This patch just change the check in ext4_mb_group_or_file, so that for the lonely 15 we will still use group preallocation. After the patch, we will get: debuge4fs -R 'stat a' /dev/sda8|grep BLOCKS -A 1 BLOCKS: (0-15):8705-8720, (16-19):8465-8468 Looks more sane. Thanks. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-02-24jbd2: clean up an assertion in jbd2_journal_commit_transaction()dingdinghua
commit_transaction has the same value as journal->j_running_transaction, so we can simplify the assert statement. Signed-off-by: dingdinghua <dingdinghua@nrchpc.ac.cn> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-01ext4: trivial quota cleanupDmitry Monakhov
The patch is aimed to reorganize and simplify quota code a bit. Quota code is itself complex enough, but we can make it more readable in some places: - Move quota option parsing to separate functions. - Simplify old-quota and journaled-quota mix check. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-02-24ext4: mount flags manipulation cleanupDmitry Monakhov
Replace intermediate EXT4_MOUNT_XXX flags manipulation to corresponding macro. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-02-24ext4: Add flag to files with blocks intentionally past EOFJiaying Zhang
fallocate() may potentially instantiate blocks past EOF, depending on the flags used when it is called. e2fsck currently has a test for blocks past i_size, and it sometimes trips up - noticeably on xfstests 013 which runs fsstress. This patch from Jiayang does fix it up - it (along with e2fsprogs updates and other patches recently from Aneesh) has survived many fsstress runs in a row. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-02-16ext4: Fix BUG_ON at fs/buffer.c:652 in no journal modeCurt Wohlgemuth
Calls to ext4_handle_dirty_metadata should only pass in an inode pointer for inode-specific metadata, and not for shared metadata blocks such as inode table blocks, block group descriptors, the superblock, etc. The BUG_ON can get tripped when updating a special device (such as a block device) that is opened (so that i_mapping is set in fs/block_dev.c) and the file system is mounted in no journal mode. Addresses-Google-Bug: #2404870 Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-02-15jbd2: delay discarding buffers in journal_unmap_bufferdingdinghua
Delay discarding buffers in journal_unmap_buffer until we know that "add to orphan" operation has definitely been committed, otherwise the log space of committing transation may be freed and reused before truncate get committed, updates may get lost if crash happens. Signed-off-by: dingdinghua <dingdinghua@nrchpc.ac.cn> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-04ext4: correctly calculate number of blocks for fiemapLeonard Michlmayr
ext4_fiemap() rounds the length of the requested range down to blocksize, which is is not the true number of blocks that cover the requested region. This problem is especially impressive if the user requests only the first byte of a file: not a single extent will be reported. We fix this by calculating the last block of the region and then subtract to find the number of blocks in the extents. Signed-off-by: Leonard Michlmayr <leonard.michlmayr@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-02-15ext4: add missing error checking to ext4_expand_extra_isize_ea()Roel Kluin
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
2010-02-15ext4: move __func__ into a macro for ext4_warning, ext4_errorEric Sandeen
Just a pet peeve of mine; we had a mishash of calls with either __func__ or "function_name" and the latter tends to get out of sync. I think it's easier to just hide the __func__ in a macro, and it'll be consistent from then on. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-01-25ext4: Reserve INCOMPAT_EA_INODE and INCOMPAT_DIRDATA feature codepointsTheodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-01-24ext4: Use bitops to read/modify EXT4_I(inode)->i_stateTheodore Ts'o
At several places we modify EXT4_I(inode)->i_state without holding i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage, ext4_do_update_inode, ...). These modifications are racy and we can lose updates to i_state. So convert handling of i_state to use bitops which are atomic. Cc: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-12-07ext4: Use slab allocator for sub-page sized allocationsTheodore Ts'o
Now that the SLUB seems to be fixed so that it respects the requested alignment, use kmem_cache_alloc() to allocator if the block size of the buffer heads to be allocated is less than the page size. Previously, we were using 16k page on a Power system for each buffer, even when the file system was using 1k or 4k block size. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-01-01ext4: Add new tracepoints to debug delayed allocation space functionsTheodore Ts'o
Add tracepoints for ext4_da_reserve_space(), ext4_da_update_reserve_space(), and ext4_da_release_space(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-12-23ext4: Add new tracepoint for jbd2_cleanup_journal_tailTheodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-01-22ext4: Add block validity check when truncating indirect block mapped inodesTheodore Ts'o
Add checks to ext4_free_branches() to make sure a block number found in an indirect block are valid before trying to free it. If a bad block number is found, stop freeing the indirect block immediately, since the file system is corrupt and we will need to run fsck anyway. This also avoids spamming the logs, and specifically avoids driver-level "attempt to access beyond end of device" errors obscure what is really going on. If you get *really*, *really*, *really* unlucky, without this patch, a supposed indirect block containing garbage might contain a reference to a primary block group descriptor, in which case ext4_free_branches() could end up zero'ing out a block group descriptor block, and if then one of the block bitmaps for a block group described by that bg descriptor block is not in memory, and is read in by ext4_read_block_bitmap(). This function calls ext4_valid_block_bitmap(), which assumes that bg_inode_table() was validated at mount time and hasn't been modified since. Since this assumption is no longer valid, it's possible for the value (ext4_inode_table(sb, desc) - group_first_block) to go negative, which will cause ext4_find_next_zero_bit() to trigger a kernel GPF. Addresses-Google-Bug: #2220436 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-02-15ext4: Fix optional-arg mount optionsEric Sandeen
We have 2 mount options, "barrier" and "auto_da_alloc" which may or may not take a 1/0 argument. This causes the ext4 superblock mount code to subtract uninitialized pointers and pass the result to kmalloc, which results in very noisy failures. Per Ted's suggestion, initialize the args struct so that we know whether match_token() found an argument for the option, and skip match_int() if not. Also, return error (0) from parse_options if we thought we found an argument, but match_int() Fails. Reported-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-02-04ext4: fix async i/o writes beyond 4GB to a sparse fileEric Sandeen
The "offset" member in ext4_io_end holds bytes, not blocks, so ext4_lblk_t is wrong - and too small (u32). This caused the async i/o writes to sparse files beyond 4GB to fail when they wrapped around to 0. Also fix up the type of arguments to ext4_convert_unwritten_extents(), it gets ssize_t from ext4_end_aio_dio_nolock() and ext4_ext_direct_IO(). Reported-by: Giel de Nijs <giel@vectorwise.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
2010-02-12Linux 2.6.33-rc8v2.6.33-rc8Linus Torvalds
2010-02-12Merge branch 'fix/hda' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda - use WARN_ON_ONCE() for zero-division detection
2010-02-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: drm/i915: hold ref on flip object until it completes drm/i915: Fix crash while aborting hibernation drm/i915: Correctly return -ENOMEM on allocation failure in cmdbuf ioctls. drm/i915: fix pipe source image setting in flip command drm/i915: fix flip done interrupt on Ironlake drm/i915: untangle page flip completion drm/i915: handle FBC and self-refresh better drm/i915: Increase fb alignment to 64k drm/i915: Update write_domains on active list after flush. drm/i915: Rework DPLL calculation parameters for Ironlake
2010-02-12ALSA: hda - use WARN_ON_ONCE() for zero-division detectionTakashi Iwai
Replace the zero-division warning message with WARN_ON_ONCE() per the advice by Linus. This shouldn't happen, but if it happens, it's possible that the bug happens often due to buggy IRQs. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2010-02-12parisc: fix tracing of signalsKyle McMartin
Mike Frysinger pointed out that calling tracehook_signal_handler with stepping=0 missed testing the thread flags, resulting in not calling ptrace_notify. Fix this by testing if we're single stepping or branch stepping and setting the flag accordingly. Tested, seems to work. Reported-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-12Merge branch 'fix/hda' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda-intel: Avoid divide by zero crash
2010-02-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: regulator/lp3971: vol_map out of bounds in lp3971_{ldo,dcdc}_set_voltage() regulator: Fix display of null constraints for regulators
2010-02-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixesLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: Fix bmap allocation corner-case bug GFS2: Fix error code
2010-02-12regulator/lp3971: vol_map out of bounds in lp3971_{ldo,dcdc}_set_voltage()Roel Kluin
After `for (val = LDO_VOL_MIN_IDX; val <= LDO_VOL_MAX_IDX; val++)', if no break occurs, val reaches LDO_VOL_MIN_IDX + 1, which is out of bounds for ldo45_voltage_map[] and ldo123_voltage_map[]. Similarly BUCK_TARGET_VOL_MAX_IDX + 1 is out of bounds for buck_voltage_map[]. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2010-02-12regulator: Fix display of null constraints for regulatorsMark Brown
If the regulator constraints are empty and there is no voltage reported then nothing will be added to the text displayed for the constraints, leading to random stack data being printed. This is unlikely to happen for practical regulators since most will at least report a voltage but should still be fixed. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: stable@kernel.org Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2010-02-12GFS2: Fix bmap allocation corner-case bugSteven Whitehouse
This patch solves a corner case during allocation which occurs if both metadata (indirect) and data blocks are required but there is an obstacle in the filesystem (e.g. a resource group header or another allocated block) such that when the allocation is requested only enough blocks for the metadata are returned. By changing the exit condition of this loop, we ensure that a minimum of one data block will always be returned. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2010-02-12GFS2: Fix error codeAbhijith Das
We need this one-liner to signal the mount helper of the 'insufficient journals' condition. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2010-02-11Merge branch 'omap-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 * 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: OMAP: hsmmc: fix memory leak
2010-02-11Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Do not falsely trigger kerneloops
2010-02-11Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cciss: Make cciss_seq_show handle holes in the h->drv[] array cfq-iosched: split seeky coop queues after one slice
2010-02-11Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix the mapping of the NFSERR_SERVERFAULT error NFS: Remove a redundant check for PageFsCache in nfs_migrate_page() NFS: Fix a bug in nfs_fscache_release_page()
2010-02-11Merge git://git.infradead.org/users/cbou/battery-2.6.33Linus Torvalds
* git://git.infradead.org/users/cbou/battery-2.6.33: wm97xx_battery: Handle missing platform data gracefully
2010-02-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] qla2xxx: Obtain proper host structure during response-queue processing. [SCSI] compat_ioct: fix bsg SG_IO [SCSI] qla2xxx: make msix interrupt handler safe for irq [SCSI] zfcp: Report FC BSG errors in correct field [SCSI] mptfusion : mptscsih_abort return value should be SUCCESS instead of value 0.
2010-02-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: psmouse - make sure we don't schedule reconnects after cleanup
2010-02-11Merge branch 'drm-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (30 commits) vgaarb: fix incorrect dereference of userspace pointer. drm/radeon/kms: retry auxch on 0x20 timeout value. drm/radeon: Skip dma copy test in benchmark if card doesn't have dma engine. drm/vmwgfx: Fix a circular locking dependency bug. drm/vmwgfx: Drop scanout flag compat and add execbuf ioctl parameter members. Bumps major. drm/vmwgfx: Report propper framebuffer_{max|min}_{width|height} drm/vmwgfx: Update the user-space interface. drm/radeon/kms: fix screen clearing before fbcon. nouveau: fix state detection with switchable graphics drm/nouveau: move dereferences after null checks drm/nv50: make the pgraph irq handler loop like the pre-nv50 version drm/nv50: delete ramfc object after disabling fifo, not before drm/nv50: avoid unloading pgraph context when ctxprog is running drm/nv50: align size of buffer object to the right boundaries. drm/nv50: disregard dac outputs in nv50_sor_dpms() drm/nv50: prevent multiple init tables being parsed at the same time drm/nouveau: make dp auxch xfer len check for reads only drm/nv40: make INIT_COMPUTE_MEM a NOP, just like nv50 drm/nouveau: Add proper vgaarb support. drm/nouveau: Fix fbcon on mixed pre-NV50 + NV50 multicard. ...
2010-02-11Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: drivers/dma: Correct NULL test async-tx: fix buffer submission error handling in ipu_idma.c dmaengine: correct onstack wait_queue_head declaration ioat: fix infinite timeout checking in ioat2_quiesce dmaengine: fix memleak in dma_async_device_unregister
2010-02-11Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: MIPS: Don't probe reserved EntryHi bits. MIPS: SNI: Correct NULL test MIPS: Fix __devinit __cpuinit confusion in cpu_cache_init MIPS: IP27: Make defconfig useful again. MIPS: Fixup of the r4k timer
2010-02-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cm: Revert association of an RDMA device when binding to loopback
2010-02-11Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, apic: Don't use logical-flat mode when CPU hotplug may exceed 8 CPUs x86-32: Make AT_VECTOR_SIZE_ARCH=2 x86/agp: Fix amd64-agp module initialization regression x86, doc: Fix minor spelling error in arch/x86/mm/gup.c
2010-02-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc32: Fix thinko in previous change. sparc: Align clone and signal stacks to 16 bytes.