summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2007-10-17[PATCH] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopsesEric Sandeen
Backport of ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch For regular files in sysfs, sysfs_readdir wants to traverse sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number. But, the dentry can be reclaimed under memory pressure, and there is no synchronization with readdir. This patch follows Tejun's scheme of allocating and storing an inode number in the new s_ino member of a sysfs_dirent, when dirents are created, and retrieving it from there for readdir, so that the pointer chain doesn't have to be traversed. Tejun's upstream patch uses a new-ish "ida" allocator which brings along some extra complexity; this -stable patch has a brain-dead incrementing counter which does not guarantee uniqueness, but because sysfs doesn't hash inodes as iunique expects, uniqueness wasn't guaranteed today anyway. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-17[PATCH] dir_index: error out instead of BUG on corrupt dx dirsEric Sandeen
commit 3d82abae9523c33d4a16fdfdfd2bdde316d7b56a in mainline. Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable errors with helpful warnings. With help catching other asserts from Duane Griffin <duaneg@dghda.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Duane Griffin <duaneg@dghda.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-17[PATCH] nfs: fix oops re sysctls and V4 supportAlexey Dobriyan
commit 49af7ee181f4f516ac99eba85d3f70ed42cabe76 in mainline. NFS unregisters sysctls only if V4 support is compiled in. However, sysctl table is not V4 specific, so unregister it always. Steps to reproduce: [build nfs.ko with CONFIG_NFS_V4=n] modrobe nfs rmmod nfs ls /proc/sys Unable to handle kernel paging request at ffffffff880661c0 RIP: [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 PGD 203067 PUD 207063 PMD 7e216067 PTE 0 Oops: 0000 [1] SMP CPU 1 Modules linked in: lockd nfs_acl sunrpc Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2 RIP: 0010:[<ffffffff802af8e3>] [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 RSP: 0018:ffff81007fd93e78 EFLAGS: 00010286 RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0 RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40 RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0 R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280 FS: 00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000) Stack: ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40 ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a 2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640 Call Trace: [<ffffffff80283f30>] filldir+0x0/0xf0 [<ffffffff80283f30>] filldir+0x0/0xf0 [<ffffffff802840c7>] vfs_readdir+0xa7/0xc0 [<ffffffff80284376>] sys_getdents+0x96/0xe0 [<ffffffff8020bb3e>] system_call+0x7e/0x83 Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b RIP [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 RSP <ffff81007fd93e78> CR2: ffffffff880661c0 Kernel panic - not syncing: Fatal exception Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-17[PATCH] Leases can be hidden by flocksPavel Emelyanov
commit 0e2f6db88a6900bc9db576d6b478b12ee60d61f7 in mainline. The inode->i_flock list contains the leases, flocks and posix locks in the specified order. However, the flocks are added in the head of this list thus hiding the leases from F_GETLEASE command, from time_out_leases() and other code that expects the leases to come first. The following example will demonstrate this: #define _GNU_SOURCE #include <unistd.h> #include <fcntl.h> #include <stdio.h> #include <sys/file.h> static void show_lease(int fd) { int res; res = fcntl(fd, F_GETLEASE); switch (res) { case F_RDLCK: printf("Read lease\n"); break; case F_WRLCK: printf("Write lease\n"); break; case F_UNLCK: printf("No leases\n"); break; default: printf("Some shit\n"); break; } } int main(int argc, char **argv) { int fd, res; fd = open(argv[1], O_RDONLY); if (fd == -1) { perror("Can't open file"); return 1; } res = fcntl(fd, F_SETLEASE, F_WRLCK); if (res == -1) { perror("Can't set lease"); return 1; } show_lease(fd); if (flock(fd, LOCK_SH) == -1) { perror("Can't flock shared"); return 1; } show_lease(fd); return 0; } The first call to show_lease() will show the write lease set, but the second will show no leases. Fix the flock adding so that the leases always stay in the head of this list. Found during making the flocks pid-namespaces aware. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-08-25[PATCH] Reset current->pdeath_signal on SUID binary execution (CVE-2007-3848)Marcel Holtmann
This fixes a vulnerability in the "parent process death signal" implementation discoverd by Wojciech Purczynski of COSEINC PTE Ltd. and iSEC Security Research. http://marc.info/?l=bugtraq&m=118711306802632&w=2 Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2007-08-25[PATCH] direct-io: fix error-path crashesBadari Pulavarty
Need to initialize map_bh.b_state to zero. Otherwise, in case of a faulty user-buffer its possible to go into dio_zero_block() and submit a page by mistake - since it checks for buffer_new(). http://marc.info/?l=linux-kernel&m=118551339032528&w=2 akpm: Linus had a (better) patch to just do a kzalloc() in there, but it got lost. Probably this version is better for -stable anwyay. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Joe Jin <joe.jin@oracle.com> Acked-by: Zach Brown <zach.brown@oracle.com> Cc: gurudas pai <gurudas.pai@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2007-08-25[PATCH] jbd2 commit: fix transaction droppingJan Kara
We have to check that also the second checkpoint list is non-empty before dropping the transaction. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Chuck Ebbert <cebbert@redhat.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2007-08-25[PATCH] jbd commit: fix transaction droppingJan Kara
We have to check that also the second checkpoint list is non-empty before dropping the transaction. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Chuck Ebbert <cebbert@redhat.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2007-08-25[PATCH] fs: 9p/conv.c error path fixMariusz Kozlowski
When buf_check_overflow() returns != 0 we will hit kfree(ERR_PTR(err)) and it will not be happy about it. Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-08-25[PATCH] nfsd: fix possible read-ahead cache and export table corruptionJ. Bruce Fields
The value of nperbucket calculated here is too small--we should be rounding up instead of down--with the result that the index j in the following loop can overflow the raparm_hash array. At least in my case, the next thing in memory turns out to be export_table, so the symptoms I see are crashes caused by the appearance of four zeroed-out export entries in the first bucket of the hash table of exports (which were actually entries in the readahead cache, a pointer to which had been written to the export table in this initialization code). It looks like the bug was probably introduced with commit fce1456a19f5c08b688c29f00ef90fdfa074c79b ("knfsd: make the readahead params cache SMP-friendly"). Cc: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2007-08-25[PATCH] "ext4_ext_put_in_cache" uses __u32 to receive physical block numberMingming Cao
Yan Zheng wrote: > I think I found a bug in ext4/extents.c, "ext4_ext_put_in_cache" uses > "__u32" to receive physical block number. "ext4_ext_put_in_cache" is > used in "ext4_ext_get_blocks", it sets ext4 inode's extent cache > according most recently tree lookup (higher 16 bits of saved physical > block number are always zero). when serving a mapping request, > "ext4_ext_get_blocks" first check whether the logical block is in > inode's extent cache. if the logical block is in the cache and the > cached region isn't a gap, "ext4_ext_get_blocks" gets physical block > number by using cached region's physical block number and offset in > the cached region. as described above, "ext4_ext_get_blocks" may > return wrong result when there are physical block numbers bigger than > 0xffffffff. > You are right. Thanks for reporting this! Signed-off-by: Mingming Cao <cmm@us.ibm.com> Cc: Yan Zheng <yanzheng@21cn.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2007-08-25[PATCH] splice: fix double page unlockJens Axboe
If add_to_page_cache_lru() fails, the page will not be locked. But splice jumps to an error path that does a page release and unlock, causing a BUG() in unlock_page(). Fix this by adding one more label that just releases the page. This bug was actually triggered on EL5 by gurudas pai <gurudas.pai@oracle.com> using fio. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2007-06-11[PATCH] ntfs_init_locked_inode(): fix array indexingAndrew Morton
Local variable `i' is a byte-counter. Don't use it as an index into an array of le32's. Reported-by: "young dave" <hidave.darkstar@gmail.com> Cc: "Christoph Lameter" <clameter@sgi.com> Acked-by: Anton Altaparmakov <aia21@cantab.net> Cc: <stable@kernel.org> Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-06-11[PATCH] fuse: fix mknod of regular fileMiklos Szeredi
The wrong lookup flag was tested in ->create() causing havoc (error or Oops) when a regular file was created with mknod() in a fuse filesystem. Thanks to J. Cameijo Cerdeira for the report. Kernels 2.6.18 onward are affected. Please apply to -stable as well. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-11[PATCH] JFS: Fix race waking up jfsIO kernel threadDave Kleikamp
It's possible for a journal I/O request to be added to the log_redrive queue and the jfsIO thread to be awakened after the thread releases log_redrive_lock but before it sets its state to TASK_INTERRUPTIBLE. The jfsIO thread should set the state before giving up the spinlock, so the waking thread will really wake it. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-01reiserfs: fix xattr root locking/refcount bugJeff Mahoney
The listxattr() and getxattr() operations are only protected by a read lock. As a result, if either of these operations run in parallel, a race condition exists where the xattr_root will end up being cached twice, which results in the leaking of a reference and a BUG() on umount. This patch refactors get_xa_root(), __get_xa_root(), and create_xa_root(), into one get_xa_root() function that takes the appropriate locking around the entire critical section. Reported, diagnosed and tested by Andrea Righi <a.righi@cineca.it> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Andrea Righi <a.righi@cineca.it> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Edward Shishkin <edward@namesys.com> Cc: Alex Zarochentsev <zam@namesys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-01NFS: Fix an Oops in nfs_setattr()Trond Myklebust
NFS: Fix an Oops in nfs_setattr() It looks like nfs_setattr() and nfs_rename() also need to test whether the target is a regular file before calling nfs_wb_all()... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-01exec.c: fix coredump to pipe problem and obscure "security hole"Alan Cox
exec.c: fix coredump to pipe problem and obscure "security hole" The patch checks for "|" in the pattern not the output and doesn't nail a pid on to a piped name (as it is a program name not a file) Also fixes a very very obscure security corner case. If you happen to have decided on a core pattern that starts with the program name then the user can run a program called "|myevilhack" as it stands. I doubt anyone does this. Signed-off-by: Alan Cox <alan@redhat.com> Confirmed-by: Christopher S. Aker <caker@theshore.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-13fix page leak during core dumpBrian Pomerantz
When the dump cannot occur most likely because of a full file system and the page to be written is the zero page, the call to page_cache_release() is missed. Signed-off-by: Brian Pomerantz <bapper@mvista.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Howells <dhowells@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-13revert "retries in ext4_prepare_write() violate ordering requirements"Andrew Morton
Revert b46be05004abb419e303e66e143eed9f8a6e9f3f. Same reasoning as for ext3. Cc: Kirill Korotaev <dev@openvz.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ken Chen <kenneth.w.chen@intel.com> Cc: Andrey Savochkin <saw@sw.ru> Cc: <linux-ext4@vger.kernel.org> Cc: Dmitriy Monakhov <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-13revert "retries in ext3_prepare_write() violate ordering requirements"Andrew Morton
Revert e92a4d595b464c4aae64be39ca61a9ffe9c8b278. Dmitry points out "When we block_prepare_write() failed while ext3_prepare_write() we jump to "failure" label and call ext3_prepare_failure() witch search last mapped bh and invoke commit_write untill it. This is wrong!! because some bh from begining to the last mapped bh may be not uptodate. As a result we commit to disk not uptodate page content witch contains garbage from previous usage." and "Unexpected file size increasing." Call trace the same as it was in first issue but result is different. For example we have file with i_size is zero. we want write two blocks , but fs has only one free block. ->ext3_prepare_write(...from == 0, to == 2048) retry: ->block_prepare_write() == -ENOSPC# we failed but allocated one block here. ->ext3_prepare_failure() ->commit_write( from == 0, to == 1024) # after this i_size becomes 1024 :) if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries)) goto retry; Finally when all retries will be spended ext3_prepare_failure return -ENOSPC, but i_size was increased and later block trimm procedures can't help here. We don't appear to have the horsepower to fix these issues, so let's put things back the way they were for now. Cc: Kirill Korotaev <dev@openvz.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ken Chen <kenneth.w.chen@intel.com> Cc: Andrey Savochkin <saw@sw.ru> Cc: <linux-ext4@vger.kernel.org> Cc: Dmitriy Monakhov <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-13knfsd: allow nfsd READDIR to return 64bit cookiesNeil Brown
From Neil Brown <neilb@suse.de> [PATCH] knfsd: allow nfsd READDIR to return 64bit cookies ->readdir passes lofft_t offsets (used as nfs cookies) to nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it becomes an 'off_t', which isn't good. So filesystems that returned 64bit offsets would lose. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-06CIFS: reset mode when client notices that ATTR_READONLY is no longer setAlan Tyson
[CIFS] reset mode when client notices that ATTR_READONLY is no longer set [<cebbert@redhat.com>: removed changelog part of patch] Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Alan Tyso <atyson@hp.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-06CIFS: Allow reset of file to ATTR_NORMAL when archive bit not setSteve French
[CIFS] Allow reset of file to ATTR_NORMAL when archive bit not set When a file had a dos attribute of 0x1 (readonly - but dos attribute of archive was not set) - doing chmod 0777 or equivalent would try to set a dos attribute of 0 (which some servers ignore) rather than ATTR_NORMAL (0x20) which most servers accept. Does not affect servers which support the CIFS Unix Extensions. [<cebbert@redhat.com>: removed changelog part of patch] Cc: Chuck Ebbert <cebbert@redhat.com> Acked-by: Prasad Potluri <pvp@us.ibm.com> Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-23nfs: nfs_getattr() can't call nfs_sync_mapping_range() for non-regular filesTrond Myklebust
Looks like we need a check in nfs_getattr() for a regular file. It makes no sense to call nfs_sync_mapping_range() on anything else. I think that should fix your problem: it will stop the NFS client from interfering with dirty pages on that inode's mapping. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09gfs2: fix locking mistakeJosef Whiter
Fix a locking mistake in the quota code, we do a mutex_lock instead of a mutex_unlock. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09buffer: memorder fixNick Piggin
unlock_buffer(), like unlock_page(), must not clear the lock without ensuring that the critical section is closed. Mingming later sent the same patch, saying: We are running SDET benchmark and saw double free issue for ext3 extended attributes block, which complains the same xattr block already being freed (in ext3_xattr_release_block()). The problem could also been triggered by multiple threads loop untar/rm a kernel tree. The race is caused by missing a memory barrier at unlock_buffer() before the lock bit being cleared, resulting in possible concurrent h_refcounter update. That causes a reference counter leak, then later leads to the double free that we have seen. Inside unlock_buffer(), there is a memory barrier is placed *after* the lock bit is being cleared, however, there is no memory barrier *before* the bit is cleared. On some arch the h_refcount update instruction and the clear bit instruction could be reordered, thus leave the critical section re-entered. The race is like this: For example, if the h_refcount is initialized as 1, cpu 0: cpu1
2007-03-09hugetlb: preserve hugetlb pte dirty stateKen Chen
__unmap_hugepage_range() is buggy that it does not preserve dirty state of huge_pte when unmapping hugepage range. It causes data corruption in the event of dop_caches being used by sys admin. For example, an application creates a hugetlb file, modify pages, then unmap it. While leaving the hugetlb file alive, comes along sys admin doing a "echo 3 > /proc/sys/vm/drop_caches". drop_pagecache_sb() will happily free all pages that aren't marked dirty if there are no active mapping. Later when application remaps the hugetlb file back and all data are gone, triggering catastrophic flip over on application. Not only that, the internal resv_huge_pages count will also get all messed up. Fix it up by marking page dirty appropriately. Signed-off-by: Ken Chen <kenchen@google.com> Cc: "Nish Aravamudan" <nish.aravamudan@gmail.com> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Acked-by: William Irwin <bill.irwin@oracle.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09v9fs_vfs_mkdir(): fix a double freeAdrian Bunk
Fix a double free of "dfid" introduced by commit da977b2c7eb4d6312f063a7b486f2aad99809710 and spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09ufs: restore back support of openstepEvgeniy Dushistov
This is a fix of regression, which triggered by ~2.6.16. Patch with name ufs-directory-and-page-cache-from-blocks-to-pages.patch: in additional to conversation from block to page cache mechanism added new checks of directory integrity, one of them that directory entry do not across directory chunks. But some kinds of UFS: OpenStep UFS and Apple UFS (looks like these are the same filesystems) have different directory chunk size, then common UFSes(BSD and Solaris UFS). So this patch adds ability to works with variable size of directory chunks, and set it for ufstype=openstep to right size. Tested on darwin ufs. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09NLM: Fix double free in __nlm_async_callTrond Myklebust
rpc_call_async() will always call rpc_release_calldata(), so it is an error for __nlm_async_call() to do so as well. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=7923 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jan "Yenya" Kasprzak <kas@fi.muni.cz> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09fix umask when noACL kernel meets extN tuned for ACLsHugh Dickins
Fix insecure default behaviour reported by Tigran Aivazian: if an ext2 or ext3 or ext4 filesystem is tuned to mount with "acl", but mounted by a kernel built without ACL support, then umask was ignored when creating inodes - though root or user has umask 022, touch creates files as 0666, and mkdir creates directories as 0777. This appears to have worked right until 2.6.11, when a fix to the default mode on symlinks (always 0777) assumed VFS applies umask: which it does, unless the mount is marked for ACLs; but ext[234] set MS_POSIXACL in s_flags according to s_mount_opt set according to def_mount_opts. We could revert to the 2.6.10 ext[234]_init_acl (adding an S_ISLNK test); but other filesystems only set MS_POSIXACL when ACLs are configured. We could fix this at another level; but it seems most robust to avoid setting the s_mount_opt flag in the first place (at the expense of more ifdefs). Likewise don't set the XATTR_USER flag when built without XATTR support. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Andreas Gruenbacher <agruen@suse.de> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09x86: Don't require the vDSO for handling a.out signalsAndi Kleen
x86: Don't require the vDSO for handling a.out signals and in other strange binfmts. vDSO is not necessarily mapped there. This fixes signals in a.out programs Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09ocfs2: ocfs2_link() journal credits updateMark Fasheh
Commit 592282cf2eaa33409c6511ddd3f3ecaa57daeaaa fixed some missing directory c/mtime updates in part by introducing a dinode update in ocfs2_add_entry(). Unfortunately, ocfs2_link() (which didn't update the directory inode before) is now missing a single journal credit. Fix this by doubling the number of inode updates expected during hard link creation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-19[PATCH] Fix a free-wrong-pointer bug in nfs/acl server (CVE-2007-0772)Greg Banks
Due to type confusion, when an nfsacl verison 2 'ACCESS' request finishes and tries to clean up, it calls fh_put on entiredly the wrong thing and this can cause an oops. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-03[PATCH] revert blockdev direct io back to 2.6.19 versionAndrew Morton
Andrew Vasquez is reporting as-iosched oopses and a 65% throughput slowdown due to the recent special-casing of direct-io against blockdevs. We don't know why either of these things are occurring. The patch minimally reverts us back to the 2.6.19 code for a 2.6.20 release. Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-03[PATCH] aio: fix buggy put_ioctx call in aio_complete - v2Ken Chen
An AIO bug was reported that sleeping function is being called in softirq context: BUG: warning at kernel/mutex.c:132/__mutex_lock_common() Call Trace: [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0 [<a000000100577ba0>] mutex_lock+0x20/0x40 [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0 [<a00000010018c0c0>] __put_ioctx+0xc0/0x240 [<a00000010018d470>] aio_complete+0x2f0/0x420 [<a00000010019cc80>] finished_one_bio+0x200/0x2a0 [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200 [<a00000010019d260>] dio_bio_end_aio+0x60/0x80 [<a00000010014acd0>] bio_endio+0x110/0x1c0 [<a0000001002770e0>] __end_that_request_first+0x180/0xba0 [<a000000100277b90>] end_that_request_chunk+0x30/0x60 [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod] [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod] [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod] [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod] [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod] [<a000000100277d20>] blk_done_softirq+0x160/0x1c0 [<a000000100083e00>] __do_softirq+0x200/0x240 [<a000000100083eb0>] do_softirq+0x70/0xc0 See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2 flush_workqueue() is not allowed to be called in the softirq context. However, aio_complete() called from I/O interrupt can potentially call put_ioctx with last ref count on ioctx and triggers bug. It is simply incorrect to perform ioctx freeing from aio_complete. The bug is trigger-able from a race between io_destroy() and aio_complete(). A possible scenario: cpu0 cpu1 io_destroy aio_complete wait_for_all_aios { __aio_put_req ... ctx->reqs_active--; if (!ctx->reqs_active) return; } ... put_ioctx(ioctx) put_ioctx(ctx); __put_ioctx bam! Bug trigger! The real problem is that the condition check of ctx->reqs_active in wait_for_all_aios() is incorrect that access to reqs_active is not being properly protected by spin lock. This patch adds that protective spin lock, and at the same time removes all duplicate ref counting for each kiocb as reqs_active is already used as a ref count for each active ioctx. This also ensures that buggy call to flush_workqueue() in softirq context is eliminated. Signed-off-by: "Ken Chen" <kenchen@google.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Suparna Bhattacharya <suparna@in.ibm.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-01[PATCH] procfs: Fix listing of /proc/NOT_A_TGID/taskGuillaume Chazarain
Listing /proc/PID/task were PID is not a TGID should not result in duplicated entries. [g ~]$ pidof thunderbird-bin 2751 [g ~]$ ls /proc/2751/task 2751 2770 2771 2824 2826 2834 2835 2851 2853 [g ~]$ ls /proc/2770/task 2751 2770 2771 2824 2826 2834 2835 2851 2853 2770 2771 2824 2826 2834 2835 2851 2853 [g ~]$ Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-01[PATCH] endianness bug: ntohl() misspelled as >> 24 in fh_verify().Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] ntfs: kmap_atomic() atomicity fixAndrew Morton
The KM_BIO_SRC_IRQ kmap slot requires local irq protection. Acked-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] Remove warning: VFS is out of sync with lock managerNeil Brown
But keep it as a dprintk The message can be generated in a quite normal situation: If a 'lock' request is interrupted, then the lock client needs to record that the server has the lock, incase it does. When we come the unlock, the server might say it doesn't, even though we think it does (or might) and this generates the message. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] ufs: reallocation fixEvgeniy Dushistov
In blocks reallocation function sometimes does not update some of buffer_head::b_blocknr, which may and cause data damage. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] ufs: truncate negative to unsigned fixEvgeniy Dushistov
During ufs_trunc_direct which is subroutine of ufs::truncate, we try the first of all free parts of block and then whole blocks. But we calculate size of block's part to free in the wrong way. This may cause bad update of used blocks and fragments statistic, and you can got report that you have free 32T on 1Gb partition. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] ufs: alloc metadata null page fixEvgeniy Dushistov
These series of patches result of UFS1 write support stress testing, like running fsx-linux, untar and build linux kernel etc We pass from ufs::get_block_t to levels below: pointer to the current page, to make possible things like reallocation of blocks on the fly, and we also uses this pointer for indication, what actually we allocate data block or meta data block, but currently we make decision about what we allocate on the wrong level, this may and cause oops if we allocate blocks in some special order. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] fuse: fix bug in control filesystem mountMiklos Szeredi
The BUG in fuse_ctl_add_dentry() could be triggered if the control filesystem was unmounted and mounted again while one or more fuse filesystems were present. The fix is to reset the dentry counter in fuse_ctl_kill_sb(). Bug reported by Florent Mertens. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] knfsd: ratelimit some nfsd messages that are triggered by external ↵NeilBrown
events Also remove {NFSD,RPC}_PARANOIA as having the defines doesn't really add anything. The printks covered by RPC_PARANOIA were triggered by badly formatted packets and so should be ratelimited. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] fs/lockd/clntlock.c: add missing newlines to dprintk'sAdrian Bunk
This patch adds missing newlines to dprintk's. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] uml: fix mknodJohannes Stezenbach
Fix UML hostfs mknod(): userspace has differernt dev_t size and encoding than kernel, so extract major/minor and reencode using glibc makedev() macro. Signed-off-by: Johannes Stezenbach <js@linuxtv.org> Acked-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-29[PATCH] Fix try_to_free_buffer() lockingNick Piggin
Fix commit ecdfc9787fe527491baefc22dce8b2dbd5b2908d Not to put too fine a point on it, but in a nutshell... __set_page_dirty_buffers() | try_to_free_buffers() ---------------------------+--------------------------- | spin_lock(private_lock); | drop_bufers() | spin_unlock(private_lock); spin_lock(private_lock) | !page_has_buffers() | spin_unlock(private_lock) | SetPageDirty() | | cancel_dirty_page() oops! Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] ocfs2: fix thinko in ocfs2_backup_super_blkno()Mark Fasheh
Fix a bug which was introduced when I synced up ocfs2_fs.h with ocfs2-tools. We can't do u64/u32 in kernel. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>