From 58e59854a3dfa9ab1fa1bcfc6324646cd77046a7 Mon Sep 17 00:00:00 2001 From: Jie Liu Date: Tue, 16 Jul 2013 13:11:16 +0800 Subject: xfs: fix assertion failure in xfs_vm_write_failed() In xfs_vm_write_failed(), we evaluate the block_offset of pos with PAGE_MASK which is an unsigned long. That is fine on 64-bit platforms regardless of whether the request pos is 32-bit or 64-bit. However, on 32-bit platforms the value is 0xfffff000 and so the high 32 bits in it will be masked off with (pos & PAGE_MASK) for a 64-bit pos. As a result, the evaluated block_offset is incorrect which will cause this failure ASSERT(block_offset + from == pos); and potentially pass the wrong block to xfs_vm_kill_delalloc_range(). In this case, we can get a kernel panic if CONFIG_XFS_DEBUG is enabled: XFS: Assertion failed: block_offset + from == pos, file: fs/xfs/xfs_aops.c, line: 1504 ------------[ cut here ]------------ kernel BUG at fs/xfs/xfs_message.c:100! invalid opcode: 0000 [#1] SMP ........ Pid: 4057, comm: mkfs.xfs Tainted: G O 3.9.0-rc2 #1 EIP: 0060:[] EFLAGS: 00010282 CPU: 0 EIP is at assfail+0x2b/0x30 [xfs] EAX: 00000056 EBX: f6ef28a0 ECX: 00000007 EDX: f57d22a4 ESI: 1c2fb000 EDI: 00000000 EBP: ea6b5d30 ESP: ea6b5d1c DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 094f3ff4 CR3: 2bcb4000 CR4: 000006f0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process mkfs.xfs (pid: 4057, ti=ea6b4000 task=ea5799e0 task.ti=ea6b4000) Stack: 00000000 f9525c48 f951fa80 f951f96b 000005e4 ea6b5d7c f9494b34 c19b0ea2 00000066 f3d6c620 c19b0ea2 00000000 e9a91458 00001000 00000000 00000000 00000000 c15c7e89 00000000 1c2fb000 00000000 00000000 1c2fb000 00000080 Call Trace: [] xfs_vm_write_failed+0x74/0x1b0 [xfs] [] ? printk+0x4d/0x4f [] xfs_vm_write_begin+0x10d/0x170 [xfs] [] generic_file_buffered_write+0xdc/0x210 [] xfs_file_buffered_aio_write+0xf9/0x190 [xfs] [] xfs_file_aio_write+0xf3/0x160 [xfs] [] do_sync_write+0x94/0xd0 [] vfs_write+0x8f/0x160 [] ? wait_on_retry_sync_kiocb+0x50/0x50 [] sys_write+0x47/0x80 [] sysenter_do_call+0x12/0x28 ............. EIP: [] assfail+0x2b/0x30 [xfs] SS:ESP 0068:ea6b5d1c ---[ end trace cdd9af4f4ecab42f ]--- Kernel panic - not syncing: Fatal exception In order to avoid this, we can evaluate the block_offset of the start of the page by using shifts rather than masks the mismatch problem. Thanks Dave Chinner for help finding and fixing this bug. Reported-by: Michael L. Semon Reviewed-by: Dave Chinner Reviewed-by: Mark Tinguely Signed-off-by: Jie Liu Signed-off-by: Ben Myers --- fs/xfs/xfs_aops.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) (limited to 'fs/xfs/xfs_aops.c') diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 596ec71da00e..79670869d436 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -1516,13 +1516,26 @@ xfs_vm_write_failed( loff_t pos, unsigned len) { - loff_t block_offset = pos & PAGE_MASK; + loff_t block_offset; loff_t block_start; loff_t block_end; loff_t from = pos & (PAGE_CACHE_SIZE - 1); loff_t to = from + len; struct buffer_head *bh, *head; + /* + * The request pos offset might be 32 or 64 bit, this is all fine + * on 64-bit platform. However, for 64-bit pos request on 32-bit + * platform, the high 32-bit will be masked off if we evaluate the + * block_offset via (pos & PAGE_MASK) because the PAGE_MASK is + * 0xfffff000 as an unsigned long, hence the result is incorrect + * which could cause the following ASSERT failed in most cases. + * In order to avoid this, we can evaluate the block_offset of the + * start of the page by using shifts rather than masks the mismatch + * problem. + */ + block_offset = (pos >> PAGE_CACHE_SHIFT) << PAGE_CACHE_SHIFT; + ASSERT(block_offset + from == pos); head = page_buffers(page); -- cgit v1.2.3 From 6898811459ff523ed256c07d1d5c54ba915d51ac Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Mon, 12 Aug 2013 20:49:42 +1000 Subject: xfs: create xfs_bmap_util.[ch] There is a bunch of code in xfs_bmap.c that is kernel specific and not shared with userspace. To minimise the difference between the kernel and userspace code, shift this unshared code to xfs_bmap_util.c, and the declarations to xfs_bmap_util.h. The biggest issue here is xfs_bmap_finish() - userspace has it's own definition of this function, and so we need to move it out of xfs_bmap.[ch]. This means several other files need to include xfs_bmap_util.h as well. It also introduces and interesting dance for the stack switching code in xfs_bmapi_allocate(). The stack switching/workqueue code is actually moved to xfs_bmap_util.c, so that userspace can simply use a #define in a header file to connect the dots without needing to know about the stack switch code at all. Signed-off-by: Dave Chinner Reviewed-by: Mark Tinguely Signed-off-by: Ben Myers --- fs/xfs/xfs_aops.c | 1 + 1 file changed, 1 insertion(+) (limited to 'fs/xfs/xfs_aops.c') diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 79670869d436..709347d6f468 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -31,6 +31,7 @@ #include "xfs_vnodeops.h" #include "xfs_trace.h" #include "xfs_bmap.h" +#include "xfs_bmap_util.h" #include #include #include -- cgit v1.2.3 From c24b5dfadc4a4f7a13af373067871479c74455e6 Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Mon, 12 Aug 2013 20:49:45 +1000 Subject: xfs: kill xfs_vnodeops.[ch] Now we have xfs_inode.c for holding kernel-only XFS inode operations, move all the inode operations from xfs_vnodeops.c to this new file as it holds another set of kernel-only inode operations. The name of this file traces back to the days of Irix and it's vnodes which we don't have anymore. Essentially this move consolidates the inode locking functions and a bunch of XFS inode operations into the one file. Eventually the high level functions will be merged into the VFS interface functions in xfs_iops.c. This leaves only internal preallocation, EOF block manipulation and hole punching functions in vnodeops.c. Move these to xfs_bmap_util.c where we are already consolidating various in-kernel physical extent manipulation and querying functions. Signed-off-by: Dave Chinner Reviewed-by: Mark Tinguely Signed-off-by: Ben Myers --- fs/xfs/xfs_aops.c | 1 - 1 file changed, 1 deletion(-) (limited to 'fs/xfs/xfs_aops.c') diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 709347d6f468..a82c83707324 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -28,7 +28,6 @@ #include "xfs_alloc.h" #include "xfs_error.h" #include "xfs_iomap.h" -#include "xfs_vnodeops.h" #include "xfs_trace.h" #include "xfs_bmap.h" #include "xfs_bmap_util.h" -- cgit v1.2.3 From 3d3c8b5222b92447bffaa4127ee18c757f32a460 Mon Sep 17 00:00:00 2001 From: Jie Liu Date: Mon, 12 Aug 2013 20:49:59 +1000 Subject: xfs: refactor xfs_trans_reserve() interface With the new xfs_trans_res structure has been introduced, the log reservation size, log count as well as log flags are pre-initialized at mount time. So it's time to refine xfs_trans_reserve() interface to be more neat. Also, introduce a new helper M_RES() to return a pointer to the mp->m_resv structure to simplify the input. Signed-off-by: Jie Liu Signed-off-by: Dave Chinner Reviewed-by: Mark Tinguely Signed-off-by: Ben Myers --- fs/xfs/xfs_aops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'fs/xfs/xfs_aops.c') diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index a82c83707324..1269434ceec1 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -116,7 +116,7 @@ xfs_setfilesize_trans_alloc( tp = xfs_trans_alloc(mp, XFS_TRANS_FSYNC_TS); - error = xfs_trans_reserve(tp, 0, XFS_FSYNC_TS_LOG_RES(mp), 0, 0, 0); + error = xfs_trans_reserve(tp, &M_RES(mp)->tr_fsyncts, 0, 0); if (error) { xfs_trans_cancel(tp, 0); return error; -- cgit v1.2.3 From c7c1a7d8bb45479e01ae47a8e8dbf4e769028f07 Mon Sep 17 00:00:00 2001 From: Zhi Yong Wu Date: Wed, 7 Aug 2013 10:11:09 +0000 Subject: xfs: rename bio_add_buffer() to xfs_bio_add_buffer() Follow up with xfs naming style. Signed-off-by: Zhi Yong Wu Reviewed-by: Mark Tinguely Signed-off-by: Ben Myers --- fs/xfs/xfs_aops.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'fs/xfs/xfs_aops.c') diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 1269434ceec1..2f9fb421915a 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -451,7 +451,7 @@ xfs_start_page_writeback( end_page_writeback(page); } -static inline int bio_add_buffer(struct bio *bio, struct buffer_head *bh) +static inline int xfs_bio_add_buffer(struct bio *bio, struct buffer_head *bh) { return bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh)); } @@ -525,7 +525,7 @@ xfs_submit_ioend( goto retry; } - if (bio_add_buffer(bio, bh) != bh->b_size) { + if (xfs_bio_add_buffer(bio, bh) != bh->b_size) { xfs_submit_ioend_bio(wbc, ioend, bio); goto retry; } -- cgit v1.2.3