linux-toradex.git/fs/xfs/libxfs, branch v4.3

Merge branch 'xfs-misc-fixes-for-4.3-4' into for-next

2015-09-01T00:30:11+00:00

libxfs: bad magic number should set da block buffer error

2015-08-28T04:50:03+00:00

If xfs_da3_node_read_verify() doesn't recognize the magic number of a
buffer it's just read, set the buffer error to -EFSCORRUPTED so that
the error can be sent up to userspace.  Without this patch we'll
notice the bad magic eventually while trying to traverse or change
the block, but we really ought to fail early in the verifier.

Signed-off-by: Darrick J. Wong 
Reviewed-by: Dave Chinner 
Signed-off-by: Dave Chinner

Merge branch 'xfs-misc-fixes-for-4.3-3' into for-next

2015-08-25T00:13:35+00:00

xfs: Fix file type directory corruption for btree directories

2015-08-25T00:05:13+00:00

Users have occasionally reported that file type for some directory
entries is wrong. This mostly happened after updating libraries some
libraries. After some debugging the problem was traced down to
xfs_dir2_node_replace(). The function uses args->filetype as a file type
to store in the replaced directory entry however it also calls
xfs_da3_node_lookup_int() which will store file type of the current
directory entry in args->filetype. Thus we fail to change file type of a
directory entry to a proper type.

Fix the problem by storing new file type in a local variable before
calling xfs_da3_node_lookup_int().

cc:  # 3.16 - 4.x
Reported-by: Giacomo Comes 
Signed-off-by: Jan Kara 
Reviewed-by: Dave Chinner 
Signed-off-by: Dave Chinner

xfs: Fix uninitialized return value in xfs_alloc_fix_freelist()

2015-08-25T00:05:13+00:00

xfs_alloc_fix_freelist() can sometimes jump to out_agbp_relse
without ever setting value of 'error' variable which is then
returned. This can happen e.g. when pag->pagf_init is set but AG is
for metadata and we want to allocate user data.

Fix the problem by initializing 'error' to 0, which is the desired
return value when we decide to skip this group.

CC: xfs@oss.sgi.com
Coverity-id: 1309714
Signed-off-by: Jan Kara 
Reviewed-by: Brian Foster 
Signed-off-by: Dave Chinner

Merge branch 'xfs-misc-fixes-for-4.3-2' into for-next

2015-08-19T23:28:45+00:00

xfs: Fix xfs_attr_leafblock definition

2015-08-19T00:34:32+00:00

struct xfs_attr_leafblock contains 'entries' array which is declared
with size 1 altough it can in fact contain much more entries. Since this
array is followed by further struct members, gcc (at least in version
4.8.3) thinks that the array has the fixed size of 1 element and thus
may optimize away all accesses beyond the end of array resulting in
non-working code. This problem was only observed with userspace code in
xfsprogs, however it's better to be safe in kernel as well and have
matching kernel and xfsprogs definitions.

cc: 
Signed-off-by: Jan Kara 
Reviewed-by: Dave Chinner 
Signed-off-by: Dave Chinner

libxfs: readahead of dir3 data blocks should use the read verifier

2015-08-19T00:33:58+00:00

In the dir3 data block readahead function, use the regular read
verifier to check the block's CRC and spot-check the block contents
instead of directly calling only the spot-checking routine.  This
prevents corrupted directory data blocks from being read into the
kernel, which can lead to garbage ls output and directory loops (if
say one of the entries contains slashes and other junk).

cc:  # 3.12 - 4.2
Signed-off-by: Darrick J. Wong 
Reviewed-by: Dave Chinner 
Signed-off-by: Dave Chinner

xfs: stop holding ILOCK over filldir callbacks

2015-08-19T00:33:00+00:00

The recent change to the readdir locking made in 40194ec ("xfs:
reinstate the ilock in xfs_readdir") for CXFS directory sanity was
probably the wrong thing to do. Deep in the readdir code we
can take page faults in the filldir callback, and so taking a page
fault while holding an inode ilock creates a new set of locking
issues that lockdep warns all over the place about.

The locking order for regular inodes w.r.t. page faults is io_lock
-> pagefault -> mmap_sem -> ilock. The directory readdir code now
triggers ilock -> page fault -> mmap_sem. While we cannot deadlock
at this point, it inverts all the locking patterns that lockdep
normally sees on XFS inodes, and so triggers lockdep. We worked
around this with commit 93a8614 ("xfs: fix directory inode iolock
lockdep false positive"), but that then just moved the lockdep
warning to deeper in the page fault path and triggered on security
inode locks. Fixing the shmem issue there just moved the lockdep
reports somewhere else, and now we are getting false positives from
filesystem freezing annotations getting confused.

Further, if we enter memory reclaim in a readdir path, we now get
lockdep warning about potential deadlocks because the ilock is held
when we enter reclaim. This, again, is different to a regular file
in that we never allow memory reclaim to run while holding the ilock
for regular files. Hence lockdep now throws
ilock->kmalloc->reclaim->ilock warnings.

Basically, the problem is that the ilock is being used to protect
the directory data and the inode metadata, whereas for a regular
file the iolock protects the data and the ilock protects the
metadata. From the VFS perspective, the i_mutex serialises all
accesses to the directory data, and so not holding the ilock for
readdir doesn't matter. The issue is that CXFS doesn't access
directory data via the VFS, so it has no "data serialisaton"
mechanism. Hence we need to hold the IOLOCK in the correct places to
provide this low level directory data access serialisation.

The ilock can then be used just when the extent list needs to be
read, just like we do for regular files. The directory modification
code can take the iolock exclusive when the ilock is also taken,
and this then ensures that readdir is correct excluded while
modifications are in progress.

Signed-off-by: Dave Chinner 
Reviewed-by: Brian Foster 
Signed-off-by: Dave Chinner

xfs: swap leaf buffer into path struct atomically during path shift

2015-08-19T00:32:33+00:00

The node directory lookup code uses a state structure that tracks the
path of buffers used to search for the hash of a filename through the
leaf blocks. When the lookup encounters a block that ends with the
requested hash, but the entry has not yet been found, it must shift over
to the next block and continue looking for the entry (i.e., duplicate
hashes could continue over into the next block). This shift mechanism
involves walking back up and down the state structure, replacing buffers
at the appropriate btree levels as necessary.

When a buffer is replaced, the old buffer is released and the new buffer
read into the active slot in the path structure. Because the buffer is
read directly into the path slot, a buffer read failure can result in
setting a NULL buffer pointer in an active slot. This throws off the
state cleanup code in xfs_dir2_node_lookup(), which expects to release a
buffer from each active slot. Instead, a BUG occurs due to a NULL
pointer dereference:

  BUG: unable to handle kernel NULL pointer dereference at 00000000000001e8
  IP: [] xfs_trans_brelse+0x2a3/0x3c0 [xfs]
  ...
  RIP: 0010:[]  [] xfs_trans_brelse+0x2a3/0x3c0 [xfs]
  ...
  Call Trace:
   [] xfs_dir2_node_lookup+0xa6/0x2c0 [xfs]
   [] xfs_dir_lookup+0x1ac/0x1c0 [xfs]
   [] xfs_lookup+0x91/0x290 [xfs]
   [] xfs_vn_lookup+0x73/0xb0 [xfs]
   [] lookup_real+0x1d/0x50
   [] path_openat+0x91e/0x1490
   [] do_filp_open+0x89/0x100
   ...

This has been reproduced via a parallel fsstress and filesystem shutdown
workload in a loop. The shutdown triggers the read error in the
aforementioned codepath and causes the BUG in xfs_dir2_node_lookup().

Update xfs_da3_path_shift() to update the active path slot atomically
with respect to the caller when a buffer is replaced. This ensures that
the caller always sees the old or new buffer in the slot and prevents
the NULL pointer dereference.

Signed-off-by: Brian Foster 
Reviewed-by: Dave Chinner 
Signed-off-by: Dave Chinner