linux-toradex.git/fs, branch v3.2.12

block: Fix NULL pointer dereference in sd_revalidate_disk

2012-03-19T16:02:34+00:00

commit fe316bf2d5847bc5dd975668671a7b1067603bc7 upstream.

Since 2.6.39 (1196f8b), when a driver returns -ENOMEDIUM for open(),
__blkdev_get() calls rescan_partitions() to remove
in-kernel partition structures and raise KOBJ_CHANGE uevent.

However it ends up calling driver's revalidate_disk without open
and could cause oops.

In the case of SCSI:

  process A                  process B
  ----------------------------------------------
  sys_open
    __blkdev_get
      sd_open
        returns -ENOMEDIUM
                             scsi_remove_device
                               
      rescan_partitions
        sd_revalidate_disk
          
Oopses are reported here:
http://marc.info/?l=linux-scsi&m=132388619710052

This patch separates the partition invalidation from rescan_partitions()
and use it for -ENOMEDIUM case.

Reported-by: Huajun Li 
Signed-off-by: Jun'ichi Nomura 
Acked-by: Tejun Heo 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

vfs: fix double put after complete_walk()

2012-03-19T16:02:20+00:00

commit 097b180ca09b581ef0dc24fbcfc1b227de3875df upstream.

complete_walk() already puts nd->path, no need to do it again at cleanup time.

This would result in Oopses if triggered, apparently the codepath is not too
well exercised.

Signed-off-by: Miklos Szeredi 
Signed-off-by: Al Viro 
Signed-off-by: Greg Kroah-Hartman

vfs: fix return value from do_last()

2012-03-19T16:02:20+00:00

commit 7f6c7e62fcc123e6bd9206da99a2163fe3facc31 upstream.

complete_walk() returns either ECHILD or ESTALE.  do_last() turns this into
ECHILD unconditionally.  If not in RCU mode, this error will reach userspace
which is complete nonsense.

Signed-off-by: Miklos Szeredi 
Signed-off-by: Al Viro 
Signed-off-by: Greg Kroah-Hartman

CIFS: Do not kmalloc under the flocks spinlock

2012-03-19T16:02:20+00:00

commit d5751469f210d2149cc2159ffff66cbeef6da3f2 upstream.

Reorganize the code to make the memory already allocated before
spinlock'ed loop.

Reviewed-by: Jeff Layton 
Signed-off-by: Pavel Shilovsky 
Signed-off-by: Steve French 
Signed-off-by: Greg Kroah-Hartman

aio: fix the "too late munmap()" race

2012-03-19T16:02:19+00:00

commit c7b285550544c22bc005ec20978472c9ac7138c6 upstream.

Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing.  As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...

We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.

Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().

Signed-off-by: Al Viro 
Reviewed-by: Jeff Moyer 
Acked-by: Benjamin LaHaise 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

aio: fix io_setup/io_destroy race

2012-03-19T16:02:18+00:00

commit 86b62a2cb4fc09037bbce2959d2992962396fd7f upstream.

Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit.  The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...

Signed-off-by: Al Viro 
Acked-by: Benjamin LaHaise 
Reviewed-by: Jeff Moyer 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

cifs: fix dentry refcount leak when opening a FIFO on lookup

2012-03-12T19:31:26+00:00

commit 5bccda0ebc7c0331b81ac47d39e4b920b198b2cd upstream.

The cifs code will attempt to open files on lookup under certain
circumstances. What happens though if we find that the file we opened
was actually a FIFO or other special file?

Currently, the open filehandle just ends up being leaked leading to
a dentry refcount mismatch and oops on umount. Fix this by having the
code close the filehandle on the server if it turns out not to be a
regular file. While we're at it, change this spaghetti if statement
into a switch too.

Reported-by: CAI Qian 
Tested-by: CAI Qian 
Reviewed-by: Shirish Pargaonkar 
Signed-off-by: Jeff Layton 
Signed-off-by: Steve French 
Signed-off-by: Greg Kroah-Hartman

aio: wake up waiters when freeing unused kiocbs

2012-03-12T19:31:25+00:00

commit 880641bb9da2473e9ecf6c708d993b29928c1b3c upstream.

Bart Van Assche reported a hung fio process when either hot-removing
storage or when interrupting the fio process itself.  The (pruned) call
trace for the latter looks like so:

  fio             D 0000000000000001     0  6849   6848 0x00000004
   ffff880092541b88 0000000000000046 ffff880000000000 ffff88012fa11dc0
   ffff88012404be70 ffff880092541fd8 ffff880092541fd8 ffff880092541fd8
   ffff880128b894d0 ffff88012404be70 ffff880092541b88 000000018106f24d
  Call Trace:
    schedule+0x3f/0x60
    io_schedule+0x8f/0xd0
    wait_for_all_aios+0xc0/0x100
    exit_aio+0x55/0xc0
    mmput+0x2d/0x110
    exit_mm+0x10d/0x130
    do_exit+0x671/0x860
    do_group_exit+0x44/0xb0
    get_signal_to_deliver+0x218/0x5a0
    do_signal+0x65/0x700
    do_notify_resume+0x65/0x80
    int_signal+0x12/0x17

The problem lies with the allocation batching code.  It will
opportunistically allocate kiocbs, and then trim back the list of iocbs
when there is not enough room in the completion ring to hold all of the
events.

In the case above, what happens is that the pruning back of events ends
up freeing up the last active request and the context is marked as dead,
so it is thus responsible for waking up waiters.  Unfortunately, the
code does not check for this condition, so we end up with a hung task.

Signed-off-by: Jeff Moyer 
Reported-by: Bart Van Assche 
Tested-by: Bart Van Assche 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

regset: Prevent null pointer reference on readonly regsets

2012-03-12T19:31:24+00:00

commit c8e252586f8d5de906385d8cf6385fee289a825e upstream.

The regset common infrastructure assumed that regsets would always
have .get and .set methods, but not necessarily .active methods.
Unfortunately people have since written regsets without .set methods.

Rather than putting in stub functions everywhere, handle regsets with
null .get or .set methods explicitly.

Signed-off-by: H. Peter Anvin 
Reviewed-by: Oleg Nesterov 
Acked-by: Roland McGrath 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

autofs: work around unhappy compat problem on x86-64

2012-03-12T19:31:21+00:00

commit a32744d4abae24572eff7269bc17895c41bd0085 upstream.

When the autofs protocol version 5 packet type was added in commit
5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.

However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.

And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64.  And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.

End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.

As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.

Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode.  But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.

So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task.  That way we
will write the properly sized packet that user mode expects.

Not pretty.  Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.

Reported-and-tested-by: Thomas Meyer 
Signed-off-by: Ian Kent 
Signed-off-by: Linus Torvalds 
Cc: Jonathan Nieder 
Signed-off-by: Greg Kroah-Hartman