linux-toradex.git/fs, branch v3.0.95

SCSI: sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-09-08T04:49:32+00:00

commit 35dc248383bbab0a7203fca4d722875bc81ef091 upstream.

There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances
leads to one process writing data into the address space of some other
random unrelated process if the ioctl is interrupted by a signal.
What happens is the following:

 - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the
   underlying SCSI command will transfer data from the SCSI device to
   the buffer provided in the ioctl)

 - Before the command finishes, a signal is sent to the process waiting
   in the ioctl.  This will end up waking up the sg_ioctl() code:

		result = wait_event_interruptible(sfp->read_wait,
			(srp_done(sfp, srp) || sdp->detached));

   but neither srp_done() nor sdp->detached is true, so we end up just
   setting srp->orphan and returning to userspace:

		srp->orphan = 1;
		write_unlock_irq(&sfp->rq_list_lock);
		return result;	/* -ERESTARTSYS because signal hit process */

   At this point the original process is done with the ioctl and
   blithely goes ahead handling the signal, reissuing the ioctl, etc.

 - Eventually, the SCSI command issued by the first ioctl finishes and
   ends up in sg_rq_end_io().  At the end of that function, we run through:

	write_lock_irqsave(&sfp->rq_list_lock, iflags);
	if (unlikely(srp->orphan)) {
		if (sfp->keep_orphan)
			srp->sg_io_owned = 0;
		else
			done = 0;
	}
	srp->done = done;
	write_unlock_irqrestore(&sfp->rq_list_lock, iflags);

	if (likely(done)) {
		/* Now wake up any sg_read() that is waiting for this
		 * packet.
		 */
		wake_up_interruptible(&sfp->read_wait);
		kill_fasync(&sfp->async_qp, SIGPOLL, POLL_IN);
		kref_put(&sfp->f_ref, sg_remove_sfp);
	} else {
		INIT_WORK(&srp->ew.work, sg_rq_end_io_usercontext);
		schedule_work(&srp->ew.work);
	}

   Since srp->orphan *is* set, we set done to 0 (assuming the
   userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN
   ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext()
   to run in a workqueue.

 - In workqueue context we go through sg_rq_end_io_usercontext() ->
   sg_finish_rem_req() -> blk_rq_unmap_user() -> ... ->
   bio_uncopy_user() -> __bio_copy_iov() -> copy_to_user().

   The key point here is that we are doing copy_to_user() on a
   workqueue -- that is, we're on a kernel thread with current->mm
   equal to whatever random previous user process was scheduled before
   this kernel thread.  So we end up copying whatever data the SCSI
   command returned to the virtual address of the buffer passed into
   the original ioctl, but it's quite likely we do this copying into a
   different address space!

As suggested by James Bottomley ,
add a check for current->mm (which is NULL if we're on a kernel thread
without a real userspace address space) in bio_uncopy_user(), and skip
the copy if we're on a kernel thread.

There's no reason that I can think of for any caller of bio_uncopy_user()
to want to do copying on a kernel thread with a random active userspace
address space.

Huge thanks to Costa Sapuntzakis  for the
original pointer to this bug in the sg code.

Signed-off-by: Roland Dreier 
Tested-by: David Milburn 
Cc: Jens Axboe 
Signed-off-by: James Bottomley 
[lizf: backported to 3.4:
 - Use __bio_for_each_segment() instead of bio_for_each_segment_all()]
Signed-off-by: Li Zefan 
Signed-off-by: Greg Kroah-Hartman

jfs: fix readdir cookie incompatibility with NFSv4

2013-09-08T04:49:32+00:00

commit 44512449c0ab368889dd13ae0031fba74ee7e1d2 upstream.

NFSv4 reserves readdir cookie values 0-2 for special entries (. and ..),
but jfs allows a value of 2 for a non-special entry. This incompatibility
can result in the nfs client reporting a readdir loop.

This patch doesn't change the value stored internally, but adds one to
the value exposed to the iterate method.

Signed-off-by: Dave Kleikamp 
[bwh: Backported to 3.2:
 - Adjust context
 - s/ctx->pos/filp->f_pos/]
Tested-by: Christian Kujau 
Signed-off-by: Ben Hutchings 
Signed-off-by: Greg Kroah-Hartman

nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection

2013-08-29T16:42:12+00:00

commit 4bf93b50fd04118ac7f33a3c2b8a0a1f9fa80bc9 upstream.

Fix the issue with improper counting number of flying bio requests for
BIO_EOPNOTSUPP error detection case.

The sb_nbio must be incremented exactly the same number of times as
complete() function was called (or will be called) because
nilfs_segbuf_wait() will call wail_for_completion() for the number of
times set to sb_nbio:

  do {
      wait_for_completion(&segbuf->sb_bio_event);
  } while (--segbuf->sb_nbio > 0);

Two functions complete() and wait_for_completion() must be called the
same number of times for the same sb_bio_event.  Otherwise,
wait_for_completion() will hang or leak.

Signed-off-by: Vyacheslav Dubeyko 
Cc: Dan Carpenter 
Acked-by: Ryusuke Konishi 
Tested-by: Ryusuke Konishi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error

2013-08-29T16:42:12+00:00

commit 2df37a19c686c2d7c4e9b4ce1505b5141e3e5552 upstream.

Remove double call of bio_put() in nilfs_end_bio_write() for the case of
BIO_EOPNOTSUPP error detection.  The issue was found by Dan Carpenter
and he suggests first version of the fix too.

Signed-off-by: Vyacheslav Dubeyko 
Reported-by: Dan Carpenter 
Acked-by: Ryusuke Konishi 
Tested-by: Ryusuke Konishi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

vfs: d_obtain_alias() needs to use "/" as default name.

2013-08-15T05:55:17+00:00

commit b911a6bdeef5848c468597d040e3407e0aee04ce upstream.

NFS appears to use d_obtain_alias() to create the root dentry rather than
d_make_root.  This can cause 'prepend_path()' to complain that the root
has a weird name if an NFS filesystem is lazily unmounted.  e.g.  if
"/mnt" is an NFS mount then

 { cd /mnt; umount -l /mnt ; ls -l /proc/self/cwd; }

will cause a WARN message like
   WARNING: at /home/git/linux/fs/dcache.c:2624 prepend_path+0x1d7/0x1e0()
   ...
   Root dentry has weird name <>

to appear in kernel logs.

So change d_obtain_alias() to use "/" rather than "" as the anonymous
name.

Signed-off-by: NeilBrown 
Cc: Trond Myklebust 
Cc: Al Viro 
Signed-off-by: Andrew Morton 
Signed-off-by: Al Viro 
[bwh: Backported to 3.2: use named initialisers instead of QSTR_INIT()]
Signed-off-by: Ben Hutchings 
Signed-off-by: Greg Kroah-Hartman

cifs: silence compiler warnings showing up with gcc-4.7.0

2013-08-15T05:55:17+00:00

commit b2a3ad9ca502169fc4c11296fa20f56059c7c031 upstream.

gcc-4.7.0 has started throwing these warnings when building cifs.ko.

  CC [M]  fs/cifs/cifssmb.o
fs/cifs/cifssmb.c: In function ‘CIFSSMBSetCIFSACL’:
fs/cifs/cifssmb.c:3905:9: warning: array subscript is above array bounds [-Warray-bounds]
fs/cifs/cifssmb.c: In function ‘CIFSSMBSetFileInfo’:
fs/cifs/cifssmb.c:5711:8: warning: array subscript is above array bounds [-Warray-bounds]
fs/cifs/cifssmb.c: In function ‘CIFSSMBUnixSetFileInfo’:
fs/cifs/cifssmb.c:6001:25: warning: array subscript is above array bounds [-Warray-bounds]

This patch cleans up the code a bit by using the offsetof macro instead
of the funky "&pSMB->hdr.Protocol" construct.

Signed-off-by: Jeff Layton 
Signed-off-by: Steve French 
Signed-off-by: Greg Kroah-Hartman

debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)

2013-08-15T05:55:17+00:00

commit 776164c1faac4966ab14418bb0922e1820da1d19 upstream.

debugfs_remove_recursive() is wrong,

1. it wrongly assumes that !list_empty(d_subdirs) means that this
   dir should be removed.

   This is not that bad by itself, but:

2. if d_subdirs does not becomes empty after __debugfs_remove()
   it gives up and silently fails, it doesn't even try to remove
   other entries.

   However ->d_subdirs can be non-empty because it still has the
   already deleted !debugfs_positive() entries.

3. simple_release_fs() is called even if __debugfs_remove() fails.

Suppose we have

	dir1/
		dir2/
			file2
		file1

and someone opens dir1/dir2/file2.

Now, debugfs_remove_recursive(dir1/dir2) succeeds, and dir1/dir2 goes
away.

But debugfs_remove_recursive(dir1) silently fails and doesn't remove
this directory. Because it tries to delete (the already deleted)
dir1/dir2/file2 again and then fails due to "Avoid infinite loop"
logic.

Test-case:

	#!/bin/sh

	cd /sys/kernel/debug/tracing
	echo 'p:probe/sigprocmask sigprocmask' >> kprobe_events
	sleep 1000 < events/probe/sigprocmask/id &
	echo -n >| kprobe_events

	[ -d events/probe ] && echo "ERR!! failed to rm probe"

And after that it is not possible to create another probe entry.

With this patch debugfs_remove_recursive() skips !debugfs_positive()
files although this is not strictly needed. The most important change
is that it does not try to make ->d_subdirs empty, it simply scans
the whole list(s) recursively and removes as much as possible.

Link: http://lkml.kernel.org/r/20130726151256.GC19472@redhat.com

Acked-by: Greg Kroah-Hartman 
Signed-off-by: Oleg Nesterov 
Signed-off-by: Steven Rostedt 
Signed-off-by: Greg Kroah-Hartman

fanotify: info leak in copy_event_to_user()

2013-08-11T22:38:22+00:00

commit de1e0c40aceb9d5bff09c3a3b97b2f1b178af53f upstream.

The ->reserved field isn't cleared so we leak one byte of stack
information to userspace.

Signed-off-by: Dan Carpenter 
Cc: Eric Paris 
Cc: Al Viro 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Cc: Luis Henriques 
Signed-off-by: Greg Kroah-Hartman

livelock avoidance in sget()

2013-08-04T07:43:40+00:00

commit acfec9a5a892f98461f52ed5770de99a3e571ae2 upstream.

Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about
to fail.  The superblock is on ->fs_supers, ->s_umount is held exclusive,
->s_active is 1.  Along comes two more processes, trying to mount the same
thing; sget() in each is picking that superblock, bumping ->s_count and
trying to grab ->s_umount.  ->s_active is 3 now.  Original mount(2)
finally gets to deactivate_locked_super() on failure; ->s_active is 2,
superblock is still ->fs_supers because shutdown will *not* happen until
->s_active hits 0.  ->s_umount is dropped and now we have two processes
chasing each other:
s_active = 2, A acquired ->s_umount, B blocked
A sees that the damn thing is stillborn, does deactivate_locked_super()
s_active = 1, A drops ->s_umount, B gets it
A restarts the search and finds the same superblock.  And bumps it ->s_active.
s_active = 2, B holds ->s_umount, A blocked on trying to get it
... and we are in the earlier situation with A and B switched places.

The root cause, of course, is that ->s_active should not grow until we'd
got MS_BORN.  Then failing ->mount() will have deactivate_locked_super()
shut the damn thing down.  Fortunately, it's easy to do - the key point
is that grab_super() is called only for superblocks currently on ->fs_supers,
so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and
bump ->s_active; we must never increment ->s_count for superblocks past
->kill_sb(), but grab_super() is never called for those.

The bug is pretty old; we would've caught it by now, if not for accidental
exclusion between sget() for block filesystems; the things like cgroup or
e.g. mtd-based filesystems don't have anything of that sort, so they get
bitten.  The right way to deal with that is obviously to fix sget()...

Signed-off-by: Al Viro 
Signed-off-by: Greg Kroah-Hartman

lockd: protect nlm_blocked access in nlmsvc_retry_blocked

2013-07-28T23:18:48+00:00

commit 1c327d962fc420aea046c16215a552710bde8231 upstream.

In nlmsvc_retry_blocked, the check that the list is non-empty and acquiring
the pointer of the first entry is unprotected by any lock.  This allows a rare
race condition when there is only one entry on the list.  A function such as
nlmsvc_grant_callback() can be called, which will temporarily remove the entry
from the list.  Between the list_empty() and list_entry(),the list may become
empty, causing an invalid pointer to be used as an nlm_block, leading to a
possible crash.

This patch adds the nlm_block_lock around these calls to prevent concurrent
use of the nlm_blocked list.

This was a regression introduced by
f904be9cc77f361d37d71468b13ff3d1a1823dea  "lockd: Mostly remove BKL from
the server".

Signed-off-by: David Jeffery 
Cc: Bryan Schumaker 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman