linux-toradex.git/block/genhd.c, branch v3.2.59

block: do not pass disk names as format strings

2013-07-27T04:34:28+00:00

commit ffc8b30866879ed9ba62bd0a86fecdbd51cd3d19 upstream.

Disk names may contain arbitrary strings, so they must not be
interpreted as format strings.  It seems that only md allows arbitrary
strings to be used for disk names, but this could allow for a local
memory corruption from uid 0 into ring 0.

CVE-2013-2851

Signed-off-by: Kees Cook 
Cc: Jens Axboe 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[bwh: Backported to 3.2: adjust device pointer name in nbd.c]
Signed-off-by: Ben Hutchings

block: fix synchronization and limit check in blk_alloc_devt()

2013-03-06T03:24:16+00:00

commit ce23bba842aee98092225d9576dba47c82352521 upstream.

idr allocation in blk_alloc_devt() wasn't synchronized against lookup
and removal, and its limit check was off by one - 1 << MINORBITS is
the number of minors allowed, not the maximum allowed minor.

Add locking and rename MAX_EXT_DEVT to NR_EXT_DEVT and fix limit
checking.

Signed-off-by: Tejun Heo 
Acked-by: Jens Axboe 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Ben Hutchings

block: fix ext_devt_idr handling

2013-03-06T03:24:15+00:00

commit 7b74e912785a11572da43292786ed07ada7e3e0c upstream.

While adding and removing a lot of disks disks and partitions this
sometimes shows up:

  WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted)
  Hardware name:
  sysfs: cannot create duplicate filename '/dev/block/259:751'
  Modules linked in: raid1 autofs4 bnx2fc cnic uio fcoe libfcoe libfc 8021q scsi_transport_fc scsi_tgt garp stp llc sunrpc cpufreq_ondemand powernow_k8 freq_table mperf ipv6 dm_mirror dm_region_hash dm_log power_meter microcode dcdbas serio_raw amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core k10temp bnx2 sg ixgbe dca mdio ext4 mbcache jbd2 dm_round_robin sr_mod cdrom sd_mod crc_t10dif ata_generic pata_acpi pata_atiixp ahci mptsas mptscsih mptbase scsi_transport_sas dm_multipath dm_mod [last unloaded: scsi_wait_scan]
  Pid: 44103, comm: async/16 Not tainted 2.6.32-195.el6.x86_64 #1
  Call Trace:
    warn_slowpath_common+0x87/0xc0
    warn_slowpath_fmt+0x46/0x50
    sysfs_add_one+0xc9/0x130
    sysfs_do_create_link+0x12b/0x170
    sysfs_create_link+0x13/0x20
    device_add+0x317/0x650
    idr_get_new+0x13/0x50
    add_partition+0x21c/0x390
    rescan_partitions+0x32b/0x470
    sd_open+0x81/0x1f0 [sd_mod]
    __blkdev_get+0x1b6/0x3c0
    blkdev_get+0x10/0x20
    register_disk+0x155/0x170
    add_disk+0xa6/0x160
    sd_probe_async+0x13b/0x210 [sd_mod]
    add_wait_queue+0x46/0x60
    async_thread+0x102/0x250
    default_wake_function+0x0/0x20
    async_thread+0x0/0x250
    kthread+0x96/0xa0
    child_rip+0xa/0x20
    kthread+0x0/0xa0
    child_rip+0x0/0x20

This most likely happens because dev_t is freed while the number is
still used and idr_get_new() is not protected on every use.  The fix
adds a mutex where it wasn't before and moves the dev_t free function so
it is called after device del.

Signed-off-by: Tomas Henzl 
Cc: Jens Axboe 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings

block: fix buffer overflow when printing partition UUIDs

2012-05-30T23:43:15+00:00

commit 05c69d298c96703741cac9a5cbbf6c53bd55a6e2 upstream.

6d1d8050b4bc8 "block, partition: add partition_meta_info to hd_struct"
added part_unpack_uuid() which assumes that the passed in buffer has
enough space for sprintfing "%pU" - 37 characters including '\0'.

Unfortunately, b5af921ec0233 "init: add support for root devices
specified by partition UUID" supplied 33 bytes buffer to the function
leading to the following panic with stackprotector enabled.

  Kernel panic - not syncing: stack-protector: Kernel stack corrupted in: ffffffff81b14c7e

  [] panic+0xba/0x1c6
  [] ? printk_all_partitions+0x259/0x26xb
  [] __stack_chk_fail+0x1b/0x20
  [] printk_all_paritions+0x259/0x26xb
  [] mount_block_root+0x1bc/0x27f
  [] mount_root+0x57/0x5b
  [] prepare_namespace+0x13d/0x176
  [] ? release_tgcred.isra.4+0x330/0x30
  [] kernel_init+0x155/0x15a
  [] ? schedule_tail+0x27/0xb0
  [] kernel_thread_helper+0x5/0x10
  [] ? start_kernel+0x3c5/0x3c5
  [] ? gs_change+0x13/0x13

Increase the buffer size, remove the dangerous part_unpack_uuid() and
use snprintf() directly from printk_all_partitions().

Signed-off-by: Tejun Heo 
Reported-by: Szymon Gruszczynski 
Cc: Will Drewry 
Signed-off-by: Jens Axboe 
Signed-off-by: Ben Hutchings

Block: use a freezable workqueue for disk-event polling

2012-03-19T16:02:34+00:00

commit 62d3c5439c534b0e6c653fc63e6d8c67be3a57b1 upstream.

This patch (as1519) fixes a bug in the block layer's disk-events
polling.  The polling is done by a work routine queued on the
system_nrt_wq workqueue.  Since that workqueue isn't freezable, the
polling continues even in the middle of a system sleep transition.

Obviously, polling a suspended drive for media changes and such isn't
a good thing to do; in the case of USB mass-storage devices it can
lead to real problems requiring device resets and even re-enumeration.

The patch fixes things by creating a new system-wide, non-reentrant,
freezable workqueue and using it for disk-events polling.

Signed-off-by: Alan Stern 
Acked-by: Tejun Heo 
Acked-by: Rafael J. Wysocki 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

block: fix __blkdev_get and add_disk race condition

2012-03-19T16:02:34+00:00

commit 9f53d2fe815b4011ff930a7b6db98385d45faa68 upstream.

The following situation might occur:

__blkdev_get:			add_disk:

				register_disk()
get_gendisk()

disk_block_events()
	disk->ev == NULL

				disk_add_events()

__disk_unblock_events()
	disk->ev != NULL
	--ev->block

Then we unblock events, when they are suppose to be blocked. This can
trigger events related block/genhd.c warnings, but also can crash in
sd_check_events() or other places.

I'm able to reproduce crashes with the following scripts (with
connected usb dongle as sdb disk).


DEV=/dev/sdb
ENABLE=/sys/bus/usb/devices/1-2/bConfigurationValue

function stop_me()
{
	for i in `jobs -p` ; do kill $i 2> /dev/null ; done
	exit
}

trap stop_me SIGHUP SIGINT SIGTERM

for ((i = 0; i < 10; i++)) ; do
	while true; do fdisk -l $DEV  2>&1 > /dev/null ; done &
done

while true ; do
echo 1 > $ENABLE
sleep 1
echo 0 > $ENABLE
done


I use the script to verify patch fixing oops in sd_revalidate_disk
http://marc.info/?l=linux-scsi&m=132935572512352&w=2
Without Jun'ichi Nomura patch titled "Fix NULL pointer dereference in
sd_revalidate_disk" or this one, script easily crash kernel within
a few seconds. With both patches applied I do not observe crash.
Unfortunately after some time (dozen of minutes), script will hung in:

[ 1563.906432]  [] schedule_timeout_uninterruptible+0x15/0x20
[ 1563.906437]  [] msleep+0x15/0x20
[ 1563.906443]  [] blk_drain_queue+0x32/0xd0
[ 1563.906447]  [] blk_cleanup_queue+0xd0/0x170
[ 1563.906454]  [] scsi_free_queue+0x3f/0x60
[ 1563.906459]  [] __scsi_remove_device+0x6e/0xb0
[ 1563.906463]  [] scsi_forget_host+0x4f/0x60
[ 1563.906468]  [] scsi_remove_host+0x5a/0xf0
[ 1563.906482]  [] quiesce_and_remove_host+0x5b/0xa0 [usb_storage]
[ 1563.906490]  [] usb_stor_disconnect+0x13/0x20 [usb_storage]

Anyway I think this patch is some step forward.

As drawback, I do not teardown on sysfs file create error, because I do
not know how to nullify disk->ev (since it can be used). However add_disk
error handling practically does not exist too, and things will work
without this sysfs file, except events will not be exported to user
space.

Signed-off-by: Stanislaw Gruszka 
Acked-by: Tejun Heo 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman

block: Revert "[SCSI] genhd: add a new attribute "alias" in gendisk"

2011-11-10T08:03:55+00:00

This reverts commit a72c5e5eb738033938ab30d6a634b74d1d060f10.

The commit introduced alias for block devices which is intended to be
used during logging although actual usage hasn't been committed yet.
This approach adds very limited benefit (raw log might be easier to
follow) which can be trivially implemented in userland but has a lot
of problems.

It is much worse than netif renames because it doesn't rename the
actual device but just adds conveninence name which isn't used
universally or enforced.  Everything internal including device lookup
and sysfs still uses the internal name and nothing prevents two
devices from using conflicting alias - ie. sda can have sdb as its
alias.

This has been nacked by people working on device driver core, block
layer and kernel-userland interface and shouldn't have been
upstreamed.  Revert it.

 http://thread.gmane.org/gmane.linux.kernel/1155104
 http://thread.gmane.org/gmane.linux.scsi/68632
 http://thread.gmane.org/gmane.linux.scsi/69776

Signed-off-by: Tejun Heo 
Acked-by: Greg Kroah-Hartman 
 Acked-by: Kay Sievers 
Cc: "James E.J. Bottomley" 
Cc: Nao Nishijima 
Cc: Alan Cox 
Cc: Al Viro 
Signed-off-by: Jens Axboe

Merge branch 'for-3.2/drivers' of git://git.kernel.dk/linux-block

2011-11-05T00:22:14+00:00

* 'for-3.2/drivers' of git://git.kernel.dk/linux-block: (30 commits)
  virtio-blk: use ida to allocate disk index
  hpsa: add small delay when using PCI Power Management to reset for kump
  cciss: add small delay when using PCI Power Management to reset for kump
  xen/blkback: Fix two races in the handling of barrier requests.
  xen/blkback: Check for proper operation.
  xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
  xen/blkback: Report VBD_WSECT (wr_sect) properly.
  xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests.
  xen-blkfront: plug device number leak in xlblk_init() error path
  xen-blkfront: If no barrier or flush is supported, use invalid operation.
  xen-blkback: use kzalloc() in favor of kmalloc()+memset()
  xen-blkback: fixed indentation and comments
  xen-blkfront: fix a deadlock while handling discard response
  xen-blkfront: Handle discard requests.
  xen-blkback: Implement discard requests ('feature-discard')
  xen-blkfront: add BLKIF_OP_DISCARD and discard request struct
  drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd()
  drivers/block/loop.c: emit uevent on auto release
  drivers/block/cpqarray.c: use pci_dev->revision
  loop: always allow userspace partitions and optionally support automatic scanning
  ...

Fic up trivial header file includsion conflict in drivers/block/loop.c

Merge branch 'for-3.2/core' of git://git.kernel.dk/linux-block

2011-11-05T00:06:58+00:00

* 'for-3.2/core' of git://git.kernel.dk/linux-block: (29 commits)
  block: don't call blk_drain_queue() if elevator is not up
  blk-throttle: use queue_is_locked() instead of lockdep_is_held()
  blk-throttle: Take blkcg->lock while traversing blkcg->policy_list
  blk-throttle: Free up policy node associated with deleted rule
  block: warn if tag is greater than real_max_depth.
  block: make gendisk hold a reference to its queue
  blk-flush: move the queue kick into
  blk-flush: fix invalid BUG_ON in blk_insert_flush
  block: Remove the control of complete cpu from bio.
  block: fix a typo in the blk-cgroup.h file
  block: initialize the bounce pool if high memory may be added later
  block: fix request_queue lifetime handling by making blk_queue_cleanup() properly shutdown
  block: drop @tsk from attempt_plug_merge() and explain sync rules
  block: make get_request[_wait]() fail if queue is dead
  block: reorganize throtl_get_tg() and blk_throtl_bio()
  block: reorganize queue draining
  block: drop unnecessary blk_get/put_queue() in scsi_cmd_ioctl() and blk_get_tg()
  block: pass around REQ_* flags instead of broken down booleans during request alloc/free
  block: move blk_throtl prototypes to block/blk.h
  block: fix genhd refcounting in blkio_policy_parse_and_set()
  ...

Fix up trivial conflicts due to "mddev_t" -> "struct mddev" conversion
and making the request functions be of type "void" instead of "int" in
 - drivers/md/{faulty.c,linear.c,md.c,md.h,multipath.c,raid0.c,raid1.c,raid10.c,raid5.c}
 - drivers/staging/zram/zram_drv.c

block: make gendisk hold a reference to its queue

2011-10-19T12:31:07+00:00

The following command sequence triggers an oops.

# mount /dev/sdb1 /mnt
# echo 1 > /sys/class/scsi_device/0\:0\:1\:0/device/delete
# umount /mnt

 general protection fault: 0000 [#1] PREEMPT SMP
 CPU 2
 Modules linked in:

 Pid: 791, comm: umount Not tainted 3.1.0-rc3-work+ #8 Bochs Bochs
 RIP: 0010:[]  [] __lock_acquire+0x389/0x1d60
...
 Call Trace:
  [] lock_acquire+0x95/0x140
  [] _raw_spin_lock+0x3b/0x50
  [] bdi_lock_two+0x5c/0x70
  [] bdev_inode_switch_bdi+0x4c/0xf0
  [] __blkdev_put+0x11b/0x1d0
  [] __blkdev_put+0x160/0x1d0
  [] blkdev_put+0x5f/0x190
  [] kill_block_super+0x4d/0x80
  [] deactivate_locked_super+0x45/0x70
  [] deactivate_super+0x4a/0x70
  [] mntput_no_expire+0xed/0x130
  [] sys_umount+0x7e/0x3a0
  [] system_call_fastpath+0x16/0x1b

This is because bdev holds on to disk but disk doesn't pin the
associated queue.  If a SCSI device is removed while the device is
still open, the sdev puts the base reference to the queue on release.
When the bdev is finally released, the associated queue is already
gone along with the bdi and bdev_inode_switch_bdi() ends up
dereferencing already freed bdi.

Even if it were not for this bug, disk not holding onto the associated
queue is very unusual and error-prone.

Fix it by making add_disk() take an extra reference to its queue and
put it on disk_release() and ensuring that disk and its fops owner are
put in that order after all accesses to the disk and queue are
complete.

Signed-off-by: Tejun Heo 
Cc: stable@kernel.org
Signed-off-by: Jens Axboe