Age | Commit message (Collapse) | Author |
|
commit 5370019dc2d2c2ff90e95d181468071362934f3a upstream.
bd_mutex and lo_ctl_mutex can be held in different order.
Path #1:
blkdev_open
blkdev_get
__blkdev_get (hold bd_mutex)
lo_open (hold lo_ctl_mutex)
Path #2:
blkdev_ioctl
lo_ioctl (hold lo_ctl_mutex)
lo_set_capacity (hold bd_mutex)
Lockdep does not report it, because path #2 actually holds a subclass of
lo_ctl_mutex. This subclass seems creep into the code by mistake. The
patch author actually just mentioned it in the changelog, see commit
f028f3b2 ("loop: fix circular locking in loop_clr_fd()"), also see:
http://marc.info/?l=linux-kernel&m=123806169129727&w=2
Path #2 hold bd_mutex to call bd_set_size(), I've protected it
with i_mutex in a previous patch, so drop bd_mutex at this site.
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: M. Hindess <hindessm@uk.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9d092603cc306ee6edfe917bf9ab8beb5f32d7bc upstream.
"be->mode" is obtained from xenbus_read(), which does a kmalloc() for
the message body. The short string is never released, so do it along
with freeing "be" itself, and make sure the string isn't kept when
backend_changed() doesn't complete successfully (which made it
desirable to slightly re-structure that function, so that the error
cleanup can be done in one place).
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f4d9605434c0fd4cc8639bf25cfc043418c52362 ]
The 'operations' bitmap corresponds one-for-one with the operation
codes, no adjustment is necessary.
Reported-by: Mark Kettenis <mark.kettenis@xs4all.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 72585d2428fa3a0daab02ebad1f41e5ef517dbaa upstream.
Without this, iostat frequently sees bogus svctime and >= 100% "utilization".
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Raoul Bhatia <raoul@bhatia.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0a41409c518083133e79015092585d68915865be upstream, but doesn't
apply, so this version is different for older kernels than 3.7.x
blk_alloc_queue has already done a bdi_init, so do not bdi_init
again in aoeblk_gdalloc. The extra call causes list corruption
in the per-CPU backing dev info stats lists.
Affected users see console WARNINGs about list_del corruption on
percpu_counter_destroy when doing "rmmod aoe" or "aoeflush -a"
when AoE targets have been detected and initialized by the
system.
The patch below applies to v3.6.11, with its v47 aoe driver. It
is expected to apply to all currently maintained stable kernels
except 3.7.y. A related but different fix has been posted for
3.7.y.
References:
RedHat bugzilla ticket with original report
https://bugzilla.redhat.com/show_bug.cgi?id=853064
LKML discussion of bug and fix
http://thread.gmane.org/gmane.linux.kernel/1416336/focus=1416497
Reported-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cecd353a02fb1405c8a72a324b26b5acf97e7411 upstream.
Set CommandMailbox with memset before use it. Fix for:
drivers/block/DAC960.c: In function ‘DAC960_V1_EnableMemoryMailboxInterface’:
arch/x86/include/asm/io.h:61:1: warning: ‘CommandMailbox.Bytes[12]’
may be used uninitialized in this function [-Wuninitialized]
drivers/block/DAC960.c:1175:30: note: ‘CommandMailbox.Bytes[12]’
was declared here
Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bca505f1097c725708ddc055cf8055e922b0904b upstream.
Fixed compiler warning:
comparison between ‘DAC960_V2_IOCTL_Opcode_T’ and ‘enum <anonymous>’
Renamed enum, added a new enum for SCSI_10.CommandOpcode in
DAC960_V2_ProcessCompletedCommand().
Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f6365201d8a21fb347260f89d6e9b3e718d63c70 upstream.
The X86_32-only disable_hlt/enable_hlt mechanism was used by the
32-bit floppy driver. Its effect was to replace the use of the
HLT instruction inside default_idle() with cpu_relax() - essentially
it turned off the use of HLT.
This workaround was commented in the code as:
"disable hlt during certain critical i/o operations"
"This halt magic was a workaround for ancient floppy DMA
wreckage. It should be safe to remove."
H. Peter Anvin additionally adds:
"To the best of my knowledge, no-hlt only existed because of
flaky power distributions on 386/486 systems which were sold to
run DOS. Since DOS did no power management of any kind,
including HLT, the power draw was fairly uniform; when exposed
to the much hhigher noise levels you got when Linux used HLT
caused some of these systems to fail.
They were by far in the minority even back then."
Alan Cox further says:
"Also for the Cyrix 5510 which tended to go castors up if a HLT
occurred during a DMA cycle and on a few other boxes HLT during
DMA tended to go astray.
Do we care ? I doubt it. The 5510 was pretty obscure, the 5520
fixed it, the 5530 is probably the oldest still in any kind of
use."
So, let's finally drop this.
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Stephen Hemminger <shemminger@vyatta.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-3rhk9bzf0x9rljkv488tloib@git.kernel.org
[ If anyone cares then alternative instruction patching could be
used to replace HLT with a one-byte NOP instruction. Much simpler. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 238ab78469c6ab7845b43d5061cd3c92331b2452 upstream.
If blk_init_queue fails, we do not call put_disk on the current dr
(dr is decremented first in the error handling loop).
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8babe8cc6570ed896b7b596337eb8fe730c3ff45 ]
In order for the network layer to see that AoE requires
no checksumming in a generic way, the packets must be
marked as requiring no checksum, so we make this requirement
explicit with the assertion.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2453f5f992717251cfadab6184fbb3ec2f2e8b40 upstream.
If a command completes with a status of CMD_PROTOCOL_ERR, this
information should be conveyed to the SCSI mid layer, not dropped
on the floor. Unlike a similar bug in the hpsa driver, this bug
only affects tape drives and CD and DVD ROM drives in the cciss
driver, and to induce it, you have to disconnect (or damage) a
cable, so it is not a very likely scenario (which would explain
why the bug has gone undetected for the last 10 years.)
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b0cf0b118c90477d1a6811f2cd2307f6a5578362 upstream.
Delete code which sets SCSI status incorrectly as it's already been set
correctly above this incorrect code. The bug was introduced in 2009 by
commit b0e15f6db111 ("cciss: fix typo that causes scsi status to be
lost.")
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-by: Roel van Meer <roel.vanmeer@bokxing.nl>
Tested-by: Roel van Meer <roel.vanmeer@bokxing.nl>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 32587371ad3db2f9d335de10dbd8cffd4fff5669 upstream.
Fix a regression introduced by 7eaceaccab5f40 ("block: remove per-queue
plugging"). In that patch, Jens removed the whole mm_unplug_device()
function, which used to be the trigger to make umem start to work.
We need to implement unplugging to make umem start to work, or I/O will
never be triggered.
Signed-off-by: Tao Guo <Tao.Guo@emc.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bc67f63650fad6b3478d9ddfd5406d45a95987c9 upstream.
The total number of scatter gather elements in the CISS command
used by the scsi tape code was being cast to a u8, which can hold
at most 255 scatter gather elements. It should have been cast to
a u16. Without this patch the command gets rejected by the controller
since the total scatter gather count did not add up to the right
value resulting in an i/o error.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 395d287526bb60411ff37b19ad9dd38b58ba8732 upstream.
The default is too small (1024 blocks), use h->cciss_max_sectors (8192 blocks)
Without this change, if you try to set the block size of a tape drive above
512*1024, via "mt -f /dev/st0 setblk nnn" where nnn is greater than 524288,
it won't work right.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ea5f4db8ece896c2ab9eafa0924148a2596c52e4 upstream.
"mem" is type u8. We need parenthesis here or it screws up the pointer
math probably leading to an oops.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 577ebb374c78314ac4617242f509e2f5e7156649 upstream.
Introduce a wrapper around scsi_cmd_ioctl that takes a block device.
The function will then be enhanced to detect partition block devices
and, in that case, subject the ioctls to whitelisting.
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 5c62cb48602dba95159c81ffeca179d3852e25be upstream.
We did not increment the amount of sectors written to disk
b/c we tested for the == WRITE which is incorrect - as the
operations are more of WRITE_FLUSH, WRITE_ODIRECT. This patch
fixes it by doing a & WRITE check.
Reported-by: Andy Burns <xen.lists@burns.me.uk>
Suggested-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit ab5dbebe33e0c353e8545f09c34553ac3351dad6 upstream.
The P600 requires a small delay when changing states. Otherwise we may think
the board did not reset and we bail. This for kdump only and is particular
to the P600.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 6c4867f6469964e34c5f4ee229a2a7f71a34c7ff upstream.
When no floppy is found the module code can be released while a timer
function is pending or about to be executed.
CPU0 CPU1
floppy_init()
timer_softirq()
spin_lock_irq(&base->lock);
detach_timer();
spin_unlock_irq(&base->lock);
-> Interrupt
del_timer();
return -ENODEV;
module_cleanup();
<- EOI
call_timer_fn();
OOPS
Use del_timer_sync() to prevent this.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 89153b5cae9f40c224a5d321665a97bf14220c2c upstream.
Avoid telling users to use xvde and onwards when using xvde.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 196cfe2ae8fcdc03b3c7d627e7dfe8c0ce7229f9 upstream.
These were intended to avoid the namespace clash when representing
emulated IDE and SCSI devices. However that seems to confuse users
more than expected (a disk defined as sda becomes xvde).
So for now go back to the scheme which does no adjustments. This
will break when mixing IDE and SCSI names in the configuration of
guests but should be by now expected.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 05eb0f252b04aa94ace0794f73d56c6a02351d80 upstream.
LOOP_CLR_FD takes lo->lo_ctl_mutex and tries to remove the loop sysfs
files. Sysfs calls show() and waits for lo->lo_ctl_mutex. LOOP_CLR_FD
waits for show() to finish to remove the sysfs file.
cat /sys/class/block/loop0/loop/backing_file
mutex_lock_nested+0x176/0x350
? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
dev_attr_show+0x1b/0x60
? sysfs_read_file+0x86/0x1a0
? __get_free_pages+0x12/0x50
sysfs_read_file+0xaf/0x1a0
ioctl(LOOP_CLR_FD):
wait_for_common+0x12c/0x180
? try_to_wake_up+0x2a0/0x2a0
wait_for_completion+0x18/0x20
sysfs_deactivate+0x178/0x180
? sysfs_addrm_finish+0x43/0x70
? sysfs_addrm_start+0x1d/0x20
sysfs_addrm_finish+0x43/0x70
sysfs_hash_and_remove+0x85/0xa0
sysfs_remove_group+0x59/0x100
loop_clr_fd+0x1dc/0x3f0 [loop]
lo_ioctl+0x223/0x7a0 [loop]
Instead of taking the lo_ctl_mutex from sysfs code, take the inner
lo->lo_lock, to protect the access to the backing_file data.
Thanks to Tejun for help debugging and finding a solution.
Cc: Milan Broz <mbroz@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 07d0c38e7d84f911c72058a124c7f17b3c779a65 upstream.
Most smartarrays will tolerate it, but some new ones don't.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Note: this is a regression caused by commit 1ddd5049
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
for-linus
|
|
We used to write these with BIO_RW_BARRIER aka REQ_HARDBARRIER (unless
disabled in the configuration). The correct semantic now would be to
write with FLUSH/FUA.
For example, with activity log transactions, FUA alone is not enough, we
need the corresponding bitmap update (and all related application
updates) on stable storage as well.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
data socket
If we have an asymetrically congested network, we may send P_PING,
but due to congestion, the corresponding P_PING_ACK would time out,
and we would drop a (congested, but otherwise) healthy connection
("PingAck did not arrive in time.")
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
If we have a good resync rate, we will frequently update the on-disk
bitmap, which, if not accounted for as resync io, may let an otherwise
idle device appear to be "busy", and cause us to throttle resync.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
The last commit, drbd: add missing spinlock to bitmap receive,
introduced a cond_resched_lock(), where the lock in question is taken
with irqs disabled.
As we must not schedule with IRQs disabled,
and cond_resched_lock_irq() does not exist, yet,
we re-aquire the spin_lock_irq() for each bitmap page processed in turn.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
During bitmap exchange, when using the RLE bitmap compression scheme,
we have a code path that can set the whole bitmap at once.
To avoid holding spin_lock_irq() for too long, we used to lock out other
bitmap modifications during bitmap exchange by other means, and then,
knowing we have exclusive access to the bitmap, modify it without
the spinlock, and with IRQs enabled.
Since we now allow local IO to continue, potentially setting additional
bits during the bitmap receive phase, this is no longer true, and we get
uncoordinated updates of bitmap members, causing bm_set to no longer
accurately reflect the total number of set bits.
To actually see this, you'd need to have a large bitmap, use RLE bitmap
compression, and have busy IO during sync handshake and bitmap exchange.
Fix this by taking the spin_lock_irq() in this code path as well, but
calling cond_resched_lock() after each page worth of bits processed.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
|
|
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Use hlist_entry() for io_context.cic_list.first
cfq-iosched: Remove bogus check in queue_fail path
xen/blkback: potential null dereference in error handling
xen/blkback: don't call vbd_size() if bd_disk is NULL
block: blkdev_get() should access ->bd_disk only after success
CFQ: Fix typo and remove unnecessary semicolon
block: remove unwanted semicolons
Revert "block: Remove extra discard_alignment from hd_struct."
nbd: adjust 'max_part' according to part_shift
nbd: limit module parameters to a sane value
nbd: pass MSG_* flags to kernel_recvmsg()
block: improve the bio_add_page() and bio_add_pc_page() descriptions
|
|
Jens' back-merge commit 698567f3fa79 ("Merge commit 'v2.6.39' into
for-2.6.40/core") was incorrectly done, and re-introduced the
DISK_EVENT_MEDIA_CHANGE lines that had been removed earlier in commits
- 9fd097b14918 ("block: unexport DISK_EVENT_MEDIA_CHANGE for
legacy/fringe drivers")
- 7eec77a1816a ("ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd
and ide-cd")
because of conflicts with the "g->flags" updates near-by by commit
d4dc210f69bc ("block: don't block events on excl write for non-optical
devices")
As a result, we re-introduced the hanging behavior due to infinite disk
media change reports.
Tssk, tssk, people! Don't do back-merges at all, and *definitely* don't
do them to hide merge conflicts from me - especially as I'm likely
better at merging them than you are, since I do so many merges.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
blkbk->pending_pages can be NULL here so I added a check for it.
Signed-off-by: Dan Carpenter <error27@gmail.com>
[v1: Redid the loop a bit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
...because vbd_size() dereferences bd_disk if bd_part is NULL.
Signed-off-by: Laszlo Ersek<lersek@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
It is easier to figure out the context by reading SCSI_SENSE_BUFFERSIZE
instead of plain '96'.
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Wire up the virtio_driver config_changed method to get notified about
config changes raised by the host. For now we just re-read the device
size to support online resizing of devices, but once we add more
attributes that might be changeable they could be added as well.
Note that the config_changed method is called from irq context, so
we'll have to use the workqueue infrastructure to provide us a proper
user context for our changes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param
x86 idle: deprecate "no-hlt" cmdline param
x86 idle APM: deprecate CONFIG_APM_CPU_IDLE
x86 idle floppy: deprecate disable_hlt()
x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it
x86 idle: clarify AMD erratum 400 workaround
idle governor: Avoid lock acquisition to read pm_qos before entering idle
cpuidle: menu: fixed wrapping timers at 4.294 seconds
|
|
Plan to remove floppy_disable_hlt in 2012, an ancient
workaround with comments that it should be removed.
This allows us to remove clutter and a run-time branch
from the idle code.
WARN_ONCE() on invocation until it is removed.
cc: x86@kernel.org
cc: stable@kernel.org # .39.x
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
The 'max_part' parameter determines how many partitions are supported
on each nbd device. However the actual number can be changed to the
power of 2 minus 1 form during the module initialization as
alloc_disk() is called with (1 << part_shift) for some reason.
So adjust 'max_part' also at least for consistency with loop and brd.
It is exported via sysfs already, and a user should check this value
after module loading if [s]he wants to use that number correctly
(i.e. fdisk or something).
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
The 'max_part' parameter controls the number of maximum partition
a nbd device can have. However if a user specifies very large
value it would exceed the limitation of device minor number and
can cause a kernel oops (or, at least, produce invalid device
nodes in some cases).
In addition, specifying large 'nbds_max' value causes same
problem for the same reason.
On my desktop, following command results to the kernel bug:
$ sudo modprobe nbd max_part=100000
kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/block/nbd4/range
CPU 1
Modules linked in: nbd(+) bridge stp llc kvm_intel kvm asus_atk0110 sg sr_mod cdrom
Pid: 2522, comm: modprobe Tainted: G W 2.6.39-leonard+ #159 System manufacturer System Product Name/P5G41TD-M PRO
RIP: 0010:[<ffffffff8115aa08>] [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
RSP: 0018:ffff8801009f1de8 EFLAGS: 00010246
RAX: 00000000ffffffef RBX: ffff880103920478 RCX: 00000000000a7bd3
RDX: ffffffff81a2dbe0 RSI: 0000000000000000 RDI: ffff880103920478
RBP: ffff8801009f1e38 R08: ffff880103920468 R09: ffff880103920478
R10: ffff8801009f1de8 R11: ffff88011eccbb68 R12: ffffffff81a2dbe0
R13: ffff880103920468 R14: 0000000000000000 R15: ffff880103920400
FS: 00007f3c49de9700(0000) GS:ffff88011f800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f3b7fe7c000 CR3: 00000000cd58d000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 2522, threadinfo ffff8801009f0000, task ffff8801009a93a0)
Stack:
ffff8801009f1e58 ffffffff812e8f6e ffff8801009f1e58 ffffffff812e7a80
ffff880000000010 ffff880103920400 ffff8801002fd0c0 ffff880103920468
0000000000000011 ffff880103920400 ffff8801009f1e48 ffffffff8115ab6a
Call Trace:
[<ffffffff812e8f6e>] ? device_add+0x4f1/0x5e4
[<ffffffff812e7a80>] ? dev_set_name+0x41/0x43
[<ffffffff8115ab6a>] sysfs_create_group+0x13/0x15
[<ffffffff810b857e>] blk_trace_init_sysfs+0x14/0x16
[<ffffffff811ee58b>] blk_register_queue+0x4c/0xfd
[<ffffffff811f3bdf>] add_disk+0xe4/0x29c
[<ffffffffa007e2ab>] nbd_init+0x2ab/0x30d [nbd]
[<ffffffffa007e000>] ? 0xffffffffa007dfff
[<ffffffff8100020f>] do_one_initcall+0x7f/0x13e
[<ffffffff8107ab0a>] sys_init_module+0xa1/0x1e3
[<ffffffff814f3542>] system_call_fastpath+0x16/0x1b
Code: 41 57 41 56 41 55 41 54 53 48 83 ec 28 0f 1f 44 00 00 48 89 fb 41 89 f6 49 89 d4 48 85 ff 74 0b 85 f6 75 0b 48 83
7f 30 00 75 14 <0f> 0b eb fe b9 ea ff ff ff 48 83 7f 30 00 0f 84 09 01 00 00 49
RIP [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
RSP <ffff8801009f1de8>
---[ end trace 753285ffbf72c57c ]---
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Unlike kernel_sendmsg(), kernel_recvmsg() requires passing flags explicitly
via last parameter instead of struct msghdr.msg_flags. Therefore calls to
sock_xmit(lo, 0, ..., MSG_WAITALL) have not been processed properly by tcp
layer wrt. the flag. Fix it.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Export 'max_loop' and 'max_part' parameters to sysfs so user can know
that how many devices are allowed and how many partitions are supported.
If 'max_loop' is 0, there is no restriction on the number of loop devices.
User can create/use the devices as many as minor numbers available. If
'max_part' is 0, it means simply the device doesn't support partitioning.
Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
needed. User should check this value after the module loading if he/she
want to use that number correctly (i.e. fdisk, mknod, etc.).
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Export 'rd_nr', 'rd_size' and 'max_part' parameters to sysfs so user can
know that how many devices are allowed, how big each device is and how
many partitions are supported. If 'max_part' is 0, it means simply the
device doesn't support partitioning.
Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
needed. User should check this value after the module loading if he/she
want to use that number correctly (i.e. fdisk, mknod, etc.).
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
If 'rd_nr' param was not specified, 16 (can be adjusted via
CONFIG_BLK_DEV_RAM_COUNT) devices would be created by default
but comment said 1. Fix it.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
When finding or allocating a ram disk device, brd_probe() did not take
partition numbers into account so that it can result to a different
device. Consider following example (I set CONFIG_BLK_DEV_RAM_COUNT=4
for simplicity) :
$ sudo modprobe brd max_part=15
$ ls -l /dev/ram*
brw-rw---- 1 root disk 1, 0 2011-05-25 15:41 /dev/ram0
brw-rw---- 1 root disk 1, 16 2011-05-25 15:41 /dev/ram1
brw-rw---- 1 root disk 1, 32 2011-05-25 15:41 /dev/ram2
brw-rw---- 1 root disk 1, 48 2011-05-25 15:41 /dev/ram3
$ sudo mknod /dev/ram4 b 1 64
$ sudo dd if=/dev/zero of=/dev/ram4 bs=4k count=256
256+0 records in
256+0 records out
1048576 bytes (1.0 MB) copied, 0.00215578 s, 486 MB/s
namhyung@leonhard:linux$ ls -l /dev/ram*
brw-rw---- 1 root disk 1, 0 2011-05-25 15:41 /dev/ram0
brw-rw---- 1 root disk 1, 16 2011-05-25 15:41 /dev/ram1
brw-rw---- 1 root disk 1, 32 2011-05-25 15:41 /dev/ram2
brw-rw---- 1 root disk 1, 48 2011-05-25 15:41 /dev/ram3
brw-r--r-- 1 root root 1, 64 2011-05-25 15:45 /dev/ram4
brw-rw---- 1 root disk 1, 1024 2011-05-25 15:44 /dev/ram64
After this patch, /dev/ram4 - instead of /dev/ram64 - was
accessed correctly.
In addition, 'range' passed to blk_register_region() should
include all range of dev_t that RAMDISK_MAJOR can address.
It does not need to be limited by partition numbers unless
'rd_nr' param was specified.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
The 'max_part' parameter controls the number of maximum partition
a brd device can have. However if a user specifies very large
value it would exceed the limitation of device minor number and
can cause a kernel panic (or, at least, produce invalid device
nodes in some cases).
On my desktop system, following command kills the kernel. On qemu,
it triggers similar oops but the kernel was alive:
$ sudo modprobe brd max_part=100000
BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
PGD 7af1067 PUD 7b19067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in: brd(+)
Pid: 44, comm: insmod Tainted: G W 2.6.39-qemu+ #158 Bochs Bochs
RIP: 0010:[<ffffffff81110a9a>] [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
RSP: 0018:ffff880007b15d78 EFLAGS: 00000286
RAX: ffff880007b05478 RBX: ffff880007a52760 RCX: ffff880007b15dc8
RDX: ffff880007a4f900 RSI: ffff880007b15e48 RDI: ffff880007a52760
RBP: ffff880007b15da8 R08: 0000000000000002 R09: 0000000000000000
R10: ffff880007b15e48 R11: ffff880007b05478 R12: 0000000000000000
R13: ffff880007b05478 R14: 0000000000400920 R15: 0000000000000063
FS: 0000000002160880(0063) GS:ffff880007c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000058 CR3: 0000000007b1c000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
Process insmod (pid: 44, threadinfo ffff880007b14000, task ffff880007acb980)
Stack:
ffff880007b15dc8 ffff880007b05478 ffff880007b15da8 00000000fffffffe
ffff880007a52760 ffff880007b05478 ffff880007b15de8 ffffffff81143c0a
0000000000400920 ffff880007a52760 ffff880007b05478 0000000000000000
Call Trace:
[<ffffffff81143c0a>] kobject_add_internal+0xdf/0x1a0
[<ffffffff81143da1>] kobject_add_varg+0x41/0x50
[<ffffffff81143e6b>] kobject_add+0x64/0x66
[<ffffffff8113bbe7>] blk_register_queue+0x5f/0xb8
[<ffffffff81140f72>] add_disk+0xdf/0x289
[<ffffffffa00040df>] brd_init+0xdf/0x1aa [brd]
[<ffffffffa0004000>] ? 0xffffffffa0003fff
[<ffffffffa0004000>] ? 0xffffffffa0003fff
[<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
[<ffffffff8108516c>] sys_init_module+0x9c/0x1dc
[<ffffffff812ff4bb>] system_call_fastpath+0x16/0x1b
Code: 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 18 48 85 ff 75 04 0f 0b eb fe 48 8b 47 18 49 c7 c4 70 1e 4d 81 48 85 c0 74 04 4c 8b 60 30
8b 44 24 58 45 31 ed 0f b6 c4 85 c0 74 0d 48 8b 43 28 48 89
RIP [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
RSP <ffff880007b15d78>
CR2: 0000000000000058
---[ end trace aebb1175ce1f6739 ]---
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
brd_refcnt, brd_offset, brd_sizelimit and brd_blocksize in struct
brd_device seem to be copied from struct loop_device but they're
not used anywhere. Let get rid of them.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits)
ceph: fix cap flush race reentrancy
libceph: subscribe to osdmap when cluster is full
libceph: handle new osdmap down/state change encoding
rbd: handle online resize of underlying rbd image
ceph: avoid inode lookup on nfs fh reconnect
ceph: use LOOKUPINO to make unconnected nfs fh more reliable
rbd: use snprintf for disk->disk_name
rbd: cleanup: make kfree match kmalloc
rbd: warn on update_snaps failure on notify
ceph: check return value for start_request in writepages
ceph: remove useless check
libceph: add missing breaks in addr_set_port
libceph: fix TAG_WAIT case
ceph: fix broken comparison in readdir loop
libceph: fix osdmap timestamp assignment
ceph: fix rare potential cap leak
libceph: use snprintf for unknown addrs
libceph: use snprintf for formatting object name
ceph: use snprintf for dirstat content
libceph: fix uninitialized value when no get_authorizer method is set
...
|
|
* 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block: (110 commits)
loop: handle on-demand devices correctly
loop: limit 'max_part' module param to DISK_MAX_PARTS
drbd: fix warning
drbd: fix warning
drbd: Fix spelling
drbd: fix schedule in atomic
drbd: Take a more conservative approach when deciding max_bio_size
drbd: Fixed state transitions after async outdate-peer-handler returned
drbd: Disallow the peer_disk_state to be D_OUTDATED while connected
drbd: Fix for the connection problems on high latency links
drbd: fix potential activity log refcount imbalance in error path
drbd: Only downgrade the disk state in case of disk failures
drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int
drbd: fix potential distributed deadlock
lru_cache.h: fix comments referring to ts_ instead of lc_
drbd: Fix for application IO with the on-io-error=pass-on policy
xen/p2m: Add EXPORT_SYMBOL_GPL to the M2P override functions.
xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override.
xen/blkback: don't fail empty barrier requests
xen/blkback: fix xenbus_transaction_start() hang caused by double xenbus_transaction_end()
...
|