Age | Commit message (Collapse) | Author |
|
commit 4274215d24633df7302069e51426659d4759c5ed upstream.
If a device fails in a way that causes pending request to take a while
to complete, md will not be able to immediately remove it from the
array in remove_and_add_spares.
It will then incorrectly look like a spare device and md will try to
recover it even though it is failed.
This leads to a recovery process starting and instantly aborting over
and over again.
We should check if the device is faulty before considering it to be a
spare. This will avoid trying to start a recovery that cannot
proceed.
This bug was introduced in 2.6.26 so that patch is suitable for any
kernel since then.
Reported-by: Jim Paradis <james.paradis@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit b062962edb086011e94ec4d9eb3f6a6d814f2a8f upstream.
Commit e9c7469bb4f5 ("md: implment REQ_FLUSH/FUA support")
introduced R5_WantFUA flag and set rw to WRITE_FUA in that case.
However remaining code still checks whether rw is exactly same
as WRITE or not, so FUAed-write ends up with being treated as
READ. Fix it.
This bug has been present since 2.6.37 and the fix is suitable for any
-stable kernel since then. It is not clear why this has not caused
more problems.
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 9b2dc8b665932a8e681a7ab3237f60475e75e161 upstream.
The @bio->bi_phys_segments consists of active stripes count in the
lower 16 bits and processed stripes count in the upper 16 bits. So
logical-OR operator should be bitwise one.
This bug has been present since 2.6.27 and the fix is suitable for any
-stable kernel since then. Fortunately the bad code is only used on
error paths and is relatively unlikely to be hit.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 01393f3d5836b7d62e925e6f4658a7eb22b83a11 upstream.
Check pers->hot_remove_disk instead of pers->hot_add_disk in slot_store()
during disk removal. The linear personality only has ->hot_add_disk and
no ->hot_remove_disk, so that removing disk in the array resulted to
following kernel bug:
$ sudo mdadm --create /dev/md0 --level=linear --raid-devices=4 /dev/loop[0-3]
$ echo none | sudo tee /sys/block/md0/md/dev-loop2/slot
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [< (null)>] (null)
PGD c9f5d067 PUD 8575a067 PMD 0
Oops: 0010 [#1] SMP
CPU 2
Modules linked in: linear loop bridge stp llc kvm_intel kvm asus_atk0110 sr_mod cdrom sg
Pid: 10450, comm: tee Not tainted 3.0.0-rc1-leonard+ #173 System manufacturer System Product Name/P5G41TD-M PRO
RIP: 0010:[<0000000000000000>] [< (null)>] (null)
RSP: 0018:ffff880085757df0 EFLAGS: 00010282
RAX: ffffffffa00168e0 RBX: ffff8800d1431800 RCX: 000000000000006e
RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff88008543c000
RBP: ffff880085757e48 R08: 0000000000000002 R09: 000000000000000a
R10: 0000000000000000 R11: ffff88008543c2e0 R12: 00000000ffffffff
R13: ffff8800b4641000 R14: 0000000000000005 R15: 0000000000000000
FS: 00007fe8c9e05700(0000) GS:ffff88011fa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000000b4502000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process tee (pid: 10450, threadinfo ffff880085756000, task ffff8800c9f08000)
Stack:
ffffffff8138496a ffff8800b4641000 ffff88008543c268 0000000000000000
ffff8800b4641000 ffff88008543c000 ffff8800d1431868 ffffffff81a78a90
ffff8800b4641000 ffff88008543c000 ffff8800d1431800 ffff880085757e98
Call Trace:
[<ffffffff8138496a>] ? slot_store+0xaa/0x265
[<ffffffff81384bae>] rdev_attr_store+0x89/0xa8
[<ffffffff8115a96a>] sysfs_write_file+0x108/0x144
[<ffffffff81106b87>] vfs_write+0xb1/0x10d
[<ffffffff8106e6c0>] ? trace_hardirqs_on_caller+0x111/0x135
[<ffffffff81106cac>] sys_write+0x4d/0x77
[<ffffffff814fe702>] system_call_fastpath+0x16/0x1b
Code: Bad RIP value.
RIP [< (null)>] (null)
RSP <ffff880085757df0>
CR2: 0000000000000000
---[ end trace ba5fc64319a826fb ]---
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 6f13f6fba76edc7d0e7580c5deee829d59a41b2f upstream.
Integrity errors need to be passed to the owner of the integrity
metadata for processing. Consequently EILSEQ should be passed up the
stack.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit f4808ca99a203f20b4475601748e44b25a65bdec upstream.
This patch adds a check that a block device has a request function
defined before it is used. Otherwise, misconfiguration can cause an oops.
Because we are allowing devices with zero size e.g. an offline multipath
device as in commit 2cd54d9bedb79a97f014e86c0da393416b264eb3
("dm: allow offline devices") there needs to be an additional check
to ensure devices are initialised. Some block devices, like a loop
device without a backing file, exist but have no request function.
Reproducer is trivial: dm-mirror on unbound loop device
(no backing file on loop devices)
dmsetup create x --table "0 8 mirror core 2 8 sync 2 /dev/loop0 0 /dev/loop1 0"
and mirror resync will immediatelly cause OOps.
BUG: unable to handle kernel NULL pointer dereference at (null)
? generic_make_request+0x2bd/0x590
? kmem_cache_alloc+0xad/0x190
submit_bio+0x53/0xe0
? bio_add_page+0x3b/0x50
dispatch_io+0x1ca/0x210 [dm_mod]
? read_callback+0x0/0xd0 [dm_mirror]
dm_io+0xbb/0x290 [dm_mod]
do_mirror+0x1e0/0x748 [dm_mirror]
Signed-off-by: Milan Broz <mbroz@redhat.com>
Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 8258c53208d7a9b7207e7d4dae36d2ea384cb278 upstream.
If a bitmap is found to be 'stale' the events_cleared value
is set to match 'events'.
However if the array is degraded this does not get stored on disk.
This can subsequently lead to incorrect behaviour.
So change bitmap_update_sb to always update events_cleared in the
superblock from the known events_cleared.
For neatness also set ->state from ->flags.
This requires updating ->state whenever we update ->flags, which makes
sense anyway.
This is suitable for any active -stable release.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit b0140891a8cea36469f58d23859e599b1122bd37 upstream.
There is a race when creating an md device by opening /dev/mdXX.
If two processes do this at much the same time they will follow the
call path
__blkdev_get -> get_gendisk -> kobj_lookup
The first will call
-> md_probe -> md_alloc -> add_disk -> blk_register_region
and the race happens when the second gets to kobj_lookup after
add_disk has called blk_register_region but before it returns to
md_alloc.
In the case the second will not call md_probe (as the probe is already
done) but will get a handle on the gendisk, return to __blkdev_get
which will then call md_open (via the ->open) pointer.
As mddev->gendisk hasn't been set yet, md_open will think something is
wrong an return with ERESTARTSYS.
This can loop endlessly while the first thread makes no progress
through add_disk. Nothing is blocking it, but due to scheduler
behaviour it doesn't get a turn.
So this is essentially a live-lock.
We fix this by simply moving the assignment to mddev->gendisk before
the call the add_disk() so md_open doesn't get confused.
Also move blk_queue_flush earlier because add_disk should be as late
as possible.
To make sure that md_open doesn't complete until md_alloc has done all
that is needed, we take mddev->open_mutex during the last part of
md_alloc. md_open will wait for this.
This can cause a lock-up on boot so Cc:ing for stable.
For 2.6.36 and earlier a different patch will be needed as the
'blk_queue_flush' call isn't there.
Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Change <sectors> from unsigned long long to sector_t.
This matches its source field.
ERROR: "__udivdi3" [drivers/md/raid456.ko] undefined!
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Problem:
After raid4->raid0 takeover operation, another takeover operation
(e.g raid0->raid10) results "kernel oops".
Root cause:
Variables 'degraded' in mddev structure is not cleared
on raid45->raid0 takeover.
This patch reset this variable.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
A raid0 array doesn't set 'dev_sectors' as each device might
contribute a different number of sectors.
So when converting to a RAID4 or RAID5 we need to set dev_sectors
as they need the number.
We have already verified that in fact all devices do contribute
the same number of sectors, so use that number.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
We previously needed to set ->queue_lock to match the raid5
device_lock so we could safely use queue_flag_* operations (e.g. for
plugging). which test the ->queue_lock is in fact locked.
However that need has completely gone away and is unlikely to come
back to remove this now-pointless setting.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
We just need to make sure that an unplug event wakes up the md
thread, which is exactly what mddev_check_plugged does.
Also remove some plug-related code that is no longer needed.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
In raid5 plugging is used for 2 things:
1/ collecting writes that require a bitmap update
2/ collecting writes in the hope that we can create full
stripes - or at least more-full.
We now release these different sets of stripes when plug_cnt
is zero.
Also in make_request, we call mddev_check_plug to hopefully increase
plug_cnt, and wake up the thread at the end if plugging wasn't
achieved for some reason.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
When an md device adds a request to a queue, it can call
mddev_check_plugged.
If this succeeds then we know that the md thread will be woken up
shortly, and ->plug_cnt will be non-zero until then, so some
processing can be delayed.
If it fails, then no unplug callback is expected and the make_request
function needs to do whatever is required to make the request happen.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
md has some plugging infrastructure for RAID5 to use because the
normal plugging infrastructure required a 'request_queue', and when
called from dm, RAID5 doesn't have one of those available.
This relied on the ->unplug_fn callback which doesn't exist any more.
So remove all of that code, both in md and raid5. Subsequent patches
with restore the plugging functionality.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
Now that unplugging is done differently, the unplug_fn callback is
never called, so it can be completely discarded.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
md/raid submits a lot of IO from the various raid threads.
So adding start/finish plug calls to those so that some
plugging happens.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
Fix common misspellings
|
|
The current block integrity (DIF/DIX) support in DM is verifying that
all devices' integrity profiles match during DM device resume (which
is past the point of no return). To some degree that is unavoidable
(stacked DM devices force this late checking). But for most DM
devices (which aren't stacking on other DM devices) the ideal time to
verify all integrity profiles match is during table load.
Introduce the notion of an "initialized" integrity profile: a profile
that was blk_integrity_register()'d with a non-NULL 'blk_integrity'
template. Add blk_integrity_is_initialized() to allow checking if a
profile was initialized.
Update DM integrity support to:
- check all devices with _initialized_ integrity profiles match
during table load; uninitialized profiles (e.g. for underlying DM
device(s) of a stacked DM device) are ignored.
- disallow a table load that would result in an integrity profile that
conflicts with a DM device's existing (in-use) integrity profile
- avoid clearing an existing integrity profile
- validate all integrity profiles match during resume; but if they
don't all we can do is report the mismatch (during resume we're past
the point of no return)
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
|
|
We incorrectly returned -EINVAL when none of the devices in the array
had an integrity profile. This in turn prevented mdadm from starting
the metadevice. Fix this so we only return errors on mismatched
profiles and memory allocation failures.
Reported-by: Giacomo Catenazzi <cate@cateee.net>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm stripe: implement merge method
dm mpath: allow table load with no priority groups
dm mpath: fail message ioctl if specified path is not valid
dm ioctl: add flag to wipe buffers for secure data
dm ioctl: prepare for crypt key wiping
dm crypt: wipe keys string immediately after key is set
dm: add flakey target
dm: fix opening log and cow devices for read only tables
|
|
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...
Fix up conflicts in fs/{aio.c,super.c}
|
|
Implement a merge function in the striped target.
When the striped target's underlying devices provide a merge_bvec_fn
(like all DM devices do via dm_merge_bvec) it is important to call down
to them when building a biovec that doesn't span a stripe boundary.
Without the merge method, a striped DM device stacked on DM devices
causes bios with a single page to be submitted which results
in unnecessary overhead that hurts performance.
This change really helps filesystems (e.g. XFS and now ext4) which take
care to assemble larger bios. By implementing stripe_merge(), DM and the
stripe target no longer undermine the filesystem's work by only allowing
a single page per bio. Buffered IO sees the biggest improvement
(particularly uncached reads, buffered writes to a lesser degree). This
is especially so for more capable "enterprise" storage LUNs.
The performance improvement has been measured to be ~12-35% -- when a
reasonable chunk_size is used (e.g. 64K) in conjunction with a stripe
count that is a power of 2.
In contrast, the performance penalty is ~5-7% for the pathological worst
case stripe configuration (small chunk_size with a stripe count that is
not a power of 2). The reason for this is that stripe_map_sector() is
now called once for every call to dm_merge_bvec(). stripe_map_sector()
will use slower division if stripe count isn't a power of 2.
Signed-off-by: Mustafa Mesanovic <mume@linux.vnet.ibm.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
This patch adjusts the multipath target to allow a table with both 0
priority groups and 0 for the initial priority group number.
If any mpath device is held open when all paths in the last priority
group have failed, userspace multipathd will attempt to reload the
associated DM table to reflect the fact that the device no longer has
any priority groups. But the reload attempt always failed because the
multipath target did not allow 0 priority groups.
All multipath target messages related to priority group (enable_group,
disable_group, switch_group) will handle a priority group of 0 (will
cause error).
When reloading a multipath table with 0 priority groups, userspace
multipathd must be updated to specify an initial priority group number
of 0 (rather than 1).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Babu Moger <babu.moger@lsi.com>
Acked-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Fail the reinstate_path and fail_path message ioctl if the specified
path is not valid.
The message ioctl would succeed for the 'reinistate_path' and
'fail_path' messages even if action was not taken because the
specified device was not a valid path of the multipath device.
Before, when /dev/vdb is not a path of mpathb:
$ dmsetup message mpathb 0 reinstate_path /dev/vdb
$ echo $?
0
After:
$ dmsetup message mpathb 0 reinstate_path /dev/vdb
device-mapper: message ioctl failed: Invalid argument
Command failed
$ echo $?
1
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Add DM_SECURE_DATA_FLAG which userspace can use to ensure
that all buffers allocated for dm-ioctl are wiped
immediately after use.
The user buffer is wiped as well (we do not want to keep
and return sensitive data back to userspace if the flag is set).
Wiping is useful for cryptsetup to ensure that the key
is present in memory only in defined places and only
for the time needed.
(For crypt, key can be present in table during load or table
status, wait and message commands).
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Prepare code for implementing buffer wipe flag.
No functional change in this patch.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Always wipe the original copy of the key after processing it
in crypt_set_key().
Signed-off-by: Milan Broz <mbroz@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
This target is the same as the linear target except that it returns I/O
errors periodically. It's been found useful in simulating failing
devices for testing purposes.
I needed a dm target to do some failure testing on btrfs's raid code, and
Mike pointed me at this.
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
If a table is read-only, also open any log and cow devices it uses read-only.
Previously, even read-only devices were opened read-write internally.
After patch 75f1dc0d076d1c1168f2115f1941ea627d38bd5a
block: check bdev_read_only() from blkdev_get()
was applied, loading such tables began to fail. The patch
was reverted by e51900f7d38cbcfb481d84567fd92540e7e1d23a
block: revert block_dev read-only check
but this patch fixes this part of the code to work with the original patch.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Alasdair Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
After the stack plugging introduction, these are called lockless.
Ensure that the counters are updated atomically.
Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (170 commits)
[SCSI] scsi_dh_rdac: Add MD36xxf into device list
[SCSI] scsi_debug: add consecutive medium errors
[SCSI] libsas: fix ata list corruption issue
[SCSI] hpsa: export resettable host attribute
[SCSI] hpsa: move device attributes to avoid forward declarations
[SCSI] scsi_debug: Logical Block Provisioning (SBC3r26)
[SCSI] sd: Logical Block Provisioning update
[SCSI] Include protection operation in SCSI command trace
[SCSI] hpsa: fix incorrect PCI IDs and add two new ones (2nd try)
[SCSI] target: Fix volume size misreporting for volumes > 2TB
[SCSI] bnx2fc: Broadcom FCoE offload driver
[SCSI] fcoe: fix broken fcoe interface reset
[SCSI] fcoe: precedence bug in fcoe_filter_frames()
[SCSI] libfcoe: Remove stale fcoe-netdev entries
[SCSI] libfcoe: Move FCOE_MTU definition from fcoe.h to libfcoe.h
[SCSI] libfc: introduce __fc_fill_fc_hdr that accepts fc_hdr as an argument
[SCSI] fcoe, libfc: initialize EM anchors list and then update npiv EMs
[SCSI] Revert "[SCSI] libfc: fix exchange being deleted when the abort itself is timed out"
[SCSI] libfc: Fixing a memory leak when destroying an interface
[SCSI] megaraid_sas: Version and Changelog update
...
Fix up trivial conflicts due to whitespace differences in
drivers/scsi/libsas/{sas_ata.c,sas_scsi_host.c}
|
|
MD and DM create a new bio_set for every metadevice. Each bio_set has an
integrity mempool attached regardless of whether the metadevice is
capable of passing integrity metadata. This is a waste of memory.
Instead we defer the allocation decision to MD and DM since we know at
metadevice creation time whether integrity passthrough is needed or not.
Automatic integrity mempool allocation can then be removed from
bioset_create() and we make an explicit integrity allocation for the
fs_bio_set.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Acked-by: Mike Snitzer <snizer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
bonding: enable netpoll without checking link status
xfrm: Refcount destination entry on xfrm_lookup
net: introduce rx_handler results and logic around that
bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
bonding: wrap slave state work
net: get rid of multiple bond-related netdevice->priv_flags
bonding: register slave pointer for rx_handler
be2net: Bump up the version number
be2net: Copyright notice change. Update to Emulex instead of ServerEngines
e1000e: fix kconfig for crc32 dependency
netfilter ebtables: fix xt_AUDIT to work with ebtables
xen network backend driver
bonding: Improve syslog message at device creation time
bonding: Call netif_carrier_off after register_netdevice
bonding: Incorrect TX queue offset
net_sched: fix ip_tos2prio
xfrm: fix __xfrm_route_forward()
be2net: Fix UDP packet detected status in RX compl
Phonet: fix aligned-mode pipe socket buffer header reserve
netxen: support for GbE port settings
...
Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
|
|
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix build failure introduced by s/freezeable/freezable/
workqueue: add system_freezeable_wq
rds/ib: use system_wq instead of rds_ib_fmr_wq
net/9p: replace p9_poll_task with a work
net/9p: use system_wq instead of p9_mux_wq
xfs: convert to alloc_workqueue()
reiserfs: make commit_wq use the default concurrency level
ocfs2: use system_wq instead of ocfs2_quota_wq
ext4: convert to alloc_workqueue()
scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path
scsi/be2iscsi,qla2xxx: convert to alloc_workqueue()
misc/iwmc3200top: use system_wq instead of dedicated workqueues
i2o: use alloc_workqueue() instead of create_workqueue()
acpi: kacpi*_wq don't need WQ_MEM_RECLAIM
fs/aio: aio_wq isn't used in memory reclaim path
input/tps6507x-ts: use system_wq instead of dedicated workqueue
cpufreq: use system_wq instead of dedicated workqueues
wireless/ipw2x00: use system_wq instead of dedicated workqueues
arm/omap: use system_wq in mailbox
workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER
|
|
Conflicts:
block/blk-core.c
block/blk-flush.c
drivers/md/raid1.c
drivers/md/raid10.c
drivers/md/raid5.c
fs/nilfs2/btnode.c
fs/nilfs2/mdt.c
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
drivers/net/bnx2x/bnx2x.h
|
|
Netlink message processing in the kernel is synchronous these days,
capabilities can be checked directly in security_netlink_recv() from
the current process.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Reviewed-by: James Morris <jmorris@namei.org>
[chrisw: update to include pohmelfs and uvesafb]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Revert
b821eaa572fd737faaf6928ba046e571526c36c6
and
f3b99be19ded511a1bf05a148276239d9f13eefa
When I wrote the first of these I had a wrong idea about the
lifetime of 'struct block_device'. It can disappear at any time that
the block device is not open if it falls out of the inode cache.
So relying on the 'size' recorded with it to detect when the
device size has changed and so we need to revalidate, is wrong.
Rather, we really do need the 'changed' attribute stored directly in
the mddev and set/tested as appropriate.
Without this patch, a sequence of:
mknod / open / close / unlink
(which can cause a block_device to be created and then destroyed)
will result in a rescan of the partition table and consequence removal
and addition of partitions.
Several of these in a row can get udev racing to create and unlink and
other code can get confused.
With the patch, the rescan is only performed when needed and so there
are no races.
This is suitable for any stable kernel from 2.6.35.
Reported-by: "Wojcik, Krzysztof" <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@kernel.org
|
|
|
|
blk_throtl_exit assumes that ->queue_lock still exists,
so make sure that it does.
To do this, we stop redirecting ->queue_lock to conf->device_lock
and leave it pointing where it is initialised - __queue_lock.
As the blk_plug functions check the ->queue_lock is held, we now
take that spin_lock explicitly around the plug functions. We don't
need the locking, just the warning removal.
This is needed for any kernel with the blk_throtl code, which is
which is 2.6.37 and later.
Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
'mdp' devices are md devices with preallocated device numbers
for partitions. As such it is possible to mknod and open a partition
before opening the whole device.
this causes md_probe() to be called with a device number of a
partition, which in-turn calls mddev_find with such a number.
However mddev_find expects the number of a 'whole device' and
does the wrong thing with partition numbers.
So add code to mddev_find to remove the 'partition' part of
a device number and just work with the 'whole device'.
This patch addresses https://bugzilla.kernel.org/show_bug.cgi?id=28652
Reported-by: hkmaly@bigfoot.com
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: <stable@kernel.org>
|
|
If the desired size of an array is set (via sysfs) before the array is
active (which is the normal sequence), we currrently call set_capacity
immediately.
This means that a subsequent 'open' (as can be caused by some
udev-triggers program) will notice the new size and try to probe for
partitions. However as the array isn't quite ready yet the read will
fail. Then when the array is read, as the size doesn't change again
we don't try to re-probe.
So when setting array size via sysfs, only call set_capacity if the
array is already active.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
Takeover raid1->raid0 not succeded. Kernel message is shown:
"md/raid0:md126: too few disks (1 of 2) - aborting!"
Problem was that we weren't updating ->raid_disks for that
takeover, unlike all the others.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|