Age | Commit message (Collapse) | Author |
|
If we have two fsync()'s race on different subvols one will do all of its work
to get into the log_tree, wait on it's outstanding IO, and then allow the
log_tree to finish it's commit. The problem is we were just free'ing that
subvols logged extents instead of waiting on them, so whoever lost the race
wouldn't really have their data on disk. Fix this by waiting properly instead
of freeing the logged extents. Thanks,
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
The sizes that are obtained from space infos are in raw units and have
to be adjusted according to the raid factor. This was missing for
f_bavail and df reported doubled size for raid1.
Reported-by: Martin Steigerwald <Martin@lichtvoll.de>
Fixes: ba7b6e62f420 ("btrfs: adjust statfs calculations according to raid profiles")
CC: stable@vger.kernel.org
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
This can be reproduced by fstests: btrfs/070
The scenario is like the following:
replace worker thread defrag thread
--------------------- -------------
copy_nocow_pages_worker btrfs_defrag_file
copy_nocow_pages_for_inode ...
btrfs_writepages
|A| lock_extent_bits extent_write_cache_pages
|B| lock_page
__extent_writepage
... writepage_delalloc
find_lock_delalloc_range
|B| lock_extent_bits
find_or_create_page
pagecache_get_page
|A| lock_page
This leads to an ABBA pattern deadlock. To fix it,
o we just change it to an AABB pattern which means to @unlock_extent_bits()
before we @lock_page(), and in this way the @extent_read_full_page_nolock()
is no longer in an locked context, so change it back to @extent_read_full_page()
to regain protection.
o Since we @unlock_extent_bits() earlier, then before @write_page_nocow(),
the extent may not really point at the physical block we want, so we
have to check it before write.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Tested-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Replacing a xattr consists of doing a lookup for its existing value, delete
the current value from the respective leaf, release the search path and then
finally insert the new value. This leaves a time window where readers (getxattr,
listxattrs) won't see any value for the xattr. Xattrs are used to store ACLs,
so this has security implications.
This change also fixes 2 other existing issues which were:
*) Deleting the old xattr value without verifying first if the new xattr will
fit in the existing leaf item (in case multiple xattrs are packed in the
same item due to name hash collision);
*) Returning -EEXIST when the flag XATTR_CREATE is given and the xattr doesn't
exist but we have have an existing item that packs muliple xattrs with
the same name hash as the input xattr. In this case we should return ENOSPC.
A test case for xfstests follows soon.
Thanks to Alexandre Oliva for reporting the non-atomicity of the xattr replace
implementation.
Reported-by: Alexandre Oliva <oliva@gnu.org>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
We try to allocate an extent state structure before acquiring the extent
state tree's spinlock as we might need a new one later and therefore avoid
doing later an atomic allocation while holding the tree's spinlock. However
we returned -ENOMEM if that initial non-atomic allocation failed, which is
a bit excessive since we might end up not needing the pre-allocated extent
state at all - for the case where the tree doesn't have any extent states
that cover the input range and cover too any other range. Therefore don't
return -ENOMEM if that pre-allocation fails.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Our gluster boxes get several thousand statfs() calls per second, which begins
to suck hardcore with all of the lock contention on the chunk mutex and dev list
mutex. We don't really need to hold these things, if we have transient
weirdness with statfs() because of the chunk allocator we don't care, so remove
this locking.
We still need the dev_list lock if you mount with -o alloc_start however, which
is a good argument for nuking that thing from orbit, but that's a patch for
another day. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Our gluster boxes were spending lots of time in statfs because our fs'es are
huge. The problem is statfs loops through all of the block groups looking for
read only block groups, and when you have several terabytes worth of data that
ends up being a lot of block groups. Move the read only block groups onto a
read only list and only proces that list in
btrfs_account_ro_block_groups_free_space to reduce the amount of churn. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Copy&paste errors in some messages and add few more missing macro
accessors.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
The xfstest btrfs/014 which tests the balance operation caused that the
check_int module complained that known blocks changed their physical
location. Since this is not an error in this case, only print such
message if the verbose mode was enabled.
Reported-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Tested-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
The xfstest btrfs/014 which tests the balance operation caused issues with
the check_int module. The attempt was made to use btrfs_map_block() to
find the physical location for a written block. However, this was not
at all needed since the location of the written block was known since
a hook to submit_bio() was the reason for entering the check_int module.
Additionally, after a block relocation it happened that btrfs_map_block()
failed causing misleading error messages afterwards.
This patch changes the check_int module to use the known information of
the physical location from the bio.
Reported-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Tested-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
We try to allocate an extent state before acquiring the tree's spinlock
just in case we end up needing to split an existing extent state into two.
If that allocation failed, we would return -ENOMEM.
However, our only single caller (transaction/log commit code), passes in
an extent state that was cached from a call to find_first_extent_bit() and
that has a very high chance to match exactly the input range (always true
for a transaction commit and very often, but not always, true for a log
commit) - in this case we end up not needing at all that initial extent
state used for an eventual split. Therefore just don't return -ENOMEM if
we can't allocate the temporary extent state, since we might not need it
at all, and if we end up needing one, we'll do it later anyway.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Right now the only caller of find_first_extent_bit() that is interested
in caching extent states (transaction or log commit), never gets an extent
state cached. This is because find_first_extent_bit() only caches states
that have at least one of the flags EXTENT_IOBITS or EXTENT_BOUNDARY, and
the transaction/log commit caller always passes a tree that doesn't have
ever extent states with any of those flags (they can only have one of the
following flags: EXTENT_DIRTY, EXTENT_NEW or EXTENT_NEED_WAIT).
This change together with the following one in the patch series (titled
"Btrfs: avoid returning -ENOMEM in convert_extent_bit() too early") will
help reduce significantly the chances of calls to convert_extent_bit()
fail with -ENOMEM when called from the transaction/log commit code.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
When committing a transaction or a log, we look for btree extents that
need to be durably persisted by searching for ranges in a io tree that
have some bits set (EXTENT_DIRTY or EXTENT_NEW). We then attempt to clear
those bits and set the EXTENT_NEED_WAIT bit, with calls to the function
convert_extent_bit, and then start writeback for the extents.
That function however can return an error (at the moment only -ENOMEM
is possible, specially when it does GFP_ATOMIC allocation requests
through alloc_extent_state_atomic) - that means the ranges didn't got
the EXTENT_NEED_WAIT bit set (or at least not for the whole range),
which in turn means a call to btrfs_wait_marked_extents() won't find
those ranges for which we started writeback, causing a transaction
commit or a log commit to persist a new superblock without waiting
for the writeback of extents in that range to finish first.
Therefore if a crash happens after persisting the new superblock and
before writeback finishes, we have a superblock pointing to roots that
weren't fully persisted or roots that point to nodes or leafs that weren't
fully persisted, causing all sorts of unexpected/bad behaviour as we endup
reading garbage from disk or the content of some node/leaf from a past
generation that got cowed or deleted and is no longer valid (for this later
case we end up getting error messages like "parent transid verify failed on
X wanted Y found Z" when reading btree nodes/leafs from disk).
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
device replace could fail due to another running scrub process or any
other errors btrfs_scrub_dev() may hit, but this failure doesn't get
returned to userspace.
The following steps could reproduce this issue
mkfs -t btrfs -f /dev/sdb1 /dev/sdb2
mount /dev/sdb1 /mnt/btrfs
while true; do btrfs scrub start -B /mnt/btrfs >/dev/null 2>&1; done &
btrfs replace start -Bf /dev/sdb2 /dev/sdb3 /mnt/btrfs
# if this replace succeeded, do the following and repeat until
# you see this log in dmesg
# BTRFS: btrfs_scrub_dev(/dev/sdb2, 2, /dev/sdb3) failed -115
#btrfs replace start -Bf /dev/sdb3 /dev/sdb2 /mnt/btrfs
# once you see the error log in dmesg, check return value of
# replace
echo $?
Introduce a new dev replace result
BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS
to catch -EINPROGRESS explicitly and return other errors directly to
userspace.
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
size of @btrfsic_state needs more than 2M, it is very likely to
fail allocating memory using kzalloc(). see following mesage:
[91428.902148] Call Trace:
[<ffffffff816f6e0f>] dump_stack+0x4d/0x66
[<ffffffff811b1c7f>] warn_alloc_failed+0xff/0x170
[<ffffffff811b66e1>] __alloc_pages_nodemask+0x951/0xc30
[<ffffffff811fd9da>] alloc_pages_current+0x11a/0x1f0
[<ffffffff811b1e0b>] ? alloc_kmem_pages+0x3b/0xf0
[<ffffffff811b1e0b>] alloc_kmem_pages+0x3b/0xf0
[<ffffffff811d1018>] kmalloc_order+0x18/0x50
[<ffffffff811d1074>] kmalloc_order_trace+0x24/0x140
[<ffffffffa06c097b>] btrfsic_mount+0x8b/0xae0 [btrfs]
[<ffffffff810af555>] ? check_preempt_curr+0x85/0xa0
[<ffffffff810b2de3>] ? try_to_wake_up+0x103/0x430
[<ffffffffa063d200>] open_ctree+0x1bd0/0x2130 [btrfs]
[<ffffffffa060fdde>] btrfs_mount+0x62e/0x8b0 [btrfs]
[<ffffffff811fd9da>] ? alloc_pages_current+0x11a/0x1f0
[<ffffffff811b0a5e>] ? __get_free_pages+0xe/0x50
[<ffffffff81230429>] mount_fs+0x39/0x1b0
[<ffffffff812509fb>] vfs_kern_mount+0x6b/0x150
[<ffffffff812537fb>] do_mount+0x27b/0xc30
[<ffffffff811b0a5e>] ? __get_free_pages+0xe/0x50
[<ffffffff812544f6>] SyS_mount+0x96/0xf0
[<ffffffff81701970>] system_call_fastpath+0x16/0x1b
Since we are allocating memory for hash table array, so
it will be good if we could allocate continuous pages here.
Fix this problem by firstly trying kzalloc(), if we fail,
use vzalloc() instead.
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
If cow_file_range_inline() failed, when called from compress_file_range(),
we were tagging the locked page for writeback, end its writeback and unlock it,
but not marking it with an error nor setting AS_EIO in inode's mapping flags.
This made it impossible for a caller of filemap_fdatawrite_range (writepages)
or filemap_fdatawait_range() to know that an error happened. And the return
value of compress_file_range() is useless because it's returned to a workqueue
task and not to the task calling filemap_fdatawrite_range (writepages).
This change applies on top of the previous patchset starting at the patch
titled:
"[1/5] Btrfs: set page and mapping error on compressed write failure"
Which changed extent_clear_unlock_delalloc() to use SetPageError and
mapping_set_error().
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
To avoid duplicating this double filemap_fdatawrite_range() call for
inodes with async extents (compressed writes) so often.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
For compressed writes, after doing the first filemap_fdatawrite_range() we
don't get the pages tagged for writeback immediately. Instead we create
a workqueue task, which is run by other kthread, and keep the pages locked.
That other kthread compresses data, creates the respective ordered extent/s,
tags the pages for writeback and unlocks them. Therefore we need a second
call to filemap_fdatawrite_range() if we have compressed writes, as this
second call will wait for the pages to become unlocked, then see they became
tagged for writeback and finally wait for the writeback to finish.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Its return value is useless, its single caller ignores it and can't do
anything with it anyway, since it's a workqueue task and not the task
calling filemap_fdatawrite_range (writepages) nor filemap_fdatawait_range().
Failure is communicated to such functions via start and end of writeback
with the respective pages tagged with an error and AS_EIO flag set in the
inode's imapping.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Steps to reproduce:
# mkfs.btrfs -f /dev/sdb
# mount -t btrfs /dev/sdb /mnt -o compress=lzo
# dd if=/dev/zero of=/mnt/data bs=$((33*4096)) count=1
after previous steps, inode will be detected as bad compression ratio,
and NOCOMPRESS flag will be set for that inode.
Reason is that compress have a max limit pages every time(128K), if a
132k write in, it will be splitted into two write(128k+4k), this bug
is a leftover for commit 68bb462d42a(Btrfs: don't compress for a small write)
Fix this problem by checking every time before compression, if it is a
small write(<=blocksize), we bail out and fall into nocompression directly.
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Our compressed bio write end callback was essentially ignoring the error
parameter. When a write error happens, it must pass a value of 0 to the
inode's write_page_end_io_hook callback, SetPageError on the respective
pages and set AS_EIO in the inode's mapping flags, so that a call to
filemap_fdatawait_range() / filemap_fdatawait() can find out that errors
happened (we surely don't want silent failures on fsync for example).
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Its return value is completely ignored by its single caller and it's
useless anyway, since errors are indicated through SetPageError and
the bit AS_EIO set in the flags of the inode's mapping. The caller
can't do anything with the value, as it's invoked from a workqueue
task and not by the task calling filemap_fdatawrite_range (which calls
the writepages address space callback, which in turn calls the inode's
fill_delalloc callback).
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
If we had an error when processing one of the async extents from our list,
we were not processing the remaining async extents, meaning we would leak
those async_extent structs, never release the pages with the compressed
data and never unlock and clear the dirty flag from the inode's pages (those
that correspond to the uncompressed content).
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
In inode.c:submit_compressed_extents(), if we fail before calling
btrfs_submit_compressed_write(), or when that function fails, we
were freeing the async_extent structure without releasing its pages
and freeing the pages array.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
In inode.c:submit_compressed_extents(), before calling btrfs_submit_compressed_write()
we start writeback for all pages, clear their dirty flag, unlock them, etc, but if
btrfs_submit_compressed_write() fails (at the moment it can only fail with -ENOMEM),
we never end the writeback on the pages, so any filemap_fdatawait_range() call will
hang forever. We were also not calling the writepage end io hook, which means the
corresponding ordered extent will never complete and all its waiters will block
forever, such as a full fsync (via btrfs_wait_ordered_range()).
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
If we fail in submit_compressed_extents() before calling btrfs_submit_compressed_write(),
we start and end the writeback for the pages (clear their dirty flag, unlock them, etc)
but we don't tag the pages, nor the inode's mapping, with an error. This makes it
impossible for a caller of filemap_fdatawait_range() (fsync, or transaction commit
for e.g.) know that there was an error.
Note that the return value of submit_compressed_extents() is useless, as that function
is executed by a workqueue task and not directly by the fill_delalloc callback. This
means the writepage/s callbacks of the inode's address space operations don't get that
return value.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Another small set of fixes:
- some DT compatible typo fixes
- irq setup fix dealing with irq storms on orion
- i2c quirk generalization for mvebu
- a handful of smaller fixes for OMAP
- a couple of added file patterns for OMAP entries in MAINTAINERS"
* tag 'armsoc-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: at91/dt: Fix sama5d3x typos
pinctrl: dra: dt-bindings: Fix output pull up/down
MAINTAINERS: Update entry for omap related .dts files to cover new SoCs
MAINTAINERS: add more files under OMAP SUPPORT
ARM: dts: AM437x-SK-EVM: Fix DCDC3 voltage
ARM: dts: AM437x-GP-EVM: Fix DCDC3 voltage
ARM: dts: AM43x-EPOS-EVM: Fix DCDC3 voltage
ARM: dts: am335x-evm: Fix 5th NAND partition's name
ARM: orion: Fix for certain sequence of request_irq can cause irq storm
ARM: mvebu: armada xp: Generalize use of i2c quirk
|
|
Pull sparc fixes from David Miller:
1) Fix NULL oops in Schizo PCI controller error handler.
2) Fix race between xchg and other operations on 32-bit sparc, from
Andreas Larsson.
3) swab*() helpers need a dummy memory input operand to show data flow
on 64-bit sparc.
4) Fix RCU warnings due to missing irq_{enter,exit}() around
generic_smp_call_function*() calls.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix constraints on swab helpers.
sparc32: Implement xchg and atomic_xchg using ATOMIC_HASH locks
sparc64: Do irq_{enter,exit}() around generic_smp_call_function*().
sparc64: Fix crashes in schizo_pcierr_intr_other().
|
|
Pull md bugfix from Neil Brown:
"One fix for md for 3.18.
This fixes a regression introduced in 3.13"
* tag 'md/3.18-fix' of git://neil.brown.name/md:
md: Always set RECOVERY_NEEDED when clearing RECOVERY_FROZEN
|
|
Some DT files had a typo with a missing "5" in sama5d3x first compatible string.
Signed-off-by: Peter Rosin <peda@axentia.se>
[nicolas.ferre@atmel.com: modify commit log]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Merge "omap fixes against v3.18-rc4" from Tony Lindgren:
Few omap fixes for hangs and wrong pinctrl defines, and update
MAINTAINERS file to avoid missing PMIC and SoC related patches:
- Fix random hangs on am437x because of incorrect default
value for the DDR regulator
- Fix wrong partition name for NAND on am335x-evm
- Fix wrong pinctrl defines for dra7xx
- Update maintainers entries for PMICs and SoCs
* tag 'omap-fixes-against-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
pinctrl: dra: dt-bindings: Fix output pull up/down
MAINTAINERS: Update entry for omap related .dts files to cover new SoCs
MAINTAINERS: add more files under OMAP SUPPORT
ARM: dts: AM437x-SK-EVM: Fix DCDC3 voltage
ARM: dts: AM437x-GP-EVM: Fix DCDC3 voltage
ARM: dts: AM43x-EPOS-EVM: Fix DCDC3 voltage
ARM: dts: am335x-evm: Fix 5th NAND partition's name
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Merge "mvebu fixes for v3.18" from Jason Cooper:
- Armada XP
- Generalize i2c quirk
- orion
- Fix irq storm caused by specific sequence of request_irq
* tag 'mvebu-fixes-3.18' of git://git.infradead.org/linux-mvebu:
ARM: orion: Fix for certain sequence of request_irq can cause irq storm
ARM: mvebu: armada xp: Generalize use of i2c quirk
|
|
md_check_recovery will skip any recovery and also clear
MD_RECOVERY_NEEDED if MD_RECOVERY_FROZEN is set.
So when we clear _FROZEN, we must set _NEEDED and ensure that
md_check_recovery gets run.
Otherwise we could miss out on something that is needed.
In particular, this can make it impossible to remove a
failed device from an array is the 'recovery-needed' processing
didn't happen.
Suitable for stable kernels since 3.13.
Cc: stable@vger.kernel.org (3.13+)
Reported-and-tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Fixes: 30b8feb730f9b9b3c5de02580897da03f59b6b16
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
We are reading the memory location, so we have to have a memory
constraint in there purely for the sake of showing the data flow
to the compiler.
Reported-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of six fixes and a MAINTAINER update.
The fixes are two multipath (one in Test Unit Ready handling for the
path checkers and one in the section of code that sends a start unit
after failover; both of these were perturbed by the scsi-mq update), a
CD-ROM door locking fix that was likewise introduced by scsi-mq and
three driver fixes for a previous code update in cxgb4i, megaraid_sas
and bnx2fc"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
bnx2fc: fix tgt spinlock locking
megaraid_sas: fix bug in handling return value of pci_enable_msix_range()
cxgb4i: send abort_rpl correctly
cxgbi: add maintainer for cxgb3i/cxgb4i
scsi: TUR path is down after adapter gets reset with multipath
scsi: call device handler for failed TUR command
scsi: only re-lock door after EH on devices that were reset
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Microcode fixes, a Xen fix and a KASLR boot loading fix with certain
memory layouts"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode, AMD: Fix ucode patch stashing on 32-bit
x86/core, x86/xen/smp: Use 'die_complete' completion when taking CPU down
x86, microcode: Fix accessing dis_ucode_ldr on 32-bit
x86, kaslr: Prevent .bss from overlaping initrd
x86, microcode, AMD: Fix early ucode loading on 32-bit
|
|
Al Viro pointed out that the x86-64 csum_partial_copy_from_user() is
somewhat confused about what it should do on errors, notably it mostly
clears the uncopied end result buffer, but misses that for the initial
alignment case.
All users should check for errors, so it's dubious whether the clearing
is even necessary, and Al also points out that we should probably clean
up the calling conventions, but regardless of any future changes to this
function, the fact that it is inconsistent is just annoying.
So make the __get_user() failure path use the same error exit as all the
other errors do.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull ARM fixes from Russell King:
"Two fixes this time, one to ensure that the kuser helper option
depends on MMU as they aren't available for noMMU targets (and if the
option is selected, we end up oopsing.)
The second fix plugs a corner case with the decompressor, ensuring
that the instruction stream can see the relocated code in every case
on ARMv7 CPUs"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8198/1: make kuser helpers depend on MMU
ARM: 8191/1: decompressor: ensure I-side picks up relocated code
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"Changes include:
- wire up the bpf syscall
- remove CONFIG_64BIT usage from some userspace-exported header files
- use compat functions for msgctl, shmat, shmctl and semtimedop
syscalls"
* 'parisc-3.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Avoid using CONFIG_64BIT in userspace exported headers
parisc: Use compat layer for msgctl, shmat, shmctl and semtimedop syscalls
parisc: Use BUILD_BUG() instead of undefined functions
parisc: Wire up bpf syscall
|
|
Pull power supply updates from Sebastian Reichel:
"Power supply and reset changes for the v3.18-rc:
- misc. charger-manager fixes
- year 2038 fix in ab8500_fg
- fix error handling of bq2415x_charger"
* tag 'for-v3.18-rc' of git://git.infradead.org/battery-2.6:
power: charger-manager: Fix accessing invalidated power supply after charger unbind
power: charger-manager: Fix accessing invalidated power supply after fuel gauge unbind
power: charger-manager: Avoid recursive thermal get_temp call
power_supply: Add no_thermal property to prevent recursive get_temp calls
power: bq2415x_charger: Fix memory leak on DTS parsing error
power: bq2415x_charger: Properly handle ENODEV from power_supply_get_by_phandle
power: ab8500_fg.c: use 64-bit time types
|
|
Pull drm gixes from Dave Airlie:
- exynos: infinite loop regressions fixed
- i915: one regression
- radeon: one race condition on monitor probing
- noveau: two regressions
- tegra: one vblank regression fix
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/tegra: dc: Add missing call to drm_vblank_on()
drm/nouveau/nv50/disp: Fix modeset on G94
drm/gk20a/fb: fix setting of large page size bit
drm/radeon: add locking around atombios scratch space usage
drm/i915: Fix obj->map_and_fenceable across tiling changes
drm/exynos: fix possible infinite loop issue
drm/exynos: g2d: fix null pointer dereference
drm/exynos: resolve infinite loop issue on non multi-platform
drm/exynos: resolve infinite loop issue on multi-platform
|
|
Sasha Levin reports:
"gcc5 changes the default standard to c11, which makes kernel build
unhappy
Explicitly define the kernel standard to be gnu89 which should keep
everything working exactly like it was before gcc5"
There are multiple small issues with the new default, but the biggest
issue seems to be that the old - and very useful - GNU extension to
allow a cast in front of an initializer has gone away.
Patch updated by Kirill:
"I'm pretty sure all gcc versions you can build kernel with supports
-std=gnu89. cc-option is redunrant.
We also need to adjust HOSTCFLAGS otherwise allmodconfig fails for me"
Note by Andrew Pinski:
"Yes it was reported and both problems relating to this extension has
been added to gnu99 and gnu11. Though there are other issues with the
kernel dealing with extern inline have different semantics between
gnu89 and gnu99/11"
End result: we may be able to move up to a newer stdc model eventually,
but right now the newer models have some annoying deficiencies, so the
traditional "gnu89" model ends up being the preferred one.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Singed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
- stable patches to fix NFSv4.x delegation reclaim error paths
- fix a bug whereby we were advertising NFSv4.1 but using NFSv4.2
features
- fix a use-after-free problem with pNFS block layouts
- fix a memory leak in the pNFS files O_DIRECT code
- replace an intrusive and Oops-prone performance fix in the NFSv4
atomic open code with a safer one-line version and revert the two
original patches"
* tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor
NFS: Don't try to reclaim delegation open state if recovery failed
NFSv4: Ensure that we call FREE_STATEID when NFSv4.x stateids are revoked
NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return
NFSv4.1: nfs41_clear_delegation_stateid shouldn't trust NFS_DELEGATED_STATE
NFSv4: Ensure that we remove NFSv4.0 delegations when state has expired
NFS: SEEK is an NFS v4.2 feature
nfs: Fix use of uninitialized variable in nfs_getattr()
nfs: Remove bogus assignment
nfs: remove spurious WARN_ON_ONCE in write path
pnfs/blocklayout: serialize GETDEVICEINFO calls
nfs: fix pnfs direct write memory leak
Revert "NFS: nfs4_do_open should add negative results to the dcache."
Revert "NFS: remove BUG possibility in nfs4_open_and_get_state"
NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov:
"Mostly small fixups to PS/2 tochpad drivers (ALPS, Elantech,
Synaptics) to better deal with specific hardware"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elantech - update the documentation
Input: elantech - provide a sysfs knob for crc_enabled
Input: elantech - report the middle button of the touchpad
Input: alps - ignore bad data on Dell Latitudes E6440 and E7440
Input: alps - allow up to 2 invalid packets without resetting device
Input: alps - ignore potential bare packets when device is out of sync
Input: elantech - fix crc_enabled for Fujitsu H730
Input: elantech - use elantech_report_trackpoint for hardware v4 too
Input: twl4030-pwrbutton - ensure a wakeup event is recorded.
Input: synaptics - add min/max quirk for Lenovo T440s
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- fix EFI stub cache maintenance causing aborts during boot on certain
platforms
- handle byte stores in __clear_user without panicking
- fix race condition in aarch64_insn_patch_text_sync() (instruction
patching)
- Couple of type fixes
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: ARCH_PFN_OFFSET should be unsigned long
Correct the race condition in aarch64_insn_patch_text_sync()
arm64: __clear_user: handle exceptions on strb
arm64: Fix data type for physical address
arm64: efi: Fix stub cache maintenance
|
|
git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform drivers fixlets from Darren Hart:
"Just two patches to remove hp_accel events from the keyboard bus
stream via an i8042 filter"
* tag 'platform-drivers-x86-v3.18-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
platform: hp_accel: Add SERIO_I8042 as a dependency since it now includes i8042.h/serio.h
platform: hp_accel: add a i8042 filter to remove HPQ6000 data from kb bus stream
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"The most notable is the revert of lock splitting optimization in ahci.
This also made the IRQ handling threaded even when there's only one
IRQ in use. The conversion missed IRFQ_SHARED leading to screaming
IRQs problem in some cases and the threaded IRQ handling showed
performance regression in some LKP test cases. The changes are
reverted for now. It'll probably be retried once threaded IRQ
handling is removed from ahci.
Other than that, there's one fix for ahci and several patches adding
device IDs"
* 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: fix AHCI parameters not taken into account
ata: sata_rcar: Add r8a7793 device support
ahci: Add Device IDs for Intel Sunrise Point PCH
ahci: disable MSI instead of NCQ on Samsung pci-e SSDs on macbooks
Revert "AHCI: Optimize single IRQ interrupt processing"
Revert "AHCI: Do not acquire ata_host::lock from single IRQ handler"
ata: sata_rcar: Disable DIPM mode for r8a7790 ES1
|
|
Pull block layer fixes from Jens Axboe:
"Four small fixes that should be merged for the current 3.18-rc series.
This pull request contains:
- a minor bugfix for computation of best IO priority given two
merging requests. From Jan Kara.
- the final (final) merge count issue that has been plaguing
virtio-blk. From Ming Lei.
- enable parallel reinit notify for blk-mq queues, to combine the
cost of an RCU grace period across lots of devices. From Tejun
Heo.
- an error handling fix for the SCSI_IOCTL_SEND_COMMAND ioctl. From
Tony Battersby"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: blk-merge: fix blk_recount_segments()
scsi: Fix more error handling in SCSI_IOCTL_SEND_COMMAND
blk-mq: make mq_queue_reinit_notify() freeze queues in parallel
block: Fix computation of merged request priority
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are three regression fixes, two recent (generic power domains,
suspend-to-idle) and one older (cpufreq), an ACPI blacklist entry for
one more machine having problems with Windows 8 compatibility, a minor
cpufreq driver fix (cpufreq-dt) and a fixup for new callback
definitions (generic power domains).
Specifics:
- Fix a crash in the suspend-to-idle code path introduced by a recent
commit that forgot to check a pointer against NULL before
dereferencing it (Dmitry Eremin-Solenikov).
- Fix a boot crash on Exynos5 introduced by a recent commit making
that platform use generic Device Tree bindings for power domains
which exposed a weakness in the generic power domains framework
leading to that crash (Ulf Hansson).
- Fix a crash during system resume on systems where cpufreq depends
on Operation Performance Points (OPP) for functionality, but
CONFIG_OPP is not set. This leads the cpufreq driver registration
to fail, but the resume code attempts to restore the pre-suspend
cpufreq configuration (which does not exist) nevertheless and
crashes. From Geert Uytterhoeven.
- Add a new ACPI blacklist entry for Dell Vostro 3546 that has
problems if it is reported as Windows 8 compatible to the BIOS
(Adam Lee).
- Fix swapped arguments in an error message in the cpufreq-dt driver
(Abhilash Kesavan).
- Fix up the prototypes of new callbacks in struct generic_pm_domain
to make them more useful. Users of those callbacks will be added
in 3.19 and it's better for them to be based on the correct struct
definition in mainline from the start. From Ulf Hansson and Kevin
Hilman"
* tag 'pm+acpi-3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Domains: Fix initial default state of the need_restore flag
PM / sleep: Fix entering suspend-to-IDLE if no freeze_oops is set
PM / Domains: Change prototype for the attach and detach callbacks
cpufreq: Avoid crash in resume on SMP without OPP
cpufreq: cpufreq-dt: Fix arguments in clock failure error message
ACPI / blacklist: blacklist Win8 OSI for Dell Vostro 3546
|