Age | Commit message (Collapse) | Author |
|
commit cf5d318865e25f887d49a0c6083bbc6dcd1905b1 upstream.
When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus
should really be turned off for the VM adhering to the suggestions in
the PSCI spec, and it's the sane thing to do.
Also, clarify the behavior and expectations for exits to user space with
the KVM_EXIT_SYSTEM_EVENT case.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit b856a59141b1066d3c896a0d0231f84dabd040af upstream.
When userspace resets the vcpu using KVM_ARM_VCPU_INIT, we should also
reset the HCR, because we now modify the HCR dynamically to
enable/disable trapping of guest accesses to the VM registers.
This is crucial for reboot of VMs working since otherwise we will not be
doing the necessary cache maintenance operations when faulting in pages
with the guest MMU off.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit 3ad8b3de526a76fbe9466b366059e4958957b88f upstream.
The implementation of KVM_ARM_VCPU_INIT is currently not doing what
userspace expects, namely making sure that a vcpu which may have been
turned off using PSCI is returned to its initial state, which would be
powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag.
Implement the expected functionality and clarify the ABI.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit 03f1d4c17edb31b41b14ca3a749ae38d2dd6639d upstream.
If a VCPU was originally started with power off (typically to be brought
up by PSCI in SMP configurations), there is no need to clear the
POWER_OFF flag in the kernel, as this flag is only tested during the
init ioctl itself.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit 849260c72c6b8bd53850cb00b80027db3a273c2c upstream.
Readonly memslots are often used to implement emulation of ROMs and
NOR flashes, in which case the guest may legally map these regions as
uncached.
To deal with the incoherency associated with uncached guest mappings,
treat all readonly memslots as incoherent, and ensure that pages that
belong to regions tagged as such are flushed to DRAM before being passed
to the guest.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit 840f4bfbe03f1ce94ade8fdf84e8cd925ef15a48 upstream.
To allow handling of incoherent memslots in a subsequent patch, this
patch adds a paramater 'ipa_uncached' to cache_coherent_guest_page()
so that we can instruct it to flush the page's contents to DRAM even
if the guest has caching globally enabled.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
Register MIDR_EL1 is masked to get variant and revision fields, then
compared against midr_range_min and midr_range_max when checking
whether CPU is affected by any particular erratum. However, variant
and revision fields in MIDR_EL1 are separated by 16 bits, so the min
and max of midr range should be constructed accordingly, otherwise
the patch will not be applied when variant field is non-0.
Acked-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Bo Yan <byan@nvidia.com>
[will: use MIDR_VARIANT_SHIFT to construct upper bound]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.18.y
(cherry picked from commit 6d1966dfd6e0ad2f8aa4b664ae1a62e33abe1998)
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
When running a compat (AArch32) userspace on Cortex-A53, a load at EL0
from a virtual address that matches the bottom 32 bits of the virtual
address used by a recent load at (AArch64) EL1 might return incorrect
data.
This patch works around the issue by writing to the contextidr_el1
register on the exception return path when returning to a 32-bit task.
This workaround is patched in at runtime based on the MIDR value of the
processor.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.18.y
(cherry picked from commit 905e8c5dcaa147163672b06fe9dcb5abaacbc711)
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
Not all of the errata we have workarounds for apply necessarily to all
SoCs, so people compiling a kernel for one very specific SoC may not
need to patch the kernel.
Introduce a new submenu in the "Platform selection" menu to allow
people to turn off certain bugs if they are not affected. By default
all of them are enabled.
Normal users or distribution kernels shouldn't bother to deselect any
bugs here, since the alternatives framework will take care of
patching them in only if needed.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
[will: moved kconfig menu under `Kernel Features']
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.18.y
(cherry picked from commit c0a01b84b1fdbd98bff5bca5b201fe73fda7e9d9)
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
The ARM erratum 832075 applies to certain revisions of Cortex-A57,
one of the workarounds is to change device loads into using
load-aquire semantics.
This is achieved using the alternatives framework.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.18.y
(cherry picked from commit 5afaa1fc1b320cec48affa7e6949f2493f875c12)
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
The ARM errata 819472, 826319, 827319 and 824069 define the same
workaround for these hardware issues in certain Cortex-A53 parts.
Use the new alternatives framework and the CPU MIDR detection to
patch "cache clean" into "cache clean and invalidate" instructions if
an affected CPU is detected at runtime.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
[will: add __maybe_unused to squash gcc warning]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.18.y
(cherry picked from commit 301bcfac42897dbd1b0b3c1be49f24654a1bc49e)
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
After each CPU has been started, we iterate through a list of
CPU features or bugs to detect CPUs which need (or could benefit
from) kernel code patches.
For each feature/bug there is a function which checks if that
particular CPU is affected. We will later provide some more generic
functions for common things like testing for certain MIDR ranges.
We do this for every CPU to cover big.LITTLE systems properly as
well.
If a certain feature/bug has been detected, the capability bit will
be set, so that later the call to apply_alternatives() will trigger
the actual code patching.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.18.y
(cherry picked from commit e116a375423393cdb94714e90a96857005d58428)
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
With a blatant copy of some x86 bits we introduce the alternative
runtime patching "framework" to arm64.
This is quite basic for now and we only provide the functions we need
at this time.
This is connected to the newly introduced feature bits.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.18.y
(cherry picked from commit e039ee4ee3fcf174736f2cb0a2eed6cb908348a6)
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
For taking note if at least one CPU in the system needs a bug
workaround or would benefit from a code optimization, we create a new
bitmap to hold (artificial) feature bits.
Since elf_hwcap is part of the userland ABI, we keep it alone and
introduce a new data structure for that (along with some accessors).
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.18.y
(cherry picked from commit 930da09f5e50dd22fb0a8600388da8677d62d671)
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 80313b3078fcd2ca51970880d90757f05879a193 ]
The ASRock Q1900DC-ITX mainboard (Baytrail-D) hangs randomly in
both BIOS and UEFI mode while rebooting unless reboot=pci is
used. Add a quirk to reboot via the pci method.
The problem is very intermittent and hard to debug, it might succeed
rebooting just fine 40 times in a row - but fails half a dozen times
the next day. It seems to be slightly less common in BIOS CSM mode
than native UEFI (with the CSM disabled), but it does happen in either
mode. Since I've started testing this patch in late january, rebooting
has been 100% reliable.
Most of the time it already hangs during POST, but occasionally it
might even make it through the bootloader and the kernel might even
start booting, but then hangs before the mode switch. The same symptoms
occur with grub-efi, gummiboot and grub-pc, just as well as (at least)
kernel 3.16-3.19 and 4.0-rc6 (I haven't tried older kernels than 3.16).
Upgrading to the most current mainboard firmware of the ASRock
Q1900DC-ITX, version 1.20, does not improve the situation.
( Searching the web seems to suggest that other Bay Trail-D mainboards
might be affected as well. )
--
Signed-off-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: <stable@vger.kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/20150330224427.0fb58e42@mir
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit fea559f303567e558bfab9c8ba4a2af5b309205a ]
Implement arch_irq_work_has_interrupt() for powerpc
Commit 9b01f5bf3 introduced a dependency on "IRQ work self-IPIs" for
full dynamic ticks to be enabled, by expecting architectures to
implement a suitable arch_irq_work_has_interrupt() routine.
Several arches have implemented this routine, including x86 (3010279f)
and arm (09f6edd4), but powerpc was omitted.
This patch implements this routine for powerpc.
The symptom, at boot (on powerpc systems) with "nohz_full=<CPU list>"
is displayed:
NO_HZ: Can't run full dynticks because arch doesn't support irq work self-IPIs
after this patch:
NO_HZ: Full dynticks CPUs: <CPU list>.
Tested against 3.19.
powerpc implements "IRQ work self-IPIs" by setting the decrementer to 1 in
arch_irq_work_raise(), which causes a decrementer exception on the next
timebase tick. We then handle the work in __timer_interrupt().
CC: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[mpe: Flesh out change log, fix ws & include guards, remove include of processor.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit d52356e7f48e400ca258c6763a232a92fa82ff68 ]
Space allocated for paca is based off nr_cpu_ids,
but pnv_alloc_idle_core_states() iterates paca with
cpu_nr_cores()*threads_per_core, which is using NR_CPUS.
This causes pnv_alloc_idle_core_states() to write over memory,
which is outside of paca array and may later lead to various panics.
Fixes: 7cba160ad789 (powernv/cpuidle: Redesign idle states management)
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit fdc0074c5fc8c7adb8186cbb123fe2082d9bd05f ]
As the sunxi usb clocks all contain a reset controller, it is not
possible to build the sunxi clock driver without RESET_CONTROLLER
enabled. Doing so results in an undefined symbol error:
drivers/built-in.o: In function `sunxi_gates_clk_setup':
linux/drivers/clk/sunxi/clk-sunxi.c:1071: undefined reference to
`reset_controller_register'
This is possible if building a minimal kernel without PHY_SUN4I_USB.
The dependency issue is made visible at compile time instead of
link time by the new A80 mmc clocks, which also use a reset control
itself.
This patch makes ARCH_SUNXI select ARCH_HAS_RESET_CONTROLLER and
RESET_CONTROLLER.
Fixes: 559482d1f950 ARM: sunxi: Split the various SoCs support in Kconfig
Cc: <stable@vger.kernel.org> # 3.16+
Reported-by: Lourens Rozema <ik@lourensrozema.nl>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit e4140819dadc3624accac8294881bca8a3cba4ed ]
A malicious signal handler / restorer can DOS the system by fudging the
user regs saved on stack, causing weird things such as sigreturn returning
to user mode PC but cpu state still being kernel mode....
Ensure that in sigreturn path status32 always has U bit; any other bogosity
(gargbage PC etc) will be taken care of by normal user mode exceptions mechanisms.
Reproducer signal handler:
void handle_sig(int signo, siginfo_t *info, void *context)
{
ucontext_t *uc = context;
struct user_regs_struct *regs = &(uc->uc_mcontext.regs);
regs->scratch.status32 = 0;
}
Before the fix, kernel would go off to weeds like below:
--------->8-----------
[ARCLinux]$ ./signal-test
Path: /signal-test
CPU: 0 PID: 61 Comm: signal-test Not tainted 4.0.0-rc5+ #65
task: 8f177880 ti: 5ffe6000 task.ti: 8f15c000
[ECR ]: 0x00220200 => Invalid Write @ 0x00000010 by insn @ 0x00010698
[EFA ]: 0x00000010
[BLINK ]: 0x2007c1ee
[ERET ]: 0x10698
[STAT32]: 0x00000000 : <--------
BTA: 0x00010680 SP: 0x5ffe7e48 FP: 0x00000000
LPS: 0x20003c6c LPE: 0x20003c70 LPC: 0x00000000
...
--------->8-----------
Reported-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 6914e1e3f63caa829431160f0f7093292daef2d5 ]
The regfile provided to SA_SIGINFO signal handler as ucontext was off by
one due to pt_regs gutter cleanups in 2013.
Before handling signal, user pt_regs are copied onto user_regs_struct and copied
back later. Both structs are binary compatible. This was all fine until
commit 2fa919045b72 (ARC: pt_regs update #2) which removed the empty stack slot
at top of pt_regs (corresponding to first pad) and made the corresponding
fixup in struct user_regs_struct (the pad in there was moved out of
@scratch - not removed altogether as it is part of ptrace ABI)
struct user_regs_struct {
+ long pad;
struct {
- long pad;
long bta, lp_start, lp_end,....
} scratch;
...
}
This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and
signal code needs to user_regs_struct.scratch to reflect it as pt_regs,
which is what this commit does.
This problem was hidden for 2 years, because both save/restore, despite
using wrong location, were using the same location. Only an interim
inspection (reproducer below) exposed the issue.
void handle_segv(int signo, siginfo_t *info, void *context)
{
ucontext_t *uc = context;
struct user_regs_struct *regs = &(uc->uc_mcontext.regs);
printf("regs %x %x\n", <=== prints 7 8 (vs. 8 9)
regs->scratch.r8, regs->scratch.r9);
}
int main()
{
struct sigaction sa;
sa.sa_sigaction = handle_segv;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
sigaction(SIGSEGV, &sa, NULL);
asm volatile(
"mov r7, 7 \n"
"mov r8, 8 \n"
"mov r9, 9 \n"
"mov r10, 10 \n"
:::"r7","r8","r9","r10");
*((unsigned int*)0x10) = 0;
}
Fixes: 2fa919045b72ec892e "ARC: pt_regs update #2: Remove unused gutter at start of pt_regs"
CC: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit bb344ca5b90df62b1a3b7a35c6a9d00b306a170d ]
Commit 746c9e9f92dd "of/base: Fix PowerPC address parsing hack" limited
the applicability of the workaround whereby a missing ranges is treated
as an empty ranges. This workaround was hiding a bug in the etsec2
device tree nodes, which have children with reg, but did not have
ranges.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit f6ff04149637723261aa4738958b0098b929ee9e ]
We currently use the device tree update code in the kernel after resuming
from a suspend operation to re-sync the kernels view of the device tree with
that of the hypervisor. The code as it stands is not endian safe as it relies
on parsing buffers returned by RTAS calls that thusly contains data in big
endian format.
This patch annotates variables and structure members with __be types as well
as performing necessary byte swaps to cpu endian for data that needs to be
parsed.
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Cyril Bur <cyrilbur@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit e53f21bce4d35a93b23d8fa1a840860f6c74f59e ]
The idle_task_exit() function may call switch_mm() with next ==
&init_mm. On arm64, init_mm.pgd cannot be used for user mappings, so
this patch simply sets the reserved TTBR0.
Cc: <stable@vger.kernel.org>
Reported-by: Jon Medhurst (Tixy) <tixy@linaro.org>
Tested-by: Jon Medhurst (Tixy) <tixy@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 44d5f6f5901e996744858c175baee320ccf1eda3 ]
commit id 2ba9f0d has changed CONFIG_KVM_BOOK3S_64_HV to tristate to allow
HV/PR bits to be built as modules. But the MCE code still depends on
CONFIG_KVM_BOOK3S_64_HV which is wrong. When user selects
CONFIG_KVM_BOOK3S_64_HV=m to build HV/PR bits as a separate module the
relevant MCE code gets excluded.
This patch fixes the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER. This
makes sure that the relevant MCE code is included when HV/PR bits
are built as a separate modules.
Fixes: 2ba9f0d88750 ("kvm: powerpc: book3s: Support building HV and PR KVM as module")
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 7d53d25578486d65bd7cd242bc7816b40e55e62b ]
ehrpwm tbclk is wrongly modelled as deriving from dpll_per_m2_ck.
The TRM says tbclk is derived from SYSCLKOUT. SYSCLKOUT nothing but the
functional clock of pwmss (l4ls_gclk).
Fix this by changing source of ehrpwmx_tbclk to l4ls_gclk.
Fixes: 4da1c67719f61 ("add tbclk data for ehrpwm")
Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 6e22616eba7e25fac5aa6cb6563471afa1815ec2 ]
ehrpwm tbclk is wrongly modelled as deriving from dpll_per_m2_ck.
The TRM says tbclk is derived from SYSCLKOUT. SYSCLKOUT nothing but the
functional clock of pwmss (l4ls_gclk).
Fix this by changing source of ehrpwmx_tbclk to l4ls_gclk.
Fixes: 9e100ebafb91: ("Fix ehrpwm tbclk data")
Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit d2192ea09858a8535b056fcede1a41d824e0b3d8 ]
Fixes: ee6c750761 (ARM: dts: dra7 clock data)
On DRA7x, For DPLL_IVA, the ref clock(CLKINP) is connected to sys_clk1 and
the bypass input(CLKINPULOW) is connected to iva_dpll_hs_clk_div clock.
But the bypass input is not directly routed to bypass clkout instead
both CLKINP and CLKINPULOW are connected to bypass clkout via a mux.
This mux is controlled by the bit - CM_CLKSEL_DPLL_IVA[23]:DPLL_BYP_CLKSEL
and it's POR value is zero which selects the CLKINP as bypass clkout.
which means iva_dpll_hs_clk_div is not the bypass clock for dpll_iva_ck
Fix this by adding another mux clock as parent in bypass mode.
This design is common to most of the PLLs and the rest have only one bypass
clock. Below is a list of the DPLLs that need this fix:
DPLL_IVA, DPLL_DDR,
DPLL_DSP, DPLL_EVE,
DPLL_GMAC, DPLL_PER,
DPLL_USB and DPLL_CORE
Signed-off-by: Ravikumar Kattekola <rk@ti.com>
Acked-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 84e871660bebfddb9a62ebd6f19d02536e782f0a ]
at91rm9200 standby and suspend to ram has been broken since
00482a4078f4. It is wrongly using AT91_BASE_SYS which is a physical address
and actually doesn't correspond to any register on at91rm9200.
Use the correct at91_ramc_base[0] instead.
Fixes: 00482a4078f4 (ARM: at91: implement the standby function for pm/cpuidle)
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 40f737791d4dab26bf23a6331609c604142228bd ]
USB vbus 5V is from PMIC SWBST, so set swbst_reg as vbus's
parent reg, it fixed a bug that the voltage of vbus is incorrect
due to swbst_reg is disabled after boots up.
Cc: stable@vger.kernel.org
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 00e7977dd1bbd46e336d7ef907d0fb6b6a4c294f ]
Prevent 16-bit APIC IDs being truncated by using correct mask. This fixes
booting large systems, where the wrong core would receive the startup and
init IPIs, causing hanging.
Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Cc: Steffen Persvold <sp@numascale.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1415089784-28779-1-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit e893286918d2cde3a94850d8f7101cd1039e0c62 ]
On gcc5 the kernel does not link:
ld: .eh_frame_hdr table[4] FDE at 0000000000000648 overlaps table[5] FDE at 0000000000000670.
Because prior GCC versions always emitted NOPs on ALIGN directives, but
gcc5 started omitting them.
.LSTARTFDEDLSI1 says:
/* HACK: The dwarf2 unwind routines will subtract 1 from the
return address to get an address in the middle of the
presumed call instruction. Since we didn't get here via
a call, we need to include the nop before the real start
to make up for it. */
.long .LSTART_sigreturn-1-. /* PC-relative start address */
But commit 69d0627a7f6e ("x86 vDSO: reorder vdso32 code") from 2.6.25
replaced .org __kernel_vsyscall+32,0x90 by ALIGN right before
__kernel_sigreturn.
Of course, ALIGN need not generate any NOP in there. Esp. gcc5 collapses
vclock_gettime.o and int80.o together with no generated NOPs as "ALIGN".
So fix this by adding to that point at least a single NOP and make the
function ALIGN possibly with more NOPs then.
Kudos for reporting and diagnosing should go to Richard.
Reported-by: Richard Biener <rguenther@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1425543211-12542-1-git-send-email-jslaby@suse.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit dc9be0fac70a2ad86e31a81372bb0bdfb6945353 ]
POWER supports irqfds but forgot to advertise them. Some userspace does
not check for the capability, but others check it---thus they work on
x86 and s390 but not POWER.
To avoid that other architectures in the future make the same mistake, let
common code handle KVM_CAP_IRQFD the same way as KVM_CAP_IRQFD_RESAMPLE.
Reported-and-tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 297e21053a52f060944e9f0de4c64fad9bcd72fc
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit f4c3686386393c120710dd34df2a74183ab805fd ]
drop_fpu() does clear_used_math() and usually this is correct
because tsk == current.
However switch_fpu_finish()->restore_fpu_checking() is called before
__switch_to() updates the "current_task" variable. If it fails,
we will wrongly clear the PF_USED_MATH flag of the previous task.
So use clear_stopped_child_used_math() instead.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150309171041.GB11388@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
__restore_xstate_sig()
[ Upstream commit a7c80ebcac3068b1c3cb27d538d29558c30010c8 ]
math_state_restore() assumes it is called with irqs disabled,
but this is not true if the caller is __restore_xstate_sig().
This means that if ia32_fxstate == T and __copy_from_user()
fails, __restore_xstate_sig() returns with irqs disabled too.
This triggers:
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41
dump_stack
___might_sleep
? _raw_spin_unlock_irqrestore
__might_sleep
down_read
? _raw_spin_unlock_irqrestore
print_vma_addr
signal_fault
sys32_rt_sigreturn
Change __restore_xstate_sig() to call set_used_math()
unconditionally. This avoids enabling and disabling interrupts
in math_state_restore(). If copy_from_user() fails, we can
simply do fpu_finit() by hand.
[ Note: this is only the first step. math_state_restore() should
not check used_math(), it should set this flag. While
init_fpu() should simply die. ]
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150307153844.GB25954@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit ccfe8c3f7e52ae83155cb038753f4c75b774ca8a ]
The kernel crypto API logic requires the caller to provide the
length of (ciphertext || authentication tag) as cryptlen for the
AEAD decryption operation. Thus, the cipher implementation must
calculate the size of the plaintext output itself and cannot simply use
cryptlen.
The RFC4106 GCM decryption operation tries to overwrite cryptlen memory
in req->dst. As the destination buffer for decryption only needs to hold
the plaintext memory but cryptlen references the input buffer holding
(ciphertext || authentication tag), the assumption of the destination
buffer length in RFC4106 GCM operation leads to a too large size. This
patch simply uses the already calculated plaintext size.
In addition, this patch fixes the offset calculation of the AAD buffer
pointer: as mentioned before, cryptlen already includes the size of the
tag. Thus, the tag does not need to be added. With the addition, the AAD
will be written beyond the already allocated buffer.
Note, this fixes a kernel crash that can be triggered from user space
via AF_ALG(aead) -- simply use the libkcapi test application
from [1] and update it to use rfc4106-gcm-aes.
Using [1], the changes were tested using CAVS vectors to demonstrate
that the crypto operation still delivers the right results.
[1] http://www.chronox.de/libkcapi.html
CC: Tadeusz Struk <tadeusz.struk@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 001eabfd54c0cbf9d7d16264ddc8cc0bee67e3ed ]
This updates the bit sliced AES module to the latest version in the
upstream OpenSSL repository (e620e5ae37bc). This is needed to fix a
bug in the XTS decryption path, where data chunked in a certain way
could trigger the ciphertext stealing code, which is not supposed to
be active in the kernel build (The kernel implementation of XTS only
supports round multiples of the AES block size of 16 bytes, whereas
the conformant OpenSSL implementation of XTS supports inputs of
arbitrary size by applying ciphertext stealing). This is fixed in
the upstream version by adding the missing #ifndef XTS_CHAIN_TWEAK
around the offending instructions.
The upstream code also contains the change applied by Russell to
build the code unconditionally, i.e., even if __LINUX_ARM_ARCH__ < 7,
but implemented slightly differently.
Cc: stable@vger.kernel.org
Fixes: e4e7f10bfc40 ("ARM: add support for bit sliced AES using NEON instructions")
Reported-by: Adrian Kotelba <adrian.kotelba@gmail.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 285994a62c80f1d72c6924282bcb59608098d5ec ]
The ARM architecture allows the caching of intermediate page table
levels and page table freeing requires a sequence like:
pmd_clear()
TLB invalidation
pte page freeing
With commit 5e5f6dc10546 (arm64: mm: enable HAVE_RCU_TABLE_FREE logic),
the page table freeing batching was moved from tlb_remove_page() to
tlb_remove_table(). The former takes care of TLB invalidation as this is
also shared with pte clearing and page cache page freeing. The latter,
however, does not invalidate the TLBs for intermediate page table levels
as it probably relies on the architecture code to do it if required.
When the mm->mm_users < 2, tlb_remove_table() does not do any batching
and page table pages are freed before tlb_finish_mmu() which performs
the actual TLB invalidation.
This patch introduces __tlb_flush_pgtable() for arm64 and calls it from
the {pte,pmd,pud}_free_tlb() directly without relying on deferred page
table freeing.
Fixes: 5e5f6dc10546 arm64: mm: enable HAVE_RCU_TABLE_FREE logic
Reported-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
Tested-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit fb7332a9fedfd62b1ba6530c86f39f0fa38afd49 ]
On architectures with hardware broadcasting of TLB invalidation messages
, it makes sense to reduce the range of the mmu_gather structure when
unmapping page ranges based on the dirty address information passed to
tlb_remove_tlb_entry.
arm64 already does this by directly manipulating the start/end fields
of the gather structure, but this confuses the generic code which
does not expect these fields to change and can end up calculating
invalid, negative ranges when forcing a flush in zap_pte_range.
This patch moves the minimal range calculation out of the arm64 code
and into the generic implementation, simplifying zap_pte_range in the
process (which no longer needs to care about start/end, since they will
point to the appropriate ranges already). With the range being tracked
by core code, the need_flush flag is dropped in favour of checking that
the end of the range has actually been set.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 7132813c384515c9dede1ae20e56f3895feb7f1e ]
Current implementation doesn't zero out the pages allocated.
Honor the __GFP_ZERO flag and zero out if set.
Cc: <stable@vger.kernel.org> # v3.14+
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 2077cef4d5c29cf886192ec32066f783d6a80db8 ]
Firstly, handle zero length calls properly. Believe it or not there
are a few of these happening during early boot.
Next, we can't just drop to a memcpy() call in the forward copy case
where dst <= src. The reason is that the cache initializing stores
used in the Niagara memcpy() implementations can end up clearing out
cache lines before we've sourced their original contents completely.
For example, considering NG4memcpy, the main unrolled loop begins like
this:
load src + 0x00
load src + 0x08
load src + 0x10
load src + 0x18
load src + 0x20
store dst + 0x00
Assume dst is 64 byte aligned and let's say that dst is src - 8 for
this memcpy() call. That store at the end there is the one to the
first line in the cache line, thus clearing the whole line, which thus
clobbers "src + 0x28" before it even gets loaded.
To avoid this, just fall through to a simple copy only mildly
optimized for the case where src and dst are 8 byte aligned and the
length is a multiple of 8 as well. We could get fancy and call
GENmemcpy() but this is good enough for how this thing is actually
used.
Reported-by: David Ahern <david.ahern@oracle.com>
Reported-by: Bob Picco <bpicco@meloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 31aaa98c248da766ece922bbbe8cc78cfd0bc920 ]
With the increase in number of CPUs calls to functions that dump
output to console (e.g., arch_trigger_all_cpu_backtrace) can take
a long time to complete. If IRQs are disabled eventually the NMI
watchdog kicks in and creates more havoc. Avoid by telling the NMI
watchdog everything is ok.
Signed-off-by: David Ahern <david.ahern@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit d51291cb8f32bfae6b331e1838651f3ddefa73a5 ]
Currently perf-stat (aka, counting mode) does not work:
$ perf stat ls
...
Performance counter stats for 'ls':
1.585665 task-clock (msec) # 0.580 CPUs utilized
24 context-switches # 0.015 M/sec
0 cpu-migrations # 0.000 K/sec
86 page-faults # 0.054 M/sec
<not supported> cycles
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
<not supported> instructions
<not supported> branches
<not supported> branch-misses
0.002735100 seconds time elapsed
The reason is that state is never reset (stays with PERF_HES_UPTODATE set).
Add a call to sparc_pmu_enable_event during the added_event handling.
Clean up the encoding since pmu_start calls sparc_pmu_enable_event which
does the same. Passing PERF_EF_RELOAD to sparc_pmu_start means the call
to sparc_perf_event_set_period can be removed as well.
With this patch:
$ perf stat ls
...
Performance counter stats for 'ls':
1.552890 task-clock (msec) # 0.552 CPUs utilized
24 context-switches # 0.015 M/sec
0 cpu-migrations # 0.000 K/sec
86 page-faults # 0.055 M/sec
5,748,997 cycles # 3.702 GHz
<not supported> stalled-cycles-frontend:HG
<not supported> stalled-cycles-backend:HG
1,684,362 instructions:HG # 0.29 insns per cycle
295,133 branches:HG # 190.054 M/sec
28,007 branch-misses:HG # 9.49% of all branches
0.002815665 seconds time elapsed
Signed-off-by: David Ahern <david.ahern@oracle.com>
Acked-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 5b0d4b5514bbcce69b516d0742f2cfc84ebd6db3 ]
perf_pmu_disable is called by core perf code before pmu->del and the
enable function is called by core perf code afterwards. No need to
call again within sparc_pmu_del.
Ditto for pmu->add and sparc_pmu_add.
Signed-off-by: David Ahern <david.ahern@oracle.com>
Acked-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 53eb2516972b8c4628651dfcb926cb9ef8b2864a ]
A bug was reported that the semtimedop() system call was always
failing eith ENOSYS.
Since SEMCTL is defined as 3, and SEMTIMEDOP is defined as 4,
the comparison "call <= SEMCTL" will always prevent SEMTIMEDOP
from getting through to the semaphore ops switch statement.
This is corrected by changing the comparison to "call <= SEMTIMEDOP".
Orabug: 20633375
Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 66d0f7ec9f1038452178b1993fc07fd96d30fd38 ]
Load balancing can be triggered in the critical sections protected by
srmmu_context_spinlock in destroy_context() and switch_mm() and can hang
the cpu waiting for the rq lock of another cpu that in turn has called
switch_mm hangning on srmmu_context_spinlock leading to deadlock.
So, disable interrupt while taking srmmu_context_spinlock in
destroy_context() and switch_mm() so we don't deadlock.
See also commit 77b838fa1ef0 ("[SPARC64]: destroy_context() needs to disable
interrupts.")
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit 6f963ec2d6bf2476a16799eece920acb2100ff1c upstream.
When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:
ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98
We are missing a call to of_node_get(). pnv_pci_to_phb_node() should
call of_node_get() otherwise np's reference count isn't incremented and
it might go away. Rename pnv_pci_to_phb_node() to pnv_pci_get_phb_node()
so it's clear it calls of_node_get().
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 13648b0118a24f4fc76c34e6c7b6ccf447e46a2a upstream.
/proc/<pid>/maps currently don't annotate stack vma with "[stack]"
This is because KSTK_ESP ie expected to return usermode SP of tsk while
currently it returns the kernel mode SP of a sleeping tsk.
While the fix is trivial, we also need to adjust the ARC kernel stack
unwinder to not use KSTK_SP and friends any more.
Reported-and-suggested-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit b3cffac04eca9af46e1e23560a8ee22b1bd36d43 upstream.
Currently the guest exit trace event saves the VCPU pointer to the
structure, and the guest PC is retrieved by dereferencing it when the
event is printed rather than directly from the trace record. This isn't
safe as the printing may occur long afterwards, after the PC has changed
and potentially after the VCPU has been freed. Usually this results in
the same (wrong) PC being printed for multiple trace events. It also
isn't portable as userland has no way to access the VCPU data structure
when interpreting the trace record itself.
Lets save the actual PC in the structure so that the correct value is
accessible later.
Fixes: 669e846e6c4e ("KVM/MIPS32: MIPS arch specific APIs for KVM")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit 4ff6f8e61eb7f96d3ca535c6d240f863ccd6fb7d upstream.
This has been broken for a long time: it broke first in 2.6.35, then was
almost fixed in 2.6.36 but this one-liner slipped through the cracks.
The bug shows up as an infinite loop in Windows 7 (and newer) boot on
32-bit hosts without EPT.
Windows uses CMPXCHG8B to write to page tables, which causes a
page fault if running without EPT; the emulator is then called from
kvm_mmu_page_fault. The loop then happens if the higher 4 bytes are
not 0; the common case for this is that the NX bit (bit 63) is 1.
Fixes: 6550e1f165f384f3a46b60a1be9aba4bc3c2adad
Fixes: 16518d5ada690643453eb0aef3cc7841d3623c2d
Reported-by: Erik Rull <erik.rull@rdsoftware.de>
Tested-by: Erik Rull <erik.rull@rdsoftware.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
commit 06c8173eb92bbfc03a0fe8bb64315857d0badd06 upstream.
Commit:
f31a9f7c7169 ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")
introduced alternative instructions for XSAVES/XRSTORS and commit:
adb9d526e982 ("x86/xsaves: Add xsaves and xrstors support for booting time")
added support for the XSAVES/XRSTORS instructions at boot time.
Unfortunately both failed to properly protect them against faulting:
The 'xstate_fault' macro will use the closest label named '1'
backward and that ends up in the .altinstr_replacement section
rather than in .text. This means that the kernel will never find
in the __ex_table the .text address where this instruction might
fault, leading to serious problems if userspace manages to
trigger the fault.
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Jamie Iles <jamie.iles@oracle.com>
[ Improved the changelog, fixed some whitespace noise. ]
Acked-by: Borislav Petkov <bp@alien8.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Allan Xavier <mr.a.xavier@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: adb9d526e982 ("x86/xsaves: Add xsaves and xrstors support for booting time")
Fixes: f31a9f7c7169 ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|