linux-toradex.git/arch/arm/kvm, branch v4.4.136

arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls

2018-02-16T19:09:45+00:00

commit 20e8175d246e9f9deb377f2784b3e7dfb2ad3e86 upstream.

KVM doesn't follow the SMCCC when it comes to unimplemented calls,
and inject an UNDEF instead of returning an error. Since firmware
calls are now used for security mitigation, they are becoming more
common, and the undef is counter productive.

Instead, let's follow the SMCCC which states that -1 must be returned
to the caller when getting an unknown function number.

Tested-by: Ard Biesheuvel 
Signed-off-by: Marc Zyngier 
Signed-off-by: Catalin Marinas 
Signed-off-by: Greg Kroah-Hartman

KVM: Fix stack-out-of-bounds read in write_mmio

2018-01-17T08:35:24+00:00

commit e39d200fa5bf5b94a0948db0dae44c1b73b84a56 upstream.

Reported by syzkaller:

  BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm]
  Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298

  CPU: 6 PID: 32298 Comm: syz-executor Tainted: G           OE    4.15.0-rc2+ #18
  Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
  Call Trace:
   dump_stack+0xab/0xe1
   print_address_description+0x6b/0x290
   kasan_report+0x28a/0x370
   write_mmio+0x11e/0x270 [kvm]
   emulator_read_write_onepage+0x311/0x600 [kvm]
   emulator_read_write+0xef/0x240 [kvm]
   emulator_fix_hypercall+0x105/0x150 [kvm]
   em_hypercall+0x2b/0x80 [kvm]
   x86_emulate_insn+0x2b1/0x1640 [kvm]
   x86_emulate_instruction+0x39a/0xb90 [kvm]
   handle_exception+0x1b4/0x4d0 [kvm_intel]
   vcpu_enter_guest+0x15a0/0x2640 [kvm]
   kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm]
   kvm_vcpu_ioctl+0x479/0x880 [kvm]
   do_vfs_ioctl+0x142/0x9a0
   SyS_ioctl+0x74/0x80
   entry_SYSCALL_64_fastpath+0x23/0x9a

The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall)
to the guest memory, however, write_mmio tracepoint always prints 8 bytes
through *(u64 *)val since kvm splits the mmio access into 8 bytes. This
leaks 5 bytes from the kernel stack (CVE-2017-17741).  This patch fixes
it by just accessing the bytes which we operate on.

Before patch:

syz-executor-5567  [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f

After patch:

syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f

Reported-by: Dmitry Vyukov 
Reviewed-by: Darren Kenny 
Reviewed-by: Marc Zyngier 
Tested-by: Marc Zyngier 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Signed-off-by: Wanpeng Li 
Signed-off-by: Paolo Bonzini 
Cc: Mathieu Desnoyers 
Signed-off-by: Greg Kroah-Hartman

arm: KVM: Survive unknown traps from guests

2017-12-16T09:33:52+00:00

[ Upstream commit f050fe7a9164945dd1c28be05bf00e8cfb082ccf ]

Currently we BUG() if we see a HSR.EC value we don't recognise. As
configurable disables/enables are added to the architecture (controlled
by RES1/RES0 bits respectively), with associated synchronous exceptions,
it may be possible for a guest to trigger exceptions with classes that
we don't recognise.

While we can't service these exceptions in a manner useful to the guest,
we can avoid bringing down the host. Per ARM DDI 0406C.c, all currently
unallocated HSR EC encodings are reserved, and per ARM DDI
0487A.k_iss10775, page G6-4395, EC values within the range 0x00 - 0x2c
are reserved for future use with synchronous exceptions, and EC values
within the range 0x2d - 0x3f may be used for either synchronous or
asynchronous exceptions.

The patch makes KVM handle any unknown EC by injecting an UNDEFINED
exception into the guest, with a corresponding (ratelimited) warning in
the host dmesg. We could later improve on this with with a new (opt-in)
exit to the host userspace.

Cc: Dave Martin 
Cc: Suzuki K Poulose 
Reviewed-by: Christoffer Dall 
Signed-off-by: Mark Rutland 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

kvm: arm/arm64: Force reading uncached stage2 PGD

2017-09-07T06:34:10+00:00

commit 2952a6070e07ebdd5896f1f5b861acad677caded upstream.

Make sure we don't use a cached value of the KVM stage2 PGD while
resetting the PGD.

Cc: Marc Zyngier 
Signed-off-by: Suzuki K Poulose 
Reviewed-by: Christoffer Dall 
Signed-off-by: Christoffer Dall 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Greg Kroah-Hartman

kvm: arm/arm64: Fix race in resetting stage2 PGD

2017-09-07T06:34:09+00:00

commit 6c0d706b563af732adb094c5bf807437e8963e84 upstream.

In kvm_free_stage2_pgd() we check the stage2 PGD before holding
the lock and proceed to take the lock if it is valid. And we unmap
the page tables, followed by releasing the lock. We reset the PGD
only after dropping this lock, which could cause a race condition
where another thread waiting on or even holding the lock, could
potentially see that the PGD is still valid and proceed to perform
a stage2 operation and later encounter a NULL PGD.

[223090.242280] Unable to handle kernel NULL pointer dereference at
virtual address 00000040
[223090.262330] PC is at unmap_stage2_range+0x8c/0x428
[223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c
[223090.262531] Call trace:
[223090.262533] [] unmap_stage2_range+0x8c/0x428
[223090.262535] [] kvm_unmap_hva_handler+0x2c/0x3c
[223090.262537] [] handle_hva_to_gpa+0xb0/0x104
[223090.262539] [] kvm_unmap_hva+0x5c/0xbc
[223090.262543] []
kvm_mmu_notifier_invalidate_page+0x50/0x8c
[223090.262547] []
__mmu_notifier_invalidate_page+0x5c/0x84
[223090.262551] [] try_to_unmap_one+0x1d0/0x4a0
[223090.262553] [] rmap_walk+0x1cc/0x2e0
[223090.262555] [] try_to_unmap+0x74/0xa4
[223090.262557] [] migrate_pages+0x31c/0x5ac
[223090.262561] [] compact_zone+0x3fc/0x7ac
[223090.262563] [] compact_zone_order+0x94/0xb0
[223090.262564] [] try_to_compact_pages+0x108/0x290
[223090.262569] [] __alloc_pages_direct_compact+0x70/0x1ac
[223090.262571] [] __alloc_pages_nodemask+0x434/0x9f4
[223090.262572] [] alloc_pages_vma+0x230/0x254
[223090.262574] [] do_huge_pmd_anonymous_page+0x114/0x538
[223090.262576] [] handle_mm_fault+0xd40/0x17a4
[223090.262577] [] __get_user_pages+0x12c/0x36c
[223090.262578] [] get_user_pages_unlocked+0xa4/0x1b8
[223090.262579] [] __gfn_to_pfn_memslot+0x280/0x31c
[223090.262580] [] gfn_to_pfn_prot+0x4c/0x5c
[223090.262582] [] kvm_handle_guest_abort+0x240/0x774
[223090.262584] [] handle_exit+0x11c/0x1ac
[223090.262586] [] kvm_arch_vcpu_ioctl_run+0x31c/0x648
[223090.262587] [] kvm_vcpu_ioctl+0x378/0x768
[223090.262590] [] do_vfs_ioctl+0x324/0x5a4
[223090.262591] [] SyS_ioctl+0x90/0xa4
[223090.262595] [] el0_svc_naked+0x38/0x3c

This patch moves the stage2 PGD manipulation under the lock.

Reported-by: Alexander Graf 
Cc: Mark Rutland 
Cc: Marc Zyngier 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Reviewed-by: Christoffer Dall 
Reviewed-by: Marc Zyngier 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Christoffer Dall 
Signed-off-by: Greg Kroah-Hartman

KVM: arm/arm64: Handle hva aging while destroying the vm

2017-08-13T02:29:09+00:00

commit 7e5a672289c9754d07e1c3b33649786d3d70f5e4 upstream.

The mmu_notifier_release() callback of KVM triggers cleaning up
the stage2 page table on kvm-arm. However there could be other
notifier callbacks in parallel with the mmu_notifier_release(),
which could cause the call backs ending up in an empty stage2
page table. Make sure we check it for all the notifier callbacks.

Fixes: commit 293f29363 ("kvm-arm: Unmap shadow pagetables properly")
Reported-by: Alex Graf 
Reviewed-by: Christoffer Dall 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Marc Zyngier 
Signed-off-by: Greg Kroah-Hartman

KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages

2017-06-14T11:16:25+00:00

commit d6dbdd3c8558cad3b6d74cc357b408622d122331 upstream.

Under memory pressure, we start ageing pages, which amounts to parsing
the page tables. Since we don't want to allocate any extra level,
we pass NULL for our private allocation cache. Which means that
stage2_get_pud() is allowed to fail. This results in the following
splat:

[ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[ 1520.417741] pgd = ffff810f52fef000
[ 1520.421201] [00000008] *pgd=0000010f636c5003, *pud=0000010f56f48003, *pmd=0000000000000000
[ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[ 1520.435156] Modules linked in:
[ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G        W       4.12.0-rc4-00027-g1885c397eaec #7205
[ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016
[ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000
[ 1520.469666] PC is at stage2_get_pmd+0x34/0x110
[ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0
[ 1520.478917] pc : [] lr : [] pstate: 40000145
[ 1520.486325] sp : ffff800ce04e33d0
[ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064
[ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000
[ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000
[ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000
[ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000
[ 1520.516274] x19: 0000000058264000 x18: 0000000000000000
[ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70
[ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008
[ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002
[ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940
[ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200
[ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000
[ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000
[ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008
[ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c
[ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000)
[...]
[ 1521.510735] [] stage2_get_pmd+0x34/0x110
[ 1521.516221] [] kvm_age_hva_handler+0x44/0xf0
[ 1521.522054] [] handle_hva_to_gpa+0xb8/0xe8
[ 1521.527716] [] kvm_age_hva+0x44/0xf0
[ 1521.532854] [] kvm_mmu_notifier_clear_flush_young+0x70/0xc0
[ 1521.539992] [] __mmu_notifier_clear_flush_young+0x88/0xd0
[ 1521.546958] [] page_referenced_one+0xf0/0x188
[ 1521.552881] [] rmap_walk_anon+0xec/0x250
[ 1521.558370] [] rmap_walk+0x78/0xa0
[ 1521.563337] [] page_referenced+0x164/0x180
[ 1521.569002] [] shrink_active_list+0x178/0x3b8
[ 1521.574922] [] shrink_node_memcg+0x328/0x600
[ 1521.580758] [] shrink_node+0xc4/0x328
[ 1521.585986] [] do_try_to_free_pages+0xc0/0x340
[ 1521.592000] [] try_to_free_pages+0xcc/0x240
[...]

The trivial fix is to handle this NULL pud value early, rather than
dereferencing it blindly.

Signed-off-by: Marc Zyngier 
Reviewed-by: Christoffer Dall 
Signed-off-by: Christoffer Dall 
Signed-off-by: Greg Kroah-Hartman

arm: KVM: Allow unaligned accesses at HYP

2017-06-14T11:16:21+00:00

commit 33b5c38852b29736f3b472dd095c9a18ec22746f upstream.

We currently have the HSCTLR.A bit set, trapping unaligned accesses
at HYP, but we're not really prepared to deal with it.

Since the rest of the kernel is pretty happy about that, let's follow
its example and set HSCTLR.A to zero. Modern CPUs don't really care.

Signed-off-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
Signed-off-by: Greg Kroah-Hartman

KVM: arm/arm64: fix races in kvm_psci_vcpu_on

2017-05-20T12:27:00+00:00

commit 6c7a5dce22b3f3cc44be098e2837fa6797edb8b8 upstream.

Fix potential races in kvm_psci_vcpu_on() by taking the kvm->lock
mutex.  In general, it's a bad idea to allow more than one PSCI_CPU_ON
to process the same target VCPU at the same time.  One such problem
that may arise is that one PSCI_CPU_ON could be resetting the target
vcpu, which fills the entire sys_regs array with a temporary value
including the MPIDR register, while another looks up the VCPU based
on the MPIDR value, resulting in no target VCPU found.  Resolves both
races found with the kvm-unit-tests/arm/psci unit test.

Reviewed-by: Marc Zyngier 
Reviewed-by: Christoffer Dall 
Reported-by: Levente Kurusa 
Suggested-by: Christoffer Dall 
Signed-off-by: Andrew Jones 
Signed-off-by: Christoffer Dall 
Signed-off-by: Greg Kroah-Hartman

kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd

2017-04-27T07:09:33+00:00

commit 8b3405e345b5a098101b0c31b264c812bba045d9 upstream.

In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range. And since we have to unmap the entire Guest memory range
holding a spinlock, make sure we yield the lock if necessary, after we
unmap each PUD range.

Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Cc: Paolo Bonzini 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: Mark Rutland 
Signed-off-by: Suzuki K Poulose 
[ Avoid vCPU starvation and lockup detector warnings ]
Signed-off-by: Marc Zyngier 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Christoffer Dall 
Signed-off-by: Greg Kroah-Hartman