linux-toradex.git/virt, branch v3.17.8

kvm: fix excessive pages un-pinning in kvm_iommu_map error path.

2014-11-14T18:10:28+00:00

commit 3d32e4dbe71374a6780eaf51d719d76f9a9bf22f upstream.

The third parameter of kvm_unpin_pages() when called from
kvm_iommu_map_pages() is wrong, it should be the number of pages to un-pin
and not the page size.

This error was facilitated with an inconsistent API: kvm_pin_pages() takes
a size, but kvn_unpin_pages() takes a number of pages, so fix the problem
by matching the two.

This was introduced by commit 350b8bd ("kvm: iommu: fix the third parameter
of kvm_iommu_put_pages (CVE-2014-3601)"), which fixes the lack of
un-pinning for pages intended to be un-pinned (i.e. memory leak) but
unfortunately potentially aggravated the number of pages we un-pin that
should have stayed pinned. As far as I understand though, the same
practical mitigations apply.

This issue was found during review of Red Hat 6.6 patches to prepare
Ksplice rebootless updates.

Thanks to Vegard for his time on a late Friday evening to help me in
understanding this code.

Fixes: 350b8bd ("kvm: iommu: fix the third parameter of... (CVE-2014-3601)")
Signed-off-by: Quentin Casasnovas 
Signed-off-by: Vegard Nossum 
Signed-off-by: Jamie Iles 
Reviewed-by: Sasha Levin 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Greg Kroah-Hartman

kvm: don't take vcpu mutex for obviously invalid vcpu ioctls

2014-10-30T16:43:06+00:00

commit 2ea75be3219571d0ec009ce20d9971e54af96e09 upstream.

vcpu ioctls can hang the calling thread if issued while a vcpu is running.
However, invalid ioctls can happen when userspace tries to probe the kind
of file descriptors (e.g. isatty() calls ioctl(TCGETS)); in that case,
we know the ioctl is going to be rejected as invalid anyway and we can
fail before trying to take the vcpu mutex.

This patch does not change functionality, it just makes invalid ioctls
fail faster.

Signed-off-by: David Matlack 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Greg Kroah-Hartman

KVM: do not bias the generation number in kvm_current_mmio_generation

2014-10-30T16:43:06+00:00

commit 00f034a12fdd81210d58116326d92780aac5c238 upstream.

The next patch will give a meaning (a la seqcount) to the low bit of the
generation number.  Ensure that it matches between kvm->memslots->generation
and kvm_current_mmio_generation().

Reviewed-by: David Matlack 
Reviewed-by: Xiao Guangrong 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Greg Kroah-Hartman

kvm: fix potentially corrupt mmio cache

2014-10-30T16:43:06+00:00

commit ee3d1570b58677885b4552bce8217fda7b226a68 upstream.

vcpu exits and memslot mutations can run concurrently as long as the
vcpu does not aquire the slots mutex. Thus it is theoretically possible
for memslots to change underneath a vcpu that is handling an exit.

If we increment the memslot generation number again after
synchronize_srcu_expedited(), vcpus can safely cache memslot generation
without maintaining a single rcu_dereference through an entire vm exit.
And much of the x86/kvm code does not maintain a single rcu_dereference
of the current memslots during each exit.

We can prevent the following case:

   vcpu (CPU 0)                             | thread (CPU 1)
--------------------------------------------+--------------------------
1  vm exit                                  |
2  srcu_read_unlock(&kvm->srcu)             |
3  decide to cache something based on       |
     old memslots                           |
4                                           | change memslots
                                            | (increments generation)
5                                           | synchronize_srcu(&kvm->srcu);
6  retrieve generation # from new memslots  |
7  tag cache with new memslot generation    |
8  srcu_read_unlock(&kvm->srcu)             |
...                                         |
                       |
...                                         |
                                  |
                                            |

By incrementing the generation after synchronizing with kvm->srcu readers,
we ensure that the generation retrieved in (6) will become invalid soon
after (8).

Keeping the existing increment is not strictly necessary, but we
do keep it and just move it for consistency from update_memslots to
install_new_memslots.  It invalidates old cached MMIOs immediately,
instead of having to wait for the end of synchronize_srcu_expedited,
which makes the code more clearly correct in case CPU 1 is preempted
right after synchronize_srcu() returns.

To avoid halving the generation space in SPTEs, always presume that the
low bit of the generation is zero when reconstructing a generation number
out of an SPTE.  This effectively disables MMIO caching in SPTEs during
the call to synchronize_srcu_expedited.  Using the low bit this way is
somewhat like a seqcount---where the protected thing is a cache, and
instead of retrying we can simply punt if we observe the low bit to be 1.

Signed-off-by: David Matlack 
Reviewed-by: Xiao Guangrong 
Reviewed-by: David Matlack 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Greg Kroah-Hartman

Merge tag 'kvm-arm-for-v3.17-rc7-or-final' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

2014-09-23T13:18:02+00:00

Fixes unaligned access to the gicv2 virtual cpu status.

arm/arm64: KVM: Fix unaligned access bug on gicv2 access

2014-09-22T21:05:56+00:00

We were using an atomic bitop on the vgic_v2.vgic_elrsr field which was
not aligned to the natural size on 64-bit platforms.  This bug showed up
after QEMU correctly identifies the pl011 line as being level-triggered,
and not edge-triggered.

These data structures are protected by a spinlock so simply use a
non-atomic version of the accessor instead.

Tested-by: Joel Schopp 
Reported-by: Riku Voipio 
Signed-off-by: Christoffer Dall

KVM: correct null pid check in kvm_vcpu_yield_to()

2014-09-22T11:21:29+00:00

Correct a simple mistake of checking the wrong variable
before a dereference, resulting in the dereference not being
properly protected by rcu_dereference().

Signed-off-by: Sam Bobroff 
Signed-off-by: Paolo Bonzini

KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()

2014-09-14T14:26:05+00:00

Read-only memory ranges may be backed by the zero page, so avoid
misidentifying it a a MMIO pfn.

This fixes another issue I identified when testing QEMU+KVM_UEFI, where
a read to an uninitialized emulated NOR flash brought in the zero page,
but mapped as a read-write device region, because kvm_is_mmio_pfn()
misidentifies it as a MMIO pfn due to its PG_reserved bit being set.

Signed-off-by: Ard Biesheuvel 
Fixes: b88657674d39 ("ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping")
Signed-off-by: Paolo Bonzini

virt/kvm/assigned-dev.c: Set 'dev->irq_source_id' to '-1' after free it

2014-08-19T13:12:28+00:00

As a generic function, deassign_guest_irq() assumes it can be called
even if assign_guest_irq() is not be called successfully (which can be
triggered by ioctl from user mode, indirectly).

So for assign_guest_irq() failure process, need set 'dev->irq_source_id'
to -1 after free 'dev->irq_source_id', or deassign_guest_irq() may free
it again.

Signed-off-by: Chen Gang 
Signed-off-by: Paolo Bonzini

kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)

2014-08-19T13:04:45+00:00

The third parameter of kvm_iommu_put_pages is wrong,
It should be 'gfn - slot->base_gfn'.

By making gfn very large, malicious guest or userspace can cause kvm to
go to this error path, and subsequently to pass a huge value as size.
Alternatively if gfn is small, then pages would be pinned but never
unpinned, causing host memory leak and local DOS.

Passing a reasonable but large value could be the most dangerous case,
because it would unpin a page that should have stayed pinned, and thus
allow the device to DMA into arbitrary memory.  However, this cannot
happen because of the condition that can trigger the error:

- out of memory (where you can't allocate even a single page)
  should not be possible for the attacker to trigger

- when exceeding the iommu's address space, guest pages after gfn
  will also exceed the iommu's address space, and inside
  kvm_iommu_put_pages() the iommu_iova_to_phys() will fail.  The
  page thus would not be unpinned at all.

Reported-by: Jack Morgenstein 
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Paolo Bonzini