<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/kvm_host.h, branch v3.2.32</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>KVM: Ensure all vcpus are consistent with in-kernel irqchip settings</title>
<updated>2012-05-30T23:43:10+00:00</updated>
<author>
<name>Avi Kivity</name>
<email>avi@redhat.com</email>
</author>
<published>2012-03-05T12:23:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=645b177cbfce6b695bdbe0b4c131de584821840d'/>
<id>645b177cbfce6b695bdbe0b4c131de584821840d</id>
<content type='text'>
(cherry picked from commit 3e515705a1f46beb1c942bb8043c16f8ac7b1e9e)

If some vcpus are created before KVM_CREATE_IRQCHIP, then
irqchip_in_kernel() and vcpu-&gt;arch.apic will be inconsistent, leading
to potential NULL pointer dereferences.

Fix by:
- ensuring that no vcpus are installed when KVM_CREATE_IRQCHIP is called
- ensuring that a vcpu has an apic if it is installed after KVM_CREATE_IRQCHIP

This is somewhat long winded because vcpu-&gt;arch.apic is created without
kvm-&gt;lock held.

Based on earlier patch by Michael Ellerman.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
(cherry picked from commit 3e515705a1f46beb1c942bb8043c16f8ac7b1e9e)

If some vcpus are created before KVM_CREATE_IRQCHIP, then
irqchip_in_kernel() and vcpu-&gt;arch.apic will be inconsistent, leading
to potential NULL pointer dereferences.

Fix by:
- ensuring that no vcpus are installed when KVM_CREATE_IRQCHIP is called
- ensuring that a vcpu has an apic if it is installed after KVM_CREATE_IRQCHIP

This is somewhat long winded because vcpu-&gt;arch.apic is created without
kvm-&gt;lock held.

Based on earlier patch by Michael Ellerman.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: unmap pages from the iommu when slots are removed</title>
<updated>2012-05-11T12:14:07+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2012-04-27T21:54:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1e57aab4e6c549804298f07fac0b6fc77f10fab2'/>
<id>1e57aab4e6c549804298f07fac0b6fc77f10fab2</id>
<content type='text'>
commit 32f6daad4651a748a58a3ab6da0611862175722f upstream.

We've been adding new mappings, but not destroying old mappings.
This can lead to a page leak as pages are pinned using
get_user_pages, but only unpinned with put_page if they still
exist in the memslots list on vm shutdown.  A memslot that is
destroyed while an iommu domain is enabled for the guest will
therefore result in an elevated page reference count that is
never cleared.

Additionally, without this fix, the iommu is only programmed
with the first translation for a gpa.  This can result in
peer-to-peer errors if a mapping is destroyed and replaced by a
new mapping at the same gpa as the iommu will still be pointing
to the original, pinned memory address.

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 32f6daad4651a748a58a3ab6da0611862175722f upstream.

We've been adding new mappings, but not destroying old mappings.
This can lead to a page leak as pages are pinned using
get_user_pages, but only unpinned with put_page if they still
exist in the memslots list on vm shutdown.  A memslot that is
destroyed while an iommu domain is enabled for the guest will
therefore result in an elevated page reference count that is
never cleared.

Additionally, without this fix, the iommu is only programmed
with the first translation for a gpa.  This can result in
peer-to-peer errors if a mapping is destroyed and replaced by a
new mapping at the same gpa as the iommu will still be pointing
to the original, pinned memory address.

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: Fix simultaneous NMIs</title>
<updated>2011-09-25T16:52:59+00:00</updated>
<author>
<name>Avi Kivity</name>
<email>avi@redhat.com</email>
</author>
<published>2011-09-20T10:43:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7460fb4a340033107530df19e7e125bd0969bfb2'/>
<id>7460fb4a340033107530df19e7e125bd0969bfb2</id>
<content type='text'>
If simultaneous NMIs happen, we're supposed to queue the second
and next (collapsing them), but currently we sometimes collapse
the second into the first.

Fix by using a counter for pending NMIs instead of a bool; since
the counter limit depends on whether the processor is currently
in an NMI handler, which can only be checked in vcpu context
(via the NMI mask), we add a new KVM_REQ_NMI to request recalculation
of the counter.

Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If simultaneous NMIs happen, we're supposed to queue the second
and next (collapsing them), but currently we sometimes collapse
the second into the first.

Fix by using a counter for pending NMIs instead of a bool; since
the counter limit depends on whether the processor is currently
in an NMI handler, which can only be checked in vcpu context
(via the NMI mask), we add a new KVM_REQ_NMI to request recalculation
of the counter.

Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: Clean up and extend rate-limited output</title>
<updated>2011-09-25T16:52:43+00:00</updated>
<author>
<name>Jan Kiszka</name>
<email>jan.kiszka@siemens.com</email>
</author>
<published>2011-09-12T09:26:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bd80158aff71a80292f96d9baea1a65bc0ce87b3'/>
<id>bd80158aff71a80292f96d9baea1a65bc0ce87b3</id>
<content type='text'>
The use of printk_ratelimit is discouraged, replace it with
pr*_ratelimited or __ratelimit. While at it, convert remaining
guest-triggerable printks to rate-limited variants.

Signed-off-by: Jan Kiszka &lt;jan.kiszka@siemens.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The use of printk_ratelimit is discouraged, replace it with
pr*_ratelimited or __ratelimit. While at it, convert remaining
guest-triggerable printks to rate-limited variants.

Signed-off-by: Jan Kiszka &lt;jan.kiszka@siemens.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: Intelligent device lookup on I/O bus</title>
<updated>2011-09-25T16:17:59+00:00</updated>
<author>
<name>Sasha Levin</name>
<email>levinsasha928@gmail.com</email>
</author>
<published>2011-07-27T13:00:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=743eeb0b01d2fbf4154bf87bff1ebb6fb18aeb7a'/>
<id>743eeb0b01d2fbf4154bf87bff1ebb6fb18aeb7a</id>
<content type='text'>
Currently the method of dealing with an IO operation on a bus (PIO/MMIO)
is to call the read or write callback for each device registered
on the bus until we find a device which handles it.

Since the number of devices on a bus can be significant due to ioeventfds
and coalesced MMIO zones, this leads to a lot of overhead on each IO
operation.

Instead of registering devices, we now register ranges which points to
a device. Lookup is done using an efficient bsearch instead of a linear
search.

Performance test was conducted by comparing exit count per second with
200 ioeventfds created on one byte and the guest is trying to access a
different byte continuously (triggering usermode exits).
Before the patch the guest has achieved 259k exits per second, after the
patch the guest does 274k exits per second.

Cc: Avi Kivity &lt;avi@redhat.com&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;levinsasha928@gmail.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently the method of dealing with an IO operation on a bus (PIO/MMIO)
is to call the read or write callback for each device registered
on the bus until we find a device which handles it.

Since the number of devices on a bus can be significant due to ioeventfds
and coalesced MMIO zones, this leads to a lot of overhead on each IO
operation.

Instead of registering devices, we now register ranges which points to
a device. Lookup is done using an efficient bsearch instead of a linear
search.

Performance test was conducted by comparing exit count per second with
200 ioeventfds created on one byte and the guest is trying to access a
different byte continuously (triggering usermode exits).
Before the patch the guest has achieved 259k exits per second, after the
patch the guest does 274k exits per second.

Cc: Avi Kivity &lt;avi@redhat.com&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;levinsasha928@gmail.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: Make coalesced mmio use a device per zone</title>
<updated>2011-09-25T16:17:57+00:00</updated>
<author>
<name>Sasha Levin</name>
<email>levinsasha928@gmail.com</email>
</author>
<published>2011-07-20T17:59:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2b3c246a682c50f5415c71fc5387a114a6f0d643'/>
<id>2b3c246a682c50f5415c71fc5387a114a6f0d643</id>
<content type='text'>
This patch changes coalesced mmio to create one mmio device per
zone instead of handling all zones in one device.

Doing so enables us to take advantage of existing locking and prevents
a race condition between coalesced mmio registration/unregistration
and lookups.

Suggested-by: Avi Kivity &lt;avi@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;levinsasha928@gmail.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch changes coalesced mmio to create one mmio device per
zone instead of handling all zones in one device.

Doing so enables us to take advantage of existing locking and prevents
a race condition between coalesced mmio registration/unregistration
and lookups.

Suggested-by: Avi Kivity &lt;avi@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;levinsasha928@gmail.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: MMU: filter out the mmio pfn from the fault pfn</title>
<updated>2011-07-24T08:50:34+00:00</updated>
<author>
<name>Xiao Guangrong</name>
<email>xiaoguangrong@cn.fujitsu.com</email>
</author>
<published>2011-07-11T19:28:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fce92dce79dbf5fff39c7ac2fb149729d79b7a39'/>
<id>fce92dce79dbf5fff39c7ac2fb149729d79b7a39</id>
<content type='text'>
If the page fault is caused by mmio, the gfn can not be found in memslots, and
'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify
the mmio page fault.
And, to clarify the meaning of mmio pfn, we return fault page instead of bad
page when the gfn is not allowd to prefetch

Signed-off-by: Xiao Guangrong &lt;xiaoguangrong@cn.fujitsu.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the page fault is caused by mmio, the gfn can not be found in memslots, and
'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify
the mmio page fault.
And, to clarify the meaning of mmio pfn, we return fault page instead of bad
page when the gfn is not allowd to prefetch

Signed-off-by: Xiao Guangrong &lt;xiaoguangrong@cn.fujitsu.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: Steal time implementation</title>
<updated>2011-07-14T09:59:14+00:00</updated>
<author>
<name>Glauber Costa</name>
<email>glommer@redhat.com</email>
</author>
<published>2011-07-11T19:28:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c9aaa8957f203bd6df83b002fb40b98390bed078'/>
<id>c9aaa8957f203bd6df83b002fb40b98390bed078</id>
<content type='text'>
To implement steal time, we need the hypervisor to pass the guest
information about how much time was spent running other processes
outside the VM, while the vcpu had meaningful work to do - halt
time does not count.

This information is acquired through the run_delay field of
delayacct/schedstats infrastructure, that counts time spent in a
runqueue but not running.

Steal time is a per-cpu information, so the traditional MSR-based
infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the
memory area address containing information about steal time

This patch contains the hypervisor part of the steal time infrasructure,
and can be backported independently of the guest portion.

[avi, yongjie: export delayacct_on, to avoid build failures in some configs]

Signed-off-by: Glauber Costa &lt;glommer@redhat.com&gt;
Tested-by: Eric B Munson &lt;emunson@mgebm.net&gt;
CC: Rik van Riel &lt;riel@redhat.com&gt;
CC: Jeremy Fitzhardinge &lt;jeremy.fitzhardinge@citrix.com&gt;
CC: Peter Zijlstra &lt;peterz@infradead.org&gt;
CC: Anthony Liguori &lt;aliguori@us.ibm.com&gt;
Signed-off-by: Yongjie Ren &lt;yongjie.ren@intel.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To implement steal time, we need the hypervisor to pass the guest
information about how much time was spent running other processes
outside the VM, while the vcpu had meaningful work to do - halt
time does not count.

This information is acquired through the run_delay field of
delayacct/schedstats infrastructure, that counts time spent in a
runqueue but not running.

Steal time is a per-cpu information, so the traditional MSR-based
infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the
memory area address containing information about steal time

This patch contains the hypervisor part of the steal time infrasructure,
and can be backported independently of the guest portion.

[avi, yongjie: export delayacct_on, to avoid build failures in some configs]

Signed-off-by: Glauber Costa &lt;glommer@redhat.com&gt;
Tested-by: Eric B Munson &lt;emunson@mgebm.net&gt;
CC: Rik van Riel &lt;riel@redhat.com&gt;
CC: Jeremy Fitzhardinge &lt;jeremy.fitzhardinge@citrix.com&gt;
CC: Peter Zijlstra &lt;peterz@infradead.org&gt;
CC: Anthony Liguori &lt;aliguori@us.ibm.com&gt;
Signed-off-by: Yongjie Ren &lt;yongjie.ren@intel.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: introduce kvm_read_guest_cached</title>
<updated>2011-07-12T10:17:01+00:00</updated>
<author>
<name>Gleb Natapov</name>
<email>gleb@redhat.com</email>
</author>
<published>2011-07-11T19:28:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e03b644fe68b1c6401465b02724d261538dba10f'/>
<id>e03b644fe68b1c6401465b02724d261538dba10f</id>
<content type='text'>
Introduce kvm_read_guest_cached() function in addition to write one we
already have.

[ by glauber: export function signature in kvm header ]

Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
Signed-off-by: Glauber Costa &lt;glommer@redhat.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Tested-by: Eric Munson &lt;emunson@mgebm.net&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce kvm_read_guest_cached() function in addition to write one we
already have.

[ by glauber: export function signature in kvm header ]

Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
Signed-off-by: Glauber Costa &lt;glommer@redhat.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Tested-by: Eric Munson &lt;emunson@mgebm.net&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6</title>
<updated>2011-05-23T22:39:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-05-23T22:39:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5e152b4c9e0fce6149c74406346a7ae7e7a17727'/>
<id>5e152b4c9e0fce6149c74406346a7ae7e7a17727</id>
<content type='text'>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits)
  PCI: Don't use dmi_name_in_vendors in quirk
  PCI: remove unused AER functions
  PCI/sysfs: move bus cpuaffinity to class dev_attrs
  PCI: add rescan to /sys/.../pci_bus/.../
  PCI: update bridge resources to get more big ranges when allocating space (again)
  KVM: Use pci_store/load_saved_state() around VM device usage
  PCI: Add interfaces to store and load the device saved state
  PCI: Track the size of each saved capability data area
  PCI/e1000e: Add and use pci_disable_link_state_locked()
  x86/PCI: derive pcibios_last_bus from ACPI MCFG
  PCI: add latency tolerance reporting enable/disable support
  PCI: add OBFF enable/disable support
  PCI: add ID-based ordering enable/disable support
  PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot.
  PCI: Set PCIE maxpayload for card during hotplug insertion
  PCI/ACPI: Report _OSC control mask returned on failure to get control
  x86/PCI: irq and pci_ids patch for Intel Panther Point DeviceIDs
  PCI: handle positive error codes
  PCI: check pci_vpd_pci22_wait() return
  PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio
  ...

Fix up trivial conflicts in include/linux/pci_ids.h: commit a6e5e2be4461
moved the intel SMBUS ID definitons to the i2c-i801.c driver.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits)
  PCI: Don't use dmi_name_in_vendors in quirk
  PCI: remove unused AER functions
  PCI/sysfs: move bus cpuaffinity to class dev_attrs
  PCI: add rescan to /sys/.../pci_bus/.../
  PCI: update bridge resources to get more big ranges when allocating space (again)
  KVM: Use pci_store/load_saved_state() around VM device usage
  PCI: Add interfaces to store and load the device saved state
  PCI: Track the size of each saved capability data area
  PCI/e1000e: Add and use pci_disable_link_state_locked()
  x86/PCI: derive pcibios_last_bus from ACPI MCFG
  PCI: add latency tolerance reporting enable/disable support
  PCI: add OBFF enable/disable support
  PCI: add ID-based ordering enable/disable support
  PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot.
  PCI: Set PCIE maxpayload for card during hotplug insertion
  PCI/ACPI: Report _OSC control mask returned on failure to get control
  x86/PCI: irq and pci_ids patch for Intel Panther Point DeviceIDs
  PCI: handle positive error codes
  PCI: check pci_vpd_pci22_wait() return
  PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio
  ...

Fix up trivial conflicts in include/linux/pci_ids.h: commit a6e5e2be4461
moved the intel SMBUS ID definitons to the i2c-i801.c driver.
</pre>
</div>
</content>
</entry>
</feed>
