<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/powerpc/kernel/exceptions-64s.S, branch v3.4.9</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>powerpc/irq: Make alignment &amp; program interrupt behave the same</title>
<updated>2012-05-08T23:42:33+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-05-08T03:38:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a3512b2dd57cb653bb33645ca9c934436e547e3c'/>
<id>a3512b2dd57cb653bb33645ca9c934436e547e3c</id>
<content type='text'>
Alignment was the last user of the ENABLE_INTS macro, which we can
now remove. All non-syscall exceptions now disable interrupts on
entry, they get re-enabled conditionally from C code. Don't
unconditionally re-enable in program check either, check the
original context.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Alignment was the last user of the ENABLE_INTS macro, which we can
now remove. All non-syscall exceptions now disable interrupts on
entry, they get re-enabled conditionally from C code. Don't
unconditionally re-enable in program check either, check the
original context.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm</title>
<updated>2012-03-28T21:35:31+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-28T21:35:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2e7580b0e75d771d93e24e681031a165b1d31071'/>
<id>2e7580b0e75d771d93e24e681031a165b1d31071</id>
<content type='text'>
Pull kvm updates from Avi Kivity:
 "Changes include timekeeping improvements, support for assigning host
  PCI devices that share interrupt lines, s390 user-controlled guests, a
  large ppc update, and random fixes."

This is with the sign-off's fixed, hopefully next merge window we won't
have rebased commits.

* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
  KVM: Convert intx_mask_lock to spin lock
  KVM: x86: fix kvm_write_tsc() TSC matching thinko
  x86: kvmclock: abstract save/restore sched_clock_state
  KVM: nVMX: Fix erroneous exception bitmap check
  KVM: Ignore the writes to MSR_K7_HWCR(3)
  KVM: MMU: make use of -&gt;root_level in reset_rsvds_bits_mask
  KVM: PMU: add proper support for fixed counter 2
  KVM: PMU: Fix raw event check
  KVM: PMU: warn when pin control is set in eventsel msr
  KVM: VMX: Fix delayed load of shared MSRs
  KVM: use correct tlbs dirty type in cmpxchg
  KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
  KVM: Ensure all vcpus are consistent with in-kernel irqchip settings
  KVM: x86 emulator: Allow PM/VM86 switch during task switch
  KVM: SVM: Fix CPL updates
  KVM: x86 emulator: VM86 segments must have DPL 3
  KVM: x86 emulator: Fix task switch privilege checks
  arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice
  KVM: x86 emulator: correctly mask pmc index bits in RDPMC instruction emulation
  KVM: mmu_notifier: Flush TLBs before releasing mmu_lock
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull kvm updates from Avi Kivity:
 "Changes include timekeeping improvements, support for assigning host
  PCI devices that share interrupt lines, s390 user-controlled guests, a
  large ppc update, and random fixes."

This is with the sign-off's fixed, hopefully next merge window we won't
have rebased commits.

* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
  KVM: Convert intx_mask_lock to spin lock
  KVM: x86: fix kvm_write_tsc() TSC matching thinko
  x86: kvmclock: abstract save/restore sched_clock_state
  KVM: nVMX: Fix erroneous exception bitmap check
  KVM: Ignore the writes to MSR_K7_HWCR(3)
  KVM: MMU: make use of -&gt;root_level in reset_rsvds_bits_mask
  KVM: PMU: add proper support for fixed counter 2
  KVM: PMU: Fix raw event check
  KVM: PMU: warn when pin control is set in eventsel msr
  KVM: VMX: Fix delayed load of shared MSRs
  KVM: use correct tlbs dirty type in cmpxchg
  KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
  KVM: Ensure all vcpus are consistent with in-kernel irqchip settings
  KVM: x86 emulator: Allow PM/VM86 switch during task switch
  KVM: SVM: Fix CPL updates
  KVM: x86 emulator: VM86 segments must have DPL 3
  KVM: x86 emulator: Fix task switch privilege checks
  arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice
  KVM: x86 emulator: correctly mask pmc index bits in RDPMC instruction emulation
  KVM: mmu_notifier: Flush TLBs before releasing mmu_lock
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Rework lazy-interrupt handling</title>
<updated>2012-03-09T02:25:06+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-03-06T07:27:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7230c5644188cd9e3fb380cc97dde00c464a3ba7'/>
<id>7230c5644188cd9e3fb380cc97dde00c464a3ba7</id>
<content type='text'>
The current implementation of lazy interrupts handling has some
issues that this tries to address.

We don't do the various workarounds we need to do when re-enabling
interrupts in some cases such as when returning from an interrupt
and thus we may still lose or get delayed decrementer or doorbell
interrupts.

The current scheme also makes it much harder to handle the external
"edge" interrupts provided by some BookE processors when using the
EPR facility (External Proxy) and the Freescale Hypervisor.

Additionally, we tend to keep interrupts hard disabled in a number
of cases, such as decrementer interrupts, external interrupts, or
when a masked decrementer interrupt is pending. This is sub-optimal.

This is an attempt at fixing it all in one go by reworking the way
we do the lazy interrupt disabling from the ground up.

The base idea is to replace the "hard_enabled" field with a
"irq_happened" field in which we store a bit mask of what interrupt
occurred while soft-disabled.

When re-enabling, either via arch_local_irq_restore() or when returning
from an interrupt, we can now decide what to do by testing bits in that
field.

We then implement replaying of the missed interrupts either by
re-using the existing exception frame (in exception exit case) or via
the creation of a new one from an assembly trampoline (in the
arch_local_irq_enable case).

This removes the need to play with the decrementer to try to create
fake interrupts, among others.

In addition, this adds a few refinements:

 - We no longer  hard disable decrementer interrupts that occur
while soft-disabled. We now simply bump the decrementer back to max
(on BookS) or leave it stopped (on BookE) and continue with hard interrupts
enabled, which means that we'll potentially get better sample quality from
performance monitor interrupts.

 - Timer, decrementer and doorbell interrupts now hard-enable
shortly after removing the source of the interrupt, which means
they no longer run entirely hard disabled. Again, this will improve
perf sample quality.

 - On Book3E 64-bit, we now make the performance monitor interrupt
act as an NMI like Book3S (the necessary C code for that to work
appear to already be present in the FSL perf code, notably calling
nmi_enter instead of irq_enter). (This also fixes a bug where BookE
perfmon interrupts could clobber r14 ... oops)

 - We could make "masked" decrementer interrupts act as NMIs when doing
timer-based perf sampling to improve the sample quality.

Signed-off-by-yet: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
---

v2:

- Add hard-enable to decrementer, timer and doorbells
- Fix CR clobber in masked irq handling on BookE
- Make embedded perf interrupt act as an NMI
- Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
  to retrigger an interrupt without preventing hard-enable

v3:

 - Fix or vs. ori bug on Book3E
 - Fix enabling of interrupts for some exceptions on Book3E

v4:

 - Fix resend of doorbells on return from interrupt on Book3E

v5:

 - Rebased on top of my latest series, which involves some significant
rework of some aspects of the patch.

v6:
 - 32-bit compile fix
 - more compile fixes with various .config combos
 - factor out the asm code to soft-disable interrupts
 - remove the C wrapper around preempt_schedule_irq

v7:
 - Fix a bug with hard irq state tracking on native power7
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current implementation of lazy interrupts handling has some
issues that this tries to address.

We don't do the various workarounds we need to do when re-enabling
interrupts in some cases such as when returning from an interrupt
and thus we may still lose or get delayed decrementer or doorbell
interrupts.

The current scheme also makes it much harder to handle the external
"edge" interrupts provided by some BookE processors when using the
EPR facility (External Proxy) and the Freescale Hypervisor.

Additionally, we tend to keep interrupts hard disabled in a number
of cases, such as decrementer interrupts, external interrupts, or
when a masked decrementer interrupt is pending. This is sub-optimal.

This is an attempt at fixing it all in one go by reworking the way
we do the lazy interrupt disabling from the ground up.

The base idea is to replace the "hard_enabled" field with a
"irq_happened" field in which we store a bit mask of what interrupt
occurred while soft-disabled.

When re-enabling, either via arch_local_irq_restore() or when returning
from an interrupt, we can now decide what to do by testing bits in that
field.

We then implement replaying of the missed interrupts either by
re-using the existing exception frame (in exception exit case) or via
the creation of a new one from an assembly trampoline (in the
arch_local_irq_enable case).

This removes the need to play with the decrementer to try to create
fake interrupts, among others.

In addition, this adds a few refinements:

 - We no longer  hard disable decrementer interrupts that occur
while soft-disabled. We now simply bump the decrementer back to max
(on BookS) or leave it stopped (on BookE) and continue with hard interrupts
enabled, which means that we'll potentially get better sample quality from
performance monitor interrupts.

 - Timer, decrementer and doorbell interrupts now hard-enable
shortly after removing the source of the interrupt, which means
they no longer run entirely hard disabled. Again, this will improve
perf sample quality.

 - On Book3E 64-bit, we now make the performance monitor interrupt
act as an NMI like Book3S (the necessary C code for that to work
appear to already be present in the FSL perf code, notably calling
nmi_enter instead of irq_enter). (This also fixes a bug where BookE
perfmon interrupts could clobber r14 ... oops)

 - We could make "masked" decrementer interrupts act as NMIs when doing
timer-based perf sampling to improve the sample quality.

Signed-off-by-yet: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
---

v2:

- Add hard-enable to decrementer, timer and doorbells
- Fix CR clobber in masked irq handling on BookE
- Make embedded perf interrupt act as an NMI
- Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
  to retrigger an interrupt without preventing hard-enable

v3:

 - Fix or vs. ori bug on Book3E
 - Fix enabling of interrupts for some exceptions on Book3E

v4:

 - Fix resend of doorbells on return from interrupt on Book3E

v5:

 - Rebased on top of my latest series, which involves some significant
rework of some aspects of the patch.

v6:
 - 32-bit compile fix
 - more compile fixes with various .config combos
 - factor out the asm code to soft-disable interrupts
 - remove the C wrapper around preempt_schedule_irq

v7:
 - Fix a bug with hard irq state tracking on native power7
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Replace mfmsr instructions with load from PACA kernel_msr field</title>
<updated>2012-03-08T23:55:20+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-03-02T00:33:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d9ada91ae2969ae6b6dc3574fd08a6ebda5df766'/>
<id>d9ada91ae2969ae6b6dc3574fd08a6ebda5df766</id>
<content type='text'>
On 64-bit, the mfmsr instruction can be quite slow, slower
than loading a field from the cache-hot PACA, which happens
to already contain the value we want in most cases.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On 64-bit, the mfmsr instruction can be quite slow, slower
than loading a field from the cache-hot PACA, which happens
to already contain the value we want in most cases.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Disable interrupts in 64-bit kernel FP and vector faults</title>
<updated>2012-03-08T23:55:10+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-03-01T04:47:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9f2f79e3a3c19ae745d0439d6e0eed31df28de3c'/>
<id>9f2f79e3a3c19ae745d0439d6e0eed31df28de3c</id>
<content type='text'>
If we get a floating point, altivec or vsx unavaible interrupt in
kernel, we trigger a kernel error. There is no point preserving
the interrupt state, in fact, that can even make debugging harder
as the processor state might change (we may even preempt) between
taking the exception and landing in a debugger.

So just make those 3 disable interrupts unconditionally.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
---

v2: On BookE only disable when hitting the kernel unavailable
    path, otherwise it will fail to restore softe as
    fast_exception_return doesn't do it.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we get a floating point, altivec or vsx unavaible interrupt in
kernel, we trigger a kernel error. There is no point preserving
the interrupt state, in fact, that can even make debugging harder
as the processor state might change (we may even preempt) between
taking the exception and landing in a debugger.

So just make those 3 disable interrupts unconditionally.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
---

v2: On BookE only disable when hitting the kernel unavailable
    path, otherwise it will fail to restore softe as
    fast_exception_return doesn't do it.
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Call do_page_fault() with interrupts off</title>
<updated>2012-03-08T23:55:08+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-03-07T05:48:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a546498f3bf9aac311c66f965186373aee2ca0b0'/>
<id>a546498f3bf9aac311c66f965186373aee2ca0b0</id>
<content type='text'>
We currently turn interrupts back to their previous state before
calling do_page_fault(). This can be annoying when debugging as
a bad fault will potentially have lost some processor state before
getting into the debugger.

We also end up calling some generic code with interrupts enabled
such as notify_page_fault() with interrupts enabled, which could
be unexpected.

This changes our code to behave more like other architectures,
and make the assembly entry code call into do_page_faults() with
interrupts disabled. They are conditionally re-enabled from
within do_page_fault() in the same spot x86 does it.

While there, add the might_sleep() test in the case of a successful
trylock of the mmap semaphore, again like x86.

Also fix a bug in the existing assembly where r12 (_MSR) could get
clobbered by C calls (the DTL accounting in the exception common
macro and DISABLE_INTS) in some cases.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
---

v2. Add the r12 clobber fix
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We currently turn interrupts back to their previous state before
calling do_page_fault(). This can be annoying when debugging as
a bad fault will potentially have lost some processor state before
getting into the debugger.

We also end up calling some generic code with interrupts enabled
such as notify_page_fault() with interrupts enabled, which could
be unexpected.

This changes our code to behave more like other architectures,
and make the assembly entry code call into do_page_faults() with
interrupts disabled. They are conditionally re-enabled from
within do_page_fault() in the same spot x86 does it.

While there, add the might_sleep() test in the case of a successful
trylock of the mmap semaphore, again like x86.

Also fix a bug in the existing assembly where r12 (_MSR) could get
clobbered by C calls (the DTL accounting in the exception common
macro and DISABLE_INTS) in some cases.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
---

v2. Add the r12 clobber fix
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Rework runlatch code</title>
<updated>2012-03-08T23:55:02+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-03-01T01:45:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fe1952fc0afb9a2e4c79f103c08aef5d13db1873'/>
<id>fe1952fc0afb9a2e4c79f103c08aef5d13db1873</id>
<content type='text'>
This moves the inlines into system.h and changes the runlatch
code to use the thread local flags (non-atomic) rather than
the TIF flags (atomic) to keep track of the latch state.

The code to turn it back on in an asynchronous interrupt is
now simplified and partially inlined.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This moves the inlines into system.h and changes the runlatch
code to use the thread local flags (non-atomic) rather than
the TIF flags (atomic) to keep track of the latch state.

The code to turn it back on in an asynchronous interrupt is
now simplified and partially inlined.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Use the same interrupt prolog for perfmon as other interrupts</title>
<updated>2012-03-08T23:55:00+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-02-29T23:52:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7450f6f03e9d6dc95d2014c4cceac8adf98560e8'/>
<id>7450f6f03e9d6dc95d2014c4cceac8adf98560e8</id>
<content type='text'>
The perfmon interrupt is the sole user of a special variant of the
interrupt prolog which differs from the one used by external and timer
interrupts in that it saves the non-volatile GPRs and doesn't turn the
runlatch on.

The former is unnecessary and the later is arguably incorrect, so
let's clean that up by using the same prolog. While at it we rename
that prolog to use the _ASYNC prefix.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The perfmon interrupt is the sole user of a special variant of the
interrupt prolog which differs from the one used by external and timer
interrupts in that it saves the non-volatile GPRs and doesn't turn the
runlatch on.

The former is unnecessary and the later is arguably incorrect, so
let's clean that up by using the same prolog. While at it we rename
that prolog to use the _ASYNC prefix.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Remove legacy iSeries bits from assembly files</title>
<updated>2012-03-08T23:54:59+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2012-02-28T02:44:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4f8cf36f48b4648a5231e9fc8e49faea377246f4'/>
<id>4f8cf36f48b4648a5231e9fc8e49faea377246f4</id>
<content type='text'>
This removes the various bits of assembly in the kernel entry,
exception handling and SLB management code that were specific
to running under the legacy iSeries hypervisor which is no
longer supported.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This removes the various bits of assembly in the kernel entry,
exception handling and SLB management code that were specific
to running under the legacy iSeries hypervisor which is no
longer supported.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: PPC: Implement MMIO emulation support for Book3S HV guests</title>
<updated>2012-03-05T12:52:37+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2011-12-12T12:36:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=697d3899dcb4bcd918d060a92db57b794e56b077'/>
<id>697d3899dcb4bcd918d060a92db57b794e56b077</id>
<content type='text'>
This provides the low-level support for MMIO emulation in Book3S HV
guests.  When the guest tries to map a page which is not covered by
any memslot, that page is taken to be an MMIO emulation page.  Instead
of inserting a valid HPTE, we insert an HPTE that has the valid bit
clear but another hypervisor software-use bit set, which we call
HPTE_V_ABSENT, to indicate that this is an absent page.  An
absent page is treated much like a valid page as far as guest hcalls
(H_ENTER, H_REMOVE, H_READ etc.) are concerned, except of course that
an absent HPTE doesn't need to be invalidated with tlbie since it
was never valid as far as the hardware is concerned.

When the guest accesses a page for which there is an absent HPTE, it
will take a hypervisor data storage interrupt (HDSI) since we now set
the VPM1 bit in the LPCR.  Our HDSI handler for HPTE-not-present faults
looks up the hash table and if it finds an absent HPTE mapping the
requested virtual address, will switch to kernel mode and handle the
fault in kvmppc_book3s_hv_page_fault(), which at present just calls
kvmppc_hv_emulate_mmio() to set up the MMIO emulation.

This is based on an earlier patch by Benjamin Herrenschmidt, but since
heavily reworked.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Alexander Graf &lt;agraf@suse.de&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This provides the low-level support for MMIO emulation in Book3S HV
guests.  When the guest tries to map a page which is not covered by
any memslot, that page is taken to be an MMIO emulation page.  Instead
of inserting a valid HPTE, we insert an HPTE that has the valid bit
clear but another hypervisor software-use bit set, which we call
HPTE_V_ABSENT, to indicate that this is an absent page.  An
absent page is treated much like a valid page as far as guest hcalls
(H_ENTER, H_REMOVE, H_READ etc.) are concerned, except of course that
an absent HPTE doesn't need to be invalidated with tlbie since it
was never valid as far as the hardware is concerned.

When the guest accesses a page for which there is an absent HPTE, it
will take a hypervisor data storage interrupt (HDSI) since we now set
the VPM1 bit in the LPCR.  Our HDSI handler for HPTE-not-present faults
looks up the hash table and if it finds an absent HPTE mapping the
requested virtual address, will switch to kernel mode and handle the
fault in kvmppc_book3s_hv_page_fault(), which at present just calls
kvmppc_hv_emulate_mmio() to set up the MMIO emulation.

This is based on an earlier patch by Benjamin Herrenschmidt, but since
heavily reworked.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Alexander Graf &lt;agraf@suse.de&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</pre>
</div>
</content>
</entry>
</feed>
