<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/uapi/linux/kvm.h, branch v7.0-rc1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge tag 'kvm-s390-next-7.0-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD</title>
<updated>2026-02-11T17:52:27+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2026-02-11T17:52:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b1195183ed42f1522fae3fe44ebee3af437aa000'/>
<id>b1195183ed42f1522fae3fe44ebee3af437aa000</id>
<content type='text'>
- gmap rewrite: completely new memory management for kvm/s390
- vSIE improvement
- maintainership change for s390 vfio-pci
- small quality of life improvement for protected guests
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- gmap rewrite: completely new memory management for kvm/s390
- vSIE improvement
- maintainership change for s390 vfio-pci
- small quality of life improvement for protected guests
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'kvm-x86-svm-6.20' of https://github.com/kvm-x86/linux into HEAD</title>
<updated>2026-02-09T17:51:37+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2026-02-09T17:51:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4215ee0d7bb5358882375c84d3cd0488bb5813b2'/>
<id>4215ee0d7bb5358882375c84d3cd0488bb5813b2</id>
<content type='text'>
KVM SVM changes for 6.20

 - Drop a user-triggerable WARN on nested_svm_load_cr3() failure.

 - Add support for virtualizing ERAPS.  Note, correct virtualization of ERAPS
   relies on an upcoming, publicly announced change in the APM to reduce the
   set of conditions where hardware (i.e. KVM) *must* flush the RAP.

 - Ignore nSVM intercepts for instructions that are not supported according to
   L1's virtual CPU model.

 - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath
   for EPT Misconfig.

 - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM,
   and allow userspace to restore nested state with GIF=0.

 - Treat exit_code as an unsigned 64-bit value through all of KVM.

 - Add support for fetching SNP certificates from userspace.

 - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD
   or VMSAVE on behalf of L2.

 - Misc fixes and cleanups.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
KVM SVM changes for 6.20

 - Drop a user-triggerable WARN on nested_svm_load_cr3() failure.

 - Add support for virtualizing ERAPS.  Note, correct virtualization of ERAPS
   relies on an upcoming, publicly announced change in the APM to reduce the
   set of conditions where hardware (i.e. KVM) *must* flush the RAP.

 - Ignore nSVM intercepts for instructions that are not supported according to
   L1's virtual CPU model.

 - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath
   for EPT Misconfig.

 - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM,
   and allow userspace to restore nested state with GIF=0.

 - Treat exit_code as an unsigned 64-bit value through all of KVM.

 - Add support for fetching SNP certificates from userspace.

 - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD
   or VMSAVE on behalf of L2.

 - Misc fixes and cleanups.
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: s390: Add explicit padding to struct kvm_s390_keyop</title>
<updated>2026-02-06T10:40:36+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2026-02-06T09:17:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f7ab71f178d56447e5efb55b65436feb68662f8f'/>
<id>f7ab71f178d56447e5efb55b65436feb68662f8f</id>
<content type='text'>
The newly added structure causes a warning about implied padding:

./usr/include/linux/kvm.h:1247:1: error: padding struct size to alignment boundary with 6 bytes [-Werror=padded]

The padding can lead to leaking kernel data and ABI incompatibilies
when used on x86. Neither of these is a problem in this specific
patch, but it's best to avoid it and use explicit padding fields
in general.

Fixes: 0ee4ddc1647b ("KVM: s390: Storage key manipulation IOCTL")
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The newly added structure causes a warning about implied padding:

./usr/include/linux/kvm.h:1247:1: error: padding struct size to alignment boundary with 6 bytes [-Werror=padded]

The padding can lead to leaking kernel data and ABI incompatibilies
when used on x86. Neither of these is a problem in this specific
patch, but it's best to avoid it and use explicit padding fields
in general.

Fixes: 0ee4ddc1647b ("KVM: s390: Storage key manipulation IOCTL")
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: s390: Storage key manipulation IOCTL</title>
<updated>2026-02-04T16:00:10+00:00</updated>
<author>
<name>Claudio Imbrenda</name>
<email>imbrenda@linux.ibm.com</email>
</author>
<published>2026-02-04T15:02:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0ee4ddc1647b8b3b9e7a94d798a1774a764428c1'/>
<id>0ee4ddc1647b8b3b9e7a94d798a1774a764428c1</id>
<content type='text'>
Add a new IOCTL to allow userspace to manipulate storage keys directly.

This will make it easier to write selftests related to storage keys.

Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new IOCTL to allow userspace to manipulate storage keys directly.

This will make it easier to write selftests related to storage keys.

Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: Introduce KVM_EXIT_SNP_REQ_CERTS for SNP certificate-fetching</title>
<updated>2026-01-23T17:14:15+00:00</updated>
<author>
<name>Michael Roth</name>
<email>michael.roth@amd.com</email>
</author>
<published>2026-01-09T23:17:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fa9893fadbc245e179cb17f3c371c67471b5a8a8'/>
<id>fa9893fadbc245e179cb17f3c371c67471b5a8a8</id>
<content type='text'>
For SEV-SNP, the host can optionally provide a certificate table to the
guest when it issues an attestation request to firmware (see GHCB 2.0
specification regarding "SNP Extended Guest Requests"). This certificate
table can then be used to verify the endorsement key used by firmware to
sign the attestation report.

While it is possible for guests to obtain the certificates through other
means, handling it via the host provides more flexibility in being able
to keep the certificate data in sync with the endorsement key throughout
host-side operations that might resulting in the endorsement key
changing.

In the case of KVM, userspace will be responsible for fetching the
certificate table and keeping it in sync with any modifications to the
endorsement key by other userspace management tools. Define a new
KVM_EXIT_SNP_REQ_CERTS event where userspace is provided with the GPA of
the buffer the guest has provided as part of the attestation request so
that userspace can write the certificate data into it while relying on
filesystem-based locking to keep the certificates up-to-date relative to
the endorsement keys installed/utilized by firmware at the time the
certificates are fetched.

[Melody: Update the documentation scheme about how file locking is
         expected to happen.]

Reviewed-by: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Tested-by: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Tested-by: Dionna Glaze &lt;dionnaglaze@google.com&gt;
Signed-off-by: Michael Roth &lt;michael.roth@amd.com&gt;
Signed-off-by: Melody Wang &lt;huibo.wang@amd.com&gt;
Signed-off-by: Michael Roth &lt;michael.roth@amd.com&gt;
Link: https://patch.msgid.link/20260109231732.1160759-2-michael.roth@amd.com
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For SEV-SNP, the host can optionally provide a certificate table to the
guest when it issues an attestation request to firmware (see GHCB 2.0
specification regarding "SNP Extended Guest Requests"). This certificate
table can then be used to verify the endorsement key used by firmware to
sign the attestation report.

While it is possible for guests to obtain the certificates through other
means, handling it via the host provides more flexibility in being able
to keep the certificate data in sync with the endorsement key throughout
host-side operations that might resulting in the endorsement key
changing.

In the case of KVM, userspace will be responsible for fetching the
certificate table and keeping it in sync with any modifications to the
endorsement key by other userspace management tools. Define a new
KVM_EXIT_SNP_REQ_CERTS event where userspace is provided with the GPA of
the buffer the guest has provided as part of the attestation request so
that userspace can write the certificate data into it while relying on
filesystem-based locking to keep the certificates up-to-date relative to
the endorsement keys installed/utilized by firmware at the time the
certificates are fetched.

[Melody: Update the documentation scheme about how file locking is
         expected to happen.]

Reviewed-by: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Tested-by: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Tested-by: Dionna Glaze &lt;dionnaglaze@google.com&gt;
Signed-off-by: Michael Roth &lt;michael.roth@amd.com&gt;
Signed-off-by: Melody Wang &lt;huibo.wang@amd.com&gt;
Signed-off-by: Michael Roth &lt;michael.roth@amd.com&gt;
Link: https://patch.msgid.link/20260109231732.1160759-2-michael.roth@amd.com
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots</title>
<updated>2026-01-22T13:24:49+00:00</updated>
<author>
<name>Marc Zyngier</name>
<email>maz@kernel.org</email>
</author>
<published>2026-01-19T02:29:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f174a9ffcd48d78a45d560c02ce4071ded036b53'/>
<id>f174a9ffcd48d78a45d560c02ce4071ded036b53</id>
<content type='text'>
The main use of {LD,ST}64B* is to talk to a device, which is hopefully
directly assigned to the guest and requires no additional handling.

However, this does not preclude a VMM from exposing a virtual device
to the guest, and to allow 64 byte accesses as part of the programming
interface. A direct consequence of this is that we need to be able
to forward such access to userspace.

Given that such a contraption is very unlikely to ever exist, we choose
to offer a limited service: userspace gets (as part of a new exit reason)
the ESR, the IPA, and that's it. It is fully expected to handle the full
semantics of the instructions, deal with ACCDATA, the return values and
increment PC. Much fun.

A canonical implementation can also simply inject an abort and be done
with it. Frankly, don't try to do anything else unless you have time
to waste.

Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Oliver Upton &lt;oupton@kernel.org&gt;
Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Signed-off-by: Zhou Wang &lt;wangzhou1@hisilicon.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The main use of {LD,ST}64B* is to talk to a device, which is hopefully
directly assigned to the guest and requires no additional handling.

However, this does not preclude a VMM from exposing a virtual device
to the guest, and to allow 64 byte accesses as part of the programming
interface. A direct consequence of this is that we need to be able
to forward such access to userspace.

Given that such a contraption is very unlikely to ever exist, we choose
to offer a limited service: userspace gets (as part of a new exit reason)
the ESR, the IPA, and that's it. It is fully expected to handle the full
semantics of the instructions, deal with ACCDATA, the return values and
increment PC. Much fun.

A canonical implementation can also simply inject an abort and be done
with it. Frankly, don't try to do anything else unless you have time
to waste.

Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Oliver Upton &lt;oupton@kernel.org&gt;
Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Signed-off-by: Zhou Wang &lt;wangzhou1@hisilicon.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'kvm-s390-next-6.19-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD</title>
<updated>2025-12-02T17:58:47+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2025-12-02T17:58:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e0c26d47def7382d7dbd9cad58bc653aed75737a'/>
<id>e0c26d47def7382d7dbd9cad58bc653aed75737a</id>
<content type='text'>
- SCA rework
- VIRT_XFER_TO_GUEST_WORK support
- Operation exception forwarding support
- Cleanups
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- SCA rework
- VIRT_XFER_TO_GUEST_WORK support
- Operation exception forwarding support
- Cleanups
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: s390: Add capability that forwards operation exceptions</title>
<updated>2025-11-21T09:26:03+00:00</updated>
<author>
<name>Janosch Frank</name>
<email>frankja@linux.ibm.com</email>
</author>
<published>2025-07-08T12:57:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8e8678e740ecde2ae4a0404fd9b4ed2b726e236d'/>
<id>8e8678e740ecde2ae4a0404fd9b4ed2b726e236d</id>
<content type='text'>
Setting KVM_CAP_S390_USER_OPEREXEC will forward all operation
exceptions to user space. This also includes the 0x0000 instructions
managed by KVM_CAP_S390_USER_INSTR0. It's helpful if user space wants
to emulate instructions which do not (yet) have an opcode.

While we're at it refine the documentation for
KVM_CAP_S390_USER_INSTR0.

Signed-off-by: Janosch Frank &lt;frankja@linux.ibm.com&gt;
Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Signed-off-by: Janosch Frank &lt;frankja@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Setting KVM_CAP_S390_USER_OPEREXEC will forward all operation
exceptions to user space. This also includes the 0x0000 instructions
managed by KVM_CAP_S390_USER_INSTR0. It's helpful if user space wants
to emulate instructions which do not (yet) have an opcode.

While we're at it refine the documentation for
KVM_CAP_S390_USER_INSTR0.

Signed-off-by: Janosch Frank &lt;frankja@linux.ibm.com&gt;
Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Signed-off-by: Janosch Frank &lt;frankja@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: arm64: VM exit to userspace to handle SEA</title>
<updated>2025-11-12T09:27:12+00:00</updated>
<author>
<name>Jiaqi Yan</name>
<email>jiaqiyan@google.com</email>
</author>
<published>2025-10-13T18:59:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ad9c62bd8946621ed02ac94131a921222508a8bc'/>
<id>ad9c62bd8946621ed02ac94131a921222508a8bc</id>
<content type='text'>
When APEI fails to handle a stage-2 synchronous external abort (SEA),
today KVM injects an asynchronous SError to the VCPU then resumes it,
which usually results in unpleasant guest kernel panic.

One major situation of guest SEA is when vCPU consumes recoverable
uncorrected memory error (UER). Although SError and guest kernel panic
effectively stops the propagation of corrupted memory, guest may
re-use the corrupted memory if auto-rebooted; in worse case, guest
boot may run into poisoned memory. So there is room to recover from
an UER in a more graceful manner.

Alternatively KVM can redirect the synchronous SEA event to VMM to
- Reduce blast radius if possible. VMM can inject a SEA to VCPU via
  KVM's existing KVM_SET_VCPU_EVENTS API. If the memory poison
  consumption or fault is not from guest kernel, blast radius can be
  limited to the triggering thread in guest userspace, so VM can
  keep running.
- Allow VMM to protect from future memory poison consumption by
  unmapping the page from stage-2, or to interrupt guest of the
  poisoned page so guest kernel can unmap it from stage-1 page table.
- Allow VMM to track SEA events that VM customers care about, to restart
  VM when certain number of distinct poison events have happened,
  to provide observability to customers in log management UI.

Introduce an userspace-visible feature to enable VMM handle SEA:
- KVM_CAP_ARM_SEA_TO_USER. As the alternative fallback behavior
  when host APEI fails to claim a SEA, userspace can opt in this new
  capability to let KVM exit to userspace during SEA if it is not
  owned by host.
- KVM_EXIT_ARM_SEA. A new exit reason is introduced for this.
  KVM fills kvm_run.arm_sea with as much as possible information about
  the SEA, enabling VMM to emulate SEA to guest by itself.
  - Sanitized ESR_EL2. The general rule is to keep only the bits
    useful for userspace and relevant to guest memory.
  - Flags indicating if faulting guest physical address is valid.
  - Faulting guest physical and virtual addresses if valid.

Signed-off-by: Jiaqi Yan &lt;jiaqiyan@google.com&gt;
Co-developed-by: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Signed-off-by: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Link: https://msgid.link/20251013185903.1372553-2-jiaqiyan@google.com
Signed-off-by: Oliver Upton &lt;oupton@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When APEI fails to handle a stage-2 synchronous external abort (SEA),
today KVM injects an asynchronous SError to the VCPU then resumes it,
which usually results in unpleasant guest kernel panic.

One major situation of guest SEA is when vCPU consumes recoverable
uncorrected memory error (UER). Although SError and guest kernel panic
effectively stops the propagation of corrupted memory, guest may
re-use the corrupted memory if auto-rebooted; in worse case, guest
boot may run into poisoned memory. So there is room to recover from
an UER in a more graceful manner.

Alternatively KVM can redirect the synchronous SEA event to VMM to
- Reduce blast radius if possible. VMM can inject a SEA to VCPU via
  KVM's existing KVM_SET_VCPU_EVENTS API. If the memory poison
  consumption or fault is not from guest kernel, blast radius can be
  limited to the triggering thread in guest userspace, so VM can
  keep running.
- Allow VMM to protect from future memory poison consumption by
  unmapping the page from stage-2, or to interrupt guest of the
  poisoned page so guest kernel can unmap it from stage-1 page table.
- Allow VMM to track SEA events that VM customers care about, to restart
  VM when certain number of distinct poison events have happened,
  to provide observability to customers in log management UI.

Introduce an userspace-visible feature to enable VMM handle SEA:
- KVM_CAP_ARM_SEA_TO_USER. As the alternative fallback behavior
  when host APEI fails to claim a SEA, userspace can opt in this new
  capability to let KVM exit to userspace during SEA if it is not
  owned by host.
- KVM_EXIT_ARM_SEA. A new exit reason is introduced for this.
  KVM fills kvm_run.arm_sea with as much as possible information about
  the SEA, enabling VMM to emulate SEA to guest by itself.
  - Sanitized ESR_EL2. The general rule is to keep only the bits
    useful for userspace and relevant to guest memory.
  - Flags indicating if faulting guest physical address is valid.
  - Faulting guest physical and virtual addresses if valid.

Signed-off-by: Jiaqi Yan &lt;jiaqiyan@google.com&gt;
Co-developed-by: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Signed-off-by: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Link: https://msgid.link/20251013185903.1372553-2-jiaqiyan@google.com
Signed-off-by: Oliver Upton &lt;oupton@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: guest_memfd: Add INIT_SHARED flag, reject user page faults if not set</title>
<updated>2025-10-10T21:25:23+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2025-10-03T23:25:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fe2bf6234e947bf5544db6d386af1df2a8db80f3'/>
<id>fe2bf6234e947bf5544db6d386af1df2a8db80f3</id>
<content type='text'>
Add a guest_memfd flag to allow userspace to state that the underlying
memory should be configured to be initialized as shared, and reject user
page faults if the guest_memfd instance's memory isn't shared.  Because
KVM doesn't yet support in-place private&lt;=&gt;shared conversions, all
guest_memfd memory effectively follows the initial state.

Alternatively, KVM could deduce the initial state based on MMAP, which for
all intents and purposes is what KVM currently does.  However, implicitly
deriving the default state based on MMAP will result in a messy ABI when
support for in-place conversions is added.

For x86 CoCo VMs, which don't yet support MMAP, memory is currently private
by default (otherwise the memory would be unusable).  If MMAP implies
memory is shared by default, then the default state for CoCo VMs will vary
based on MMAP, and from userspace's perspective, will change when in-place
conversion support is added.  I.e. to maintain guest&lt;=&gt;host ABI, userspace
would need to immediately convert all memory from shared=&gt;private, which
is both ugly and inefficient.  The inefficiency could be avoided by adding
a flag to state that memory is _private_ by default, irrespective of MMAP,
but that would lead to an equally messy and hard to document ABI.

Bite the bullet and immediately add a flag to control the default state so
that the effective behavior is explicit and straightforward.

Fixes: 3d3a04fad25a ("KVM: Allow and advertise support for host mmap() on guest_memfd files")
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Fuad Tabba &lt;tabba@google.com&gt;
Tested-by: Fuad Tabba &lt;tabba@google.com&gt;
Reviewed-by: Ackerley Tng &lt;ackerleytng@google.com&gt;
Tested-by: Ackerley Tng &lt;ackerleytng@google.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Link: https://lore.kernel.org/r/20251003232606.4070510-3-seanjc@google.com
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a guest_memfd flag to allow userspace to state that the underlying
memory should be configured to be initialized as shared, and reject user
page faults if the guest_memfd instance's memory isn't shared.  Because
KVM doesn't yet support in-place private&lt;=&gt;shared conversions, all
guest_memfd memory effectively follows the initial state.

Alternatively, KVM could deduce the initial state based on MMAP, which for
all intents and purposes is what KVM currently does.  However, implicitly
deriving the default state based on MMAP will result in a messy ABI when
support for in-place conversions is added.

For x86 CoCo VMs, which don't yet support MMAP, memory is currently private
by default (otherwise the memory would be unusable).  If MMAP implies
memory is shared by default, then the default state for CoCo VMs will vary
based on MMAP, and from userspace's perspective, will change when in-place
conversion support is added.  I.e. to maintain guest&lt;=&gt;host ABI, userspace
would need to immediately convert all memory from shared=&gt;private, which
is both ugly and inefficient.  The inefficiency could be avoided by adding
a flag to state that memory is _private_ by default, irrespective of MMAP,
but that would lead to an equally messy and hard to document ABI.

Bite the bullet and immediately add a flag to control the default state so
that the effective behavior is explicit and straightforward.

Fixes: 3d3a04fad25a ("KVM: Allow and advertise support for host mmap() on guest_memfd files")
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Fuad Tabba &lt;tabba@google.com&gt;
Tested-by: Fuad Tabba &lt;tabba@google.com&gt;
Reviewed-by: Ackerley Tng &lt;ackerleytng@google.com&gt;
Tested-by: Ackerley Tng &lt;ackerleytng@google.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Link: https://lore.kernel.org/r/20251003232606.4070510-3-seanjc@google.com
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
