summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)Author
2025-11-18KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=nSean Christopherson
Disable support for flushing the L1 data cache to mitigate L1TF if CPU mitigations are disabled for the entire kernel. KVM's mitigation of L1TF is in no way special enough to justify ignoring CONFIG_CPU_MITIGATIONS=n. Deliberately use CPU_MITIGATIONS instead of the more precise MITIGATION_L1TF, as MITIGATION_L1TF only controls the default behavior, i.e. CONFIG_MITIGATION_L1TF=n doesn't completely disable L1TF mitigations in the kernel. Keep the vmentry_l1d_flush module param to avoid breaking existing setups, and leverage the .set path to alert the user to the fact that vmentry_l1d_flush will be ignored. Don't bother validating the incoming value; if an admin misconfigures vmentry_l1d_flush, the fact that the bad configuration won't be detected when running with CONFIG_CPU_MITIGATIONS=n is likely the least of their worries. Reviewed-by: Brendan Jackman <jackmanb@google.com> Link: https://patch.msgid.link/20251113233746.1703361-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-18x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERSSean Christopherson
Now that VMX encodes its own sequence for clearing CPU buffers, move VM_CLEAR_CPU_BUFFERS into SVM to minimize the chances of KVM botching a mitigation in the future, e.g. using VM_CLEAR_CPU_BUFFERS instead of checking multiple mitigation flags. No functional change intended. Reviewed-by: Brendan Jackman <jackmanb@google.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251113233746.1703361-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-18x86/bugs: Use an x86 feature to track the MMIO Stale Data mitigationSean Christopherson
Convert the MMIO Stale Data mitigation tracking from a static branch into an x86 feature flag so that it can be used via ALTERNATIVE_2 in KVM. No functional change intended. Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Brendan Jackman <jackmanb@google.com> Link: https://patch.msgid.link/20251113233746.1703361-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-18x86/bugs: Decouple ALTERNATIVE usage from VERW macro definitionSean Christopherson
Decouple the use of ALTERNATIVE from the encoding of VERW to clear CPU buffers so that KVM can use ALTERNATIVE_2 to handle "always clear buffers" and "clear if guest can access host MMIO" in a single statement. No functional change intended. Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Link: https://patch.msgid.link/20251113233746.1703361-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-18x86/alternatives: Disable LASS when patching kernel codeSohil Mehta
For patching, the kernel initializes a temporary mm area in the lower half of the address range. LASS blocks these accesses because its enforcement relies on bit 63 of the virtual address as opposed to SMAP which depends on the _PAGE_BIT_USER bit in the page table. Disable LASS enforcement by toggling the RFLAGS.AC bit during patching to avoid triggering a #GP fault. Introduce LASS-specific STAC/CLAC helpers to set the AC bit only on platforms that need it. Name the wrappers as lass_stac()/_clac() instead of lass_disable()/_enable() because they only control the kernel data access enforcement. The entire LASS mechanism (including instruction fetch enforcement) is controlled by the CR4.LASS bit. Describe the usage of the new helpers in comparison to the ones used for SMAP. Also, add comments to explain when the existing stac()/clac() should be used. While at it, move the duplicated "barrier" comment to the same block. The Text poking functions use standard memcpy()/memset() while patching kernel code. However, objtool complains about calling such dynamic functions within an AC=1 region. See warning #9, regarding function calls with UACCESS enabled, in tools/objtool/Documentation/objtool.txt. To pacify objtool, one option is to add memcpy() and memset() to the list of allowed-functions. However, that would provide a blanket exemption for all usages of memcpy() and memset(). Instead, replace the standard calls in the text poking functions with their unoptimized, always-inlined versions. Considering that patching is usually small, there is no performance impact expected. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20251118182911.2983253-5-sohil.mehta%40intel.com
2025-11-18x86/asm: Introduce inline memcpy and memsetPeter Zijlstra (Intel)
Provide inline memcpy and memset functions that can be used instead of the GCC builtins when necessary. The immediate use case is for the text poking functions to avoid the standard memcpy()/memset() calls because objtool complains about such dynamic calls within an AC=1 region. See tools/objtool/Documentation/objtool.txt, warning #9, regarding function calls with UACCESS enabled. Some user copy functions such as copy_user_generic() and __clear_user() have similar rep_{movs,stos} usages. But, those are highly specialized and hard to combine or reuse for other things. Define these new helpers for all other usages that need a completely unoptimized, strictly inline version of memcpy() or memset(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20251118182911.2983253-4-sohil.mehta%40intel.com
2025-11-18x86/cpufeatures: Enumerate the LASS feature bitsSohil Mehta
Linear Address Space Separation (LASS) is a security feature that mitigates a class of side-channel attacks relying on speculative access across the user/kernel boundary. Privilege mode based access protection already exists today with paging and features such as SMEP and SMAP. However, to enforce these protections, the processor must traverse the paging structures in memory. An attacker can use timing information resulting from this traversal to determine details about the paging structures, and to determine the layout of the kernel memory. LASS provides the same mode-based protections as paging but without traversing the paging structures. Because the protections are enforced prior to page-walks, an attacker will not be able to derive paging-based timing information from the various caching structures such as the TLBs, mid-level caches, page walker, data caches, etc. LASS enforcement relies on the kernel implementation to divide the 64-bit virtual address space into two halves: Addr[63]=0 -> User address space Addr[63]=1 -> Kernel address space Any data access or code execution across address spaces typically results in a #GP fault, with an #SS generated in some rare cases. The LASS enforcement for kernel data accesses is dependent on CR4.SMAP being set. The enforcement can be disabled by toggling the RFLAGS.AC bit similar to SMAP. Define the CPU feature bits to enumerate LASS. Also, disable the feature at compile time on 32-bit kernels. Use a direct dependency on X86_32 (instead of !X86_64) to make it easier to combine with similar 32-bit specific dependencies in the future. LASS mitigates a class of side-channel speculative attacks, such as Spectre LAM, described in the paper, "Leaky Address Masking: Exploiting Unmasked Spectre Gadgets with Noncanonical Address Translation". Add the "lass" flag to /proc/cpuinfo to indicate that the feature is supported by hardware and enabled by the kernel. This allows userspace to determine if the system is secure against such attacks. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Xin Li (Intel) <xin@zytor.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20251118182911.2983253-2-sohil.mehta%40intel.com
2025-11-17Revert "x86: kvm: rate-limit global clock updates"Lei Chen
This reverts commit 7e44e4495a398eb553ce561f29f9148f40a3448f. Commit 7e44e4495a39 ("x86: kvm: rate-limit global clock updates") intends to use a kvmclock_update_work to sync ntp corretion across all vcpus kvmclock, which is based on commit 0061d53daf26f ("KVM: x86: limit difference between kvmclock updates") Since kvmclock has been switched to mono raw, this commit can be reverted. Signed-off-by: Lei Chen <lei.chen@smartx.com> Link: https://patch.msgid.link/20250819152027.1687487-3-lei.chen@smartx.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-17Revert "x86: kvm: introduce periodic global clock updates"Lei Chen
This reverts commit 332967a3eac06f6379283cf155c84fe7cd0537c2. Commit 332967a3eac0 ("x86: kvm: introduce periodic global clock updates") introduced a 300s interval work to sync ntp corrections across all vcpus. Since commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to monotonic raw clock"), kvmclock switched to mono raw clock, we can no longer take ntp into consideration. Signed-off-by: Lei Chen <lei.chen@smartx.com> Link: https://patch.msgid.link/20250819152027.1687487-2-lei.chen@smartx.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-15x86/hyperv: Enable build of hypervisor crashdump collection filesMukesh Rathor
Enable build of the new files introduced in the earlier commits and add call to do the setup during boot. Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> [ wei: fix build ] Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-14Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: - Fix interaction between livepatch and BPF fexit programs (Song Liu) With Steven and Masami acks. - Fix stack ORC unwind from BPF kprobe_multi (Jiri Olsa) With Steven and Masami acks. - Fix out of bounds access in widen_imprecise_scalars() in the verifier (Eduard Zingerman) - Fix conflicts between MPTCP and BPF sockmap (Jiayuan Chen) - Fix net_sched storage collision with BPF data_meta/data_end (Eric Dumazet) - Add _impl suffix to BPF kfuncs with implicit args to avoid breaking them in bpf-next when KF_IMPLICIT_ARGS is added (Mykyta Yatsenko) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Test widen_imprecise_scalars() with different stack depth bpf: account for current allocated stack depth in widen_imprecise_scalars() bpf: Add bpf_prog_run_data_pointers() selftests/bpf: Add mptcp test with sockmap mptcp: Fix proto fallback detection with BPF mptcp: Disallow MPTCP subflows from sockmap selftests/bpf: Add stacktrace ips test for raw_tp selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe Revert "perf/x86: Always store regs->ip in perf_callchain_kernel()" bpf: add _impl suffix for bpf_stream_vprintk() kfunc bpf:add _impl suffix for bpf_task_work_schedule* kfuncs selftests/bpf: Add tests for livepatch + bpf trampoline ftrace: bpf: Fix IPMODIFY + DIRECT in modify_ftrace_direct() ftrace: Fix BPF fexit with livepatch
2025-11-14x86/sgx: Fix a typo in the kernel-doc comment for enum sgx_attributeSean Christopherson
Use the exact enum name when documenting "enum sgx_attribute" to fix a warning if the file is fed into kernel-doc processing: WARNING: ./arch/x86/include/asm/sgx.h:139 expecting prototype for enum sgx_attributes. Prototype was for enum sgx_attribute instead Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251112160708.1343355-6-seanjc%40google.com
2025-11-14x86/sgx: Remove superfluous asterisk from copyright comment in asm/sgx.hSean Christopherson
Drop an asterisk from a file-level copyright comment so that the comment isn't intrepeted as a kernel-doc comment. E.g. if arch/x86/include/asm/sgx.h is fed into kernel-doc processing: WARNING: ./arch/x86/include/asm/sgx.h:2 This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251112160708.1343355-5-seanjc%40google.com
2025-11-14x86/sgx: Document structs and enums with '@', not '%'Sean Christopherson
Use '@' to document structure members and enum values in kernel-doc markup, as per Documentation/doc-guide/kernel-doc.rst and flagged by make htmldocs. WARNING: arch/x86/include/uapi/asm/sgx.h:17 Enum value 'SGX_PAGE_MEASURE' not described in enum 'sgx_page_flags' Opportunistically add a missing ':' for SGX_CHILD_PRESENT. Closes: https://lore.kernel.org/all/20251106145506.145fc620@canb.auug.org.au Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251112160708.1343355-4-seanjc%40google.com
2025-11-14x86/sgx: Add kernel-doc descriptions for params passed to vDSO user handlerSean Christopherson
Add kernel-doc markup for the register parameters passed by the vDSO blob to the user handler to suppress build warnings, e.g. WARNING: arch/x86/include/uapi/asm/sgx.h:157 function parameter 'r8' not described in 'sgx_enclave_user_handler_t' Call out that except for RSP, the registers are undefined on asynchronous exits as far as the vDSO ABI is concerned. E.g. the vDSO's exception handler clobbers RDX, RDI, and RSI, and the kernel doesn't guarantee that R8 or R9 will be zero (the synthetic value loaded by the CPU). Closes: https://lore.kernel.org/all/20251106145506.145fc620@canb.auug.org.au Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251112160708.1343355-3-seanjc%40google.com
2025-11-14x86/sgx: Add a missing colon in kernel-doc markup for "struct sgx_enclave_run"Sean Christopherson
Add a missing ':' for the description of sgx_enclave_run.reserved so that documentation for the member is correctly generated: WARNING: arch/x86/include/uapi/asm/sgx.h:184 struct member 'reserved' not described in 'sgx_enclave_run' Closes: https://lore.kernel.org/all/20251106145506.145fc620@canb.auug.org.au Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251112160708.1343355-2-seanjc%40google.com
2025-11-14KVM: SEV: Publish supported SEV-SNP policy bitsTom Lendacky
Define the set of policy bits that KVM currently knows as not requiring any implementation support within KVM. Provide this value to userspace via the KVM_GET_DEVICE_ATTR ioctl. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/c596f7529518f3f826a57970029451d9385949e5.1761593632.git.thomas.lendacky@amd.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-13KVM: SVM: Don't skip unrelated instruction if INT3/INTO is replacedOmar Sandoval
When re-injecting a soft interrupt from an INT3, INT0, or (select) INTn instruction, discard the exception and retry the instruction if the code stream is changed (e.g. by a different vCPU) between when the CPU executes the instruction and when KVM decodes the instruction to get the next RIP. As effectively predicted by commit 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction"), failure to verify that the correct INTn instruction was decoded can effectively clobber guest state due to decoding the wrong instruction and thus specifying the wrong next RIP. The bug most often manifests as "Oops: int3" panics on static branch checks in Linux guests. Enabling or disabling a static branch in Linux uses the kernel's "text poke" code patching mechanism. To modify code while other CPUs may be executing that code, Linux (temporarily) replaces the first byte of the original instruction with an int3 (opcode 0xcc), then patches in the new code stream except for the first byte, and finally replaces the int3 with the first byte of the new code stream. If a CPU hits the int3, i.e. executes the code while it's being modified, then the guest kernel must look up the RIP to determine how to handle the #BP, e.g. by emulating the new instruction. If the RIP is incorrect, then this lookup fails and the guest kernel panics. The bug reproduces almost instantly by hacking the guest kernel to repeatedly check a static branch[1] while running a drgn script[2] on the host to constantly swap out the memory containing the guest's TSS. [1]: https://gist.github.com/osandov/44d17c51c28c0ac998ea0334edf90b5a [2]: https://gist.github.com/osandov/10e45e45afa29b11e0c7209247afc00b Fixes: 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction") Cc: stable@vger.kernel.org Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Link: https://patch.msgid.link/1cc6dcdf36e3add7ee7c8d90ad58414eeb6c3d34.1762278762.git.osandov@fb.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-13Merge tag 'v6.18-rc5' into objtool/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-11-12x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possibleSean Christopherson
Extend KVM's export macro framework to provide EXPORT_SYMBOL_FOR_KVM(), and use the helper macro to export symbols for KVM throughout x86 if and only if KVM will build one or more modules, and only for those modules. To avoid unnecessary exports when CONFIG_KVM=m but kvm.ko will not be built (because no vendor modules are selected), let arch code #define EXPORT_SYMBOL_FOR_KVM to suppress/override the exports. Note, the set of symbols to restrict to KVM was generated by manual search and audit; any "misses" are due to human error, not some grand plan. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Kai Huang <kai.huang@intel.com> Tested-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251112173944.1380633-5-seanjc%40google.com
2025-11-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "Arm: - Fix trapping regression when no in-kernel irqchip is present - Check host-provided, untrusted ranges and offsets in pKVM - Fix regression restoring the ID_PFR1_EL1 register - Fix vgic ITS locking issues when LPIs are not directly injected Arm selftests: - Correct target CPU programming in vgic_lpi_stress selftest - Fix exposure of SCTLR2_EL2 and ZCR_EL2 in get-reg-list selftest RISC-V: - Fix check for local interrupts on riscv32 - Read HGEIP CSR on the correct cpu when checking for IMSIC interrupts - Remove automatic I/O mapping from kvm_arch_prepare_memory_region() x86: - Inject #UD if the guest attempts to execute SEAMCALL or TDCALL as KVM doesn't support virtualization the instructions, but the instructions are gated only by VMXON. That is, they will VM-Exit instead of taking a #UD and until now this resulted in KVM exiting to userspace with an emulation error. - Unload the "FPU" when emulating INIT of XSTATE features if and only if the FPU is actually loaded, instead of trying to predict when KVM will emulate an INIT (CET support missed the MP_STATE path). Add sanity checks to detect and harden against similar bugs in the future. - Unregister KVM's GALog notifier (for AVIC) when kvm-amd.ko is unloaded. - Use a raw spinlock for svm->ir_list_lock as the lock is taken during schedule(), and "normal" spinlocks are sleepable locks when PREEMPT_RT=y. - Remove guest_memfd bindings on memslot deletion when a gmem file is dying to fix a use-after-free race found by syzkaller. - Fix a goof in the EPT Violation handler where KVM checks the wrong variable when determining if the reported GVA is valid. - Fix and simplify the handling of LBR virtualization on AMD, which was made buggy and unnecessarily complicated by nested VM support Misc: - Update Oliver's email address" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits) KVM: nSVM: Fix and simplify LBR virtualization handling with nested KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv() KVM: SVM: Mark VMCB_LBR dirty when MSR_IA32_DEBUGCTLMSR is updated MAINTAINERS: Switch myself to using kernel.org address KVM: arm64: vgic-v3: Release reserved slot outside of lpi_xa's lock KVM: arm64: vgic-v3: Reinstate IRQ lock ordering for LPI xarray KVM: arm64: Limit clearing of ID_{AA64PFR0,PFR1}_EL1.GIC to userspace irqchip KVM: arm64: Set ID_{AA64PFR0,PFR1}_EL1.GIC when GICv3 is configured KVM: arm64: Make all 32bit ID registers fully writable KVM: VMX: Fix check for valid GVA on an EPT violation KVM: guest_memfd: Remove bindings on memslot deletion when gmem is dying KVM: SVM: switch to raw spinlock for svm->ir_list_lock KVM: SVM: Make avic_ga_log_notifier() local to avic.c KVM: SVM: Unregister KVM's GALog notifier on kvm-amd.ko exit KVM: SVM: Initialize per-CPU svm_data at the end of hardware setup KVM: x86: Call out MSR_IA32_S_CET is not handled by XSAVES KVM: x86: Harden KVM against imbalanced load/put of guest FPU state KVM: x86: Unload "FPU" state on INIT if and only if its currently in-use KVM: arm64: Check the untrusted offset in FF-A memory share KVM: arm64: Check range args for pKVM mem transitions ...
2025-11-10x86/percpu: Use BIT_WORD() and BIT_MASK() macrosUros Bizjak
Use BIT_WORD() and BIT_MASK() macros from <linux/bits.h> in <arch/x86/include/asm/percpu.h> instead of open-coding them. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20250907184915.78041-1-ubizjak@gmail.com
2025-11-09Merge tag 'kvm-x86-fixes-6.18-rc5' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM x86 fixes for 6.18: - Inject #UD if the guest attempts to execute SEAMCALL or TDCALL as KVM doesn't support virtualization the instructions, but the instructions are gated only by VMXON, i.e. will VM-Exit instead of taking a #UD and thus result in KVM exiting to userspace with an emulation error. - Unload the "FPU" when emulating INIT of XSTATE features if and only if the FPU is actually loaded, instead of trying to predict when KVM will emulate an INIT (CET support missed the MP_STATE path). Add sanity checks to detect and harden against similar bugs in the future. - Unregister KVM's GALog notifier (for AVIC) when kvm-amd.ko is unloaded. - Use a raw spinlock for svm->ir_list_lock as the lock is taken during schedule(), and "normal" spinlocks are sleepable locks when PREEMPT_RT=y. - Remove guest_memfd bindings on memslot deletion when a gmem file is dying to fix a use-after-free race found by syzkaller. - Fix a goof in the EPT Violation handler where KVM checks the wrong variable when determining if the reported GVA is valid.
2025-11-08Merge tag 'x86-urgent-2025-11-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix AMD PCI root device caching regression that triggers on certain firmware variants - Fix the zen5_rdseed_microcode[] array to be NULL-terminated - Add more AMD models to microcode signature checking * tag 'x86-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Add more known models to entry sign checking x86/CPU/AMD: Add missing terminator for zen5_rdseed_microcode x86/amd_node: Fix AMD root device caching
2025-11-07KVM: TDX: Explicitly set user-return MSRs that *may* be clobbered by the ↵Sean Christopherson
TDX-Module Set all user-return MSRs to their post-TD-exit value when preparing to run a TDX vCPU to ensure the value that KVM expects to be loaded after running the vCPU is indeed the value that's loaded in hardware. If the TDX-Module doesn't actually enter the guest, i.e. doesn't do VM-Enter, then it won't "restore" VMM state, i.e. won't clobber user-return MSRs to their expected post-run values, in which case simply updating KVM's "cached" value will effectively corrupt the cache due to hardware still holding the original value. In theory, KVM could conditionally update the current user-return value if and only if tdh_vp_enter() succeeds, but in practice "success" doesn't guarantee the TDX-Module actually entered the guest, e.g. if the TDX-Module synthesizes an EPT Violation because it suspects a zero-step attack. Force-load the expected values instead of trying to decipher whether or not the TDX-Module restored/clobbered MSRs, as the risk doesn't justify the benefits. Effectively avoiding four WRMSRs once per run loop (even if the vCPU is scheduled out, user-return MSRs only need to be reloaded if the CPU exits to userspace or runs a non-TDX vCPU) is likely in the noise when amortized over all entries, given the cost of running a TDX vCPU. E.g. the cost of the WRMSRs is somewhere between ~300 and ~500 cycles, whereas the cost of a _single_ roundtrip to/from a TDX guest is thousands of cycles. Fixes: e0b4f31a3c65 ("KVM: TDX: restore user ret MSRs") Cc: stable@vger.kernel.org Cc: Yan Zhao <yan.y.zhao@intel.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://patch.msgid.link/20251030191528.3380553-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-07perf/x86/intel: Add counter group support for arch-PEBSDapeng Mi
Base on previous adaptive PEBS counter snapshot support, add counter group support for architectural PEBS. Since arch-PEBS shares same counter group layout with adaptive PEBS, directly reuse __setup_pebs_counter_group() helper to process arch-PEBS counter group. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-13-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Setup PEBS data configuration and enable legacy groupsDapeng Mi
Different with legacy PEBS, arch-PEBS provides per-counter PEBS data configuration by programing MSR IA32_PMC_GPx/FXx_CFG_C MSRs. This patch obtains PEBS data configuration from event attribute and then writes the PEBS data configuration to MSR IA32_PMC_GPx/FXx_CFG_C and enable corresponding PEBS groups. Please notice this patch only enables XMM SIMD regs sampling for arch-PEBS, the other SIMD regs (OPMASK/YMM/ZMM) sampling on arch-PEBS would be supported after PMI based SIMD regs (OPMASK/YMM/ZMM) sampling is supported. Co-developed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-12-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSRDapeng Mi
Arch-PEBS introduces a new MSR IA32_PEBS_BASE to store the arch-PEBS buffer physical address. This patch allocates arch-PEBS buffer and then initialize IA32_PEBS_BASE MSR with the buffer physical address. Co-developed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-10-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Process arch-PEBS records or record fragmentsDapeng Mi
A significant difference with adaptive PEBS is that arch-PEBS record supports fragments which means an arch-PEBS record could be split into several independent fragments which have its own arch-PEBS header in each fragment. This patch defines architectural PEBS record layout structures and add helpers to process arch-PEBS records or fragments. Only legacy PEBS groups like basic, GPR, XMM and LBR groups are supported in this patch, the new added YMM/ZMM/OPMASK vector registers capturing would be supported in the future. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-9-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Initialize architectural PEBSDapeng Mi
arch-PEBS leverages CPUID.23H.4/5 sub-leaves enumerate arch-PEBS supported capabilities and counters bitmap. This patch parses these 2 sub-leaves and initializes arch-PEBS capabilities and corresponding structures. Since IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs are no longer existed for arch-PEBS, arch-PEBS doesn't need to manipulate these MSRs. Thus add a simple pair of __intel_pmu_pebs_enable/disable() callbacks for arch-PEBS. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-6-dapeng1.mi@linux.intel.com
2025-11-05x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probeJiri Olsa
Currently we don't get stack trace via ORC unwinder on top of fgraph exit handler. We can see that when generating stacktrace from kretprobe_multi bpf program which is based on fprobe/fgraph. The reason is that the ORC unwind code won't get pass the return_to_handler callback installed by fgraph return probe machinery. Solving this by creating stack frame in return_to_handler expected by ftrace_graph_ret_addr function to recover original return address and continue with the unwind. Also updating the pt_regs data with cs/flags/rsp which are needed for successful stack retrieval from ebpf bpf_get_stackid helper. - in get_perf_callchain we check user_mode(regs) so CS has to be set - in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS/FIXED has to be unset Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-11-05KVM: TDX: Convert INIT_MEM_REGION and INIT_VCPU to "unlocked" vCPU ioctlSean Christopherson
Handle the KVM_TDX_INIT_MEM_REGION and KVM_TDX_INIT_VCPU vCPU sub-ioctls in the unlocked variant, i.e. outside of vcpu->mutex, in anticipation of taking kvm->lock along with all other vCPU mutexes, at which point the sub-ioctls _must_ start without vcpu->mutex held. No functional change intended. Reviewed-by: Kai Huang <kai.huang@intel.com> Co-developed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251030200951.3402865-24-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-05KVM: TDX: WARN if mirror SPTE doesn't have full RWX when creating S-EPT mappingSean Christopherson
Pass in the mirror_spte to kvm_x86_ops.set_external_spte() to provide symmetry with .remove_external_spte(), and assert in TDX that the mirror SPTE is shadow-present with full RWX permissions (the TDX-Module doesn't allow the hypervisor to control protections). Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251030200951.3402865-13-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-05KVM: x86/mmu: Drop the return code from kvm_x86_ops.remove_external_spte()Sean Christopherson
Drop the return code from kvm_x86_ops.remove_external_spte(), a.k.a. tdx_sept_remove_private_spte(), as KVM simply does a KVM_BUG_ON() failure, and that KVM_BUG_ON() is redundant since all error paths in TDX also do a KVM_BUG_ON(). Opportunistically pass the spte instead of the pfn, as the API is clearly about removing an spte. Suggested-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251030200951.3402865-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-05x86/mce: Unify AMD DFR handler with MCA PollingYazen Ghannam
AMD systems optionally support a deferred error interrupt. The interrupt should be used as another signal to trigger MCA polling. This is similar to how other MCA interrupts are handled. Deferred errors do not require any special handling related to the interrupt, e.g. resetting or rearming the interrupt, etc. However, Scalable MCA systems include a pair of registers, MCA_DESTAT and MCA_DEADDR, that should be checked for valid errors. This check should be done whenever MCA registers are polled. Currently, the deferred error interrupt does this check, but the MCA polling function does not. Call the MCA polling function when handling the deferred error interrupt. This keeps all "polling" cases in a common function. Add an SMCA status check helper. This will do the same status check and register clearing that the interrupt handler has done. And it extends the common polling flow to find AMD deferred errors. Clear the MCA_DESTAT register at the end of the handler rather than the beginning. This maintains the procedure that the 'status' register must be cleared as the final step. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20251104-wip-mca-updates-v8-0-66c8eacf67b9@amd.com
2025-11-05x86: uaccess: don't use runtime-const rewriting in modulesLinus Torvalds
The runtime-const infrastructure was never designed to handle the modular case, because the constant fixup is only done at boot time for core kernel code. But by the time I used it for the x86-64 user space limit handling in commit 86e6b1547b3d ("x86: fix user address masking non-canonical speculation issue"), I had completely repressed that fact. And it all happens to work because the only code that currently actually gets inlined by modules is for the access_ok() limit check, where the default constant value works even when not fixed up. Because at least I had intentionally made it be something that is in the non-canonical address space region. But it's technically very wrong, and it does mean that at least in theory, the use of 'access_ok()' + '__get_user()' can trigger the same speculation issue with non-canonical addresses that the original commit was all about. The pattern is unusual enough that this probably doesn't matter in practice, but very wrong is still very wrong. Also, let's fix it before the nice optimized scoped user accessor helpers that Thomas Gleixner is working on cause this pseudo-constant to then be more widely used. This all came up due to an unrelated discussion with Mateusz Guzik about using the runtime const infrastructure for names_cachep accesses too. There the modular case was much more obviously broken, and Mateusz noted it in his 'v2' of the patch series. That then made me notice how broken 'access_ok()' had been in modules all along. Mea culpa, mea maxima culpa. Fix it by simply not using the runtime-const code in modules, and just using the USER_PTR_MAX variable value instead. This is not performance-critical like the core user accessor functions (get_user() and friends) are. Also make sure this doesn't get forgotten the next time somebody wants to do runtime constant optimizations by having the x86 runtime-const.h header file error out if included by modules. Fixes: 86e6b1547b3d ("x86: fix user address masking non-canonical speculation issue") Acked-by: Borislav Petkov <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Triggered-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/all/20251030105242.801528-1-mjguzik@gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-11-04x86/cpufeatures: Correct LKGS feature flag descriptionBorislav Petkov (AMD)
Quotation marks in cpufeatures.h comments are special and when the comment begins with a quoted string, that string lands in /proc/cpuinfo, turning it into a user-visible one. The LKGS comment doesn't begin with a quoted string but just in case drop the quoted "kernel" in there to avoid confusion. And while at it, simply change the description into what the LKGS instruction does for more clarity. No functional changes. Reviewed-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20251015103548.10194-1-bp@kernel.org
2025-11-04KVM: x86: Add a helper to dedup reporting of unhandled VM-ExitsSean Christopherson
Add and use a helper, kvm_prepare_unexpected_reason_exit(), to dedup the code that fills the exit reason and CPU when KVM encounters a VM-Exit that KVM doesn't know how to handle. Reviewed-by: yaoyuan@linux.alibaba.com Reviewed-by: Yao Yuan <yaoyuan@linux.alibaba.com> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Acked-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251030185004.3372256-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-04x86/ptrace: Always inline trivial accessorsPeter Zijlstra
A KASAN build bloats these single load/store helpers such that it fails to inline them: vmlinux.o: error: objtool: irqentry_exit+0x5e8: call to instruction_pointer_set() with UACCESS enabled Make sure the compiler isn't allowed to do stupid. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251031105435.GU4068168@noisy.programming.kicks-ass.net
2025-11-04x86/futex: Convert to scoped user accessThomas Gleixner
Replace the open coded implementation with the scoped user access guards No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251027083745.799714344@linutronix.de
2025-11-03x86/uaccess: Use unsafe wrappers for ASM GOTOThomas Gleixner
ASM GOTO is miscompiled by GCC when it is used inside a auto cleanup scope: bool foo(u32 __user *p, u32 val) { scoped_guard(pagefault) unsafe_put_user(val, p, efault); return true; efault: return false; } It ends up leaking the pagefault disable counter in the fault path. clang at least fails the build. Rename unsafe_*_user() to arch_unsafe_*_user() which makes the generic uaccess header wrap it with a local label that makes both compilers emit correct code. Same for the kernel_nofault() variants. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20251027083745.294359925@linutronix.de
2025-11-03x86/amd_node: Fix AMD root device cachingYazen Ghannam
Recent AMD node rework removed the "search and count" method of caching AMD root devices. This depended on the value from a Data Fabric register that was expected to hold the PCI bus of one of the root devices attached to that fabric. However, this expectation is incorrect. The register, when read from PCI config space, returns the bitwise-OR of the buses of all attached root devices. This behavior is benign on AMD reference design boards, since the bus numbers are aligned. This results in a bitwise-OR value matching one of the buses. For example, 0x00 | 0x40 | 0xA0 | 0xE0 = 0xE0. This behavior breaks on boards where the bus numbers are not exactly aligned. For example, 0x00 | 0x07 | 0xE0 | 0x15 = 0x1F. The examples above are for AMD node 0. The first root device on other nodes will not be 0x00. The first root device for other nodes will depend on the total number of root devices, the system topology, and the specific PCI bus number assignment. For example, a system with 2 AMD nodes could have this: Node 0 : 0x00 0x07 0x0e 0x15 Node 1 : 0x1c 0x23 0x2a 0x31 The bus numbering style in the reference boards is not a requirement. The numbering found in other boards is not incorrect. Therefore, the root device caching method needs to be adjusted. Go back to the "search and count" method used before the recent rework. Search for root devices using PCI class code rather than fixed PCI IDs. This keeps the goal of the rework (remove dependency on PCI IDs) while being able to support various board designs. Merge helper functions to reduce code duplication. [ bp: Reflow comment. ] Fixes: 40a5f6ffdfc8 ("x86/amd_nb: Simplify root device search") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/all/20251028-fix-amd-root-v2-1-843e38f8be2c@amd.com
2025-10-31x86/mm: Ensure clear_page() variants always have __kcfi_typeid_ symbolsNathan Chancellor
When building with CONFIG_CFI=y and CONFIG_LTO_CLANG_FULL=y, there is a series of errors from the various versions of clear_page() not having __kcfi_typeid_ symbols. $ cat kernel/configs/repro.config CONFIG_CFI=y # CONFIG_LTO_NONE is not set CONFIG_LTO_CLANG_FULL=y $ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 clean defconfig repro.config bzImage ld.lld: error: undefined symbol: __kcfi_typeid_clear_page_rep >>> referenced by ld-temp.o >>> vmlinux.o:(__cfi_clear_page_rep) ld.lld: error: undefined symbol: __kcfi_typeid_clear_page_orig >>> referenced by ld-temp.o >>> vmlinux.o:(__cfi_clear_page_orig) ld.lld: error: undefined symbol: __kcfi_typeid_clear_page_erms >>> referenced by ld-temp.o >>> vmlinux.o:(__cfi_clear_page_erms) With full LTO, it is possible for LLVM to realize that these functions never have their address taken (as they are only used within an alternative, which will make them a direct call) across the whole kernel and either drop or skip generating their kCFI type identification symbols. clear_page_{rep,orig,erms}() are defined in clear_page_64.S with SYM_TYPED_FUNC_START as a result of 2981557cb040 ("x86,kcfi: Fix EXPORT_SYMBOL vs kCFI"), as exported functions are free to be called indirectly thus need kCFI type identifiers. Use KCFI_REFERENCE with these clear_page() functions to force LLVM to see these functions as address-taken and generate then keep the kCFI type identifiers. Fixes: 2981557cb040 ("x86,kcfi: Fix EXPORT_SYMBOL vs kCFI") Closes: https://github.com/ClangBuiltLinux/linux/issues/2128 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://patch.msgid.link/20251013-x86-fix-clear_page-cfi-full-lto-errors-v1-1-d69534c0be61@kernel.org
2025-10-30x86/sev: Include XSS value in GHCB CPUID requestJohn Allen
When a guest issues a CPUID instruction for Fn0000000D_x01, the hypervisor may be intercepting the CPUID instruction and need to access the guest XSS value. For SEV-ES, the XSS value is encrypted and needs to be included in the GHCB to be visible to the hypervisor. Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/all/20250924200852.4452-3-john.allen@amd.com/
2025-10-30x86/boot: Move boot_*msr helpers to asm/shared/msr.hJohn Allen
The boot_{rdmsr,wrmsr}() helpers are *just* the barebones MSR access functionality, without any tracing or exception handling glue as it is done in kernel proper. Move these helpers to asm/shared/msr.h and rename to raw_{rdmsr,wrmsr}() to indicate what they are. [ bp: Correct the reason why those helpers exist. I should've caught that in the original patch that added them: 176db622573f ("x86/boot: Introduce helpers for MSR reads/writes" but oh well... - fixup include path delimiters to <> ] Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/all/20250924200852.4452-2-john.allen@amd.com
2025-10-30x86/smpboot: Mark native_play_dead() as __noreturnThorsten Blum
native_play_dead() ends by calling the non-returning function hlt_play_dead() and therefore also never returns. The !CONFIG_HOTPLUG_CPU stub version of native_play_dead() unconditionally calls BUG() and does not return either. Add the __noreturn attribute to both function definitions and their declaration to document this behavior and to potentially improve compiler optimizations. Remove the obsolete comment, and add native_play_dead() to the objtool's list of __noreturn functions. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20251027155107.183136-1-thorsten.blum@linux.dev Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-10-30x86/cpu: Add/fix core comments for {Panther,Nova} LakeTony Luck
The E-core in Panther Lake is Darkmont, not Crestmont. Nova Lake is built from Coyote Cove (P-core) and Arctic Wolf (E-core). Fixes: 43bb700cff6b ("x86/cpu: Update Intel Family comments") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20251028172948.6721-1-tony.luck@intel.com
2025-10-30unwind_user/x86: Fix arch=um buildPeter Zijlstra
Add CONFIG_HAVE_UNWIND_USER_FP guards to make sure this code doesn't break arch=um builds. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/oe-kbuild-all/202510291919.FFGyU7nq-lkp@intel.com/
2025-10-29unwind_user/x86: Teach FP unwind about start of functionPeter Zijlstra
When userspace is interrupted at the start of a function, before we get a chance to complete the frame, unwind will miss one caller. X86 has a uprobe specific fixup for this, add bits to the generic unwinder to support this. Suggested-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251024145156.GM4068168@noisy.programming.kicks-ass.net
2025-10-29unwind_user/x86: Enable frame pointer unwinding on x86Josh Poimboeuf
Use ARCH_INIT_USER_FP_FRAME to describe how frame pointers are unwound on x86, and enable CONFIG_HAVE_UNWIND_USER_FP accordingly so the unwind_user interfaces can be used. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20250827193828.347397433@kernel.org