diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-11-02 15:45:15 -1000 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-11-02 15:45:15 -1000 |
| commit | 6803bd7956ca8fc43069c2e42016f17f3c2fbf30 (patch) | |
| tree | ebcd7d47efe649781817dd0d7664c7c618645b21 /arch/loongarch/kvm/timer.c | |
| parent | 5be9911406ada8fe6187db7ce402f7ff4c21ebdf (diff) | |
| parent | 45b890f7689eb0aba454fc5831d2d79763781677 (diff) | |
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its
guest
- Optimization for vSGI injection, opportunistically compressing
MPIDR to vCPU mapping into a table
- Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
- Guest support for memory operation instructions (FEAT_MOPS)
- Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
- Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
- Load the stage-2 MMU context at vcpu_load() for VHE systems,
reducing the overhead of errata mitigations
- Miscellaneous kernel and selftest fixes
LoongArch:
- New architecture for kvm.
The hardware uses the same model as x86, s390 and RISC-V, where
guest/host mode is orthogonal to supervisor/user mode. The
virtualization extensions are very similar to MIPS, therefore the
code also has some similarities but it's been cleaned up to avoid
some of the historical bogosities that are found in arch/mips. The
kernel emulates MMU, timer and CSR accesses, while interrupt
controllers are only emulated in userspace, at least for now.
RISC-V:
- Support for the Smstateen and Zicond extensions
- Support for virtualizing senvcfg
- Support for virtualized SBI debug console (DBCN)
S390:
- Nested page table management can be monitored through tracepoints
and statistics
x86:
- Fix incorrect handling of VMX posted interrupt descriptor in
KVM_SET_LAPIC, which could result in a dropped timer IRQ
- Avoid WARN on systems with Intel IPI virtualization
- Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs
without forcing more common use cases to eat the extra memory
overhead.
- Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
SBPB, aka Selective Branch Predictor Barrier).
- Fix a bug where restoring a vCPU snapshot that was taken within 1
second of creating the original vCPU would cause KVM to try to
synchronize the vCPU's TSC and thus clobber the correct TSC being
set by userspace.
- Compute guest wall clock using a single TSC read to avoid
generating an inaccurate time, e.g. if the vCPU is preempted
between multiple TSC reads.
- "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which
complain about a "Firmware Bug" if the bit isn't set for select
F/M/S combos. Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to
appease Windows Server 2022.
- Don't apply side effects to Hyper-V's synthetic timer on writes
from userspace to fix an issue where the auto-enable behavior can
trigger spurious interrupts, i.e. do auto-enabling only for guest
writes.
- Remove an unnecessary kick of all vCPUs when synchronizing the
dirty log without PML enabled.
- Advertise "support" for non-serializing FS/GS base MSR writes as
appropriate.
- Harden the fast page fault path to guard against encountering an
invalid root when walking SPTEs.
- Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.
- Use the fast path directly from the timer callback when delivering
Xen timer events, instead of waiting for the next iteration of the
run loop. This was not done so far because previously proposed code
had races, but now care is taken to stop the hrtimer at critical
points such as restarting the timer or saving the timer information
for userspace.
- Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future
flag.
- Optimize injection of PMU interrupts that are simultaneous with
NMIs.
- Usual handful of fixes for typos and other warts.
x86 - MTRR/PAT fixes and optimizations:
- Clean up code that deals with honoring guest MTRRs when the VM has
non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.
- Zap EPT entries when non-coherent DMA assignment stops/start to
prevent using stale entries with the wrong memtype.
- Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y
This was done as a workaround for virtual machine BIOSes that did
not bother to clear CR0.CD (because ancient KVM/QEMU did not bother
to set it, in turn), and there's zero reason to extend the quirk to
also ignore guest PAT.
x86 - SEV fixes:
- Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts
SHUTDOWN while running an SEV-ES guest.
- Clean up the recognition of emulation failures on SEV guests, when
KVM would like to "skip" the instruction but it had already been
partially emulated. This makes it possible to drop a hack that
second guessed the (insufficient) information provided by the
emulator, and just do the right thing.
Documentation:
- Various updates and fixes, mostly for x86
- MTRR and PAT fixes and optimizations"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (164 commits)
KVM: selftests: Avoid using forced target for generating arm64 headers
tools headers arm64: Fix references to top srcdir in Makefile
KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
KVM: arm64: selftest: Perform ISB before reading PAR_EL1
KVM: arm64: selftest: Add the missing .guest_prepare()
KVM: arm64: Always invalidate TLB for stage-2 permission faults
KVM: x86: Service NMI requests after PMI requests in VM-Enter path
KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
KVM: arm64: Refine _EL2 system register list that require trap reinjection
arm64: Add missing _EL2 encodings
arm64: Add missing _EL12 encodings
KVM: selftests: aarch64: vPMU test for validating user accesses
KVM: selftests: aarch64: vPMU register test for unimplemented counters
KVM: selftests: aarch64: vPMU register test for implemented counters
KVM: selftests: aarch64: Introduce vpmu_counter_access test
tools: Import arm_pmuv3.h
KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
...
Diffstat (limited to 'arch/loongarch/kvm/timer.c')
| -rw-r--r-- | arch/loongarch/kvm/timer.c | 197 |
1 files changed, 197 insertions, 0 deletions
diff --git a/arch/loongarch/kvm/timer.c b/arch/loongarch/kvm/timer.c new file mode 100644 index 000000000000..284bf553fefe --- /dev/null +++ b/arch/loongarch/kvm/timer.c @@ -0,0 +1,197 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2020-2023 Loongson Technology Corporation Limited + */ + +#include <linux/kvm_host.h> +#include <asm/kvm_csr.h> +#include <asm/kvm_vcpu.h> + +/* + * ktime_to_tick() - Scale ktime_t to timer tick value. + */ +static inline u64 ktime_to_tick(struct kvm_vcpu *vcpu, ktime_t now) +{ + u64 delta; + + delta = ktime_to_ns(now); + return div_u64(delta * vcpu->arch.timer_mhz, MNSEC_PER_SEC); +} + +static inline u64 tick_to_ns(struct kvm_vcpu *vcpu, u64 tick) +{ + return div_u64(tick * MNSEC_PER_SEC, vcpu->arch.timer_mhz); +} + +/* + * Push timer forward on timeout. + * Handle an hrtimer event by push the hrtimer forward a period. + */ +static enum hrtimer_restart kvm_count_timeout(struct kvm_vcpu *vcpu) +{ + unsigned long cfg, period; + + /* Add periodic tick to current expire time */ + cfg = kvm_read_sw_gcsr(vcpu->arch.csr, LOONGARCH_CSR_TCFG); + if (cfg & CSR_TCFG_PERIOD) { + period = tick_to_ns(vcpu, cfg & CSR_TCFG_VAL); + hrtimer_add_expires_ns(&vcpu->arch.swtimer, period); + return HRTIMER_RESTART; + } else + return HRTIMER_NORESTART; +} + +/* Low level hrtimer wake routine */ +enum hrtimer_restart kvm_swtimer_wakeup(struct hrtimer *timer) +{ + struct kvm_vcpu *vcpu; + + vcpu = container_of(timer, struct kvm_vcpu, arch.swtimer); + kvm_queue_irq(vcpu, INT_TI); + rcuwait_wake_up(&vcpu->wait); + + return kvm_count_timeout(vcpu); +} + +/* + * Initialise the timer to the specified frequency, zero it + */ +void kvm_init_timer(struct kvm_vcpu *vcpu, unsigned long timer_hz) +{ + vcpu->arch.timer_mhz = timer_hz >> 20; + + /* Starting at 0 */ + kvm_write_sw_gcsr(vcpu->arch.csr, LOONGARCH_CSR_TVAL, 0); +} + +/* + * Restore hard timer state and enable guest to access timer registers + * without trap, should be called with irq disabled + */ +void kvm_acquire_timer(struct kvm_vcpu *vcpu) +{ + unsigned long cfg; + + cfg = read_csr_gcfg(); + if (!(cfg & CSR_GCFG_TIT)) + return; + + /* Enable guest access to hard timer */ + write_csr_gcfg(cfg & ~CSR_GCFG_TIT); + + /* + * Freeze the soft-timer and sync the guest stable timer with it. We do + * this with interrupts disabled to avoid latency. + */ + hrtimer_cancel(&vcpu->arch.swtimer); +} + +/* + * Restore soft timer state from saved context. + */ +void kvm_restore_timer(struct kvm_vcpu *vcpu) +{ + unsigned long cfg, delta, period; + ktime_t expire, now; + struct loongarch_csrs *csr = vcpu->arch.csr; + + /* + * Set guest stable timer cfg csr + */ + cfg = kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TCFG); + kvm_restore_hw_gcsr(csr, LOONGARCH_CSR_ESTAT); + kvm_restore_hw_gcsr(csr, LOONGARCH_CSR_TCFG); + if (!(cfg & CSR_TCFG_EN)) { + /* Guest timer is disabled, just restore timer registers */ + kvm_restore_hw_gcsr(csr, LOONGARCH_CSR_TVAL); + return; + } + + /* + * Set remainder tick value if not expired + */ + now = ktime_get(); + expire = vcpu->arch.expire; + if (ktime_before(now, expire)) + delta = ktime_to_tick(vcpu, ktime_sub(expire, now)); + else { + if (cfg & CSR_TCFG_PERIOD) { + period = cfg & CSR_TCFG_VAL; + delta = ktime_to_tick(vcpu, ktime_sub(now, expire)); + delta = period - (delta % period); + } else + delta = 0; + /* + * Inject timer here though sw timer should inject timer + * interrupt async already, since sw timer may be cancelled + * during injecting intr async in function kvm_acquire_timer + */ + kvm_queue_irq(vcpu, INT_TI); + } + + write_gcsr_timertick(delta); +} + +/* + * Save guest timer state and switch to software emulation of guest + * timer. The hard timer must already be in use, so preemption should be + * disabled. + */ +static void _kvm_save_timer(struct kvm_vcpu *vcpu) +{ + unsigned long ticks, delta; + ktime_t expire; + struct loongarch_csrs *csr = vcpu->arch.csr; + + ticks = kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TVAL); + delta = tick_to_ns(vcpu, ticks); + expire = ktime_add_ns(ktime_get(), delta); + vcpu->arch.expire = expire; + if (ticks) { + /* + * Update hrtimer to use new timeout + * HRTIMER_MODE_PINNED is suggested since vcpu may run in + * the same physical cpu in next time + */ + hrtimer_cancel(&vcpu->arch.swtimer); + hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED); + } else + /* + * Inject timer interrupt so that hall polling can dectect and exit + */ + kvm_queue_irq(vcpu, INT_TI); +} + +/* + * Save guest timer state and switch to soft guest timer if hard timer was in + * use. + */ +void kvm_save_timer(struct kvm_vcpu *vcpu) +{ + unsigned long cfg; + struct loongarch_csrs *csr = vcpu->arch.csr; + + preempt_disable(); + cfg = read_csr_gcfg(); + if (!(cfg & CSR_GCFG_TIT)) { + /* Disable guest use of hard timer */ + write_csr_gcfg(cfg | CSR_GCFG_TIT); + + /* Save hard timer state */ + kvm_save_hw_gcsr(csr, LOONGARCH_CSR_TCFG); + kvm_save_hw_gcsr(csr, LOONGARCH_CSR_TVAL); + if (kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TCFG) & CSR_TCFG_EN) + _kvm_save_timer(vcpu); + } + + /* Save timer-related state to vCPU context */ + kvm_save_hw_gcsr(csr, LOONGARCH_CSR_ESTAT); + preempt_enable(); +} + +void kvm_reset_timer(struct kvm_vcpu *vcpu) +{ + write_gcsr_timercfg(0); + kvm_write_sw_gcsr(vcpu->arch.csr, LOONGARCH_CSR_TCFG, 0); + hrtimer_cancel(&vcpu->arch.swtimer); +} |
