diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2026-02-10 12:00:46 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2026-02-10 12:00:46 -0800 |
| commit | 4d84667627c4ff70826b349c449bbaf63b9af4e5 (patch) | |
| tree | 27340dbd3ed4e7bbb8b95f53c1af4c4733331a47 | |
| parent | a9aabb3b839aba094ed80861054993785c61462c (diff) | |
| parent | 7db06e329af30dcb170a6782c1714217ad65033d (diff) | |
Merge tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull performance event updates from Ingo Molnar:
"x86 PMU driver updates:
- Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs
(Dapeng Mi)
Compared to previous iterations of the Intel PMU code, there's been
a lot of changes, which center around three main areas:
- Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the
Off-Core Response (OCR) facility
- New PEBS data source encoding layout
- Support the new "RDPMC user disable" feature
- Likewise, a large series adds uncore PMU support for Intel Diamond
Rapids (DMR) CPUs (Zide Chen)
This centers around these four main areas:
- DMR may have two Integrated I/O and Memory Hub (IMH) dies,
separate from the compute tile (CBB) dies. Each CBB and each IMH
die has its own discovery domain.
- Unlike prior CPUs that retrieve the global discovery table
portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
discovery and MSR for CBB PMON discovery.
- DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR,
PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.
- IIO free-running counters in DMR are MMIO-based, unlike SPR.
- Also add support for Add missing PMON units for Intel Panther Lake,
and support Nova Lake (NVL), which largely maps to Panther Lake.
(Zide Chen)
- KVM integration: Add support for mediated vPMUs (by Kan Liang and
Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
Sandipan Das and Mingwei Zhang)
- Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs,
which are a low-power variant of Panther Lake (Zide Chen)
- Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
(aka MaxLinear Lightning Mountain), which maps to the existing
Airmont code (Martin Schiller)
Performance enhancements:
- Speed up kexec shutdown by avoiding unnecessary cross CPU calls
(Jan H. Schönherr)
- Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim)
User-space stack unwinding support:
- Various cleanups and refactorings in preparation to generalize the
unwinding code for other architectures (Jens Remus)
Uprobes updates:
- Transition from kmap_atomic to kmap_local_page (Keke Ming)
- Fix incorrect lockdep condition in filter_chain() (Breno Leitao)
- Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)
Misc fixes and cleanups:
- s390: Remove kvm_types.h from Kbuild (Randy Dunlap)
- x86/intel/uncore: Convert comma to semicolon (Chen Ni)
- x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)
- x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)"
* tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
s390: remove kvm_types.h from Kbuild
uprobes: Fix incorrect lockdep condition in filter_chain()
x86/ibs: Fix typo in dc_l2tlb_miss comment
x86/uprobes: Fix XOL allocation failure for 32-bit tasks
perf/x86/intel/uncore: Convert comma to semicolon
perf/x86/intel: Add support for rdpmc user disable feature
perf/x86: Use macros to replace magic numbers in attr_rdpmc
perf/x86/intel: Add core PMU support for Novalake
perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
perf/x86/intel: Add core PMU support for DMR
perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
perf/core: Fix slow perf_event_task_exit() with LBR callstacks
perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls
uprobes: use kmap_local_page() for temporary page mappings
arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
perf/x86/intel/uncore: Add Nova Lake support
...
45 files changed, 2154 insertions, 522 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc b/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc new file mode 100644 index 000000000000..59ec18bbb418 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc @@ -0,0 +1,44 @@ +What: /sys/bus/event_source/devices/cpu.../rdpmc +Date: November 2011 +KernelVersion: 3.10 +Contact: Linux kernel mailing list linux-kernel@vger.kernel.org +Description: The /sys/bus/event_source/devices/cpu.../rdpmc attribute + is used to show/manage if rdpmc instruction can be + executed in user space. This attribute supports 3 numbers. + - rdpmc = 0 + user space rdpmc is globally disabled for all PMU + counters. + - rdpmc = 1 + user space rdpmc is globally enabled only in event mmap + ioctl called time window. If the mmap region is unmapped, + user space rdpmc is disabled again. + - rdpmc = 2 + user space rdpmc is globally enabled for all PMU + counters. + + In the Intel platforms supporting counter level's user + space rdpmc disable feature (CPUID.23H.EBX[2] = 1), the + meaning of 3 numbers is extended to + - rdpmc = 0 + global user space rdpmc and counter level's user space + rdpmc of all counters are both disabled. + - rdpmc = 1 + No changes on behavior of global user space rdpmc. + counter level's rdpmc of system-wide events is disabled + but counter level's rdpmc of non-system-wide events is + enabled. + - rdpmc = 2 + global user space rdpmc and counter level's user space + rdpmc of all counters are both enabled unconditionally. + + The default value of rdpmc is 1. + + Please notice: + - global user space rdpmc's behavior would change + immediately along with the rdpmc value's change, + but the behavior of counter level's user space rdpmc + won't take effect immediately until the event is + reactivated or recreated. + - The rdpmc attribute is global, even for x86 hybrid + platforms. For example, changing cpu_core/rdpmc will + also change cpu_atom/rdpmc. diff --git a/arch/arm/probes/uprobes/core.c b/arch/arm/probes/uprobes/core.c index 3d96fb41d624..0e1c6b9e7e54 100644 --- a/arch/arm/probes/uprobes/core.c +++ b/arch/arm/probes/uprobes/core.c @@ -113,7 +113,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, void *src, unsigned long len) { - void *xol_page_kaddr = kmap_atomic(page); + void *xol_page_kaddr = kmap_local_page(page); void *dst = xol_page_kaddr + (vaddr & ~PAGE_MASK); preempt_disable(); @@ -126,7 +126,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, preempt_enable(); - kunmap_atomic(xol_page_kaddr); + kunmap_local(xol_page_kaddr); } diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c index c451ca17a656..2a9b0bc083a3 100644 --- a/arch/arm64/kernel/probes/uprobes.c +++ b/arch/arm64/kernel/probes/uprobes.c @@ -15,7 +15,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, void *src, unsigned long len) { - void *xol_page_kaddr = kmap_atomic(page); + void *xol_page_kaddr = kmap_local_page(page); void *dst = xol_page_kaddr + (vaddr & ~PAGE_MASK); /* @@ -32,7 +32,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, sync_icache_aliases((unsigned long)dst, (unsigned long)dst + len); done: - kunmap_atomic(xol_page_kaddr); + kunmap_local(xol_page_kaddr); } unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) diff --git a/arch/mips/kernel/uprobes.c b/arch/mips/kernel/uprobes.c index 401b148f8917..05cfc320992b 100644 --- a/arch/mips/kernel/uprobes.c +++ b/arch/mips/kernel/uprobes.c @@ -214,11 +214,11 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, unsigned long kaddr, kstart; /* Initialize the slot */ - kaddr = (unsigned long)kmap_atomic(page); + kaddr = (unsigned long)kmap_local_page(page); kstart = kaddr + (vaddr & ~PAGE_MASK); memcpy((void *)kstart, src, len); flush_icache_range(kstart, kstart + len); - kunmap_atomic((void *)kaddr); + kunmap_local((void *)kaddr); } /** diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c index cc15f7ca6cc1..f0d0691a8688 100644 --- a/arch/riscv/kernel/probes/uprobes.c +++ b/arch/riscv/kernel/probes/uprobes.c @@ -165,7 +165,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, void *src, unsigned long len) { /* Initialize the slot */ - void *kaddr = kmap_atomic(page); + void *kaddr = kmap_local_page(page); void *dst = kaddr + (vaddr & ~PAGE_MASK); unsigned long start = (unsigned long)dst; @@ -178,5 +178,5 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, } flush_icache_range(start, start + len); - kunmap_atomic(kaddr); + kunmap_local(kaddr); } diff --git a/arch/s390/include/asm/Kbuild b/arch/s390/include/asm/Kbuild index 297bf7157968..80bad7de7a04 100644 --- a/arch/s390/include/asm/Kbuild +++ b/arch/s390/include/asm/Kbuild @@ -5,6 +5,5 @@ generated-y += syscall_table.h generated-y += unistd_nr.h generic-y += asm-offsets.h -generic-y += kvm_types.h generic-y += mcs_spinlock.h generic-y += mmzone.h diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c index 94e626cc6a07..a9b72997103d 100644 --- a/arch/x86/entry/entry_fred.c +++ b/arch/x86/entry/entry_fred.c @@ -114,6 +114,7 @@ static idtentry_t sysvec_table[NR_SYSTEM_VECTORS] __ro_after_init = { SYSVEC(IRQ_WORK_VECTOR, irq_work), + SYSVEC(PERF_GUEST_MEDIATED_PMI_VECTOR, perf_guest_mediated_pmi_handler), SYSVEC(POSTED_INTR_VECTOR, kvm_posted_intr_ipi), SYSVEC(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi), SYSVEC(POSTED_INTR_NESTED_VECTOR, kvm_posted_intr_nested_ipi), diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index 44656d2fb555..0c92ed5f464b 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -1439,6 +1439,8 @@ static int __init amd_core_pmu_init(void) amd_pmu_global_cntr_mask = x86_pmu.cntr_mask64; + x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_MEDIATED_VPMU; + /* Update PMC handling functions */ x86_pmu.enable_all = amd_pmu_v2_enable_all; x86_pmu.disable_all = amd_pmu_v2_disable_all; diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 576baa9a52c5..d4468efbdcd9 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -30,6 +30,7 @@ #include <linux/device.h> #include <linux/nospec.h> #include <linux/static_call.h> +#include <linux/kvm_types.h> #include <asm/apic.h> #include <asm/stacktrace.h> @@ -56,6 +57,8 @@ DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events) = { .pmu = &pmu, }; +static DEFINE_PER_CPU(bool, guest_lvtpc_loaded); + DEFINE_STATIC_KEY_FALSE(rdpmc_never_available_key); DEFINE_STATIC_KEY_FALSE(rdpmc_always_available_key); DEFINE_STATIC_KEY_FALSE(perf_is_hybrid); @@ -1760,6 +1763,25 @@ void perf_events_lapic_init(void) apic_write(APIC_LVTPC, APIC_DM_NMI); } +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +void perf_load_guest_lvtpc(u32 guest_lvtpc) +{ + u32 masked = guest_lvtpc & APIC_LVT_MASKED; + + apic_write(APIC_LVTPC, + APIC_DM_FIXED | PERF_GUEST_MEDIATED_PMI_VECTOR | masked); + this_cpu_write(guest_lvtpc_loaded, true); +} +EXPORT_SYMBOL_FOR_KVM(perf_load_guest_lvtpc); + +void perf_put_guest_lvtpc(void) +{ + this_cpu_write(guest_lvtpc_loaded, false); + apic_write(APIC_LVTPC, APIC_DM_NMI); +} +EXPORT_SYMBOL_FOR_KVM(perf_put_guest_lvtpc); +#endif /* CONFIG_PERF_GUEST_MEDIATED_PMU */ + static int perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs) { @@ -1768,6 +1790,17 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs) int ret; /* + * Ignore all NMIs when the CPU's LVTPC is configured to route PMIs to + * PERF_GUEST_MEDIATED_PMI_VECTOR, i.e. when an NMI time can't be due + * to a PMI. Attempting to handle a PMI while the guest's context is + * loaded will generate false positives and clobber guest state. Note, + * the LVTPC is switched to/from the dedicated mediated PMI IRQ vector + * while host events are quiesced. + */ + if (this_cpu_read(guest_lvtpc_loaded)) + return NMI_DONE; + + /* * All PMUs/events that share this PMI handler should make sure to * increment active_events for their events. */ @@ -2130,7 +2163,8 @@ static int __init init_hw_perf_events(void) pr_cont("%s PMU driver.\n", x86_pmu.name); - x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */ + /* enable userspace RDPMC usage by default */ + x86_pmu.attr_rdpmc = X86_USER_RDPMC_CONDITIONAL_ENABLE; for (quirk = x86_pmu.quirks; quirk; quirk = quirk->next) quirk->func(); @@ -2582,6 +2616,27 @@ static ssize_t get_attr_rdpmc(struct device *cdev, return snprintf(buf, 40, "%d\n", x86_pmu.attr_rdpmc); } +/* + * Behaviors of rdpmc value: + * - rdpmc = 0 + * global user space rdpmc and counter level's user space rdpmc of all + * counters are both disabled. + * - rdpmc = 1 + * global user space rdpmc is enabled in mmap enabled time window and + * counter level's user space rdpmc is enabled for only non system-wide + * events. Counter level's user space rdpmc of system-wide events is + * still disabled by default. This won't introduce counter data leak for + * non system-wide events since their count data would be cleared when + * context switches. + * - rdpmc = 2 + * global user space rdpmc and counter level's user space rdpmc of all + * counters are enabled unconditionally. + * + * Suppose the rdpmc value won't be changed frequently, don't dynamically + * reschedule events to make the new rpdmc value take effect on active perf + * events immediately, the new rdpmc value would only impact the new + * activated perf events. This makes code simpler and cleaner. + */ static ssize_t set_attr_rdpmc(struct device *cdev, struct device_attribute *attr, const char *buf, size_t count) @@ -2610,12 +2665,12 @@ static ssize_t set_attr_rdpmc(struct device *cdev, */ if (val == 0) static_branch_inc(&rdpmc_never_available_key); - else if (x86_pmu.attr_rdpmc == 0) + else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_NEVER_ENABLE) static_branch_dec(&rdpmc_never_available_key); if (val == 2) static_branch_inc(&rdpmc_always_available_key); - else if (x86_pmu.attr_rdpmc == 2) + else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE) static_branch_dec(&rdpmc_always_available_key); on_each_cpu(cr4_update_pce, NULL, 1); @@ -3073,11 +3128,12 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap) cap->version = x86_pmu.version; cap->num_counters_gp = x86_pmu_num_counters(NULL); cap->num_counters_fixed = x86_pmu_num_counters_fixed(NULL); - cap->bit_width_gp = x86_pmu.cntval_bits; - cap->bit_width_fixed = x86_pmu.cntval_bits; + cap->bit_width_gp = cap->num_counters_gp ? x86_pmu.cntval_bits : 0; + cap->bit_width_fixed = cap->num_counters_fixed ? x86_pmu.cntval_bits : 0; cap->events_mask = (unsigned int)x86_pmu.events_maskl; cap->events_mask_len = x86_pmu.events_mask_len; cap->pebs_ept = x86_pmu.pebs_ept; + cap->mediated = !!(pmu.capabilities & PERF_PMU_CAP_MEDIATED_VPMU); } EXPORT_SYMBOL_FOR_KVM(perf_get_x86_pmu_capability); diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index bdf3f0d0fe21..f3ae1f8ee3cd 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -232,6 +232,29 @@ static struct event_constraint intel_skt_event_constraints[] __read_mostly = { EVENT_CONSTRAINT_END }; +static struct event_constraint intel_arw_event_constraints[] __read_mostly = { + FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */ + FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */ + FIXED_EVENT_CONSTRAINT(0x0300, 2), /* pseudo CPU_CLK_UNHALTED.REF */ + FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF_TSC_P */ + FIXED_EVENT_CONSTRAINT(0x0073, 4), /* TOPDOWN_BAD_SPECULATION.ALL */ + FIXED_EVENT_CONSTRAINT(0x019c, 5), /* TOPDOWN_FE_BOUND.ALL */ + FIXED_EVENT_CONSTRAINT(0x02c2, 6), /* TOPDOWN_RETIRING.ALL */ + INTEL_UEVENT_CONSTRAINT(0x01b7, 0x1), + INTEL_UEVENT_CONSTRAINT(0x02b7, 0x2), + INTEL_UEVENT_CONSTRAINT(0x04b7, 0x4), + INTEL_UEVENT_CONSTRAINT(0x08b7, 0x8), + INTEL_UEVENT_CONSTRAINT(0x01d4, 0x1), + INTEL_UEVENT_CONSTRAINT(0x02d4, 0x2), + INTEL_UEVENT_CONSTRAINT(0x04d4, 0x4), + INTEL_UEVENT_CONSTRAINT(0x08d4, 0x8), + INTEL_UEVENT_CONSTRAINT(0x0175, 0x1), + INTEL_UEVENT_CONSTRAINT(0x0275, 0x2), + INTEL_UEVENT_CONSTRAINT(0x21d3, 0x1), + INTEL_UEVENT_CONSTRAINT(0x22d3, 0x1), + EVENT_CONSTRAINT_END +}; + static struct event_constraint intel_skl_event_constraints[] = { FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */ FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */ @@ -435,6 +458,62 @@ static struct extra_reg intel_lnc_extra_regs[] __read_mostly = { EVENT_EXTRA_END }; +static struct event_constraint intel_pnc_event_constraints[] = { + FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */ + FIXED_EVENT_CONSTRAINT(0x0100, 0), /* INST_RETIRED.PREC_DIST */ + FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */ + FIXED_EVENT_CONSTRAINT(0x0300, 2), /* CPU_CLK_UNHALTED.REF */ + FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF_TSC_P */ + FIXED_EVENT_CONSTRAINT(0x0400, 3), /* SLOTS */ + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_RETIRING, 0), + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BAD_SPEC, 1), + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FE_BOUND, 2), + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BE_BOUND, 3), + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_HEAVY_OPS, 4), + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BR_MISPREDICT, 5), + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FETCH_LAT, 6), + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_MEM_BOUND, 7), + + INTEL_EVENT_CONSTRAINT(0x20, 0xf), + INTEL_EVENT_CONSTRAINT(0x79, 0xf), + + INTEL_UEVENT_CONSTRAINT(0x0275, 0xf), + INTEL_UEVENT_CONSTRAINT(0x0176, 0xf), + INTEL_UEVENT_CONSTRAINT(0x04a4, 0x1), + INTEL_UEVENT_CONSTRAINT(0x08a4, 0x1), + INTEL_UEVENT_CONSTRAINT(0x01cd, 0xfc), + INTEL_UEVENT_CONSTRAINT(0x02cd, 0x3), + + INTEL_EVENT_CONSTRAINT(0xd0, 0xf), + INTEL_EVENT_CONSTRAINT(0xd1, 0xf), + INTEL_EVENT_CONSTRAINT(0xd4, 0xf), + INTEL_EVENT_CONSTRAINT(0xd6, 0xf), + INTEL_EVENT_CONSTRAINT(0xdf, 0xf), + INTEL_EVENT_CONSTRAINT(0xce, 0x1), + + INTEL_UEVENT_CONSTRAINT(0x01b1, 0x8), + INTEL_UEVENT_CONSTRAINT(0x0847, 0xf), + INTEL_UEVENT_CONSTRAINT(0x0446, 0xf), + INTEL_UEVENT_CONSTRAINT(0x0846, 0xf), + INTEL_UEVENT_CONSTRAINT(0x0148, 0xf), + + EVENT_CONSTRAINT_END +}; + +static struct extra_reg intel_pnc_extra_regs[] __read_mostly = { + /* must define OMR_X first, see intel_alt_er() */ + INTEL_UEVENT_EXTRA_REG(0x012a, MSR_OMR_0, 0x40ffffff0000ffffull, OMR_0), + INTEL_UEVENT_EXTRA_REG(0x022a, MSR_OMR_1, 0x40ffffff0000ffffull, OMR_1), + INTEL_UEVENT_EXTRA_REG(0x042a, MSR_OMR_2, 0x40ffffff0000ffffull, OMR_2), + INTEL_UEVENT_EXTRA_REG(0x082a, MSR_OMR_3, 0x40ffffff0000ffffull, OMR_3), + INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd), + INTEL_UEVENT_EXTRA_REG(0x02c6, MSR_PEBS_FRONTEND, 0x9, FE), + INTEL_UEVENT_EXTRA_REG(0x03c6, MSR_PEBS_FRONTEND, 0x7fff1f, FE), + INTEL_UEVENT_EXTRA_REG(0x40ad, MSR_PEBS_FRONTEND, 0xf, FE), + INTEL_UEVENT_EXTRA_REG(0x04c2, MSR_PEBS_FRONTEND, 0x8, FE), + EVENT_EXTRA_END +}; + EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x0b,umask=0x10,ldlat=3"); EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3"); EVENT_ATTR_STR(mem-stores, mem_st_snb, "event=0xcd,umask=0x2"); @@ -650,6 +729,102 @@ static __initconst const u64 glc_hw_cache_extra_regs }, }; +static __initconst const u64 pnc_hw_cache_event_ids + [PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = +{ + [ C(L1D ) ] = { + [ C(OP_READ) ] = { + [ C(RESULT_ACCESS) ] = 0x81d0, + [ C(RESULT_MISS) ] = 0xe124, + }, + [ C(OP_WRITE) ] = { + [ C(RESULT_ACCESS) ] = 0x82d0, + }, + }, + [ C(L1I ) ] = { + [ C(OP_READ) ] = { + [ C(RESULT_MISS) ] = 0xe424, + }, + [ C(OP_WRITE) ] = { + [ C(RESULT_ACCESS) ] = -1, + [ C(RESULT_MISS) ] = -1, + }, + }, + [ C(LL ) ] = { + [ C(OP_READ) ] = { + [ C(RESULT_ACCESS) ] = 0x12a, + [ C(RESULT_MISS) ] = 0x12a, + }, + [ C(OP_WRITE) ] = { + [ C(RESULT_ACCESS) ] = 0x12a, + [ C(RESULT_MISS) ] = 0x12a, + }, + }, + [ C(DTLB) ] = { + [ C(OP_READ) ] = { + [ C(RESULT_ACCESS) ] = 0x81d0, + [ C(RESULT_MISS) ] = 0xe12, + }, + [ C(OP_WRITE) ] = { + [ C(RESULT_ACCESS) ] = 0x82d0, + [ C(RESULT_MISS) ] = 0xe13, + }, + }, + [ C(ITLB) ] = { + [ C(OP_READ) ] = { + [ C(RESULT_ACCESS) ] = -1, + [ C(RESULT_MISS) ] = 0xe11, + }, + [ C(OP_WRITE) ] = { + [ C(RESULT_ACCESS) ] = -1, + [ C(RESULT_MISS) ] = -1, + }, + [ C(OP_PREFETCH) ] = { + [ C(RESULT_ACCESS) ] = -1, + [ C(RESULT_MISS) ] = -1, + }, + }, + [ C(BPU ) ] = { + [ C(OP_READ) ] = { + [ C(RESULT_ACCESS) ] = 0x4c4, + [ C(RESULT_MISS) ] = 0x4c5, + }, + [ C(OP_WRITE) ] = { + [ C(RESULT_ACCESS) ] = -1, + [ C(RESULT_MISS) ] = -1, + }, + [ C(OP_PREFETCH) ] = { + [ C(RESULT_ACCESS) ] = -1, + [ C(RESULT_MISS) ] = -1, + }, + }, + [ C(NODE) ] = { + [ C(OP_READ) ] = { + [ C(RESULT_ACCESS) ] = -1, + [ C(RESULT_MISS) ] = -1, + }, + }, +}; + +static __initconst const u64 pnc_hw_cache_extra_regs + [PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = +{ + [ C(LL ) ] = { + [ C(OP_READ) ] = { + [ C(RESULT_ACCESS) ] = 0x4000000000000001, + [ C(RESULT_MISS) ] = 0xFFFFF000000001, + }, + [ C(OP_WRITE) ] = { + [ C(RESULT_ACCESS) ] = 0x4000000000000002, + [ C(RESULT_MISS) ] = 0xFFFFF000000002, + }, + }, +}; + /* * Notes on the events: * - data reads do not include code reads (comparable to earlier tables) @@ -2167,6 +2342,26 @@ static __initconst const u64 tnt_hw_cache_extra_regs }, }; +static __initconst const u64 arw_hw_cache_extra_regs + [PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { + [C(LL)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = 0x4000000000000001, + [C(RESULT_MISS)] = 0xFFFFF000000001, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = 0x4000000000000002, + [C(RESULT_MISS)] = 0xFFFFF000000002, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = 0x0, + [C(RESULT_MISS)] = 0x0, + }, + }, +}; + EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_tnt, "event=0x71,umask=0x0"); EVENT_ATTR_STR(topdown-retiring, td_retiring_tnt, "event=0xc2,umask=0x0"); EVENT_ATTR_STR(topdown-bad-spec, td_bad_spec_tnt, "event=0x73,umask=0x6"); @@ -2225,6 +2420,22 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = { EVENT_EXTRA_END }; +static struct extra_reg intel_arw_extra_regs[] __read_mostly = { + /* must define OMR_X first, see intel_alt_er() */ + INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0), + INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1), + INTEL_UEVENT_EXTRA_REG(0x04b7, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2), + INTEL_UEVENT_EXTRA_REG(0x08b7, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3), + INTEL_UEVENT_EXTRA_REG(0x01d4, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0), + INTEL_UEVENT_EXTRA_REG(0x02d4, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1), + INTEL_UEVENT_EXTRA_REG(0x04d4, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2), + INTEL_UEVENT_EXTRA_REG(0x08d4, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3), + INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x5d0), + INTEL_UEVENT_EXTRA_REG(0x0127, MSR_SNOOP_RSP_0, 0xffffffffffffffffull, SNOOP_0), + INTEL_UEVENT_EXTRA_REG(0x0227, MSR_SNOOP_RSP_1, 0xffffffffffffffffull, SNOOP_1), + EVENT_EXTRA_END +}; + EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_skt, "event=0x9c,umask=0x01"); EVENT_ATTR_STR(topdown-retiring, td_retiring_skt, "event=0xc2,umask=0x02"); EVENT_ATTR_STR(topdown-be-bound, td_be_bound_skt, "event=0xa4,umask=0x02"); @@ -2917,6 +3128,8 @@ static void intel_pmu_enable_fixed(struct perf_event *event) bits |= INTEL_FIXED_0_USER; if (hwc->config & ARCH_PERFMON_EVENTSEL_OS) bits |= INTEL_FIXED_0_KERNEL; + if (hwc->config & ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE) + bits |= INTEL_FIXED_0_RDPMC_USER_DISABLE; /* * ANY bit is supported in v3 and up @@ -3052,6 +3265,27 @@ static void intel_pmu_enable_event_ext(struct perf_event *event) __intel_pmu_update_event_ext(hwc->idx, ext); } +static void intel_pmu_update_rdpmc_user_disable(struct perf_event *event) +{ + if (!x86_pmu_has_rdpmc_user_disable(event->pmu)) + return; + + /* + * Counter scope's user-space rdpmc is disabled by default + * except two cases. + * a. rdpmc = 2 (user space rdpmc enabled unconditionally) + * b. rdpmc = 1 and the event is not a system-wide event. + * The count of non-system-wide events would be cleared when + * context switches, so no count data is leaked. + */ + if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE || + (x86_pmu.attr_rdpmc == X86_USER_RDPMC_CONDITIONAL_ENABLE && + event->ctx->task)) + event->hw.config &= ~ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; + else + event->hw.config |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; +} + DEFINE_STATIC_CALL_NULL(intel_pmu_enable_event_ext, intel_pmu_enable_event_ext); static void intel_pmu_enable_event(struct perf_event *event) @@ -3060,6 +3294,8 @@ static void intel_pmu_enable_event(struct perf_event *event) struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; + intel_pmu_update_rdpmc_user_disable(event); + if (unlikely(event->attr.precise_ip)) static_call(x86_pmu_pebs_enable)(event); @@ -3532,17 +3768,32 @@ static int intel_alt_er(struct cpu_hw_events *cpuc, struct extra_reg *extra_regs = hybrid(cpuc->pmu, extra_regs); int alt_idx = idx; - if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1)) - return idx; - - if (idx == EXTRA_REG_RSP_0) - alt_idx = EXTRA_REG_RSP_1; + switch (idx) { + case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1: + if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1)) + return idx; + if (++alt_idx > EXTRA_REG_RSP_1) + alt_idx = EXTRA_REG_RSP_0; + if (config & ~extra_regs[alt_idx].valid_mask) + return idx; + break; - if (idx == EXTRA_REG_RSP_1) - alt_idx = EXTRA_REG_RSP_0; + case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3: + if (!(x86_pmu.flags & PMU_FL_HAS_OMR)) + return idx; + if (++alt_idx > EXTRA_REG_OMR_3) + alt_idx = EXTRA_REG_OMR_0; + /* + * Subtracting EXTRA_REG_OMR_0 ensures to get correct + * OMR extra_reg entries which start from 0. + */ + if (config & ~extra_regs[alt_idx - EXTRA_REG_OMR_0].valid_mask) + return idx; + break; - if (config & ~extra_regs[alt_idx].valid_mask) - return idx; + default: + break; + } return alt_idx; } @@ -3550,16 +3801,26 @@ static int intel_alt_er(struct cpu_hw_events *cpuc, static void intel_fixup_er(struct perf_event *event, int idx) { struct extra_reg *extra_regs = hybrid(event->pmu, extra_regs); - event->hw.extra_reg.idx = idx; + int er_idx; - if (idx == EXTRA_REG_RSP_0) { - event->hw.config &= ~INTEL_ARCH_EVENT_MASK; - event->hw.config |= extra_regs[EXTRA_REG_RSP_0].event; - event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0; - } else if (idx == EXTRA_REG_RSP_1) { + event->hw.extra_reg.idx = idx; + switch (idx) { + case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1: + er_idx = idx - EXTRA_REG_RSP_0; event->hw.config &= ~INTEL_ARCH_EVENT_MASK; - event->hw.config |= extra_regs[EXTRA_REG_RSP_1].event; - event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1; + event->hw.config |= extra_regs[er_idx].event; + event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0 + er_idx; + break; + + case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3: + er_idx = idx - EXTRA_REG_OMR_0; + event->hw.config &= ~ARCH_PERFMON_EVENTSEL_UMASK; + event->hw.config |= 1ULL << (8 + er_idx); + event->hw.extra_reg.reg = MSR_OMR_0 + er_idx; + break; + + default: + pr_warn("The extra reg idx %d is not supported.\n", idx); } } @@ -5633,6 +5894,8 @@ static void update_pmu_cap(struct pmu *pmu) hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2; if (ebx_0.split.eq) hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ; + if (ebx_0.split.rdpmc_user_disable) + hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; if (eax_0.split.cntr_subleaf) { cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF, @@ -5695,6 +5958,8 @@ static void intel_pmu_check_hybrid_pmus(struct x86_hybrid_pmu *pmu) else pmu->intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS; + pmu->pmu.capabilities |= PERF_PMU_CAP_MEDIATED_VPMU; + intel_pmu_check_event_constraints_all(&pmu->pmu); intel_pmu_check_extra_regs(pmu->extra_regs); @@ -7209,6 +7474,20 @@ static __always_inline void intel_pmu_init_lnc(struct pmu *pmu) hybrid(pmu, extra_regs) = intel_lnc_extra_regs; } +static __always_inline void intel_pmu_init_pnc(struct pmu *pmu) +{ + intel_pmu_init_glc(pmu); + x86_pmu.flags &= ~PMU_FL_HAS_RSP_1; + x86_pmu.flags |= PMU_FL_HAS_OMR; + memcpy(hybrid_var(pmu, hw_cache_event_ids), + pnc_hw_cache_event_ids, sizeof(hw_cache_event_ids)); + memcpy(hybrid_var(pmu, hw_cache_extra_regs), + pnc_hw_cache_extra_regs, sizeof(hw_cache_extra_regs)); + hybrid(pmu, event_constraints) = intel_pnc_event_constraints; + hybrid(pmu, pebs_constraints) = intel_pnc_pebs_event_constraints; + hybrid(pmu, extra_regs) = intel_pnc_extra_regs; +} + static __always_inline void intel_pmu_init_skt(struct pmu *pmu) { intel_pmu_init_grt(pmu); @@ -7217,6 +7496,19 @@ static __always_inline void intel_pmu_init_skt(struct pmu *pmu) static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr); } +static __always_inline void intel_pmu_init_arw(struct pmu *pmu) +{ + intel_pmu_init_grt(pmu); + x86_pmu.flags &= ~PMU_FL_HAS_RSP_1; + x86_pmu.flags |= PMU_FL_HAS_OMR; + memcpy(hybrid_var(pmu, hw_cache_extra_regs), + arw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs)); + hybrid(pmu, event_constraints) = intel_arw_event_constraints; + hybrid(pmu, pebs_constraints) = intel_arw_pebs_event_constraints; + hybrid(pmu, extra_regs) = intel_arw_extra_regs; + static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr); +} + __init int intel_pmu_init(void) { struct attribute **extra_skl_attr = &empty_attrs; @@ -7314,6 +7606,9 @@ __init int intel_pmu_init(void) pr_cont(" AnyThread deprecated, "); } + /* The perf side of core PMU is ready to support the mediated vPMU. */ + x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_MEDIATED_VPMU; + /* * Many features on and after V6 require dynamic constraint, * e.g., Arch PEBS, ACR. @@ -7405,6 +7700,7 @@ __init int intel_pmu_init(void) case INTEL_ATOM_SILVERMONT_D: case INTEL_ATOM_SILVERMONT_MID: case INTEL_ATOM_AIRMONT: + case INTEL_ATOM_AIRMONT_NP: case INTEL_ATOM_SILVERMONT_MID2: memcpy(hw_cache_event_ids, slm_hw_cache_event_ids, sizeof(hw_cache_event_ids)); @@ -7866,9 +8162,21 @@ __init int intel_pmu_init(void) x86_pmu.extra_regs = intel_rwc_extra_regs; pr_cont("Granite Rapids events, "); name = "granite_rapids"; + goto glc_common; + + case INTEL_DIAMONDRAPIDS_X: + intel_pmu_init_pnc(NULL); + x86_pmu.pebs_latency_data = pnc_latency_data; + + pr_cont("Panthercove events, "); + name = "panthercove"; + goto glc_base; glc_common: intel_pmu_init_glc(NULL); + intel_pmu_pebs_data_source_skl(true); + + glc_base: x86_pmu.pebs_ept = 1; x86_pmu.hw_config = hsw_hw_config; x86_pmu.get_event_constraints = glc_get_event_constraints; @@ -7878,7 +8186,6 @@ __init int intel_pmu_init(void) mem_attr = glc_events_attrs; td_attr = glc_td_events_attrs; tsx_attr = glc_tsx_events_attrs; - intel_pmu_pebs_data_source_skl(true); break; case INTEL_ALDERLAKE: @@ -8042,6 +8349,33 @@ __init int intel_pmu_init(void) name = "arrowlake_h_hybrid"; break; + case INTEL_NOVALAKE: + case INTEL_NOVALAKE_L: + pr_cont("Novalake Hybrid events, "); + name = "novalake_hybrid"; + intel_pmu_init_hybrid(hybrid_big_small); + + x86_pmu.pebs_latency_data = nvl_latency_data; + x86_pmu.get_event_constraints = mtl_get_event_constraints; + x86_pmu.hw_config = adl_hw_config; + + td_attr = lnl_hybrid_events_attrs; + mem_attr = mtl_hybrid_mem_attrs; + tsx_attr = adl_hybrid_tsx_attrs; + extra_attr = boot_cpu_has(X86_FEATURE_RTM) ? + mtl_hybrid_extra_attr_rtm : mtl_hybrid_extra_attr; + + /* Initialize big core specific PerfMon capabilities.*/ + pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX]; + intel_pmu_init_pnc(&pmu->pmu); + + /* Initialize Atom core specific PerfMon capabilities.*/ + pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_ATOM_IDX]; + intel_pmu_init_arw(&pmu->pmu); + + intel_pmu_pebs_data_source_lnl(); + break; + default: switch (x86_pmu.version) { case 1: diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c index fa67fda6e45b..f3d5ee07f8f2 100644 --- a/arch/x86/events/intel/cstate.c +++ b/arch/x86/events/intel/cstate.c @@ -41,7 +41,7 @@ * MSR_CORE_C1_RES: CORE C1 Residency Counter * perf code: 0x00 * Available model: SLM,AMT,GLM,CNL,ICX,TNT,ADL,RPL - * MTL,SRF,GRR,ARL,LNL,PTL + * MTL,SRF,GRR,ARL,LNL,PTL,WCL,NVL * Scope: Core (each processor core has a MSR) * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter * perf code: 0x01 @@ -53,19 +53,20 @@ * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW, * SKL,KNL,GLM,CNL,KBL,CML,ICL,ICX, * TGL,TNT,RKL,ADL,RPL,SPR,MTL,SRF, - * GRR,ARL,LNL,PTL + * GRR,ARL,LNL,PTL,WCL,NVL * Scope: Core * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter * perf code: 0x03 * Available model: SNB,IVB,HSW,BDW,SKL,CNL,KBL,CML, * ICL,TGL,RKL,ADL,RPL,MTL,ARL,LNL, - * PTL + * PTL,WCL,NVL * Scope: Core * MSR_PKG_C2_RESIDENCY: Package C2 Residency Counter. * perf code: 0x00 * Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL, * KBL,CML,ICL,ICX,TGL,TNT,RKL,ADL, - * RPL,SPR,MTL,ARL,LNL,SRF,PTL + * RPL,SPR,MTL,ARL,LNL,SRF,PTL,WCL, + * NVL * Scope: Package (physical package) * MSR_PKG_C3_RESIDENCY: Package C3 Residency Counter. * perf code: 0x01 @@ -78,7 +79,7 @@ * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW, * SKL,KNL,GLM,CNL,KBL,CML,ICL,ICX, * TGL,TNT,RKL,ADL,RPL,SPR,MTL,SRF, - * ARL,LNL,PTL + * ARL,LNL,PTL,WCL,NVL * Scope: Package (physical package) * MSR_PKG_C7_RESIDENCY: Package C7 Residency Counter. * perf code: 0x03 @@ -97,11 +98,12 @@ * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter. * perf code: 0x06 * Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL, - * TNT,RKL,ADL,RPL,MTL,ARL,LNL,PTL + * TNT,RKL,ADL,RPL,MTL,ARL,LNL,PTL, + * WCL,NVL * Scope: Package (physical package) * MSR_MODULE_C6_RES_MS: Module C6 Residency Counter. * perf code: 0x00 - * Available model: SRF,GRR + * Available model: SRF,GRR,NVL * Scope: A cluster of cores shared L2 cache * */ @@ -527,6 +529,18 @@ static const struct cstate_model lnl_cstates __initconst = { BIT(PERF_CSTATE_PKG_C10_RES), }; +static const struct cstate_model nvl_cstates __initconst = { + .core_events = BIT(PERF_CSTATE_CORE_C1_RES) | + BIT(PERF_CSTATE_CORE_C6_RES) | + BIT(PERF_CSTATE_CORE_C7_RES), + + .module_events = BIT(PERF_CSTATE_MODULE_C6_RES), + + .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) | + BIT(PERF_CSTATE_PKG_C6_RES) | + BIT(PERF_CSTATE_PKG_C10_RES), +}; + static const struct cstate_model slm_cstates __initconst = { .core_events = BIT(PERF_CSTATE_CORE_C1_RES) | BIT(PERF_CSTATE_CORE_C6_RES), @@ -599,6 +613,7 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = { X86_MATCH_VFM(INTEL_ATOM_SILVERMONT, &slm_cstates), X86_MATCH_VFM(INTEL_ATOM_SILVERMONT_D, &slm_cstates), X86_MATCH_VFM(INTEL_ATOM_AIRMONT, &slm_cstates), + X86_MATCH_VFM(INTEL_ATOM_AIRMONT_NP, &slm_cstates), X86_MATCH_VFM(INTEL_BROADWELL, &snb_cstates), X86_MATCH_VFM(INTEL_BROADWELL_D, &snb_cstates), @@ -638,6 +653,7 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = { X86_MATCH_VFM(INTEL_EMERALDRAPIDS_X, &icx_cstates), X86_MATCH_VFM(INTEL_GRANITERAPIDS_X, &icx_cstates), X86_MATCH_VFM(INTEL_GRANITERAPIDS_D, &icx_cstates), + X86_MATCH_VFM(INTEL_DIAMONDRAPIDS_X, &srf_cstates), X86_MATCH_VFM(INTEL_TIGERLAKE_L, &icl_cstates), X86_MATCH_VFM(INTEL_TIGERLAKE, &icl_cstates), @@ -654,6 +670,9 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = { X86_MATCH_VFM(INTEL_ARROWLAKE_U, &adl_cstates), X86_MATCH_VFM(INTEL_LUNARLAKE_M, &lnl_cstates), X86_MATCH_VFM(INTEL_PANTHERLAKE_L, &lnl_cstates), + X86_MATCH_VFM(INTEL_WILDCATLAKE_L, &lnl_cstates), + X86_MATCH_VFM(INTEL_NOVALAKE, &nvl_cstates), + X86_MATCH_VFM(INTEL_NOVALAKE_L, &nvl_cstates), { }, }; MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match); diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index feb1c3cf63e4..5027afc97b65 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -34,6 +34,17 @@ struct pebs_record_32 { */ +union omr_encoding { + struct { + u8 omr_source : 4; + u8 omr_remote : 1; + u8 omr_hitm : 1; + u8 omr_snoop : 1; + u8 omr_promoted : 1; + }; + u8 omr_full; +}; + union intel_x86_pebs_dse { u64 val; struct { @@ -73,6 +84,30 @@ union intel_x86_pebs_dse { unsigned int lnc_addr_blk:1; unsigned int ld_reserved6:18; }; + struct { + unsigned int pnc_dse: 8; + unsigned int pnc_l2_miss:1; + unsigned int pnc_stlb_clean_hit:1; + unsigned int pnc_stlb_any_hit:1; + unsigned int pnc_stlb_miss:1; + unsigned int pnc_locked:1; + unsigned int pnc_data_blk:1; + unsigned int pnc_addr_blk:1; + unsigned int pnc_fb_full:1; + unsigned int ld_reserved8:16; + }; + struct { + unsigned int arw_dse:8; + unsigned int arw_l2_miss:1; + unsigned int arw_xq_promotion:1; + unsigned int arw_reissue:1; + unsigned int arw_stlb_miss:1; + unsigned int arw_locked:1; + unsigned int arw_data_blk:1; + unsigned int arw_addr_blk:1; + unsigned int arw_fb_full:1; + unsigned int ld_reserved9:16; + }; }; @@ -228,6 +263,108 @@ void __init intel_pmu_pebs_data_source_lnl(void) __intel_pmu_pebs_data_source_cmt(data_source); } +/* Version for Panthercove and later */ + +/* L2 hit */ +#define PNC_PEBS_DATA_SOURCE_MAX 16 +static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = { + P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: non-cache access */ + OP_LH | LEVEL(L0) | P(SNOOP, NONE), /* 0x01: L0 hit */ + OP_LH | P(LVL, L1) | LEVEL(L1) | P(SNOOP, NONE), /* 0x02: L1 hit */ + OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x03: L1 Miss Handling Buffer hit */ + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, NONE), /* 0x04: L2 Hit Clean */ + 0, /* 0x05: Reserved */ + 0, /* 0x06: Reserved */ + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HIT), /* 0x07: L2 Hit Snoop HIT */ + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HITM), /* 0x08: L2 Hit Snoop Hit Modified */ + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, MISS), /* 0x09: Prefetch Promotion */ + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, MISS), /* 0x0a: Cross Core Prefetch Promotion */ + 0, /* 0x0b: Reserved */ + 0, /* 0x0c: Reserved */ + 0, /* 0x0d: Reserved */ + 0, /* 0x0e: Reserved */ + OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x0f: uncached */ +}; + +/* Version for Arctic Wolf and later */ + +/* L2 hit */ +#define ARW_PEBS_DATA_SOURCE_MAX 16 +static u64 arw_pebs_l2_hit_data_source[ARW_PEBS_DATA_SOURCE_MAX] = { + P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: non-cache access */ + OP_LH | P(LVL, L1) | LEVEL(L1) | P(SNOOP, NONE), /* 0x01: L1 hit */ + OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x02: WCB Hit */ + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, NONE), /* 0x03: L2 Hit Clean */ + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HIT), /* 0x04: L2 Hit Snoop HIT */ + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HITM), /* 0x05: L2 Hit Snoop Hit Modified */ + OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x06: uncached */ + 0, /* 0x07: Reserved */ + 0, /* 0x08: Reserved */ + 0, /* 0x09: Reserved */ + 0, /* 0x0a: Reserved */ + 0, /* 0x0b: Reserved */ + 0, /* 0x0c: Reserved */ + 0, /* 0x0d: Reserved */ + 0, /* 0x0e: Reserved */ + 0, /* 0x0f: Reserved */ +}; + +/* L2 miss */ +#define OMR_DATA_SOURCE_MAX 16 +static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = { + P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: invalid */ + 0, /* 0x01: Reserved */ + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE), /* 0x02: local CA shared cache */ + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */ + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO), /* 0x04: other CA IO agent */ + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE), /* 0x05: other CA shared cache */ + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */ + OP_LH | LEVEL(RAM) | P(REGION, MMIO), /* 0x07: MMIO */ + OP_LH | LEVEL(RAM) | P(REGION, MEM0), /* 0x08: Memory region 0 */ + OP_LH | LEVEL(RAM) | P(REGION, MEM1), /* 0x09: Memory region 1 */ + OP_LH | LEVEL(RAM) | P(REGION, MEM2), /* 0x0a: Memory region 2 */ + OP_LH | LEVEL(RAM) | P(REGION, MEM3), /* 0x0b: Memory region 3 */ + OP_LH | LEVEL(RAM) | P(REGION, MEM4), /* 0x0c: Memory region 4 */ + OP_LH | LEVEL(RAM) | P(REGION, MEM5), /* 0x0d: Memory region 5 */ + OP_LH | LEVEL(RAM) | P(REGION, MEM6), /* 0x0e: Memory region 6 */ + OP_LH | LEVEL(RAM) | P(REGION, MEM7), /* 0x0f: Memory region 7 */ +}; + +static u64 parse_omr_data_source(u8 dse) +{ + union omr_encoding omr; + u64 val = 0; + + omr.omr_full = dse; + val = omr_data_source[omr.omr_source]; + if (omr.omr_source > 0x1 && omr.omr_source < 0x7) + val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0; + else if (omr.omr_source > 0x7) + val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM); + + if (omr.omr_remote) + val |= REM; + + val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT); + + if (omr.omr_source == 0x2) { + u8 snoop = omr.omr_snoop | omr.omr_promoted; + + if (snoop == 0x0) + val |= P(SNOOP, NA); + else if (snoop == 0x1) + val |= P(SNOOP, MISS); + else if (snoop == 0x2) + val |= P(SNOOP, HIT); + else if (snoop == 0x3) + val |= P(SNOOP, NONE); + } else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) { + val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0; + } + + return val; +} + static u64 precise_store_data(u64 status) { union intel_x86_pebs_dse dse; @@ -356,6 +493,44 @@ u64 cmt_latency_data(struct perf_event *event, u64 status) dse.mtl_fwd_blk); } +static u64 arw_latency_data(struct perf_event *event, u64 status) +{ + union intel_x86_pebs_dse dse; + union perf_mem_data_src src; + u64 val; + + dse.val = status; + + if (!dse.arw_l2_miss) + val = arw_pebs_l2_hit_data_source[dse.arw_dse & 0xf]; + else + val = parse_omr_data_source(dse.arw_dse); + + if (!val) + val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA); + + if (dse.arw_stlb_miss) + val |= P(TLB, MISS) | P(TLB, L2); + else + val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2); + + if (dse.arw_locked) + val |= P(LOCK, LOCKED); + + if (dse.arw_data_blk) + val |= P(BLK, DATA); + if (dse.arw_addr_blk) + val |= P(BLK, ADDR); + if (!dse.arw_data_blk && !dse.arw_addr_blk) + val |= P(BLK, NA); + + src.val = val; + if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW) + src.mem_op = P(OP, STORE); + + return src.val; +} + static u64 lnc_latency_data(struct perf_event *event, u64 status) { union intel_x86_pebs_dse dse; @@ -411,6 +586,54 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status) return lnl_latency_data(event, status); } +u64 pnc_latency_data(struct perf_event *event, u64 status) +{ + union intel_x86_pebs_dse dse; + union perf_mem_data_src src; + u64 val; + + dse.val = status; + + if (!dse.pnc_l2_miss) + val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf]; + else + val = parse_omr_data_source(dse.pnc_dse); + + if (!val) + val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA); + + if (dse.pnc_stlb_miss) + val |= P(TLB, MISS) | P(TLB, L2); + else + val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2); + + if (dse.pnc_locked) + val |= P(LOCK, LOCKED); + + if (dse.pnc_data_blk) + val |= P(BLK, DATA); + if (dse.pnc_addr_blk) + val |= P(BLK, ADDR); + if (!dse.pnc_data_blk && !dse.pnc_addr_blk) + val |= P(BLK, NA); + + src.val = val; + if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW) + src.mem_op = P(OP, STORE); + + return src.val; +} + +u64 nvl_latency_data(struct perf_event *event, u64 status) +{ + struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu); + + if (pmu->pmu_type == hybrid_small) + return arw_latency_data(event, status); + + return pnc_latency_data(event, status); +} + static u64 load_latency_data(struct perf_event *event, u64 status) { union intel_x86_pebs_dse dse; @@ -1070,6 +1293,17 @@ struct event_constraint intel_grt_pebs_event_constraints[] = { EVENT_CONSTRAINT_END }; +struct event_constraint intel_arw_pebs_event_constraints[] = { + /* Allow all events as PEBS with no flags */ + INTEL_HYBRID_LAT_CONSTRAINT(0x5d0, 0xff), + INTEL_HYBRID_LAT_CONSTRAINT(0x6d0, 0xff), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x01d4, 0x1), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x02d4, 0x2), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x04d4, 0x4), + INTEL_FLAGS_UEVENT_CONSTRAINT(0x08d4, 0x8), + EVENT_CONSTRAINT_END +}; + struct event_constraint intel_nehalem_pebs_event_constraints[] = { INTEL_PLD_CONSTRAINT(0x100b, 0xf), /* MEM_INST_RETIRED.* */ INTEL_FLAGS_EVENT_CONSTRAINT(0x0f, 0xf), /* MEM_UNCORE_RETIRED.* */ @@ -1285,6 +1519,33 @@ struct event_constraint intel_lnc_pebs_event_constraints[] = { EVENT_CONSTRAINT_END }; +struct event_constraint intel_pnc_pebs_event_constraints[] = { + INTEL_FLAGS_UEVENT_CONSTRAINT(0x100, 0x100000000ULL), /* INST_RETIRED.PREC_DIST */ + INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL), + + INTEL_HYBRID_LDLAT_CONSTRAINT(0x1cd, 0xfc), + INTEL_HYBRID_STLAT_CONSTRAINT(0x2cd, 0x3), + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_LOADS */ + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_STORES */ + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x21d0, 0xf), /* MEM_INST_RETIRED.LOCK_LOADS */ + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x41d0, 0xf), /* MEM_INST_RETIRED.SPLIT_LOADS */ + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x42d0, 0xf), /* MEM_INST_RETIRED.SPLIT_STORES */ + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x81d0, 0xf), /* MEM_INST_RETIRED.ALL_LOADS */ + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x82d0, 0xf), /* MEM_INST_RETIRED.ALL_STORES */ + + INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD_RANGE(0xd1, 0xd4, 0xf), + + INTEL_FLAGS_EVENT_CONSTRAINT(0xd0, 0xf), + INTEL_FLAGS_EVENT_CONSTRAINT(0xd6, 0xf), + + /* + * Everything else is handled by PMU_FL_PEBS_ALL, because we + * need the full constraints from the main table. + */ + + EVENT_CONSTRAINT_END +}; + struct event_constraint *intel_pebs_constraints(struct perf_event *event) { struct event_constraint *pebs_constraints = hybrid(event->pmu, pebs_constraints); diff --git a/arch/x86/events/intel/p6.c b/arch/x86/events/intel/p6.c index 6e41de355bd8..fb991e0ac614 100644 --- a/arch/x86/events/intel/p6.c +++ b/arch/x86/events/intel/p6.c @@ -243,7 +243,7 @@ static __init void p6_pmu_rdpmc_quirk(void) */ pr_warn("Userspace RDPMC support disabled due to a CPU erratum\n"); x86_pmu.attr_rdpmc_broken = 1; - x86_pmu.attr_rdpmc = 0; + x86_pmu.attr_rdpmc = X86_USER_RDPMC_NEVER_ENABLE; } } diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c index e228e564b15e..4684649109d9 100644 --- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -436,7 +436,7 @@ uncore_get_event_constraint(struct intel_uncore_box *box, struct perf_event *eve if (type->constraints) { for_each_event_constraint(c, type->constraints) { - if ((event->hw.config & c->cmask) == c->code) + if (constraint_match(c, event->hw.config)) return c; } } @@ -1697,152 +1697,181 @@ err: return ret; } -struct intel_uncore_init_fun { - void (*cpu_init)(void); - int (*pci_init)(void); - void (*mmio_init)(void); - /* Discovery table is required */ - bool use_discovery; - /* The units in the discovery table should be ignored. */ - int *uncore_units_ignore; -}; +static int uncore_mmio_global_init(u64 ctl) +{ + void __iomem *io_addr; + + io_addr = ioremap(ctl, sizeof(ctl)); + if (!io_addr) + return -ENOMEM; + + /* Clear freeze bit (0) to enable all counters. */ + writel(0, io_addr); -static const struct intel_uncore_init_fun nhm_uncore_init __initconst = { + iounmap(io_addr); + return 0; +} + +static const struct uncore_plat_init nhm_uncore_init __initconst = { .cpu_init = nhm_uncore_cpu_init, }; -static const struct intel_uncore_init_fun snb_uncore_init __initconst = { +static const struct uncore_plat_init snb_uncore_init __initconst = { .cpu_init = snb_uncore_cpu_init, .pci_init = snb_uncore_pci_init, }; -static const struct intel_uncore_init_fun ivb_uncore_init __initconst = { +static const struct uncore_plat_init ivb_uncore_init __initconst = { .cpu_init = snb_uncore_cpu_init, .pci_init = ivb_uncore_pci_init, }; -static const struct intel_uncore_init_fun hsw_uncore_init __initconst = { +static const struct uncore_plat_init hsw_uncore_init __initconst = { .cpu_init = snb_uncore_cpu_init, .pci_init = hsw_uncore_pci_init, }; -static const struct intel_uncore_init_fun bdw_uncore_init __initconst = { +static const struct uncore_plat_init bdw_uncore_init __initconst = { .cpu_init = snb_uncore_cpu_init, .pci_init = bdw_uncore_pci_init, }; -static const struct intel_uncore_init_fun snbep_uncore_init __initconst = { +static const struct uncore_plat_init snbep_uncore_init __initconst = { .cpu_init = snbep_uncore_cpu_init, .pci_init = snbep_uncore_pci_init, }; -static const struct intel_uncore_init_fun nhmex_uncore_init __initconst = { +static const struct uncore_plat_init nhmex_uncore_init __initconst = { .cpu_init = nhmex_uncore_cpu_init, }; -static const struct intel_uncore_init_fun ivbep_uncore_init __initconst = { +static const struct uncore_plat_init ivbep_uncore_init __initconst = { .cpu_init = ivbep_uncore_cpu_init, .pci_init = ivbep_uncore_pci_init, }; -static const struct intel_uncore_init_fun hswep_uncore_init __initconst = { +static const struct uncore_plat_init hswep_uncore_init __initconst = { .cpu_init = hswep_uncore_cpu_init, .pci_init = hswep_uncore_pci_init, }; -static const struct intel_uncore_init_fun bdx_uncore_init __initconst = { +static const struct uncore_plat_init bdx_uncore_init __initconst = { .cpu_init = bdx_uncore_cpu_init, .pci_init = bdx_uncore_pci_init, }; -static const struct intel_uncore_init_fun knl_uncore_init __initconst = { +static const struct uncore_plat_init knl_uncore_init __initconst = { .cpu_init = knl_uncore_cpu_init, .pci_init = knl_uncore_pci_init, }; -static const struct intel_uncore_init_fun skl_uncore_init __initconst = { +static const struct uncore_plat_init skl_uncore_init __initconst = { .cpu_init = skl_uncore_cpu_init, .pci_init = skl_uncore_pci_init, }; -static const struct intel_uncore_init_fun skx_uncore_init __initconst = { +static const struct uncore_plat_init skx_uncore_init __initconst = { .cpu_init = skx_uncore_cpu_init, .pci_init = skx_uncore_pci_init, }; -static const struct intel_uncore_init_fun icl_uncore_init __initconst = { +static const struct uncore_plat_init icl_uncore_init __initconst = { .cpu_init = icl_uncore_cpu_init, .pci_init = skl_uncore_pci_init, }; -static const struct intel_uncore_init_fun tgl_uncore_init __initconst = { +static const struct uncore_plat_init tgl_uncore_init __initconst = { .cpu_init = tgl_uncore_cpu_init, .mmio_init = tgl_uncore_mmio_init, }; -static const struct intel_uncore_init_fun tgl_l_uncore_init __initconst = { +static const struct uncore_plat_init tgl_l_uncore_init __initconst = { .cpu_init = tgl_uncore_cpu_init, .mmio_init = tgl_l_uncore_mmio_init, }; -static const struct intel_uncore_init_fun rkl_uncore_init __initconst = { +static const struct uncore_plat_init rkl_uncore_init __initconst = { .cpu_init = tgl_uncore_cpu_init, .pci_init = skl_uncore_pci_init, }; -static const struct intel_uncore_init_fun adl_uncore_init __initconst = { +static const struct uncore_plat_init adl_uncore_init __initconst = { .cpu_init = adl_uncore_cpu_init, .mmio_init = adl_uncore_mmio_init, }; -static const struct intel_uncore_init_fun mtl_uncore_init __initconst = { +static const struct uncore_plat_init mtl_uncore_init __initconst = { .cpu_init = mtl_uncore_cpu_init, .mmio_init = adl_uncore_mmio_init, }; -static const struct intel_uncore_init_fun lnl_uncore_init __initconst = { +static const struct uncore_plat_init lnl_uncore_init __initconst = { .cpu_init = lnl_uncore_cpu_init, .mmio_init = lnl_uncore_mmio_init, }; -static const struct intel_uncore_init_fun ptl_uncore_init __initconst = { +static const struct uncore_plat_init ptl_uncore_init __initconst = { .cpu_init = ptl_uncore_cpu_init, .mmio_init = ptl_uncore_mmio_init, - .use_discovery = true, + .domain[0].discovery_base = UNCORE_DISCOVERY_MSR, + .domain[0].global_init = uncore_mmio_global_init, }; -static const struct intel_uncore_init_fun icx_uncore_init __initconst = { +static const struct uncore_plat_init nvl_uncore_init __initconst = { + .cpu_init = nvl_uncore_cpu_init, + .mmio_init = ptl_uncore_mmio_init, + .domain[0].discovery_base = PACKAGE_UNCORE_DISCOVERY_MSR, + .domain[0].global_init = uncore_mmio_global_init, +}; + +static const struct uncore_plat_init icx_uncore_init __initconst = { .cpu_init = icx_uncore_cpu_init, .pci_init = icx_uncore_pci_init, .mmio_init = icx_uncore_mmio_init, }; -static const struct intel_uncore_init_fun snr_uncore_init __initconst = { +static const struct uncore_plat_init snr_uncore_init __initconst = { .cpu_init = snr_uncore_cpu_init, .pci_init = snr_uncore_pci_init, .mmio_init = snr_uncore_mmio_init, }; -static const struct intel_uncore_init_fun spr_uncore_init __initconst = { +static const struct uncore_plat_init spr_uncore_init __initconst = { .cpu_init = spr_uncore_cpu_init, .pci_init = spr_uncore_pci_init, .mmio_init = spr_uncore_mmio_init, - .use_discovery = true, - .uncore_units_ignore = spr_uncore_units_ignore, + .domain[0].base_is_pci = true, + .domain[0].discovery_base = UNCORE_DISCOVERY_TABLE_DEVICE, + .domain[0].units_ignore = spr_uncore_units_ignore, }; -static const struct intel_uncore_init_fun gnr_uncore_init __initconst = { +static const struct uncore_plat_init gnr_uncore_init __initconst = { .cpu_init = gnr_uncore_cpu_init, .pci_init = gnr_uncore_pci_init, .mmio_init = gnr_uncore_mmio_init, - .use_discovery = true, - .uncore_units_ignore = gnr_uncore_units_ignore, + .domain[0].base_is_pci = true, + .domain[0].discovery_base = UNCORE_DISCOVERY_TABLE_DEVICE, + .domain[0].units_ignore = gnr_uncore_units_ignore, +}; + +static const struct uncore_plat_init dmr_uncore_init __initconst = { + .pci_init = dmr_uncore_pci_init, + .mmio_init = dmr_uncore_mmio_init, + .domain[0].base_is_pci = true, + .domain[0].discovery_base = DMR_UNCORE_DISCOVERY_TABLE_DEVICE, + .domain[0].units_ignore = dmr_uncore_imh_units_ignore, + .domain[1].discovery_base = CBB_UNCORE_DISCOVERY_MSR, + .domain[1].units_ignore = dmr_uncore_cbb_units_ignore, + .domain[1].global_init = uncore_mmio_global_init, }; -static const struct intel_uncore_init_fun generic_uncore_init __initconst = { +static const struct uncore_plat_init generic_uncore_init __initconst = { .cpu_init = intel_uncore_generic_uncore_cpu_init, .pci_init = intel_uncore_generic_uncore_pci_init, .mmio_init = intel_uncore_generic_uncore_mmio_init, + .domain[0].base_is_pci = true, + .domain[0].discovery_base = PCI_ANY_ID, + .domain[1].discovery_base = UNCORE_DISCOVERY_MSR, }; static const struct x86_cpu_id intel_uncore_match[] __initconst = { @@ -1894,6 +1923,8 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = { X86_MATCH_VFM(INTEL_LUNARLAKE_M, &lnl_uncore_init), X86_MATCH_VFM(INTEL_PANTHERLAKE_L, &ptl_uncore_init), X86_MATCH_VFM(INTEL_WILDCATLAKE_L, &ptl_uncore_init), + X86_MATCH_VFM(INTEL_NOVALAKE, &nvl_uncore_init), + X86_MATCH_VFM(INTEL_NOVALAKE_L, &nvl_uncore_init), X86_MATCH_VFM(INTEL_SAPPHIRERAPIDS_X, &spr_uncore_init), X86_MATCH_VFM(INTEL_EMERALDRAPIDS_X, &spr_uncore_init), X86_MATCH_VFM(INTEL_GRANITERAPIDS_X, &gnr_uncore_init), @@ -1903,14 +1934,25 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = { X86_MATCH_VFM(INTEL_ATOM_CRESTMONT_X, &gnr_uncore_init), X86_MATCH_VFM(INTEL_ATOM_CRESTMONT, &gnr_uncore_init), X86_MATCH_VFM(INTEL_ATOM_DARKMONT_X, &gnr_uncore_init), + X86_MATCH_VFM(INTEL_DIAMONDRAPIDS_X, &dmr_uncore_init), {}, }; MODULE_DEVICE_TABLE(x86cpu, intel_uncore_match); +static bool uncore_use_discovery(struct uncore_plat_init *config) +{ + for (int i = 0; i < UNCORE_DISCOVERY_DOMAINS; i++) { + if (config->domain[i].discovery_base) + return true; + } + + return false; +} + static int __init intel_uncore_init(void) { const struct x86_cpu_id *id; - struct intel_uncore_init_fun *uncore_init; + struct uncore_plat_init *uncore_init; int pret = 0, cret = 0, mret = 0, ret; if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) @@ -1921,16 +1963,15 @@ static int __init intel_uncore_init(void) id = x86_match_cpu(intel_uncore_match); if (!id) { - if (!uncore_no_discover && intel_uncore_has_discovery_tables(NULL)) - uncore_init = (struct intel_uncore_init_fun *)&generic_uncore_init; - else + uncore_init = (struct uncore_plat_init *)&generic_uncore_init; + if (uncore_no_discover || !uncore_discovery(uncore_init)) return -ENODEV; } else { - uncore_init = (struct intel_uncore_init_fun *)id->driver_data; - if (uncore_no_discover && uncore_init->use_discovery) + uncore_init = (struct uncore_plat_init *)id->driver_data; + if (uncore_no_discover && uncore_use_discovery(uncore_init)) return -ENODEV; - if (uncore_init->use_discovery && - !intel_uncore_has_discovery_tables(uncore_init->uncore_units_ignore)) + if (uncore_use_discovery(uncore_init) && + !uncore_discovery(uncore_init)) return -ENODEV; } diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h index d8815fff7588..c35918c01afa 100644 --- a/arch/x86/events/intel/uncore.h +++ b/arch/x86/events/intel/uncore.h @@ -33,6 +33,8 @@ #define UNCORE_EXTRA_PCI_DEV_MAX 4 #define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff) +#define UNCORE_EVENT_CONSTRAINT_RANGE(c, e, n) \ + EVENT_CONSTRAINT_RANGE(c, e, n, 0xff) #define UNCORE_IGNORE_END -1 @@ -47,6 +49,25 @@ struct uncore_event_desc; struct freerunning_counters; struct intel_uncore_topology; +struct uncore_discovery_domain { + /* MSR address or PCI device used as the discovery base */ + u32 discovery_base; + bool base_is_pci; + int (*global_init)(u64 ctl); + + /* The units in the discovery table should be ignored. */ + int *units_ignore; +}; + +#define UNCORE_DISCOVERY_DOMAINS 2 +struct uncore_plat_init { + void (*cpu_init)(void); + int (*pci_init)(void); + void (*mmio_init)(void); + + struct uncore_discovery_domain domain[UNCORE_DISCOVERY_DOMAINS]; +}; + struct intel_uncore_type { const char *name; int num_counters; @@ -597,6 +618,8 @@ extern struct pci_extra_dev *uncore_extra_pci_dev; extern struct event_constraint uncore_constraint_empty; extern int spr_uncore_units_ignore[]; extern int gnr_uncore_units_ignore[]; +extern int dmr_uncore_imh_units_ignore[]; +extern int dmr_uncore_cbb_units_ignore[]; /* uncore_snb.c */ int snb_uncore_pci_init(void); @@ -613,6 +636,7 @@ void adl_uncore_cpu_init(void); void lnl_uncore_cpu_init(void); void mtl_uncore_cpu_init(void); void ptl_uncore_cpu_init(void); +void nvl_uncore_cpu_init(void); void tgl_uncore_mmio_init(void); void tgl_l_uncore_mmio_init(void); void adl_uncore_mmio_init(void); @@ -645,6 +669,8 @@ void spr_uncore_mmio_init(void); int gnr_uncore_pci_init(void); void gnr_uncore_cpu_init(void); void gnr_uncore_mmio_init(void); +int dmr_uncore_pci_init(void); +void dmr_uncore_mmio_init(void); /* uncore_nhmex.c */ void nhmex_uncore_cpu_init(void); diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c index 7d57ce706feb..b46575254dbe 100644 --- a/arch/x86/events/intel/uncore_discovery.c +++ b/arch/x86/events/intel/uncore_discovery.c @@ -12,24 +12,6 @@ static struct rb_root discovery_tables = RB_ROOT; static int num_discovered_types[UNCORE_ACCESS_MAX]; -static bool has_generic_discovery_table(void) -{ - struct pci_dev *dev; - int dvsec; - - dev = pci_get_device(PCI_VENDOR_ID_INTEL, UNCORE_DISCOVERY_TABLE_DEVICE, NULL); - if (!dev) - return false; - - /* A discovery table device has the unique capability ID. */ - dvsec = pci_find_next_ext_capability(dev, 0, UNCORE_EXT_CAP_ID_DISCOVERY); - pci_dev_put(dev); - if (dvsec) - return true; - - return false; -} - static int logical_die_id; static int get_device_die_id(struct pci_dev *dev) @@ -52,7 +34,7 @@ static int get_device_die_id(struct pci_dev *dev) static inline int __type_cmp(const void *key, const struct rb_node *b) { - struct intel_uncore_discovery_type *type_b = __node_2_type(b); + const struct intel_uncore_discovery_type *type_b = __node_2_type(b); const u16 *type_id = key; if (type_b->type > *type_id) @@ -115,7 +97,7 @@ get_uncore_discovery_type(struct uncore_unit_discovery *unit) static inline int pmu_idx_cmp(const void *key, const struct rb_node *b) { - struct intel_uncore_discovery_unit *unit; + const struct intel_uncore_discovery_unit *unit; const unsigned int *id = key; unit = rb_entry(b, struct intel_uncore_discovery_unit, node); @@ -173,7 +155,7 @@ int intel_uncore_find_discovery_unit_id(struct rb_root *units, int die, static inline bool unit_less(struct rb_node *a, const struct rb_node *b) { - struct intel_uncore_discovery_unit *a_node, *b_node; + const struct intel_uncore_discovery_unit *a_node, *b_node; a_node = rb_entry(a, struct intel_uncore_discovery_unit, node); b_node = rb_entry(b, struct intel_uncore_discovery_unit, node); @@ -259,23 +241,24 @@ uncore_insert_box_info(struct uncore_unit_discovery *unit, } static bool -uncore_ignore_unit(struct uncore_unit_discovery *unit, int *ignore) +uncore_ignore_unit(struct uncore_unit_discovery *unit, + struct uncore_discovery_domain *domain) { int i; - if (!ignore) + if (!domain || !domain->units_ignore) return false; - for (i = 0; ignore[i] != UNCORE_IGNORE_END ; i++) { - if (unit->box_type == ignore[i]) + for (i = 0; domain->units_ignore[i] != UNCORE_IGNORE_END ; i++) { + if (unit->box_type == domain->units_ignore[i]) return true; } return false; } -static int __parse_discovery_table(resource_size_t addr, int die, - bool *parsed, int *ignore) +static int __parse_discovery_table(struct uncore_discovery_domain *domain, + resource_size_t addr, int die, bool *parsed) { struct uncore_global_discovery global; struct uncore_unit_discovery unit; @@ -303,6 +286,9 @@ static int __parse_discovery_table(resource_size_t addr, int die, if (!io_addr) return -ENOMEM; + if (domain->global_init && domain->global_init(global.ctl)) + return -ENODEV; + /* Parsing Unit Discovery State */ for (i = 0; i < global.max_units; i++) { memcpy_fromio(&unit, io_addr + (i + 1) * (global.stride * 8), @@ -314,7 +300,7 @@ static int __parse_discovery_table(resource_size_t addr, int die, if (unit.access_type >= UNCORE_ACCESS_MAX) continue; - if (uncore_ignore_unit(&unit, ignore)) + if (uncore_ignore_unit(&unit, domain)) continue; uncore_insert_box_info(&unit, die); @@ -325,9 +311,9 @@ static int __parse_discovery_table(resource_size_t addr, int die, return 0; } -static int parse_discovery_table(struct pci_dev *dev, int die, - u32 bar_offset, bool *parsed, - int *ignore) +static int parse_discovery_table(struct uncore_discovery_domain *domain, + struct pci_dev *dev, int die, + u32 bar_offset, bool *parsed) { resource_size_t addr; u32 val; @@ -347,20 +333,17 @@ static int parse_discovery_table(struct pci_dev *dev, int die, } #endif - return __parse_discovery_table(addr, die, parsed, ignore); + return __parse_discovery_table(domain, addr, die, parsed); } -static bool intel_uncore_has_discovery_tables_pci(int *ignore) +static bool uncore_discovery_pci(struct uncore_discovery_domain *domain) { u32 device, val, entry_id, bar_offset; int die, dvsec = 0, ret = true; struct pci_dev *dev = NULL; bool parsed = false; - if (has_generic_discovery_table()) - device = UNCORE_DISCOVERY_TABLE_DEVICE; - else - device = PCI_ANY_ID; + device = domain->discovery_base; /* * Start a new search and iterates through the list of @@ -386,7 +369,7 @@ static bool intel_uncore_has_discovery_tables_pci(int *ignore) if (die < 0) continue; - parse_discovery_table(dev, die, bar_offset, &parsed, ignore); + parse_discovery_table(domain, dev, die, bar_offset, &parsed); } } @@ -399,7 +382,7 @@ err: return ret; } -static bool intel_uncore_has_discovery_tables_msr(int *ignore) +static bool uncore_discovery_msr(struct uncore_discovery_domain *domain) { unsigned long *die_mask; bool parsed = false; @@ -417,13 +400,13 @@ static bool intel_uncore_has_discovery_tables_msr(int *ignore) if (__test_and_set_bit(die, die_mask)) continue; - if (rdmsrq_safe_on_cpu(cpu, UNCORE_DISCOVERY_MSR, &base)) + if (rdmsrq_safe_on_cpu(cpu, domain->discovery_base, &base)) continue; if (!base) continue; - __parse_discovery_table(base, die, &parsed, ignore); + __parse_discovery_table(domain, base, die, &parsed); } cpus_read_unlock(); @@ -432,10 +415,23 @@ static bool intel_uncore_has_discovery_tables_msr(int *ignore) return parsed; } -bool intel_uncore_has_discovery_tables(int *ignore) +bool uncore_discovery(struct uncore_plat_init *init) { - return intel_uncore_has_discovery_tables_msr(ignore) || - intel_uncore_has_discovery_tables_pci(ignore); + struct uncore_discovery_domain *domain; + bool ret = false; + int i; + + for (i = 0; i < UNCORE_DISCOVERY_DOMAINS; i++) { + domain = &init->domain[i]; + if (domain->discovery_base) { + if (!domain->base_is_pci) + ret |= uncore_discovery_msr(domain); + else + ret |= uncore_discovery_pci(domain); + } + } + + return ret; } void intel_uncore_clear_discovery_tables(void) diff --git a/arch/x86/events/intel/uncore_discovery.h b/arch/x86/events/intel/uncore_discovery.h index dff75c98e22f..e1330342b92e 100644 --- a/arch/x86/events/intel/uncore_discovery.h +++ b/arch/x86/events/intel/uncore_discovery.h @@ -2,9 +2,15 @@ /* Store the full address of the global discovery table */ #define UNCORE_DISCOVERY_MSR 0x201e +/* Base address of uncore perfmon discovery table for CBB domain */ +#define CBB_UNCORE_DISCOVERY_MSR 0x710 +/* Base address of uncore perfmon discovery table for the package */ +#define PACKAGE_UNCORE_DISCOVERY_MSR 0x711 /* Generic device ID of a discovery table device */ #define UNCORE_DISCOVERY_TABLE_DEVICE 0x09a7 +/* Device ID used on DMR */ +#define DMR_UNCORE_DISCOVERY_TABLE_DEVICE 0x09a1 /* Capability ID for a discovery table device */ #define UNCORE_EXT_CAP_ID_DISCOVERY 0x23 /* First DVSEC offset */ @@ -136,7 +142,7 @@ struct intel_uncore_discovery_type { u16 num_units; /* number of units */ }; -bool intel_uncore_has_discovery_tables(int *ignore); +bool uncore_discovery(struct uncore_plat_init *init); void intel_uncore_clear_discovery_tables(void); void intel_uncore_generic_uncore_cpu_init(void); int intel_uncore_generic_uncore_pci_init(void); diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c index 807e582b8f17..3dbc6bacbd9d 100644 --- a/arch/x86/events/intel/uncore_snb.c +++ b/arch/x86/events/intel/uncore_snb.c @@ -245,6 +245,30 @@ #define MTL_UNC_HBO_CTR 0x2048 #define MTL_UNC_HBO_CTRL 0x2042 +/* PTL Low Power Bridge register */ +#define PTL_UNC_IA_CORE_BRIDGE_PER_CTR0 0x2028 +#define PTL_UNC_IA_CORE_BRIDGE_PERFEVTSEL0 0x2022 + +/* PTL Santa register */ +#define PTL_UNC_SANTA_CTR0 0x2418 +#define PTL_UNC_SANTA_CTRL0 0x2412 + +/* PTL cNCU register */ +#define PTL_UNC_CNCU_MSR_OFFSET 0x140 + +/* NVL cNCU register */ +#define NVL_UNC_CNCU_BOX_CTL 0x202e +#define NVL_UNC_CNCU_FIXED_CTR 0x2028 +#define NVL_UNC_CNCU_FIXED_CTRL 0x2022 + +/* NVL SANTA register */ +#define NVL_UNC_SANTA_CTR0 0x2048 +#define NVL_UNC_SANTA_CTRL0 0x2042 + +/* NVL CBOX register */ +#define NVL_UNC_CBOX_PER_CTR0 0x2108 +#define NVL_UNC_CBOX_PERFEVTSEL0 0x2102 + DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7"); DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15"); DEFINE_UNCORE_FORMAT_ATTR(chmask, chmask, "config:8-11"); @@ -1921,8 +1945,36 @@ void ptl_uncore_mmio_init(void) ptl_uncores); } +static struct intel_uncore_type ptl_uncore_ia_core_bridge = { + .name = "ia_core_bridge", + .num_counters = 2, + .num_boxes = 1, + .perf_ctr_bits = 48, + .perf_ctr = PTL_UNC_IA_CORE_BRIDGE_PER_CTR0, + .event_ctl = PTL_UNC_IA_CORE_BRIDGE_PERFEVTSEL0, + .event_mask = ADL_UNC_RAW_EVENT_MASK, + .ops = &icl_uncore_msr_ops, + .format_group = &adl_uncore_format_group, +}; + +static struct intel_uncore_type ptl_uncore_santa = { + .name = "santa", + .num_counters = 2, + .num_boxes = 2, + .perf_ctr_bits = 48, + .perf_ctr = PTL_UNC_SANTA_CTR0, + .event_ctl = PTL_UNC_SANTA_CTRL0, + .event_mask = ADL_UNC_RAW_EVENT_MASK, + .msr_offset = SNB_UNC_CBO_MSR_OFFSET, + .ops = &icl_uncore_msr_ops, + .format_group = &adl_uncore_format_group, +}; + static struct intel_uncore_type *ptl_msr_uncores[] = { &mtl_uncore_cbox, + &ptl_uncore_ia_core_bridge, + &ptl_uncore_santa, + &mtl_uncore_cncu, NULL }; @@ -1930,7 +1982,40 @@ void ptl_uncore_cpu_init(void) { mtl_uncore_cbox.num_boxes = 6; mtl_uncore_cbox.ops = &lnl_uncore_msr_ops; + + mtl_uncore_cncu.num_counters = 2; + mtl_uncore_cncu.num_boxes = 2; + mtl_uncore_cncu.msr_offset = PTL_UNC_CNCU_MSR_OFFSET; + mtl_uncore_cncu.single_fixed = 0; + uncore_msr_uncores = ptl_msr_uncores; } /* end of Panther Lake uncore support */ + +/* Nova Lake uncore support */ + +static struct intel_uncore_type *nvl_msr_uncores[] = { + &mtl_uncore_cbox, + &ptl_uncore_santa, + &mtl_uncore_cncu, + NULL +}; + +void nvl_uncore_cpu_init(void) +{ + mtl_uncore_cbox.num_boxes = 12; + mtl_uncore_cbox.perf_ctr = NVL_UNC_CBOX_PER_CTR0; + mtl_uncore_cbox.event_ctl = NVL_UNC_CBOX_PERFEVTSEL0; + + ptl_uncore_santa.perf_ctr = NVL_UNC_SANTA_CTR0; + ptl_uncore_santa.event_ctl = NVL_UNC_SANTA_CTRL0; + + mtl_uncore_cncu.box_ctl = NVL_UNC_CNCU_BOX_CTL; + mtl_uncore_cncu.fixed_ctr = NVL_UNC_CNCU_FIXED_CTR; + mtl_uncore_cncu.fixed_ctl = NVL_UNC_CNCU_FIXED_CTRL; + + uncore_msr_uncores = nvl_msr_uncores; +} + +/* end of Nova Lake uncore support */ diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index e1f370b8d065..7ca0429c4004 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -471,6 +471,18 @@ #define SPR_C0_MSR_PMON_BOX_FILTER0 0x200e +/* DMR */ +#define DMR_IMH1_HIOP_MMIO_BASE 0x1ffff6ae7000 +#define DMR_HIOP_MMIO_SIZE 0x8000 +#define DMR_CXLCM_EVENT_MASK_EXT 0xf +#define DMR_HAMVF_EVENT_MASK_EXT 0xffffffff +#define DMR_PCIE4_EVENT_MASK_EXT 0xffffff + +#define UNCORE_DMR_ITC 0x30 + +#define DMR_IMC_PMON_FIXED_CTR 0x18 +#define DMR_IMC_PMON_FIXED_CTL 0x10 + DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7"); DEFINE_UNCORE_FORMAT_ATTR(event2, event, "config:0-6"); DEFINE_UNCORE_FORMAT_ATTR(event_ext, event, "config:0-7,21"); @@ -486,6 +498,10 @@ DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18"); DEFINE_UNCORE_FORMAT_ATTR(tid_en, tid_en, "config:19"); DEFINE_UNCORE_FORMAT_ATTR(tid_en2, tid_en, "config:16"); DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23"); +DEFINE_UNCORE_FORMAT_ATTR(inv2, inv, "config:21"); +DEFINE_UNCORE_FORMAT_ATTR(thresh_ext, thresh_ext, "config:32-35"); +DEFINE_UNCORE_FORMAT_ATTR(thresh10, thresh, "config:23-32"); +DEFINE_UNCORE_FORMAT_ATTR(thresh9_2, thresh, "config:23-31"); DEFINE_UNCORE_FORMAT_ATTR(thresh9, thresh, "config:24-35"); DEFINE_UNCORE_FORMAT_ATTR(thresh8, thresh, "config:24-31"); DEFINE_UNCORE_FORMAT_ATTR(thresh6, thresh, "config:24-29"); @@ -494,6 +510,13 @@ DEFINE_UNCORE_FORMAT_ATTR(occ_sel, occ_sel, "config:14-15"); DEFINE_UNCORE_FORMAT_ATTR(occ_invert, occ_invert, "config:30"); DEFINE_UNCORE_FORMAT_ATTR(occ_edge, occ_edge, "config:14-51"); DEFINE_UNCORE_FORMAT_ATTR(occ_edge_det, occ_edge_det, "config:31"); +DEFINE_UNCORE_FORMAT_ATTR(port_en, port_en, "config:32-35"); +DEFINE_UNCORE_FORMAT_ATTR(rs3_sel, rs3_sel, "config:36"); +DEFINE_UNCORE_FORMAT_ATTR(rx_sel, rx_sel, "config:37"); +DEFINE_UNCORE_FORMAT_ATTR(tx_sel, tx_sel, "config:38"); +DEFINE_UNCORE_FORMAT_ATTR(iep_sel, iep_sel, "config:39"); +DEFINE_UNCORE_FORMAT_ATTR(vc_sel, vc_sel, "config:40-47"); +DEFINE_UNCORE_FORMAT_ATTR(port_sel, port_sel, "config:48-55"); DEFINE_UNCORE_FORMAT_ATTR(ch_mask, ch_mask, "config:36-43"); DEFINE_UNCORE_FORMAT_ATTR(ch_mask2, ch_mask, "config:36-47"); DEFINE_UNCORE_FORMAT_ATTR(fc_mask, fc_mask, "config:44-46"); @@ -813,76 +836,37 @@ static struct intel_uncore_ops snbep_uncore_pci_ops = { static struct event_constraint snbep_uncore_cbox_constraints[] = { UNCORE_EVENT_CONSTRAINT(0x01, 0x1), UNCORE_EVENT_CONSTRAINT(0x02, 0x3), - UNCORE_EVENT_CONSTRAINT(0x04, 0x3), - UNCORE_EVENT_CONSTRAINT(0x05, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x04, 0x5, 0x3), UNCORE_EVENT_CONSTRAINT(0x07, 0x3), UNCORE_EVENT_CONSTRAINT(0x09, 0x3), UNCORE_EVENT_CONSTRAINT(0x11, 0x1), - UNCORE_EVENT_CONSTRAINT(0x12, 0x3), - UNCORE_EVENT_CONSTRAINT(0x13, 0x3), - UNCORE_EVENT_CONSTRAINT(0x1b, 0xc), - UNCORE_EVENT_CONSTRAINT(0x1c, 0xc), - UNCORE_EVENT_CONSTRAINT(0x1d, 0xc), - UNCORE_EVENT_CONSTRAINT(0x1e, 0xc), + UNCORE_EVENT_CONSTRAINT_RANGE(0x12, 0x13, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x1b, 0x1e, 0xc), UNCORE_EVENT_CONSTRAINT(0x1f, 0xe), UNCORE_EVENT_CONSTRAINT(0x21, 0x3), UNCORE_EVENT_CONSTRAINT(0x23, 0x3), - UNCORE_EVENT_CONSTRAINT(0x31, 0x3), - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), - UNCORE_EVENT_CONSTRAINT(0x35, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x31, 0x35, 0x3), UNCORE_EVENT_CONSTRAINT(0x36, 0x1), - UNCORE_EVENT_CONSTRAINT(0x37, 0x3), - UNCORE_EVENT_CONSTRAINT(0x38, 0x3), - UNCORE_EVENT_CONSTRAINT(0x39, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x37, 0x39, 0x3), UNCORE_EVENT_CONSTRAINT(0x3b, 0x1), EVENT_CONSTRAINT_END }; static struct event_constraint snbep_uncore_r2pcie_constraints[] = { - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3), UNCORE_EVENT_CONSTRAINT(0x12, 0x1), UNCORE_EVENT_CONSTRAINT(0x23, 0x3), - UNCORE_EVENT_CONSTRAINT(0x24, 0x3), - UNCORE_EVENT_CONSTRAINT(0x25, 0x3), - UNCORE_EVENT_CONSTRAINT(0x26, 0x3), - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x24, 0x26, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x32, 0x34, 0x3), EVENT_CONSTRAINT_END }; static struct event_constraint snbep_uncore_r3qpi_constraints[] = { - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), - UNCORE_EVENT_CONSTRAINT(0x12, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x12, 0x3), UNCORE_EVENT_CONSTRAINT(0x13, 0x1), - UNCORE_EVENT_CONSTRAINT(0x20, 0x3), - UNCORE_EVENT_CONSTRAINT(0x21, 0x3), - UNCORE_EVENT_CONSTRAINT(0x22, 0x3), - UNCORE_EVENT_CONSTRAINT(0x23, 0x3), - UNCORE_EVENT_CONSTRAINT(0x24, 0x3), - UNCORE_EVENT_CONSTRAINT(0x25, 0x3), - UNCORE_EVENT_CONSTRAINT(0x26, 0x3), - UNCORE_EVENT_CONSTRAINT(0x28, 0x3), - UNCORE_EVENT_CONSTRAINT(0x29, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2a, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2b, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2e, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2f, 0x3), - UNCORE_EVENT_CONSTRAINT(0x30, 0x3), - UNCORE_EVENT_CONSTRAINT(0x31, 0x3), - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), - UNCORE_EVENT_CONSTRAINT(0x36, 0x3), - UNCORE_EVENT_CONSTRAINT(0x37, 0x3), - UNCORE_EVENT_CONSTRAINT(0x38, 0x3), - UNCORE_EVENT_CONSTRAINT(0x39, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x20, 0x26, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x34, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3), EVENT_CONSTRAINT_END }; @@ -3011,24 +2995,15 @@ static struct intel_uncore_type hswep_uncore_qpi = { }; static struct event_constraint hswep_uncore_r2pcie_constraints[] = { - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3), UNCORE_EVENT_CONSTRAINT(0x13, 0x1), - UNCORE_EVENT_CONSTRAINT(0x23, 0x1), - UNCORE_EVENT_CONSTRAINT(0x24, 0x1), - UNCORE_EVENT_CONSTRAINT(0x25, 0x1), + UNCORE_EVENT_CONSTRAINT_RANGE(0x23, 0x25, 0x1), UNCORE_EVENT_CONSTRAINT(0x26, 0x3), UNCORE_EVENT_CONSTRAINT(0x27, 0x1), - UNCORE_EVENT_CONSTRAINT(0x28, 0x3), - UNCORE_EVENT_CONSTRAINT(0x29, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3), UNCORE_EVENT_CONSTRAINT(0x2a, 0x1), - UNCORE_EVENT_CONSTRAINT(0x2b, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), - UNCORE_EVENT_CONSTRAINT(0x35, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x2b, 0x2d, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x32, 0x35, 0x3), EVENT_CONSTRAINT_END }; @@ -3043,38 +3018,17 @@ static struct intel_uncore_type hswep_uncore_r2pcie = { static struct event_constraint hswep_uncore_r3qpi_constraints[] = { UNCORE_EVENT_CONSTRAINT(0x01, 0x3), - UNCORE_EVENT_CONSTRAINT(0x07, 0x7), - UNCORE_EVENT_CONSTRAINT(0x08, 0x7), - UNCORE_EVENT_CONSTRAINT(0x09, 0x7), - UNCORE_EVENT_CONSTRAINT(0x0a, 0x7), + UNCORE_EVENT_CONSTRAINT_RANGE(0x7, 0x0a, 0x7), UNCORE_EVENT_CONSTRAINT(0x0e, 0x7), - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), - UNCORE_EVENT_CONSTRAINT(0x12, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x12, 0x3), UNCORE_EVENT_CONSTRAINT(0x13, 0x1), - UNCORE_EVENT_CONSTRAINT(0x14, 0x3), - UNCORE_EVENT_CONSTRAINT(0x15, 0x3), - UNCORE_EVENT_CONSTRAINT(0x1f, 0x3), - UNCORE_EVENT_CONSTRAINT(0x20, 0x3), - UNCORE_EVENT_CONSTRAINT(0x21, 0x3), - UNCORE_EVENT_CONSTRAINT(0x22, 0x3), - UNCORE_EVENT_CONSTRAINT(0x23, 0x3), - UNCORE_EVENT_CONSTRAINT(0x25, 0x3), - UNCORE_EVENT_CONSTRAINT(0x26, 0x3), - UNCORE_EVENT_CONSTRAINT(0x28, 0x3), - UNCORE_EVENT_CONSTRAINT(0x29, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2e, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2f, 0x3), - UNCORE_EVENT_CONSTRAINT(0x31, 0x3), - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), - UNCORE_EVENT_CONSTRAINT(0x36, 0x3), - UNCORE_EVENT_CONSTRAINT(0x37, 0x3), - UNCORE_EVENT_CONSTRAINT(0x38, 0x3), - UNCORE_EVENT_CONSTRAINT(0x39, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x14, 0x15, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x1f, 0x23, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x25, 0x26, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2f, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x31, 0x34, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3), EVENT_CONSTRAINT_END }; @@ -3348,8 +3302,7 @@ static struct event_constraint bdx_uncore_r2pcie_constraints[] = { UNCORE_EVENT_CONSTRAINT(0x25, 0x1), UNCORE_EVENT_CONSTRAINT(0x26, 0x3), UNCORE_EVENT_CONSTRAINT(0x28, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2d, 0x3), EVENT_CONSTRAINT_END }; @@ -3364,35 +3317,18 @@ static struct intel_uncore_type bdx_uncore_r2pcie = { static struct event_constraint bdx_uncore_r3qpi_constraints[] = { UNCORE_EVENT_CONSTRAINT(0x01, 0x7), - UNCORE_EVENT_CONSTRAINT(0x07, 0x7), - UNCORE_EVENT_CONSTRAINT(0x08, 0x7), - UNCORE_EVENT_CONSTRAINT(0x09, 0x7), - UNCORE_EVENT_CONSTRAINT(0x0a, 0x7), + UNCORE_EVENT_CONSTRAINT_RANGE(0x07, 0x0a, 0x7), UNCORE_EVENT_CONSTRAINT(0x0e, 0x7), - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3), UNCORE_EVENT_CONSTRAINT(0x13, 0x1), - UNCORE_EVENT_CONSTRAINT(0x14, 0x3), - UNCORE_EVENT_CONSTRAINT(0x15, 0x3), - UNCORE_EVENT_CONSTRAINT(0x1f, 0x3), - UNCORE_EVENT_CONSTRAINT(0x20, 0x3), - UNCORE_EVENT_CONSTRAINT(0x21, 0x3), - UNCORE_EVENT_CONSTRAINT(0x22, 0x3), - UNCORE_EVENT_CONSTRAINT(0x23, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x14, 0x15, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x1f, 0x23, 0x3), UNCORE_EVENT_CONSTRAINT(0x25, 0x3), UNCORE_EVENT_CONSTRAINT(0x26, 0x3), - UNCORE_EVENT_CONSTRAINT(0x28, 0x3), - UNCORE_EVENT_CONSTRAINT(0x29, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2e, 0x3), - UNCORE_EVENT_CONSTRAINT(0x2f, 0x3), - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), - UNCORE_EVENT_CONSTRAINT(0x36, 0x3), - UNCORE_EVENT_CONSTRAINT(0x37, 0x3), - UNCORE_EVENT_CONSTRAINT(0x38, 0x3), - UNCORE_EVENT_CONSTRAINT(0x39, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2f, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x33, 0x34, 0x3), + UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3), EVENT_CONSTRAINT_END }; @@ -3699,8 +3635,7 @@ static struct event_constraint skx_uncore_iio_constraints[] = { UNCORE_EVENT_CONSTRAINT(0x95, 0xc), UNCORE_EVENT_CONSTRAINT(0xc0, 0xc), UNCORE_EVENT_CONSTRAINT(0xc5, 0xc), - UNCORE_EVENT_CONSTRAINT(0xd4, 0xc), - UNCORE_EVENT_CONSTRAINT(0xd5, 0xc), + UNCORE_EVENT_CONSTRAINT_RANGE(0xd4, 0xd5, 0xc), EVENT_CONSTRAINT_END }; @@ -4049,34 +3984,24 @@ static struct freerunning_counters skx_iio_freerunning[] = { [SKX_IIO_MSR_UTIL] = { 0xb08, 0x1, 0x10, 8, 36 }, }; +#define INTEL_UNCORE_FR_EVENT_DESC(name, umask, scl) \ + INTEL_UNCORE_EVENT_DESC(name, \ + "event=0xff,umask=" __stringify(umask)),\ + INTEL_UNCORE_EVENT_DESC(name.scale, __stringify(scl)), \ + INTEL_UNCORE_EVENT_DESC(name.unit, "MiB") + static struct uncore_event_desc skx_uncore_iio_freerunning_events[] = { /* Free-Running IO CLOCKS Counter */ INTEL_UNCORE_EVENT_DESC(ioclk, "event=0xff,umask=0x10"), /* Free-Running IIO BANDWIDTH Counters */ - INTEL_UNCORE_EVENT_DESC(bw_in_port0, "event=0xff,umask=0x20"), - INTEL_UNCORE_EVENT_DESC(bw_in_port0.scale, "3.814697266e-6"), - INTEL_UNCORE_EVENT_DESC(bw_in_port0.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port1, "event=0xff,umask=0x21"), - INTEL_UNCORE_EVENT_DESC(bw_in_port1.scale, "3.814697266e-6"), - INTEL_UNCORE_EVENT_DESC(bw_in_port1.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port2, "event=0xff,umask=0x22"), - INTEL_UNCORE_EVENT_DESC(bw_in_port2.scale, "3.814697266e-6"), - INTEL_UNCORE_EVENT_DESC(bw_in_port2.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port3, "event=0xff,umask=0x23"), - INTEL_UNCORE_EVENT_DESC(bw_in_port3.scale, "3.814697266e-6"), - INTEL_UNCORE_EVENT_DESC(bw_in_port3.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_out_port0, "event=0xff,umask=0x24"), - INTEL_UNCORE_EVENT_DESC(bw_out_port0.scale, "3.814697266e-6"), - INTEL_UNCORE_EVENT_DESC(bw_out_port0.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_out_port1, "event=0xff,umask=0x25"), - INTEL_UNCORE_EVENT_DESC(bw_out_port1.scale, "3.814697266e-6"), - INTEL_UNCORE_EVENT_DESC(bw_out_port1.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_out_port2, "event=0xff,umask=0x26"), - INTEL_UNCORE_EVENT_DESC(bw_out_port2.scale, "3.814697266e-6"), - INTEL_UNCORE_EVENT_DESC(bw_out_port2.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_out_port3, "event=0xff,umask=0x27"), - INTEL_UNCORE_EVENT_DESC(bw_out_port3.scale, "3.814697266e-6"), - INTEL_UNCORE_EVENT_DESC(bw_out_port3.unit, "MiB"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, 3.814697266e-6), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, 3.814697266e-6), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, 3.814697266e-6), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, 3.814697266e-6), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x24, 3.814697266e-6), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x25, 3.814697266e-6), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x26, 3.814697266e-6), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x27, 3.814697266e-6), /* Free-running IIO UTILIZATION Counters */ INTEL_UNCORE_EVENT_DESC(util_in_port0, "event=0xff,umask=0x30"), INTEL_UNCORE_EVENT_DESC(util_out_port0, "event=0xff,umask=0x31"), @@ -4466,14 +4391,9 @@ static struct intel_uncore_type skx_uncore_m2pcie = { }; static struct event_constraint skx_uncore_m3upi_constraints[] = { - UNCORE_EVENT_CONSTRAINT(0x1d, 0x1), - UNCORE_EVENT_CONSTRAINT(0x1e, 0x1), + UNCORE_EVENT_CONSTRAINT_RANGE(0x1d, 0x1e, 0x1), UNCORE_EVENT_CONSTRAINT(0x40, 0x7), - UNCORE_EVENT_CONSTRAINT(0x4e, 0x7), - UNCORE_EVENT_CONSTRAINT(0x4f, 0x7), - UNCORE_EVENT_CONSTRAINT(0x50, 0x7), - UNCORE_EVENT_CONSTRAINT(0x51, 0x7), - UNCORE_EVENT_CONSTRAINT(0x52, 0x7), + UNCORE_EVENT_CONSTRAINT_RANGE(0x4e, 0x52, 0x7), EVENT_CONSTRAINT_END }; @@ -4891,30 +4811,14 @@ static struct uncore_event_desc snr_uncore_iio_freerunning_events[] = { /* Free-Running IIO CLOCKS Counter */ INTEL_UNCORE_EVENT_DESC(ioclk, "event=0xff,umask=0x10"), /* Free-Running IIO BANDWIDTH IN Counters */ - INTEL_UNCORE_EVENT_DESC(bw_in_port0, "event=0xff,umask=0x20"), - INTEL_UNCORE_EVENT_DESC(bw_in_port0.scale, "3.0517578125e-5"), - INTEL_UNCORE_EVENT_DESC(bw_in_port0.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port1, "event=0xff,umask=0x21"), - INTEL_UNCORE_EVENT_DESC(bw_in_port1.scale, "3.0517578125e-5"), - INTEL_UNCORE_EVENT_DESC(bw_in_port1.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port2, "event=0xff,umask=0x22"), - INTEL_UNCORE_EVENT_DESC(bw_in_port2.scale, "3.0517578125e-5"), - INTEL_UNCORE_EVENT_DESC(bw_in_port2.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port3, "event=0xff,umask=0x23"), - INTEL_UNCORE_EVENT_DESC(bw_in_port3.scale, "3.0517578125e-5"), - INTEL_UNCORE_EVENT_DESC(bw_in_port3.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port4, "event=0xff,umask=0x24"), - INTEL_UNCORE_EVENT_DESC(bw_in_port4.scale, "3.0517578125e-5"), - INTEL_UNCORE_EVENT_DESC(bw_in_port4.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port5, "event=0xff,umask=0x25"), - INTEL_UNCORE_EVENT_DESC(bw_in_port5.scale, "3.0517578125e-5"), - INTEL_UNCORE_EVENT_DESC(bw_in_port5.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port6, "event=0xff,umask=0x26"), - INTEL_UNCORE_EVENT_DESC(bw_in_port6.scale, "3.0517578125e-5"), - INTEL_UNCORE_EVENT_DESC(bw_in_port6.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(bw_in_port7, "event=0xff,umask=0x27"), - INTEL_UNCORE_EVENT_DESC(bw_in_port7.scale, "3.0517578125e-5"), - INTEL_UNCORE_EVENT_DESC(bw_in_port7.unit, "MiB"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, 3.0517578125e-5), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, 3.0517578125e-5), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, 3.0517578125e-5), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, 3.0517578125e-5), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, 3.0517578125e-5), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, 3.0517578125e-5), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, 3.0517578125e-5), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, 3.0517578125e-5), { /* end: all zeroes */ }, }; @@ -5247,12 +5151,8 @@ static struct freerunning_counters snr_imc_freerunning[] = { static struct uncore_event_desc snr_uncore_imc_freerunning_events[] = { INTEL_UNCORE_EVENT_DESC(dclk, "event=0xff,umask=0x10"), - INTEL_UNCORE_EVENT_DESC(read, "event=0xff,umask=0x20"), - INTEL_UNCORE_EVENT_DESC(read.scale, "6.103515625e-5"), - INTEL_UNCORE_EVENT_DESC(read.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(write, "event=0xff,umask=0x21"), - INTEL_UNCORE_EVENT_DESC(write.scale, "6.103515625e-5"), - INTEL_UNCORE_EVENT_DESC(write.unit, "MiB"), + INTEL_UNCORE_FR_EVENT_DESC(read, 0x20, 6.103515625e-5), + INTEL_UNCORE_FR_EVENT_DESC(write, 0x21, 6.103515625e-5), { /* end: all zeroes */ }, }; @@ -5659,14 +5559,9 @@ static struct intel_uncore_type icx_uncore_upi = { }; static struct event_constraint icx_uncore_m3upi_constraints[] = { - UNCORE_EVENT_CONSTRAINT(0x1c, 0x1), - UNCORE_EVENT_CONSTRAINT(0x1d, 0x1), - UNCORE_EVENT_CONSTRAINT(0x1e, 0x1), - UNCORE_EVENT_CONSTRAINT(0x1f, 0x1), + UNCORE_EVENT_CONSTRAINT_RANGE(0x1c, 0x1f, 0x1), UNCORE_EVENT_CONSTRAINT(0x40, 0x7), - UNCORE_EVENT_CONSTRAINT(0x4e, 0x7), - UNCORE_EVENT_CONSTRAINT(0x4f, 0x7), - UNCORE_EVENT_CONSTRAINT(0x50, 0x7), + UNCORE_EVENT_CONSTRAINT_RANGE(0x4e, 0x50, 0x7), EVENT_CONSTRAINT_END }; @@ -5817,19 +5712,10 @@ static struct freerunning_counters icx_imc_freerunning[] = { static struct uncore_event_desc icx_uncore_imc_freerunning_events[] = { INTEL_UNCORE_EVENT_DESC(dclk, "event=0xff,umask=0x10"), - INTEL_UNCORE_EVENT_DESC(read, "event=0xff,umask=0x20"), - INTEL_UNCORE_EVENT_DESC(read.scale, "6.103515625e-5"), - INTEL_UNCORE_EVENT_DESC(read.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(write, "event=0xff,umask=0x21"), - INTEL_UNCORE_EVENT_DESC(write.scale, "6.103515625e-5"), - INTEL_UNCORE_EVENT_DESC(write.unit, "MiB"), - - INTEL_UNCORE_EVENT_DESC(ddrt_read, "event=0xff,umask=0x30"), - INTEL_UNCORE_EVENT_DESC(ddrt_read.scale, "6.103515625e-5"), - INTEL_UNCORE_EVENT_DESC(ddrt_read.unit, "MiB"), - INTEL_UNCORE_EVENT_DESC(ddrt_write, "event=0xff,umask=0x31"), - INTEL_UNCORE_EVENT_DESC(ddrt_write.scale, "6.103515625e-5"), - INTEL_UNCORE_EVENT_DESC(ddrt_write.unit, "MiB"), + INTEL_UNCORE_FR_EVENT_DESC(read, 0x20, 6.103515625e-5), + INTEL_UNCORE_FR_EVENT_DESC(write, 0x21, 6.103515625e-5), + INTEL_UNCORE_FR_EVENT_DESC(ddrt_read, 0x30, 6.103515625e-5), + INTEL_UNCORE_FR_EVENT_DESC(ddrt_write, 0x31, 6.103515625e-5), { /* end: all zeroes */ }, }; @@ -6158,10 +6044,7 @@ static struct intel_uncore_ops spr_uncore_mmio_offs8_ops = { static struct event_constraint spr_uncore_cxlcm_constraints[] = { UNCORE_EVENT_CONSTRAINT(0x02, 0x0f), UNCORE_EVENT_CONSTRAINT(0x05, 0x0f), - UNCORE_EVENT_CONSTRAINT(0x40, 0xf0), - UNCORE_EVENT_CONSTRAINT(0x41, 0xf0), - UNCORE_EVENT_CONSTRAINT(0x42, 0xf0), - UNCORE_EVENT_CONSTRAINT(0x43, 0xf0), + UNCORE_EVENT_CONSTRAINT_RANGE(0x40, 0x43, 0xf0), UNCORE_EVENT_CONSTRAINT(0x4b, 0xf0), UNCORE_EVENT_CONSTRAINT(0x52, 0xf0), EVENT_CONSTRAINT_END @@ -6462,7 +6345,11 @@ static int uncore_type_max_boxes(struct intel_uncore_type **types, for (node = rb_first(type->boxes); node; node = rb_next(node)) { unit = rb_entry(node, struct intel_uncore_discovery_unit, node); - if (unit->id > max) + /* + * on DMR IMH2, the unit id starts from 0x8000, + * and we don't need to count it. + */ + if ((unit->id > max) && (unit->id < 0x8000)) max = unit->id; } return max + 1; @@ -6709,3 +6596,386 @@ void gnr_uncore_mmio_init(void) } /* end of GNR uncore support */ + +/* DMR uncore support */ +#define UNCORE_DMR_NUM_UNCORE_TYPES 52 + +static struct attribute *dmr_imc_uncore_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_inv.attr, + &format_attr_thresh10.attr, + NULL, +}; + +static const struct attribute_group dmr_imc_uncore_format_group = { + .name = "format", + .attrs = dmr_imc_uncore_formats_attr, +}; + +static struct intel_uncore_type dmr_uncore_imc = { + .name = "imc", + .fixed_ctr_bits = 48, + .fixed_ctr = DMR_IMC_PMON_FIXED_CTR, + .fixed_ctl = DMR_IMC_PMON_FIXED_CTL, + .ops = &spr_uncore_mmio_ops, + .format_group = &dmr_imc_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct attribute *dmr_sca_uncore_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask_ext5.attr, + &format_attr_edge.attr, + &format_attr_inv.attr, + &format_attr_thresh8.attr, + NULL, +}; + +static const struct attribute_group dmr_sca_uncore_format_group = { + .name = "format", + .attrs = dmr_sca_uncore_formats_attr, +}; + +static struct intel_uncore_type dmr_uncore_sca = { + .name = "sca", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct attribute *dmr_cxlcm_uncore_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_inv2.attr, + &format_attr_thresh9_2.attr, + &format_attr_port_en.attr, + NULL, +}; + +static const struct attribute_group dmr_cxlcm_uncore_format_group = { + .name = "format", + .attrs = dmr_cxlcm_uncore_formats_attr, +}; + +static struct event_constraint dmr_uncore_cxlcm_constraints[] = { + UNCORE_EVENT_CONSTRAINT_RANGE(0x1, 0x24, 0x0f), + UNCORE_EVENT_CONSTRAINT_RANGE(0x41, 0x41, 0xf0), + UNCORE_EVENT_CONSTRAINT_RANGE(0x50, 0x5e, 0xf0), + UNCORE_EVENT_CONSTRAINT_RANGE(0x60, 0x61, 0xf0), + EVENT_CONSTRAINT_END +}; + +static struct intel_uncore_type dmr_uncore_cxlcm = { + .name = "cxlcm", + .event_mask = GENERIC_PMON_RAW_EVENT_MASK, + .event_mask_ext = DMR_CXLCM_EVENT_MASK_EXT, + .constraints = dmr_uncore_cxlcm_constraints, + .format_group = &dmr_cxlcm_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_hamvf = { + .name = "hamvf", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct event_constraint dmr_uncore_cbo_constraints[] = { + UNCORE_EVENT_CONSTRAINT(0x11, 0x1), + UNCORE_EVENT_CONSTRAINT_RANGE(0x19, 0x1a, 0x1), + UNCORE_EVENT_CONSTRAINT(0x1f, 0x1), + UNCORE_EVENT_CONSTRAINT(0x21, 0x1), + UNCORE_EVENT_CONSTRAINT(0x25, 0x1), + UNCORE_EVENT_CONSTRAINT(0x36, 0x1), + EVENT_CONSTRAINT_END +}; + +static struct intel_uncore_type dmr_uncore_cbo = { + .name = "cbo", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .constraints = dmr_uncore_cbo_constraints, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_santa = { + .name = "santa", + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_cncu = { + .name = "cncu", + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_sncu = { + .name = "sncu", + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_ula = { + .name = "ula", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_dda = { + .name = "dda", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct event_constraint dmr_uncore_sbo_constraints[] = { + UNCORE_EVENT_CONSTRAINT(0x1f, 0x01), + UNCORE_EVENT_CONSTRAINT(0x25, 0x01), + EVENT_CONSTRAINT_END +}; + +static struct intel_uncore_type dmr_uncore_sbo = { + .name = "sbo", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .constraints = dmr_uncore_sbo_constraints, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_ubr = { + .name = "ubr", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct attribute *dmr_pcie4_uncore_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_inv.attr, + &format_attr_thresh8.attr, + &format_attr_thresh_ext.attr, + &format_attr_rs3_sel.attr, + &format_attr_rx_sel.attr, + &format_attr_tx_sel.attr, + &format_attr_iep_sel.attr, + &format_attr_vc_sel.attr, + &format_attr_port_sel.attr, + NULL, +}; + +static const struct attribute_group dmr_pcie4_uncore_format_group = { + .name = "format", + .attrs = dmr_pcie4_uncore_formats_attr, +}; + +static struct intel_uncore_type dmr_uncore_pcie4 = { + .name = "pcie4", + .event_mask_ext = DMR_PCIE4_EVENT_MASK_EXT, + .format_group = &dmr_pcie4_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_crs = { + .name = "crs", + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_cpc = { + .name = "cpc", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_itc = { + .name = "itc", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_otc = { + .name = "otc", + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, + .format_group = &dmr_sca_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_cms = { + .name = "cms", + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type dmr_uncore_pcie6 = { + .name = "pcie6", + .event_mask_ext = DMR_PCIE4_EVENT_MASK_EXT, + .format_group = &dmr_pcie4_uncore_format_group, + .attr_update = uncore_alias_groups, +}; + +static struct intel_uncore_type *dmr_uncores[UNCORE_DMR_NUM_UNCORE_TYPES] = { + NULL, NULL, NULL, NULL, + &spr_uncore_pcu, + &gnr_uncore_ubox, + &dmr_uncore_imc, + NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, + &dmr_uncore_sca, + &dmr_uncore_cxlcm, + NULL, NULL, NULL, + NULL, NULL, + &dmr_uncore_hamvf, + &dmr_uncore_cbo, + &dmr_uncore_santa, + &dmr_uncore_cncu, + &dmr_uncore_sncu, + &dmr_uncore_ula, + &dmr_uncore_dda, + NULL, + &dmr_uncore_sbo, + NULL, + NULL, NULL, NULL, + &dmr_uncore_ubr, + NULL, + &dmr_uncore_pcie4, + &dmr_uncore_crs, + &dmr_uncore_cpc, + &dmr_uncore_itc, + &dmr_uncore_otc, + &dmr_uncore_cms, + &dmr_uncore_pcie6, +}; + +int dmr_uncore_imh_units_ignore[] = { + 0x13, /* MSE */ + UNCORE_IGNORE_END +}; + +int dmr_uncore_cbb_units_ignore[] = { + 0x25, /* SB2UCIE */ + UNCORE_IGNORE_END +}; + +static unsigned int dmr_iio_freerunning_box_offsets[] = { + 0x0, 0x8000, 0x18000, 0x20000 +}; + +static void dmr_uncore_freerunning_init_box(struct intel_uncore_box *box) +{ + struct intel_uncore_type *type = box->pmu->type; + u64 mmio_base; + + if (box->pmu->pmu_idx >= type->num_boxes) + return; + + mmio_base = DMR_IMH1_HIOP_MMIO_BASE; + mmio_base += dmr_iio_freerunning_box_offsets[box->pmu->pmu_idx]; + + box->io_addr = ioremap(mmio_base, type->mmio_map_size); + if (!box->io_addr) + pr_warn("perf uncore: Failed to ioremap for %s.\n", type->name); +} + +static struct intel_uncore_ops dmr_uncore_freerunning_ops = { + .init_box = dmr_uncore_freerunning_init_box, + .exit_box = uncore_mmio_exit_box, + .read_counter = uncore_mmio_read_counter, + .hw_config = uncore_freerunning_hw_config, +}; + +enum perf_uncore_dmr_iio_freerunning_type_id { + DMR_ITC_INB_DATA_BW, + DMR_ITC_BW_IN, + DMR_OTC_BW_OUT, + DMR_OTC_CLOCK_TICKS, + + DMR_IIO_FREERUNNING_TYPE_MAX, +}; + +static struct freerunning_counters dmr_iio_freerunning[] = { + [DMR_ITC_INB_DATA_BW] = { 0x4d40, 0x8, 0, 8, 48}, + [DMR_ITC_BW_IN] = { 0x6b00, 0x8, 0, 8, 48}, + [DMR_OTC_BW_OUT] = { 0x6b60, 0x8, 0, 8, 48}, + [DMR_OTC_CLOCK_TICKS] = { 0x6bb0, 0x8, 0, 1, 48}, +}; + +static struct uncore_event_desc dmr_uncore_iio_freerunning_events[] = { + /* ITC Free Running Data BW counter for inbound traffic */ + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port0, 0x10, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port1, 0x11, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port2, 0x12, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port3, 0x13, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port4, 0x14, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port5, 0x15, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port6, 0x16, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port7, 0x17, "3.814697266e-6"), + + /* ITC Free Running BW IN counters */ + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, "3.814697266e-6"), + + /* ITC Free Running BW OUT counters */ + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x30, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x31, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x32, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x33, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port4, 0x34, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port5, 0x35, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port6, 0x36, "3.814697266e-6"), + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port7, 0x37, "3.814697266e-6"), + + /* Free Running Clock Counter */ + INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x40"), + { /* end: all zeroes */ }, +}; + +static struct intel_uncore_type dmr_uncore_iio_free_running = { + .name = "iio_free_running", + .num_counters = 25, + .mmio_map_size = DMR_HIOP_MMIO_SIZE, + .num_freerunning_types = DMR_IIO_FREERUNNING_TYPE_MAX, + .freerunning = dmr_iio_freerunning, + .ops = &dmr_uncore_freerunning_ops, + .event_descs = dmr_uncore_iio_freerunning_events, + .format_group = &skx_uncore_iio_freerunning_format_group, +}; + +#define UNCORE_DMR_MMIO_EXTRA_UNCORES 1 +static struct intel_uncore_type *dmr_mmio_uncores[UNCORE_DMR_MMIO_EXTRA_UNCORES] = { + &dmr_uncore_iio_free_running, +}; + +int dmr_uncore_pci_init(void) +{ + uncore_pci_uncores = uncore_get_uncores(UNCORE_ACCESS_PCI, 0, NULL, + UNCORE_DMR_NUM_UNCORE_TYPES, + dmr_uncores); + return 0; +} + +void dmr_uncore_mmio_init(void) +{ + uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, + UNCORE_DMR_MMIO_EXTRA_UNCORES, + dmr_mmio_uncores, + UNCORE_DMR_NUM_UNCORE_TYPES, + dmr_uncores); + + dmr_uncore_iio_free_running.num_boxes = + uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_DMR_ITC); +} +/* end of DMR uncore support */ diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c index 7f5007a4752a..8052596b8503 100644 --- a/arch/x86/events/msr.c +++ b/arch/x86/events/msr.c @@ -78,6 +78,7 @@ static bool test_intel(int idx, void *data) case INTEL_ATOM_SILVERMONT: case INTEL_ATOM_SILVERMONT_D: case INTEL_ATOM_AIRMONT: + case INTEL_ATOM_AIRMONT_NP: case INTEL_ATOM_GOLDMONT: case INTEL_ATOM_GOLDMONT_D: diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index ad35c546243e..fad87d3c8b2c 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -45,6 +45,10 @@ enum extra_reg_type { EXTRA_REG_FE = 4, /* fe_* */ EXTRA_REG_SNOOP_0 = 5, /* snoop response 0 */ EXTRA_REG_SNOOP_1 = 6, /* snoop response 1 */ + EXTRA_REG_OMR_0 = 7, /* OMR 0 */ + EXTRA_REG_OMR_1 = 8, /* OMR 1 */ + EXTRA_REG_OMR_2 = 9, /* OMR 2 */ + EXTRA_REG_OMR_3 = 10, /* OMR 3 */ EXTRA_REG_MAX /* number of entries needed */ }; @@ -183,6 +187,13 @@ struct amd_nb { (1ULL << PERF_REG_X86_R14) | \ (1ULL << PERF_REG_X86_R15)) +/* user space rdpmc control values */ +enum { + X86_USER_RDPMC_NEVER_ENABLE = 0, + X86_USER_RDPMC_CONDITIONAL_ENABLE = 1, + X86_USER_RDPMC_ALWAYS_ENABLE = 2, +}; + /* * Per register state. */ @@ -1099,6 +1110,7 @@ do { \ #define PMU_FL_RETIRE_LATENCY 0x200 /* Support Retire Latency in PEBS */ #define PMU_FL_BR_CNTR 0x400 /* Support branch counter logging */ #define PMU_FL_DYN_CONSTRAINT 0x800 /* Needs dynamic constraint */ +#define PMU_FL_HAS_OMR 0x1000 /* has 4 equivalent OMR regs */ #define EVENT_VAR(_id) event_attr_##_id #define EVENT_PTR(_id) &event_attr_##_id.attr.attr @@ -1321,6 +1333,12 @@ static inline u64 x86_pmu_get_event_config(struct perf_event *event) return event->attr.config & hybrid(event->pmu, config_mask); } +static inline bool x86_pmu_has_rdpmc_user_disable(struct pmu *pmu) +{ + return !!(hybrid(pmu, config_mask) & + ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE); +} + extern struct event_constraint emptyconstraint; extern struct event_constraint unconstrained; @@ -1668,6 +1686,10 @@ u64 lnl_latency_data(struct perf_event *event, u64 status); u64 arl_h_latency_data(struct perf_event *event, u64 status); +u64 pnc_latency_data(struct perf_event *event, u64 status); + +u64 nvl_latency_data(struct perf_event *event, u64 status); + extern struct event_constraint intel_core2_pebs_event_constraints[]; extern struct event_constraint intel_atom_pebs_event_constraints[]; @@ -1680,6 +1702,8 @@ extern struct event_constraint intel_glp_pebs_event_constraints[]; extern struct event_constraint intel_grt_pebs_event_constraints[]; +extern struct event_constraint intel_arw_pebs_event_constraints[]; + extern struct event_constraint intel_nehalem_pebs_event_constraints[]; extern struct event_constraint intel_westmere_pebs_event_constraints[]; @@ -1700,6 +1724,8 @@ extern struct event_constraint intel_glc_pebs_event_constraints[]; extern struct event_constraint intel_lnc_pebs_event_constraints[]; +extern struct event_constraint intel_pnc_pebs_event_constraints[]; + struct event_constraint *intel_pebs_constraints(struct perf_event *event); void intel_pmu_pebs_add(struct perf_event *event); diff --git a/arch/x86/include/asm/amd/ibs.h b/arch/x86/include/asm/amd/ibs.h index 3ee5903982c2..fcc8a5abe54e 100644 --- a/arch/x86/include/asm/amd/ibs.h +++ b/arch/x86/include/asm/amd/ibs.h @@ -110,7 +110,7 @@ union ibs_op_data3 { __u64 ld_op:1, /* 0: load op */ st_op:1, /* 1: store op */ dc_l1tlb_miss:1, /* 2: data cache L1TLB miss */ - dc_l2tlb_miss:1, /* 3: data cache L2TLB hit in 2M page */ + dc_l2tlb_miss:1, /* 3: data cache L2TLB miss in 2M page */ dc_l1tlb_hit_2m:1, /* 4: data cache L1TLB hit in 2M page */ dc_l1tlb_hit_1g:1, /* 5: data cache L1TLB hit in 1G page */ dc_l2tlb_hit_2m:1, /* 6: data cache L2TLB hit in 2M page */ diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 6b6d472baa0b..9314642ae93c 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -19,6 +19,9 @@ typedef struct { unsigned int kvm_posted_intr_wakeup_ipis; unsigned int kvm_posted_intr_nested_ipis; #endif +#ifdef CONFIG_GUEST_PERF_EVENTS + unsigned int perf_guest_mediated_pmis; +#endif unsigned int x86_platform_ipis; /* arch dependent */ unsigned int apic_perf_irqs; unsigned int apic_irq_work_irqs; diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index 3218770670d3..42bf6a58ec36 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -746,6 +746,12 @@ DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_NESTED_VECTOR, sysvec_kvm_posted_intr_nested # define fred_sysvec_kvm_posted_intr_nested_ipi NULL #endif +# ifdef CONFIG_GUEST_PERF_EVENTS +DECLARE_IDTENTRY_SYSVEC(PERF_GUEST_MEDIATED_PMI_VECTOR, sysvec_perf_guest_mediated_pmi_handler); +#else +# define fred_sysvec_perf_guest_mediated_pmi_handler NULL +#endif + # ifdef CONFIG_X86_POSTED_MSI DECLARE_IDTENTRY_SYSVEC(POSTED_MSI_NOTIFICATION_VECTOR, sysvec_posted_msi_notification); #else diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 47051871b436..85253fc8e384 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -77,7 +77,9 @@ */ #define IRQ_WORK_VECTOR 0xf6 -/* 0xf5 - unused, was UV_BAU_MESSAGE */ +/* IRQ vector for PMIs when running a guest with a mediated PMU. */ +#define PERF_GUEST_MEDIATED_PMI_VECTOR 0xf5 + #define DEFERRED_ERROR_VECTOR 0xf4 /* Vector on which hypervisor callbacks will be delivered */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 3d0a0950d20a..6d1b69ea01c2 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -263,6 +263,11 @@ #define MSR_SNOOP_RSP_0 0x00001328 #define MSR_SNOOP_RSP_1 0x00001329 +#define MSR_OMR_0 0x000003e0 +#define MSR_OMR_1 0x000003e1 +#define MSR_OMR_2 0x000003e2 +#define MSR_OMR_3 0x000003e3 + #define MSR_LBR_SELECT 0x000001c8 #define MSR_LBR_TOS 0x000001c9 diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index 7276ba70c88a..ff5acb8b199b 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -33,6 +33,7 @@ #define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL #define ARCH_PERFMON_EVENTSEL_BR_CNTR (1ULL << 35) #define ARCH_PERFMON_EVENTSEL_EQ (1ULL << 36) +#define ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE (1ULL << 37) #define ARCH_PERFMON_EVENTSEL_UMASK2 (0xFFULL << 40) #define INTEL_FIXED_BITS_STRIDE 4 @@ -40,6 +41,7 @@ #define INTEL_FIXED_0_USER (1ULL << 1) #define INTEL_FIXED_0_ANYTHREAD (1ULL << 2) #define INTEL_FIXED_0_ENABLE_PMI (1ULL << 3) +#define INTEL_FIXED_0_RDPMC_USER_DISABLE (1ULL << 33) #define INTEL_FIXED_3_METRICS_CLEAR (1ULL << 2) #define HSW_IN_TX (1ULL << 32) @@ -50,7 +52,7 @@ #define INTEL_FIXED_BITS_MASK \ (INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER | \ INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI | \ - ICL_FIXED_0_ADAPTIVE) + ICL_FIXED_0_ADAPTIVE | INTEL_FIXED_0_RDPMC_USER_DISABLE) #define intel_fixed_bits_by_idx(_idx, _bits) \ ((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE)) @@ -226,7 +228,9 @@ union cpuid35_ebx { unsigned int umask2:1; /* EQ-bit Supported */ unsigned int eq:1; - unsigned int reserved:30; + /* rdpmc user disable Supported */ + unsigned int rdpmc_user_disable:1; + unsigned int reserved:29; } split; unsigned int full; }; @@ -301,6 +305,7 @@ struct x86_pmu_capability { unsigned int events_mask; int events_mask_len; unsigned int pebs_ept :1; + unsigned int mediated :1; }; /* @@ -759,6 +764,11 @@ static inline void perf_events_lapic_init(void) { } static inline void perf_check_microcode(void) { } #endif +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +extern void perf_load_guest_lvtpc(u32 guest_lvtpc); +extern void perf_put_guest_lvtpc(void); +#endif + #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL) extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *data); extern void x86_perf_get_lbr(struct x86_pmu_lbr *lbr); diff --git a/arch/x86/include/asm/unwind_user.h b/arch/x86/include/asm/unwind_user.h index 12064284bc4e..6e469044e4de 100644 --- a/arch/x86/include/asm/unwind_user.h +++ b/arch/x86/include/asm/unwind_user.h @@ -2,11 +2,23 @@ #ifndef _ASM_X86_UNWIND_USER_H #define _ASM_X86_UNWIND_USER_H -#ifdef CONFIG_HAVE_UNWIND_USER_FP +#ifdef CONFIG_UNWIND_USER #include <asm/ptrace.h> #include <asm/uprobes.h> +static inline int unwind_user_word_size(struct pt_regs *regs) +{ + /* We can't unwind VM86 stacks */ + if (regs->flags & X86_VM_MASK) + return 0; + return user_64bit_mode(regs) ? 8 : 4; +} + +#endif /* CONFIG_UNWIND_USER */ + +#ifdef CONFIG_HAVE_UNWIND_USER_FP + #define ARCH_INIT_USER_FP_FRAME(ws) \ .cfa_off = 2*(ws), \ .ra_off = -1*(ws), \ @@ -19,22 +31,11 @@ .fp_off = 0, \ .use_fp = false, -static inline int unwind_user_word_size(struct pt_regs *regs) -{ - /* We can't unwind VM86 stacks */ - if (regs->flags & X86_VM_MASK) - return 0; -#ifdef CONFIG_X86_64 - if (!user_64bit_mode(regs)) - return sizeof(int); -#endif - return sizeof(long); -} - static inline bool unwind_user_at_function_start(struct pt_regs *regs) { return is_uprobe_at_func_entry(regs); } +#define unwind_user_at_function_start unwind_user_at_function_start #endif /* CONFIG_HAVE_UNWIND_USER_FP */ diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index f445bec516a0..260456588756 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -158,6 +158,9 @@ static const __initconst struct idt_data apic_idts[] = { INTG(POSTED_INTR_WAKEUP_VECTOR, asm_sysvec_kvm_posted_intr_wakeup_ipi), INTG(POSTED_INTR_NESTED_VECTOR, asm_sysvec_kvm_posted_intr_nested_ipi), # endif +#ifdef CONFIG_GUEST_PERF_EVENTS + INTG(PERF_GUEST_MEDIATED_PMI_VECTOR, asm_sysvec_perf_guest_mediated_pmi_handler), +#endif # ifdef CONFIG_IRQ_WORK INTG(IRQ_WORK_VECTOR, asm_sysvec_irq_work), # endif diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index b2fe6181960c..316730e95fc3 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -192,6 +192,13 @@ int arch_show_interrupts(struct seq_file *p, int prec) irq_stats(j)->kvm_posted_intr_wakeup_ipis); seq_puts(p, " Posted-interrupt wakeup event\n"); #endif +#ifdef CONFIG_GUEST_PERF_EVENTS + seq_printf(p, "%*s: ", prec, "VPMI"); + for_each_online_cpu(j) + seq_printf(p, "%10u ", + irq_stats(j)->perf_guest_mediated_pmis); + seq_puts(p, " Perf Guest Mediated PMI\n"); +#endif #ifdef CONFIG_X86_POSTED_MSI seq_printf(p, "%*s: ", prec, "PMN"); for_each_online_cpu(j) @@ -349,6 +356,18 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_x86_platform_ipi) } #endif +#ifdef CONFIG_GUEST_PERF_EVENTS +/* + * Handler for PERF_GUEST_MEDIATED_PMI_VECTOR. + */ +DEFINE_IDTENTRY_SYSVEC(sysvec_perf_guest_mediated_pmi_handler) +{ + apic_eoi(); + inc_irq_stat(perf_guest_mediated_pmis); + perf_guest_handle_mediated_pmi(); +} +#endif + #if IS_ENABLED(CONFIG_KVM) static void dummy_handler(void) {} static void (*kvm_posted_intr_wakeup_handler)(void) = dummy_handler; diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 7be8e361ca55..619dddf54424 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -1823,3 +1823,27 @@ bool is_uprobe_at_func_entry(struct pt_regs *regs) return false; } + +#ifdef CONFIG_IA32_EMULATION +unsigned long arch_uprobe_get_xol_area(void) +{ + struct thread_info *ti = current_thread_info(); + unsigned long vaddr; + + /* + * HACK: we are not in a syscall, but x86 get_unmapped_area() paths + * ignore TIF_ADDR32 and rely on in_32bit_syscall() to calculate + * vm_unmapped_area_info.high_limit. + * + * The #ifdef above doesn't cover the CONFIG_X86_X32_ABI=y case, + * but in this case in_32bit_syscall() -> in_x32_syscall() always + * (falsely) returns true because ->orig_ax == -1. + */ + if (test_thread_flag(TIF_ADDR32)) + ti->status |= TS_COMPAT; + vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0); + ti->status &= ~TS_COMPAT; + + return vaddr; +} +#endif diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 278f08194ec8..d916bd766c94 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -37,6 +37,7 @@ config KVM_X86 select SCHED_INFO select PERF_EVENTS select GUEST_PERF_EVENTS + select PERF_GUEST_MEDIATED_PMU select HAVE_KVM_MSI select HAVE_KVM_CPU_RELAX_INTERCEPT select HAVE_KVM_NO_POLL diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild index 295c94a3ccc1..9aff61e7b8f2 100644 --- a/include/asm-generic/Kbuild +++ b/include/asm-generic/Kbuild @@ -32,6 +32,7 @@ mandatory-y += irq_work.h mandatory-y += kdebug.h mandatory-y += kmap_size.h mandatory-y += kprobes.h +mandatory-y += kvm_types.h mandatory-y += linkage.h mandatory-y += local.h mandatory-y += local64.h diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 9ded2e582c60..48d851fbd8ea 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -305,6 +305,7 @@ struct perf_event_pmu_context; #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100 #define PERF_PMU_CAP_AUX_PAUSE 0x0200 #define PERF_PMU_CAP_AUX_PREFER_LARGE 0x0400 +#define PERF_PMU_CAP_MEDIATED_VPMU 0x0800 /** * pmu::scope @@ -998,6 +999,11 @@ struct perf_event_groups { u64 index; }; +struct perf_time_ctx { + u64 time; + u64 stamp; + u64 offset; +}; /** * struct perf_event_context - event context structure @@ -1036,9 +1042,12 @@ struct perf_event_context { /* * Context clock, runs when context enabled. */ - u64 time; - u64 timestamp; - u64 timeoffset; + struct perf_time_ctx time; + + /* + * Context clock, runs when in the guest mode. + */ + struct perf_time_ctx timeguest; /* * These fields let us detect when two contexts have both @@ -1171,9 +1180,8 @@ struct bpf_perf_event_data_kern { * This is a per-cpu dynamically allocated data structure. */ struct perf_cgroup_info { - u64 time; - u64 timestamp; - u64 timeoffset; + struct perf_time_ctx time; + struct perf_time_ctx timeguest; int active; }; @@ -1669,6 +1677,8 @@ struct perf_guest_info_callbacks { unsigned int (*state)(void); unsigned long (*get_ip)(void); unsigned int (*handle_intel_pt_intr)(void); + + void (*handle_mediated_pmi)(void); }; #ifdef CONFIG_GUEST_PERF_EVENTS @@ -1678,6 +1688,7 @@ extern struct perf_guest_info_callbacks __rcu *perf_guest_cbs; DECLARE_STATIC_CALL(__perf_guest_state, *perf_guest_cbs->state); DECLARE_STATIC_CALL(__perf_guest_get_ip, *perf_guest_cbs->get_ip); DECLARE_STATIC_CALL(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr); +DECLARE_STATIC_CALL(__perf_guest_handle_mediated_pmi, *perf_guest_cbs->handle_mediated_pmi); static inline unsigned int perf_guest_state(void) { @@ -1694,6 +1705,11 @@ static inline unsigned int perf_guest_handle_intel_pt_intr(void) return static_call(__perf_guest_handle_intel_pt_intr)(); } +static inline void perf_guest_handle_mediated_pmi(void) +{ + static_call(__perf_guest_handle_mediated_pmi)(); +} + extern void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs); extern void perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs); @@ -1914,6 +1930,13 @@ extern int perf_event_account_interrupt(struct perf_event *event); extern int perf_event_period(struct perf_event *event, u64 value); extern u64 perf_event_pause(struct perf_event *event, bool reset); +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +int perf_create_mediated_pmu(void); +void perf_release_mediated_pmu(void); +void perf_load_guest_context(void); +void perf_put_guest_context(void); +#endif + #else /* !CONFIG_PERF_EVENTS: */ static inline void * diff --git a/include/linux/unwind_user.h b/include/linux/unwind_user.h index 7f7282516bf5..64618618febd 100644 --- a/include/linux/unwind_user.h +++ b/include/linux/unwind_user.h @@ -5,8 +5,22 @@ #include <linux/unwind_user_types.h> #include <asm/unwind_user.h> -#ifndef ARCH_INIT_USER_FP_FRAME - #define ARCH_INIT_USER_FP_FRAME +#ifndef CONFIG_HAVE_UNWIND_USER_FP + +#define ARCH_INIT_USER_FP_FRAME(ws) + +#endif + +#ifndef ARCH_INIT_USER_FP_ENTRY_FRAME +#define ARCH_INIT_USER_FP_ENTRY_FRAME(ws) +#endif + +#ifndef unwind_user_at_function_start +static inline bool unwind_user_at_function_start(struct pt_regs *regs) +{ + return false; +} +#define unwind_user_at_function_start unwind_user_at_function_start #endif int unwind_user(struct unwind_stacktrace *trace, unsigned int max_entries); diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index ee3d36eda45d..f548fea2adec 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -242,6 +242,7 @@ extern void arch_uprobe_clear_state(struct mm_struct *mm); extern void arch_uprobe_init_state(struct mm_struct *mm); extern void handle_syscall_uprobe(struct pt_regs *regs, unsigned long bp_vaddr); extern void arch_uprobe_optimize(struct arch_uprobe *auprobe, unsigned long vaddr); +extern unsigned long arch_uprobe_get_xol_area(void); #else /* !CONFIG_UPROBES */ struct uprobes_state { }; diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 72f03153dd32..fd10aa8d697f 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1330,14 +1330,16 @@ union perf_mem_data_src { mem_snoopx : 2, /* Snoop mode, ext */ mem_blk : 3, /* Access blocked */ mem_hops : 3, /* Hop level */ - mem_rsvd : 18; + mem_region : 5, /* cache/memory regions */ + mem_rsvd : 13; }; }; #elif defined(__BIG_ENDIAN_BITFIELD) union perf_mem_data_src { __u64 val; struct { - __u64 mem_rsvd : 18, + __u64 mem_rsvd : 13, + mem_region : 5, /* cache/memory regions */ mem_hops : 3, /* Hop level */ mem_blk : 3, /* Access blocked */ mem_snoopx : 2, /* Snoop mode, ext */ @@ -1394,7 +1396,7 @@ union perf_mem_data_src { #define PERF_MEM_LVLNUM_L4 0x0004 /* L4 */ #define PERF_MEM_LVLNUM_L2_MHB 0x0005 /* L2 Miss Handling Buffer */ #define PERF_MEM_LVLNUM_MSC 0x0006 /* Memory-side Cache */ -/* 0x007 available */ +#define PERF_MEM_LVLNUM_L0 0x0007 /* L0 */ #define PERF_MEM_LVLNUM_UNC 0x0008 /* Uncached */ #define PERF_MEM_LVLNUM_CXL 0x0009 /* CXL */ #define PERF_MEM_LVLNUM_IO 0x000a /* I/O */ @@ -1447,6 +1449,25 @@ union perf_mem_data_src { /* 5-7 available */ #define PERF_MEM_HOPS_SHIFT 43 +/* Cache/Memory region */ +#define PERF_MEM_REGION_NA 0x0 /* Invalid */ +#define PERF_MEM_REGION_RSVD 0x01 /* Reserved */ +#define PERF_MEM_REGION_L_SHARE 0x02 /* Local CA shared cache */ +#define PERF_MEM_REGION_L_NON_SHARE 0x03 /* Local CA non-shared cache */ +#define PERF_MEM_REGION_O_IO 0x04 /* Other CA IO agent */ +#define PERF_MEM_REGION_O_SHARE 0x05 /* Other CA shared cache */ +#define PERF_MEM_REGION_O_NON_SHARE 0x06 /* Other CA non-shared cache */ +#define PERF_MEM_REGION_MMIO 0x07 /* MMIO */ +#define PERF_MEM_REGION_MEM0 0x08 /* Memory region 0 */ +#define PERF_MEM_REGION_MEM1 0x09 /* Memory region 1 */ +#define PERF_MEM_REGION_MEM2 0x0a /* Memory region 2 */ +#define PERF_MEM_REGION_MEM3 0x0b /* Memory region 3 */ +#define PERF_MEM_REGION_MEM4 0x0c /* Memory region 4 */ +#define PERF_MEM_REGION_MEM5 0x0d /* Memory region 5 */ +#define PERF_MEM_REGION_MEM6 0x0e /* Memory region 6 */ +#define PERF_MEM_REGION_MEM7 0x0f /* Memory region 7 */ +#define PERF_MEM_REGION_SHIFT 46 + #define PERF_MEM_S(a, s) \ (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT) diff --git a/init/Kconfig b/init/Kconfig index 0b425841df46..66f02480368b 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2072,6 +2072,10 @@ config GUEST_PERF_EVENTS bool depends on HAVE_PERF_EVENTS +config PERF_GUEST_MEDIATED_PMU + bool + depends on GUEST_PERF_EVENTS + config PERF_USE_VMALLOC bool help diff --git a/kernel/events/core.c b/kernel/events/core.c index 8cca80094624..e18119f30c29 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -57,6 +57,7 @@ #include <linux/task_work.h> #include <linux/percpu-rwsem.h> #include <linux/unwind_deferred.h> +#include <linux/kvm_types.h> #include "internal.h" @@ -166,6 +167,18 @@ enum event_type_t { EVENT_CPU = 0x10, EVENT_CGROUP = 0x20, + /* + * EVENT_GUEST is set when scheduling in/out events between the host + * and a guest with a mediated vPMU. Among other things, EVENT_GUEST + * is used: + * + * - In for_each_epc() to skip PMUs that don't support events in a + * MEDIATED_VPMU guest, i.e. don't need to be context switched. + * - To indicate the start/end point of the events in a guest. Guest + * running time is deducted for host-only (exclude_guest) events. + */ + EVENT_GUEST = 0x40, + EVENT_FLAGS = EVENT_CGROUP | EVENT_GUEST, /* compound helpers */ EVENT_ALL = EVENT_FLEXIBLE | EVENT_PINNED, EVENT_TIME_FROZEN = EVENT_TIME | EVENT_FROZEN, @@ -458,6 +471,20 @@ static cpumask_var_t perf_online_pkg_mask; static cpumask_var_t perf_online_sys_mask; static struct kmem_cache *perf_event_cache; +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +static DEFINE_PER_CPU(bool, guest_ctx_loaded); + +static __always_inline bool is_guest_mediated_pmu_loaded(void) +{ + return __this_cpu_read(guest_ctx_loaded); +} +#else +static __always_inline bool is_guest_mediated_pmu_loaded(void) +{ + return false; +} +#endif + /* * perf event paranoia level: * -1 - not paranoid at all @@ -779,33 +806,97 @@ do { \ ___p; \ }) -#define for_each_epc(_epc, _ctx, _pmu, _cgroup) \ +static bool perf_skip_pmu_ctx(struct perf_event_pmu_context *pmu_ctx, + enum event_type_t event_type) +{ + if ((event_type & EVENT_CGROUP) && !pmu_ctx->nr_cgroups) + return true; + if ((event_type & EVENT_GUEST) && + !(pmu_ctx->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU)) + return true; + return false; +} + +#define for_each_epc(_epc, _ctx, _pmu, _event_type) \ list_for_each_entry(_epc, &((_ctx)->pmu_ctx_list), pmu_ctx_entry) \ - if (_cgroup && !_epc->nr_cgroups) \ + if (perf_skip_pmu_ctx(_epc, _event_type)) \ continue; \ else if (_pmu && _epc->pmu != _pmu) \ continue; \ else -static void perf_ctx_disable(struct perf_event_context *ctx, bool cgroup) +static void perf_ctx_disable(struct perf_event_context *ctx, + enum event_type_t event_type) { struct perf_event_pmu_context *pmu_ctx; - for_each_epc(pmu_ctx, ctx, NULL, cgroup) + for_each_epc(pmu_ctx, ctx, NULL, event_type) perf_pmu_disable(pmu_ctx->pmu); } -static void perf_ctx_enable(struct perf_event_context *ctx, bool cgroup) +static void perf_ctx_enable(struct perf_event_context *ctx, + enum event_type_t event_type) { struct perf_event_pmu_context *pmu_ctx; - for_each_epc(pmu_ctx, ctx, NULL, cgroup) + for_each_epc(pmu_ctx, ctx, NULL, event_type) perf_pmu_enable(pmu_ctx->pmu); } static void ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type); static void ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type); +static inline void update_perf_time_ctx(struct perf_time_ctx *time, u64 now, bool adv) +{ + if (adv) + time->time += now - time->stamp; + time->stamp = now; + + /* + * The above: time' = time + (now - timestamp), can be re-arranged + * into: time` = now + (time - timestamp), which gives a single value + * offset to compute future time without locks on. + * + * See perf_event_time_now(), which can be used from NMI context where + * it's (obviously) not possible to acquire ctx->lock in order to read + * both the above values in a consistent manner. + */ + WRITE_ONCE(time->offset, time->time - time->stamp); +} + +static_assert(offsetof(struct perf_event_context, timeguest) - + offsetof(struct perf_event_context, time) == + sizeof(struct perf_time_ctx)); + +#define T_TOTAL 0 +#define T_GUEST 1 + +static inline u64 __perf_event_time_ctx(struct perf_event *event, + struct perf_time_ctx *times) +{ + u64 time = times[T_TOTAL].time; + + if (event->attr.exclude_guest) + time -= times[T_GUEST].time; + + return time; +} + +static inline u64 __perf_event_time_ctx_now(struct perf_event *event, + struct perf_time_ctx *times, + u64 now) +{ + if (is_guest_mediated_pmu_loaded() && event->attr.exclude_guest) { + /* + * (now + times[total].offset) - (now + times[guest].offset) := + * times[total].offset - times[guest].offset + */ + return READ_ONCE(times[T_TOTAL].offset) - READ_ONCE(times[T_GUEST].offset); + } + + return now + READ_ONCE(times[T_TOTAL].offset); +} + #ifdef CONFIG_CGROUP_PERF static inline bool @@ -842,12 +933,16 @@ static inline int is_cgroup_event(struct perf_event *event) return event->cgrp != NULL; } +static_assert(offsetof(struct perf_cgroup_info, timeguest) - + offsetof(struct perf_cgroup_info, time) == + sizeof(struct perf_time_ctx)); + static inline u64 perf_cgroup_event_time(struct perf_event *event) { struct perf_cgroup_info *t; t = per_cpu_ptr(event->cgrp->info, event->cpu); - return t->time; + return __perf_event_time_ctx(event, &t->time); } static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64 now) @@ -856,20 +951,21 @@ static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64 now) t = per_cpu_ptr(event->cgrp->info, event->cpu); if (!__load_acquire(&t->active)) - return t->time; - now += READ_ONCE(t->timeoffset); - return now; + return __perf_event_time_ctx(event, &t->time); + + return __perf_event_time_ctx_now(event, &t->time, now); } -static inline void __update_cgrp_time(struct perf_cgroup_info *info, u64 now, bool adv) +static inline void __update_cgrp_guest_time(struct perf_cgroup_info *info, u64 now, bool adv) { - if (adv) - info->time += now - info->timestamp; - info->timestamp = now; - /* - * see update_context_time() - */ - WRITE_ONCE(info->timeoffset, info->time - info->timestamp); + update_perf_time_ctx(&info->timeguest, now, adv); +} + +static inline void update_cgrp_time(struct perf_cgroup_info *info, u64 now) +{ + update_perf_time_ctx(&info->time, now, true); + if (is_guest_mediated_pmu_loaded()) + __update_cgrp_guest_time(info, now, true); } static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx, bool final) @@ -885,7 +981,7 @@ static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx, cgrp = container_of(css, struct perf_cgroup, css); info = this_cpu_ptr(cgrp->info); - __update_cgrp_time(info, now, true); + update_cgrp_time(info, now); if (final) __store_release(&info->active, 0); } @@ -908,11 +1004,11 @@ static inline void update_cgrp_time_from_event(struct perf_event *event) * Do not update time when cgroup is not active */ if (info->active) - __update_cgrp_time(info, perf_clock(), true); + update_cgrp_time(info, perf_clock()); } static inline void -perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx) +perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx, bool guest) { struct perf_event_context *ctx = &cpuctx->ctx; struct perf_cgroup *cgrp = cpuctx->cgrp; @@ -932,8 +1028,12 @@ perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx) for (css = &cgrp->css; css; css = css->parent) { cgrp = container_of(css, struct perf_cgroup, css); info = this_cpu_ptr(cgrp->info); - __update_cgrp_time(info, ctx->timestamp, false); - __store_release(&info->active, 1); + if (guest) { + __update_cgrp_guest_time(info, ctx->time.stamp, false); + } else { + update_perf_time_ctx(&info->time, ctx->time.stamp, false); + __store_release(&info->active, 1); + } } } @@ -964,8 +1064,7 @@ static void perf_cgroup_switch(struct task_struct *task) return; WARN_ON_ONCE(cpuctx->ctx.nr_cgroups == 0); - - perf_ctx_disable(&cpuctx->ctx, true); + perf_ctx_disable(&cpuctx->ctx, EVENT_CGROUP); ctx_sched_out(&cpuctx->ctx, NULL, EVENT_ALL|EVENT_CGROUP); /* @@ -981,7 +1080,7 @@ static void perf_cgroup_switch(struct task_struct *task) */ ctx_sched_in(&cpuctx->ctx, NULL, EVENT_ALL|EVENT_CGROUP); - perf_ctx_enable(&cpuctx->ctx, true); + perf_ctx_enable(&cpuctx->ctx, EVENT_CGROUP); } static int perf_cgroup_ensure_storage(struct perf_event *event, @@ -1138,7 +1237,7 @@ static inline int perf_cgroup_connect(pid_t pid, struct perf_event *event, } static inline void -perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx) +perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx, bool guest) { } @@ -1550,29 +1649,24 @@ static void perf_unpin_context(struct perf_event_context *ctx) */ static void __update_context_time(struct perf_event_context *ctx, bool adv) { - u64 now = perf_clock(); - lockdep_assert_held(&ctx->lock); - if (adv) - ctx->time += now - ctx->timestamp; - ctx->timestamp = now; + update_perf_time_ctx(&ctx->time, perf_clock(), adv); +} - /* - * The above: time' = time + (now - timestamp), can be re-arranged - * into: time` = now + (time - timestamp), which gives a single value - * offset to compute future time without locks on. - * - * See perf_event_time_now(), which can be used from NMI context where - * it's (obviously) not possible to acquire ctx->lock in order to read - * both the above values in a consistent manner. - */ - WRITE_ONCE(ctx->timeoffset, ctx->time - ctx->timestamp); +static void __update_context_guest_time(struct perf_event_context *ctx, bool adv) +{ + lockdep_assert_held(&ctx->lock); + + /* must be called after __update_context_time(); */ + update_perf_time_ctx(&ctx->timeguest, ctx->time.stamp, adv); } static void update_context_time(struct perf_event_context *ctx) { __update_context_time(ctx, true); + if (is_guest_mediated_pmu_loaded()) + __update_context_guest_time(ctx, true); } static u64 perf_event_time(struct perf_event *event) @@ -1585,7 +1679,7 @@ static u64 perf_event_time(struct perf_event *event) if (is_cgroup_event(event)) return perf_cgroup_event_time(event); - return ctx->time; + return __perf_event_time_ctx(event, &ctx->time); } static u64 perf_event_time_now(struct perf_event *event, u64 now) @@ -1599,10 +1693,9 @@ static u64 perf_event_time_now(struct perf_event *event, u64 now) return perf_cgroup_event_time_now(event, now); if (!(__load_acquire(&ctx->is_active) & EVENT_TIME)) - return ctx->time; + return __perf_event_time_ctx(event, &ctx->time); - now += READ_ONCE(ctx->timeoffset); - return now; + return __perf_event_time_ctx_now(event, &ctx->time, now); } static enum event_type_t get_event_type(struct perf_event *event) @@ -2422,20 +2515,23 @@ group_sched_out(struct perf_event *group_event, struct perf_event_context *ctx) } static inline void -__ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx, bool final) +__ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx, + bool final, enum event_type_t event_type) { if (ctx->is_active & EVENT_TIME) { if (ctx->is_active & EVENT_FROZEN) return; + update_context_time(ctx); - update_cgrp_time_from_cpuctx(cpuctx, final); + /* vPMU should not stop time */ + update_cgrp_time_from_cpuctx(cpuctx, !(event_type & EVENT_GUEST) && final); } } static inline void ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx) { - __ctx_time_update(cpuctx, ctx, false); + __ctx_time_update(cpuctx, ctx, false, 0); } /* @@ -2861,14 +2957,15 @@ static void task_ctx_sched_out(struct perf_event_context *ctx, static void perf_event_sched_in(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx, - struct pmu *pmu) + struct pmu *pmu, + enum event_type_t event_type) { - ctx_sched_in(&cpuctx->ctx, pmu, EVENT_PINNED); + ctx_sched_in(&cpuctx->ctx, pmu, EVENT_PINNED | event_type); if (ctx) - ctx_sched_in(ctx, pmu, EVENT_PINNED); - ctx_sched_in(&cpuctx->ctx, pmu, EVENT_FLEXIBLE); + ctx_sched_in(ctx, pmu, EVENT_PINNED | event_type); + ctx_sched_in(&cpuctx->ctx, pmu, EVENT_FLEXIBLE | event_type); if (ctx) - ctx_sched_in(ctx, pmu, EVENT_FLEXIBLE); + ctx_sched_in(ctx, pmu, EVENT_FLEXIBLE | event_type); } /* @@ -2902,11 +2999,11 @@ static void ctx_resched(struct perf_cpu_context *cpuctx, event_type &= EVENT_ALL; - for_each_epc(epc, &cpuctx->ctx, pmu, false) + for_each_epc(epc, &cpuctx->ctx, pmu, 0) perf_pmu_disable(epc->pmu); if (task_ctx) { - for_each_epc(epc, task_ctx, pmu, false) + for_each_epc(epc, task_ctx, pmu, 0) perf_pmu_disable(epc->pmu); task_ctx_sched_out(task_ctx, pmu, event_type); @@ -2924,13 +3021,13 @@ static void ctx_resched(struct perf_cpu_context *cpuctx, else if (event_type & EVENT_PINNED) ctx_sched_out(&cpuctx->ctx, pmu, EVENT_FLEXIBLE); - perf_event_sched_in(cpuctx, task_ctx, pmu); + perf_event_sched_in(cpuctx, task_ctx, pmu, 0); - for_each_epc(epc, &cpuctx->ctx, pmu, false) + for_each_epc(epc, &cpuctx->ctx, pmu, 0) perf_pmu_enable(epc->pmu); if (task_ctx) { - for_each_epc(epc, task_ctx, pmu, false) + for_each_epc(epc, task_ctx, pmu, 0) perf_pmu_enable(epc->pmu); } } @@ -3479,11 +3576,10 @@ static void ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type) { struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); + enum event_type_t active_type = event_type & ~EVENT_FLAGS; struct perf_event_pmu_context *pmu_ctx; int is_active = ctx->is_active; - bool cgroup = event_type & EVENT_CGROUP; - event_type &= ~EVENT_CGROUP; lockdep_assert_held(&ctx->lock); @@ -3507,14 +3603,14 @@ ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t * * would only update time for the pinned events. */ - __ctx_time_update(cpuctx, ctx, ctx == &cpuctx->ctx); + __ctx_time_update(cpuctx, ctx, ctx == &cpuctx->ctx, event_type); /* * CPU-release for the below ->is_active store, * see __load_acquire() in perf_event_time_now() */ barrier(); - ctx->is_active &= ~event_type; + ctx->is_active &= ~active_type; if (!(ctx->is_active & EVENT_ALL)) { /* @@ -3533,9 +3629,20 @@ ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t cpuctx->task_ctx = NULL; } - is_active ^= ctx->is_active; /* changed bits */ + if (event_type & EVENT_GUEST) { + /* + * Schedule out all exclude_guest events of PMU + * with PERF_PMU_CAP_MEDIATED_VPMU. + */ + is_active = EVENT_ALL; + __update_context_guest_time(ctx, false); + perf_cgroup_set_timestamp(cpuctx, true); + barrier(); + } else { + is_active ^= ctx->is_active; /* changed bits */ + } - for_each_epc(pmu_ctx, ctx, pmu, cgroup) + for_each_epc(pmu_ctx, ctx, pmu, event_type) __pmu_ctx_sched_out(pmu_ctx, is_active); } @@ -3691,7 +3798,7 @@ perf_event_context_sched_out(struct task_struct *task, struct task_struct *next) raw_spin_lock_nested(&next_ctx->lock, SINGLE_DEPTH_NESTING); if (context_equiv(ctx, next_ctx)) { - perf_ctx_disable(ctx, false); + perf_ctx_disable(ctx, 0); /* PMIs are disabled; ctx->nr_no_switch_fast is stable. */ if (local_read(&ctx->nr_no_switch_fast) || @@ -3715,7 +3822,7 @@ perf_event_context_sched_out(struct task_struct *task, struct task_struct *next) perf_ctx_sched_task_cb(ctx, task, false); - perf_ctx_enable(ctx, false); + perf_ctx_enable(ctx, 0); /* * RCU_INIT_POINTER here is safe because we've not @@ -3739,13 +3846,13 @@ unlock: if (do_switch) { raw_spin_lock(&ctx->lock); - perf_ctx_disable(ctx, false); + perf_ctx_disable(ctx, 0); inside_switch: perf_ctx_sched_task_cb(ctx, task, false); task_ctx_sched_out(ctx, NULL, EVENT_ALL); - perf_ctx_enable(ctx, false); + perf_ctx_enable(ctx, 0); raw_spin_unlock(&ctx->lock); } } @@ -3992,10 +4099,15 @@ static inline void group_update_userpage(struct perf_event *group_event) event_update_userpage(event); } +struct merge_sched_data { + int can_add_hw; + enum event_type_t event_type; +}; + static int merge_sched_in(struct perf_event *event, void *data) { struct perf_event_context *ctx = event->ctx; - int *can_add_hw = data; + struct merge_sched_data *msd = data; if (event->state <= PERF_EVENT_STATE_OFF) return 0; @@ -4003,13 +4115,22 @@ static int merge_sched_in(struct perf_event *event, void *data) if (!event_filter_match(event)) return 0; - if (group_can_go_on(event, *can_add_hw)) { + /* + * Don't schedule in any host events from PMU with + * PERF_PMU_CAP_MEDIATED_VPMU, while a guest is running. + */ + if (is_guest_mediated_pmu_loaded() && + event->pmu_ctx->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU && + !(msd->event_type & EVENT_GUEST)) + return 0; + + if (group_can_go_on(event, msd->can_add_hw)) { if (!group_sched_in(event, ctx)) list_add_tail(&event->active_list, get_event_list(event)); } if (event->state == PERF_EVENT_STATE_INACTIVE) { - *can_add_hw = 0; + msd->can_add_hw = 0; if (event->attr.pinned) { perf_cgroup_event_disable(event, ctx); perf_event_set_state(event, PERF_EVENT_STATE_ERROR); @@ -4032,11 +4153,15 @@ static int merge_sched_in(struct perf_event *event, void *data) static void pmu_groups_sched_in(struct perf_event_context *ctx, struct perf_event_groups *groups, - struct pmu *pmu) + struct pmu *pmu, + enum event_type_t event_type) { - int can_add_hw = 1; + struct merge_sched_data msd = { + .can_add_hw = 1, + .event_type = event_type, + }; visit_groups_merge(ctx, groups, smp_processor_id(), pmu, - merge_sched_in, &can_add_hw); + merge_sched_in, &msd); } static void __pmu_ctx_sched_in(struct perf_event_pmu_context *pmu_ctx, @@ -4045,20 +4170,18 @@ static void __pmu_ctx_sched_in(struct perf_event_pmu_context *pmu_ctx, struct perf_event_context *ctx = pmu_ctx->ctx; if (event_type & EVENT_PINNED) - pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu_ctx->pmu); + pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu_ctx->pmu, event_type); if (event_type & EVENT_FLEXIBLE) - pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu_ctx->pmu); + pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu_ctx->pmu, event_type); } static void ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type) { struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); + enum event_type_t active_type = event_type & ~EVENT_FLAGS; struct perf_event_pmu_context *pmu_ctx; int is_active = ctx->is_active; - bool cgroup = event_type & EVENT_CGROUP; - - event_type &= ~EVENT_CGROUP; lockdep_assert_held(&ctx->lock); @@ -4066,9 +4189,11 @@ ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t return; if (!(is_active & EVENT_TIME)) { + /* EVENT_TIME should be active while the guest runs */ + WARN_ON_ONCE(event_type & EVENT_GUEST); /* start ctx time */ __update_context_time(ctx, false); - perf_cgroup_set_timestamp(cpuctx); + perf_cgroup_set_timestamp(cpuctx, false); /* * CPU-release for the below ->is_active store, * see __load_acquire() in perf_event_time_now() @@ -4076,7 +4201,7 @@ ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t barrier(); } - ctx->is_active |= (event_type | EVENT_TIME); + ctx->is_active |= active_type | EVENT_TIME; if (ctx->task) { if (!(is_active & EVENT_ALL)) cpuctx->task_ctx = ctx; @@ -4084,21 +4209,37 @@ ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t WARN_ON_ONCE(cpuctx->task_ctx != ctx); } - is_active ^= ctx->is_active; /* changed bits */ + if (event_type & EVENT_GUEST) { + /* + * Schedule in the required exclude_guest events of PMU + * with PERF_PMU_CAP_MEDIATED_VPMU. + */ + is_active = event_type & EVENT_ALL; + + /* + * Update ctx time to set the new start time for + * the exclude_guest events. + */ + update_context_time(ctx); + update_cgrp_time_from_cpuctx(cpuctx, false); + barrier(); + } else { + is_active ^= ctx->is_active; /* changed bits */ + } /* * First go through the list and put on any pinned groups * in order to give them the best chance of going on. */ if (is_active & EVENT_PINNED) { - for_each_epc(pmu_ctx, ctx, pmu, cgroup) - __pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED); + for_each_epc(pmu_ctx, ctx, pmu, event_type) + __pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED | (event_type & EVENT_GUEST)); } /* Then walk through the lower prio flexible groups */ if (is_active & EVENT_FLEXIBLE) { - for_each_epc(pmu_ctx, ctx, pmu, cgroup) - __pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE); + for_each_epc(pmu_ctx, ctx, pmu, event_type) + __pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE | (event_type & EVENT_GUEST)); } } @@ -4114,11 +4255,11 @@ static void perf_event_context_sched_in(struct task_struct *task) if (cpuctx->task_ctx == ctx) { perf_ctx_lock(cpuctx, ctx); - perf_ctx_disable(ctx, false); + perf_ctx_disable(ctx, 0); perf_ctx_sched_task_cb(ctx, task, true); - perf_ctx_enable(ctx, false); + perf_ctx_enable(ctx, 0); perf_ctx_unlock(cpuctx, ctx); goto rcu_unlock; } @@ -4131,7 +4272,7 @@ static void perf_event_context_sched_in(struct task_struct *task) if (!ctx->nr_events) goto unlock; - perf_ctx_disable(ctx, false); + perf_ctx_disable(ctx, 0); /* * We want to keep the following priority order: * cpu pinned (that don't need to move), task pinned, @@ -4141,18 +4282,18 @@ static void perf_event_context_sched_in(struct task_struct *task) * events, no need to flip the cpuctx's events around. */ if (!RB_EMPTY_ROOT(&ctx->pinned_groups.tree)) { - perf_ctx_disable(&cpuctx->ctx, false); + perf_ctx_disable(&cpuctx->ctx, 0); ctx_sched_out(&cpuctx->ctx, NULL, EVENT_FLEXIBLE); } - perf_event_sched_in(cpuctx, ctx, NULL); + perf_event_sched_in(cpuctx, ctx, NULL, 0); perf_ctx_sched_task_cb(cpuctx->task_ctx, task, true); if (!RB_EMPTY_ROOT(&ctx->pinned_groups.tree)) - perf_ctx_enable(&cpuctx->ctx, false); + perf_ctx_enable(&cpuctx->ctx, 0); - perf_ctx_enable(ctx, false); + perf_ctx_enable(ctx, 0); unlock: perf_ctx_unlock(cpuctx, ctx); @@ -5280,9 +5421,20 @@ attach_task_ctx_data(struct task_struct *task, struct kmem_cache *ctx_cache, return -ENOMEM; for (;;) { - if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)) { + if (try_cmpxchg(&task->perf_ctx_data, &old, cd)) { if (old) perf_free_ctx_data_rcu(old); + /* + * Above try_cmpxchg() pairs with try_cmpxchg() from + * detach_task_ctx_data() such that + * if we race with perf_event_exit_task(), we must + * observe PF_EXITING. + */ + if (task->flags & PF_EXITING) { + /* detach_task_ctx_data() may free it already */ + if (try_cmpxchg(&task->perf_ctx_data, &cd, NULL)) + perf_free_ctx_data_rcu(cd); + } return 0; } @@ -5328,6 +5480,8 @@ again: /* Allocate everything */ scoped_guard (rcu) { for_each_process_thread(g, p) { + if (p->flags & PF_EXITING) + continue; cd = rcu_dereference(p->perf_ctx_data); if (cd && !cd->global) { cd->global = 1; @@ -5594,6 +5748,8 @@ static void __free_event(struct perf_event *event) { struct pmu *pmu = event->pmu; + security_perf_event_free(event); + if (event->attach_state & PERF_ATTACH_CALLCHAIN) put_callchain_buffers(); @@ -5647,6 +5803,8 @@ static void __free_event(struct perf_event *event) call_rcu(&event->rcu_head, free_event_rcu); } +static void mediated_pmu_unaccount_event(struct perf_event *event); + DEFINE_FREE(__free_event, struct perf_event *, if (_T) __free_event(_T)) /* vs perf_event_alloc() success */ @@ -5656,8 +5814,7 @@ static void _free_event(struct perf_event *event) irq_work_sync(&event->pending_disable_irq); unaccount_event(event); - - security_perf_event_free(event); + mediated_pmu_unaccount_event(event); if (event->rb) { /* @@ -6180,6 +6337,138 @@ u64 perf_event_pause(struct perf_event *event, bool reset) } EXPORT_SYMBOL_GPL(perf_event_pause); +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +static atomic_t nr_include_guest_events __read_mostly; + +static atomic_t nr_mediated_pmu_vms __read_mostly; +static DEFINE_MUTEX(perf_mediated_pmu_mutex); + +/* !exclude_guest event of PMU with PERF_PMU_CAP_MEDIATED_VPMU */ +static inline bool is_include_guest_event(struct perf_event *event) +{ + if ((event->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU) && + !event->attr.exclude_guest) + return true; + + return false; +} + +static int mediated_pmu_account_event(struct perf_event *event) +{ + if (!is_include_guest_event(event)) + return 0; + + if (atomic_inc_not_zero(&nr_include_guest_events)) + return 0; + + guard(mutex)(&perf_mediated_pmu_mutex); + if (atomic_read(&nr_mediated_pmu_vms)) + return -EOPNOTSUPP; + + atomic_inc(&nr_include_guest_events); + return 0; +} + +static void mediated_pmu_unaccount_event(struct perf_event *event) +{ + if (!is_include_guest_event(event)) + return; + + if (WARN_ON_ONCE(!atomic_read(&nr_include_guest_events))) + return; + + atomic_dec(&nr_include_guest_events); +} + +/* + * Currently invoked at VM creation to + * - Check whether there are existing !exclude_guest events of PMU with + * PERF_PMU_CAP_MEDIATED_VPMU + * - Set nr_mediated_pmu_vms to prevent !exclude_guest event creation on + * PMUs with PERF_PMU_CAP_MEDIATED_VPMU + * + * No impact for the PMU without PERF_PMU_CAP_MEDIATED_VPMU. The perf + * still owns all the PMU resources. + */ +int perf_create_mediated_pmu(void) +{ + if (atomic_inc_not_zero(&nr_mediated_pmu_vms)) + return 0; + + guard(mutex)(&perf_mediated_pmu_mutex); + if (atomic_read(&nr_include_guest_events)) + return -EBUSY; + + atomic_inc(&nr_mediated_pmu_vms); + return 0; +} +EXPORT_SYMBOL_FOR_KVM(perf_create_mediated_pmu); + +void perf_release_mediated_pmu(void) +{ + if (WARN_ON_ONCE(!atomic_read(&nr_mediated_pmu_vms))) + return; + + atomic_dec(&nr_mediated_pmu_vms); +} +EXPORT_SYMBOL_FOR_KVM(perf_release_mediated_pmu); + +/* When loading a guest's mediated PMU, schedule out all exclude_guest events. */ +void perf_load_guest_context(void) +{ + struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); + + lockdep_assert_irqs_disabled(); + + guard(perf_ctx_lock)(cpuctx, cpuctx->task_ctx); + + if (WARN_ON_ONCE(__this_cpu_read(guest_ctx_loaded))) + return; + + perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST); + ctx_sched_out(&cpuctx->ctx, NULL, EVENT_GUEST); + if (cpuctx->task_ctx) { + perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST); + task_ctx_sched_out(cpuctx->task_ctx, NULL, EVENT_GUEST); + } + + perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST); + if (cpuctx->task_ctx) + perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST); + + __this_cpu_write(guest_ctx_loaded, true); +} +EXPORT_SYMBOL_GPL(perf_load_guest_context); + +void perf_put_guest_context(void) +{ + struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); + + lockdep_assert_irqs_disabled(); + + guard(perf_ctx_lock)(cpuctx, cpuctx->task_ctx); + + if (WARN_ON_ONCE(!__this_cpu_read(guest_ctx_loaded))) + return; + + perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST); + if (cpuctx->task_ctx) + perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST); + + perf_event_sched_in(cpuctx, cpuctx->task_ctx, NULL, EVENT_GUEST); + + if (cpuctx->task_ctx) + perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST); + perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST); + + __this_cpu_write(guest_ctx_loaded, false); +} +EXPORT_SYMBOL_GPL(perf_put_guest_context); +#else +static int mediated_pmu_account_event(struct perf_event *event) { return 0; } +static void mediated_pmu_unaccount_event(struct perf_event *event) {} +#endif + /* * Holding the top-level event's child_mutex means that any * descendant process that has inherited this event will block @@ -6548,22 +6837,22 @@ void perf_event_update_userpage(struct perf_event *event) goto unlock; /* - * compute total_time_enabled, total_time_running - * based on snapshot values taken when the event - * was last scheduled in. + * Disable preemption to guarantee consistent time stamps are stored to + * the user page. + */ + preempt_disable(); + + /* + * Compute total_time_enabled, total_time_running based on snapshot + * values taken when the event was last scheduled in. * - * we cannot simply called update_context_time() - * because of locking issue as we can be called in - * NMI context + * We cannot simply call update_context_time() because doing so would + * lead to deadlock when called from NMI context. */ calc_timer_values(event, &now, &enabled, &running); userpg = rb->user_page; - /* - * Disable preemption to guarantee consistent time stamps are stored to - * the user page. - */ - preempt_disable(); + ++userpg->lock; barrier(); userpg->index = perf_event_index(event); @@ -7383,6 +7672,7 @@ struct perf_guest_info_callbacks __rcu *perf_guest_cbs; DEFINE_STATIC_CALL_RET0(__perf_guest_state, *perf_guest_cbs->state); DEFINE_STATIC_CALL_RET0(__perf_guest_get_ip, *perf_guest_cbs->get_ip); DEFINE_STATIC_CALL_RET0(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr); +DEFINE_STATIC_CALL_RET0(__perf_guest_handle_mediated_pmi, *perf_guest_cbs->handle_mediated_pmi); void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs) { @@ -7397,6 +7687,10 @@ void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs) if (cbs->handle_intel_pt_intr) static_call_update(__perf_guest_handle_intel_pt_intr, cbs->handle_intel_pt_intr); + + if (cbs->handle_mediated_pmi) + static_call_update(__perf_guest_handle_mediated_pmi, + cbs->handle_mediated_pmi); } EXPORT_SYMBOL_GPL(perf_register_guest_info_callbacks); @@ -7408,8 +7702,8 @@ void perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs) rcu_assign_pointer(perf_guest_cbs, NULL); static_call_update(__perf_guest_state, (void *)&__static_call_return0); static_call_update(__perf_guest_get_ip, (void *)&__static_call_return0); - static_call_update(__perf_guest_handle_intel_pt_intr, - (void *)&__static_call_return0); + static_call_update(__perf_guest_handle_intel_pt_intr, (void *)&__static_call_return0); + static_call_update(__perf_guest_handle_mediated_pmi, (void *)&__static_call_return0); synchronize_rcu(); } EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks); @@ -7869,13 +8163,11 @@ static void perf_output_read(struct perf_output_handle *handle, u64 read_format = event->attr.read_format; /* - * compute total_time_enabled, total_time_running - * based on snapshot values taken when the event - * was last scheduled in. + * Compute total_time_enabled, total_time_running based on snapshot + * values taken when the event was last scheduled in. * - * we cannot simply called update_context_time() - * because of locking issue as we are called in - * NMI context + * We cannot simply call update_context_time() because doing so would + * lead to deadlock when called from NMI context. */ if (read_format & PERF_FORMAT_TOTAL_TIMES) calc_timer_values(event, &now, &enabled, &running); @@ -12043,7 +12335,7 @@ static void task_clock_event_update(struct perf_event *event, u64 now) static void task_clock_event_start(struct perf_event *event, int flags) { event->hw.state = 0; - local64_set(&event->hw.prev_count, event->ctx->time); + local64_set(&event->hw.prev_count, event->ctx->time.time); perf_swevent_start_hrtimer(event); } @@ -12052,7 +12344,7 @@ static void task_clock_event_stop(struct perf_event *event, int flags) event->hw.state = PERF_HES_STOPPED; perf_swevent_cancel_hrtimer(event); if (flags & PERF_EF_UPDATE) - task_clock_event_update(event, event->ctx->time); + task_clock_event_update(event, event->ctx->time.time); } static int task_clock_event_add(struct perf_event *event, int flags) @@ -12072,8 +12364,8 @@ static void task_clock_event_del(struct perf_event *event, int flags) static void task_clock_event_read(struct perf_event *event) { u64 now = perf_clock(); - u64 delta = now - event->ctx->timestamp; - u64 time = event->ctx->time + delta; + u64 delta = now - event->ctx->time.stamp; + u64 time = event->ctx->time.time + delta; task_clock_event_update(event, time); } @@ -13155,6 +13447,10 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu, if (err) return ERR_PTR(err); + err = mediated_pmu_account_event(event); + if (err) + return ERR_PTR(err); + /* symmetric to unaccount_event() in _free_event() */ account_event(event); @@ -14294,8 +14590,11 @@ void perf_event_exit_task(struct task_struct *task) /* * Detach the perf_ctx_data for the system-wide event. + * + * Done without holding global_ctx_data_rwsem; typically + * attach_global_ctx_data() will skip over this task, but otherwise + * attach_task_ctx_data() will observe PF_EXITING. */ - guard(percpu_read)(&global_ctx_data_rwsem); detach_task_ctx_data(task); } @@ -14798,7 +15097,8 @@ static void perf_event_exit_cpu_context(int cpu) ctx = &cpuctx->ctx; mutex_lock(&ctx->mutex); - smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1); + if (ctx->nr_events) + smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1); cpuctx->online = 0; mutex_unlock(&ctx->mutex); mutex_unlock(&pmus_lock); diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index d546d32390a8..424ef2235b07 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -179,16 +179,16 @@ bool __weak is_trap_insn(uprobe_opcode_t *insn) void uprobe_copy_from_page(struct page *page, unsigned long vaddr, void *dst, int len) { - void *kaddr = kmap_atomic(page); + void *kaddr = kmap_local_page(page); memcpy(dst, kaddr + (vaddr & ~PAGE_MASK), len); - kunmap_atomic(kaddr); + kunmap_local(kaddr); } static void copy_to_page(struct page *page, unsigned long vaddr, const void *src, int len) { - void *kaddr = kmap_atomic(page); + void *kaddr = kmap_local_page(page); memcpy(kaddr + (vaddr & ~PAGE_MASK), src, len); - kunmap_atomic(kaddr); + kunmap_local(kaddr); } static int verify_opcode(struct page *page, unsigned long vaddr, uprobe_opcode_t *insn, @@ -323,7 +323,7 @@ __update_ref_ctr(struct mm_struct *mm, unsigned long vaddr, short d) return ret == 0 ? -EBUSY : ret; } - kaddr = kmap_atomic(page); + kaddr = kmap_local_page(page); ptr = kaddr + (vaddr & ~PAGE_MASK); if (unlikely(*ptr + d < 0)) { @@ -336,7 +336,7 @@ __update_ref_ctr(struct mm_struct *mm, unsigned long vaddr, short d) *ptr += d; ret = 0; out: - kunmap_atomic(kaddr); + kunmap_local(kaddr); put_page(page); return ret; } @@ -1138,7 +1138,7 @@ static bool filter_chain(struct uprobe *uprobe, struct mm_struct *mm) bool ret = false; down_read(&uprobe->consumer_rwsem); - list_for_each_entry_rcu(uc, &uprobe->consumers, cons_node, rcu_read_lock_trace_held()) { + list_for_each_entry(uc, &uprobe->consumers, cons_node) { ret = consumer_filter(uc, mm); if (ret) break; @@ -1694,6 +1694,12 @@ static const struct vm_special_mapping xol_mapping = { .mremap = xol_mremap, }; +unsigned long __weak arch_uprobe_get_xol_area(void) +{ + /* Try to map as high as possible, this is only a hint. */ + return get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0); +} + /* Slot allocation for XOL */ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area) { @@ -1709,9 +1715,7 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area) } if (!area->vaddr) { - /* Try to map as high as possible, this is only a hint. */ - area->vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, - PAGE_SIZE, 0, 0); + area->vaddr = arch_uprobe_get_xol_area(); if (IS_ERR_VALUE(area->vaddr)) { ret = area->vaddr; goto fail; diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c index 39e270789444..90ab3c1a205e 100644 --- a/kernel/unwind/user.c +++ b/kernel/unwind/user.c @@ -31,6 +31,7 @@ static int unwind_user_next_common(struct unwind_user_state *state, { unsigned long cfa, fp, ra; + /* Get the Canonical Frame Address (CFA) */ if (frame->use_fp) { if (state->fp < state->sp) return -EINVAL; @@ -38,11 +39,9 @@ static int unwind_user_next_common(struct unwind_user_state *state, } else { cfa = state->sp; } - - /* Get the Canonical Frame Address (CFA) */ cfa += frame->cfa_off; - /* stack going in wrong direction? */ + /* Make sure that stack is not going in wrong direction */ if (cfa <= state->sp) return -EINVAL; @@ -50,10 +49,11 @@ static int unwind_user_next_common(struct unwind_user_state *state, if (cfa & (state->ws - 1)) return -EINVAL; - /* Find the Return Address (RA) */ + /* Get the Return Address (RA) */ if (get_user_word(&ra, cfa, frame->ra_off, state->ws)) return -EINVAL; + /* Get the Frame Pointer (FP) */ if (frame->fp_off && get_user_word(&fp, cfa, frame->fp_off, state->ws)) return -EINVAL; @@ -67,7 +67,6 @@ static int unwind_user_next_common(struct unwind_user_state *state, static int unwind_user_next_fp(struct unwind_user_state *state) { -#ifdef CONFIG_HAVE_UNWIND_USER_FP struct pt_regs *regs = task_pt_regs(current); if (state->topmost && unwind_user_at_function_start(regs)) { @@ -81,9 +80,6 @@ static int unwind_user_next_fp(struct unwind_user_state *state) ARCH_INIT_USER_FP_FRAME(state->ws) }; return unwind_user_next_common(state, &fp_frame); -#else - return -EINVAL; -#endif } static int unwind_user_next(struct unwind_user_state *state) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 72f03153dd32..76e9d0664d0c 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -1330,14 +1330,16 @@ union perf_mem_data_src { mem_snoopx : 2, /* Snoop mode, ext */ mem_blk : 3, /* Access blocked */ mem_hops : 3, /* Hop level */ - mem_rsvd : 18; + mem_region : 5, /* cache/memory regions */ + mem_rsvd : 13; }; }; #elif defined(__BIG_ENDIAN_BITFIELD) union perf_mem_data_src { __u64 val; struct { - __u64 mem_rsvd : 18, + __u64 mem_rsvd : 13, + mem_region : 5, /* cache/memory regions */ mem_hops : 3, /* Hop level */ mem_blk : 3, /* Access blocked */ mem_snoopx : 2, /* Snoop mode, ext */ @@ -1394,7 +1396,7 @@ union perf_mem_data_src { #define PERF_MEM_LVLNUM_L4 0x0004 /* L4 */ #define PERF_MEM_LVLNUM_L2_MHB 0x0005 /* L2 Miss Handling Buffer */ #define PERF_MEM_LVLNUM_MSC 0x0006 /* Memory-side Cache */ -/* 0x007 available */ +#define PERF_MEM_LVLNUM_L0 0x0007 /* L0 */ #define PERF_MEM_LVLNUM_UNC 0x0008 /* Uncached */ #define PERF_MEM_LVLNUM_CXL 0x0009 /* CXL */ #define PERF_MEM_LVLNUM_IO 0x000a /* I/O */ @@ -1447,6 +1449,25 @@ union perf_mem_data_src { /* 5-7 available */ #define PERF_MEM_HOPS_SHIFT 43 +/* Cache/Memory region */ +#define PERF_MEM_REGION_NA 0x0 /* Invalid */ +#define PERF_MEM_REGION_RSVD 0x01 /* Reserved */ +#define PERF_MEM_REGION_L_SHARE 0x02 /* Local CA shared cache */ +#define PERF_MEM_REGION_L_NON_SHARE 0x03 /* Local CA non-shared cache */ +#define PERF_MEM_REGION_O_IO 0x04 /* Other CA IO agent */ +#define PERF_MEM_REGION_O_SHARE 0x05 /* Other CA shared cache */ +#define PERF_MEM_REGION_O_NON_SHARE 0x06 /* Other CA non-shared cache */ +#define PERF_MEM_REGION_MMIO 0x07 /* MMIO */ +#define PERF_MEM_REGION_MEM0 0x08 /* Memory region 0 */ +#define PERF_MEM_REGION_MEM1 0x09 /* Memory region 1 */ +#define PERF_MEM_REGION_MEM2 0x0a /* Memory region 2 */ +#define PERF_MEM_REGION_MEM3 0x0b /* Memory region 3 */ +#define PERF_MEM_REGION_MEM4 0x0c /* Memory region 4 */ +#define PERF_MEM_REGION_MEM5 0x0d /* Memory region 5 */ +#define PERF_MEM_REGION_MEM6 0x0e /* Memory region 6 */ +#define PERF_MEM_REGION_MEM7 0x0f /* Memory region 7 */ +#define PERF_MEM_REGION_SHIFT 46 + #define PERF_MEM_S(a, s) \ (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT) diff --git a/tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h b/tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h index 47051871b436..6e1d5b955aae 100644 --- a/tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h +++ b/tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h @@ -77,7 +77,8 @@ */ #define IRQ_WORK_VECTOR 0xf6 -/* 0xf5 - unused, was UV_BAU_MESSAGE */ +#define PERF_GUEST_MEDIATED_PMI_VECTOR 0xf5 + #define DEFERRED_ERROR_VECTOR 0xf4 /* Vector on which hypervisor callbacks will be delivered */ diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 956ea273c2c7..01a21b6aa031 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -939,6 +939,7 @@ static bool perf_pmu__match_wildcard(const char *pmu_name, const char *tok) { const char *p, *suffix; bool has_hex = false; + bool has_underscore = false; size_t tok_len = strlen(tok); /* Check start of pmu_name for equality. */ @@ -949,13 +950,14 @@ static bool perf_pmu__match_wildcard(const char *pmu_name, const char *tok) if (*p == 0) return true; - if (*p == '_') { - ++p; - ++suffix; - } - - /* Ensure we end in a number */ + /* Ensure we end in a number or a mix of number and "_". */ while (1) { + if (!has_underscore && (*p == '_')) { + has_underscore = true; + ++p; + ++suffix; + } + if (!isxdigit(*p)) return false; if (!has_hex) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5b5b69c97665..64f58aa82137 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -6482,11 +6482,14 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = { .state = kvm_guest_state, .get_ip = kvm_guest_get_ip, .handle_intel_pt_intr = NULL, + .handle_mediated_pmi = NULL, }; void kvm_register_perf_callbacks(unsigned int (*pt_intr_handler)(void)) { kvm_guest_cbs.handle_intel_pt_intr = pt_intr_handler; + kvm_guest_cbs.handle_mediated_pmi = NULL; + perf_register_guest_info_callbacks(&kvm_guest_cbs); } void kvm_unregister_perf_callbacks(void) |
