summaryrefslogtreecommitdiff
path: root/tools/arch/arm64/include/asm
AgeCommit message (Collapse)Author
2025-12-24tools headers: Sync arm64 headers with kernel sourcesNamhyung Kim
To pick up changes from: b0a3f0e894f34e01 ("arm64/sysreg: Replace TCR_EL1 field macros") 3bbf004c4808e2c3 ("arm64: cputype: Add Neoverse-V3AE definitions") e185c8a0d84236d1 ("arm64: cputype: Add NVIDIA Olympus definitions") 52b49bd6de29a89a ("arm64: cputype: Remove duplicate Cortex-X1C definitions") This should address these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Please see tools/include/uapi/README. Note that this is still out of sync due to is_midr_in_range_list(). Reviewed-by: Leo Yan <leo.yan@arm.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-23tools headers arm64: Add NVIDIA Olympus partBesar Wicaksono
Add the part number and MIDR for NVIDIA Olympus. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-07Merge tag 'perf-tools-for-v6.19-2025-12-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "Perf event/metric description: Unify all event and metric descriptions in JSON format. Now event parsing and handling is greatly simplified by that. From users point of view, perf list will provide richer information about hardware events like the following. $ perf list hw List of pre-defined events (to be used in -e or -M): legacy hardware: branch-instructions [Retired branch instructions [This event is an alias of branches]. Unit: cpu] branch-misses [Mispredicted branch instructions. Unit: cpu] branches [Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu] bus-cycles [Bus cycles,which can be different from total cycles. Unit: cpu] cache-misses [Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu] cache-references [Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu] cpu-cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu] cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu] instructions [Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu] ref-cycles [Total cycles; not affected by CPU frequency scaling. Unit: cpu] But most notable changes would be in the perf stat. On the right side, the default metrics are better named and aligned. :) $ perf stat -- perf test -w noploop Performance counter stats for 'perf test -w noploop': 11 context-switches # 10.8 cs/sec cs_per_second 0 cpu-migrations # 0.0 migrations/sec migrations_per_second 3,612 page-faults # 3532.5 faults/sec page_faults_per_second 1,022.51 msec task-clock # 1.0 CPUs CPUs_utilized 110,466 branch-misses # 0.0 % branch_miss_rate (88.66%) 6,934,452,104 branches # 6781.8 M/sec branch_frequency (88.66%) 4,657,032,590 cpu-cycles # 4.6 GHz cycles_frequency (88.65%) 27,755,874,218 instructions # 6.0 instructions insn_per_cycle (89.03%) TopdownL1 # 0.3 % tma_backend_bound # 9.3 % tma_bad_speculation (89.05%) # 9.7 % tma_frontend_bound (77.86%) # 80.7 % tma_retiring (88.81%) 1.025318171 seconds time elapsed 1.013248000 seconds user 0.012014000 seconds sys Deferred unwinding support: With the kernel support (commit c69993ecdd4d: "perf: Support deferred user unwind"), perf can use deferred callchains for userspace stack trace with frame pointers like below: $ perf record --call-graph fp,defer ... This will be transparent to users when it comes to other commands like perf report and perf script. They will merge the deferred callchains to the previous samples as if they were collected together. ARM SPE updates - Extensive enhancements to support various kinds of memory operations including GCS, MTE allocation tags, memcpy/memset, register access, and SIMD operations. - Add inverted data source filter (inv_data_src_filter) support to exclude certain data sources. - Improve documentation. Vendor event updates: - Intel: Updated event files for Sierra Forest, Panther Lake, Meteor Lake, Lunar Lake, Granite Rapids, and others. - Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE definitions. - RISC-V: Added JSON support for T-HEAD C920V2. Misc: - Improve pointer tracking in data type profiling. It'd give better output when the variable is using container_of() to convert type. - Annotation support for perf c2c report in TUI. Press 'a' key to enter annotation view from cacheline browser window. This will show which instruction is causing the cacheline contention. - Lots of fixes and test coverage improvements!" * tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits) libperf: Use 'extern' in LIBPERF_API visibility macro perf stat: Improve handling of termination by signal perf tests stat: Add test for error for an offline CPU perf stat: When no events, don't report an error if there is none perf tests stat: Add "--null" coverage perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map perf stat: Allow no events to open if this is a "--null" run perf test kvm: Add some basic perf kvm test coverage perf tests evlist: Add basic evlist test perf tests script dlfilter: Add a dlfilter test perf tests kallsyms: Add basic kallsyms test perf tests timechart: Add a perf timechart test perf tests top: Add basic perf top coverage test perf tests buildid: Add purge and remove testing perf tests c2c: Add a basic c2c perf c2c: Clean up some defensive gets and make asan clean perf jitdump: Fix missed dso__put perf mem-events: Don't leak online CPU map perf hist: In init, ensure mem_info is put on error paths ...
2025-12-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM - Minor fixes to KVM and selftests Loongarch: - Get VM PMU capability from HW GCFG register - Add AVEC basic support - Use 64-bit register definition for EIOINTC - Add KVM timer test cases for tools/selftests RISC/V: - SBI message passing (MPXY) support for KVM guest - Give a new, more specific error subcode for the case when in-kernel AIA virtualization fails to allocate IMSIC VS-file - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually in small chunks - Fix guest page fault within HLV* instructions - Flush VS-stage TLB after VCPU migration for Andes cores s390: - Always allocate ESCA (Extended System Control Area), instead of starting with the basic SCA and converting to ESCA with the addition of the 65th vCPU. The price is increased number of exits (and worse performance) on z10 and earlier processor; ESCA was introduced by z114/z196 in 2010 - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups x86: - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO SPTE caching is disabled, as there can't be any relevant SPTEs to zap - Relocate a misplaced export - Fix an async #PF bug where KVM would clear the completion queue when the guest transitioned in and out of paging mode, e.g. when handling an SMI and then returning to paged mode via RSM - Leave KVM's user-return notifier registered even when disabling virtualization, as long as kvm.ko is loaded. On reboot/shutdown, keeping the notifier registered is ok; the kernel does not use the MSRs and the callback will run cleanly and restore host MSRs if the CPU manages to return to userspace before the system goes down - Use the checked version of {get,put}_user() - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC timers can result in a hard lockup in the host - Revert the periodic kvmclock sync logic now that KVM doesn't use a clocksource that's subject to NTP corrections - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter behind CONFIG_CPU_MITIGATIONS - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast path; the only reason they were handled in the fast path was to paper of a bug in the core #MC code, and that has long since been fixed - Add emulator support for AVX MOV instructions, to play nice with emulated devices whose guest drivers like to access PCI BARs with large multi-byte instructions x86 (AMD): - Fix a few missing "VMCB dirty" bugs - Fix the worst of KVM's lack of EFER.LMSLE emulation - Add AVIC support for addressing 4k vCPUs in x2AVIC mode - Fix incorrect handling of selective CR0 writes when checking intercepts during emulation of L2 instructions - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32] on VMRUN and #VMEXIT - Fix a bug where KVM corrupt the guest code stream when re-injecting a soft interrupt if the guest patched the underlying code after the VM-Exit, e.g. when Linux patches code with a temporary INT3 - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits to userspace, and extend KVM "support" to all policy bits that don't require any actual support from KVM x86 (Intel): - Use the root role from kvm_mmu_page to construct EPTPs instead of the current vCPU state, partly as worthwhile cleanup, but mostly to pave the way for tracking per-root TLB flushes, and elide EPT flushes on pCPU migration if the root is clean from a previous flush - Add a few missing nested consistency checks - Rip out support for doing "early" consistency checks via hardware as the functionality hasn't been used in years and is no longer useful in general; replace it with an off-by-default module param to WARN if hardware fails a check that KVM does not perform - Fix a currently-benign bug where KVM would drop the guest's SPEC_CTRL[63:32] on VM-Enter - Misc cleanups - Overhaul the TDX code to address systemic races where KVM (acting on behalf of userspace) could inadvertantly trigger lock contention in the TDX-Module; KVM was either working around these in weird, ugly ways, or was simply oblivious to them (though even Yan's devilish selftests could only break individual VMs, not the host kernel) - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a TDX vCPU, if creating said vCPU failed partway through - Fix a few sparse warnings (bad annotation, 0 != NULL) - Use struct_size() to simplify copying TDX capabilities to userspace - Fix a bug where TDX would effectively corrupt user-return MSR values if the TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected Selftests: - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM - Forcefully override ARCH from x86_64 to x86 to play nice with specifying ARCH=x86_64 on the command line - Extend a bunch of nested VMX to validate nested SVM as well - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to verify KVM can save/restore nested VMX state when L1 is using 5-level paging, but L2 is not - Clean up the guest paging code in anticipation of sharing the core logic for nested EPT and nested NPT guest_memfd: - Add NUMA mempolicy support for guest_memfd, and clean up a variety of rough edges in guest_memfd along the way - Define a CLASS to automatically handle get+put when grabbing a guest_memfd from a memslot to make it harder to leak references - Enhance KVM selftests to make it easer to develop and debug selftests like those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs often result in hard-to-debug SIGBUS errors - Misc cleanups Generic: - Use the recently-added WQ_PERCPU when creating the per-CPU workqueue for irqfd cleanup - Fix a goof in the dirty ring documentation - Fix choice of target for directed yield across different calls to kvm_vcpu_on_spin(); the function was always starting from the first vCPU instead of continuing the round-robin search" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits) KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions ...
2025-11-28Merge branch 'for-next/sysreg' into for-next/coreCatalin Marinas
* for-next/sysreg: : arm64 sysreg updates/cleanups arm64/sysreg: Remove unused define ARM64_FEATURE_FIELD_BITS KVM: arm64: selftests: Consider all 7 possible levels of cache KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last user arm64/sysreg: Add ICH_VMCR_EL2 arm64/sysreg: Move generation of RES0/RES1/UNKN to function arm64/sysreg: Support feature-specific fields with 'Prefix' descriptor arm64/sysreg: Fix checks for incomplete sysreg definitions arm64/sysreg: Replace TCR_EL1 field macros
2025-11-27KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last userBen Horgan
ARM64_FEATURE_FIELD_BITS is set to 4 but not all ID register fields are 4 bits. See for instance ID_AA64SMFR0_EL1. The last user of this define, ARM64_FEATURE_FIELD_BITS, is the set_id_regs selftest. Its logic assumes the fields aren't a single bits; assert that's the case and stop using the define. As there are no more users, ARM64_FEATURE_FIELD_BITS is removed from the arm64 tools sysreg.h header. A separate commit removes this from the kernel version of the header. Signed-off-by: Ben Horgan <ben.horgan@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-12KVM: selftests: Test for KVM_EXIT_ARM_SEAJiaqi Yan
Test how KVM handles guest SEA when APEI is unable to claim it, and KVM_CAP_ARM_SEA_TO_USER is enabled. The behavior is triggered by consuming recoverable memory error (UER) injected via EINJ. The test asserts two major things: 1. KVM returns to userspace with KVM_EXIT_ARM_SEA exit reason, and has provided expected fault information, e.g. esr, flags, gva, gpa. 2. Userspace is able to handle KVM_EXIT_ARM_SEA by injecting SEA to guest and KVM injects expected SEA into the VCPU. Tested on a data center server running Siryn AmpereOne processor that has RAS support. Several things to notice before attempting to run this selftest: - The test relies on EINJ support in both firmware and kernel to inject UER. Otherwise the test will be skipped. - The under-test platform's APEI should be unable to claim the SEA. Otherwise the test will be skipped. - Some platform doesn't support notrigger in EINJ, which may cause APEI and GHES to offline the memory before guest can consume injected UER, and making test unable to trigger SEA. Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> Link: https://msgid.link/20251013185903.1372553-3-jiaqiyan@google.com Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-11arm64: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headersThomas Huth
While the GCC and Clang compilers already define __ASSEMBLER__ automatically when compiling assembly code, __ASSEMBLY__ is a macro that only gets defined by the Makefiles in the kernel. This can be very confusing when switching between userspace and kernelspace coding, or when dealing with uapi headers that rather should use __ASSEMBLER__ instead. So let's standardize now on the __ASSEMBLER__ macro that is provided by the compilers. This is a mostly mechanical patch (done with a simple "sed -i" statement), except for the following files where comments with mis-spelled macros were tweaked manually: arch/arm64/include/asm/stacktrace/frame.h arch/arm64/include/asm/kvm_ptrauth.h arch/arm64/include/asm/debug-monitors.h arch/arm64/include/asm/esr.h arch/arm64/include/asm/scs.h arch/arm64/include/asm/memory.h Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-10-23tools: arm64: Add Cortex-A720AE definitionsKuninori Morimoto
Add cputype definitions for Cortex-A720AE. These will be used for errata detection in subsequent patches. These values can be found in the Cortex-A720AE TRM: https://developer.arm.com/documentation/102828/0001/ ... in Table A-187 Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-08-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Correctly handle 'invariant' system registers for protected VMs - Improved handling of VNCR data aborts, including external aborts - Fixes for handling of FEAT_RAS for NV guests, providing a sane fault context during SEA injection and preventing the use of RASv1p1 fault injection hardware - Ensure that page table destruction when a VM is destroyed gives an opportunity to reschedule - Large fix to KVM's infrastructure for managing guest context loaded on the CPU, addressing issues where the output of AT emulation doesn't get reflected to the guest - Fix AT S12 emulation to actually perform stage-2 translation when necessary - Avoid attempting vLPI irqbypass when GICv4 has been explicitly disabled for a VM - Minor KVM + selftest fixes RISC-V: - Fix pte settings within kvm_riscv_gstage_ioremap() - Fix comments in kvm_riscv_check_vcpu_requests() - Fix stack overrun when setting vlenb via ONE_REG x86: - Use array_index_nospec() to sanitize the target vCPU ID when handling PV IPIs and yields as the ID is guest-controlled. - Drop a superfluous cpumask_empty() check when reclaiming SEV memory, as the common case, by far, is that at least one CPU will have entered the VM, and wbnoinvd_on_cpus_mask() will naturally handle the rare case where the set of have_run_cpus is empty. Selftests (not KVM): - Rename the is_signed_type() macro in kselftest_harness.h to is_signed_var() to fix a collision with linux/overflow.h. The collision generates compiler warnings due to the two macros having different meaning" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits) KVM: arm64: nv: Fix ATS12 handling of single-stage translation KVM: arm64: Remove __vcpu_{read,write}_sys_reg_{from,to}_cpu() KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors KVM: arm64: Simplify sysreg access on exception delivery KVM: arm64: Check for SYSREGS_ON_CPU before accessing the 32bit state RISC-V: KVM: fix stack overrun when loading vlenb RISC-V: KVM: Correct kvm_riscv_check_vcpu_requests() comment RISC-V: KVM: Fix pte settings within kvm_riscv_gstage_ioremap() KVM: arm64: selftests: Sync ID_AA64MMFR3_EL1 in set_id_regs KVM: arm64: Get rid of ARM64_FEATURE_MASK() KVM: arm64: Make ID_AA64PFR1_EL1.RAS_frac writable KVM: arm64: Make ID_AA64PFR0_EL1.RAS writable KVM: arm64: Ignore HCR_EL2.FIEN set by L1 guest's EL2 KVM: arm64: Handle RASv1p1 registers arm64: Add capability denoting FEAT_RASv1p1 KVM: arm64: Reschedule as needed when destroying the stage-2 page-tables KVM: arm64: Split kvm_pgtable_stage2_destroy() selftests: harness: Rename is_signed_type() to avoid collision with overflow.h KVM: SEV: don't check have_run_cpus in sev_writeback_caches() KVM: arm64: Correctly populate FAR_EL2 on nested SEA injection ...
2025-08-21KVM: arm64: Get rid of ARM64_FEATURE_MASK()Marc Zyngier
The ARM64_FEATURE_MASK() macro was a hack introduce whilst the automatic generation of sysreg encoding was introduced, and was too unreliable to be entirely trusted. We are in a better place now, and we could really do without this macro. Get rid of it altogether. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250817202158.395078-7-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-18tools headers: Sync arm64 headers with the kernel sourceNamhyung Kim
To pick up the changes in this cset: efe676a1a7554219 arm64: proton-pack: Add new CPUs 'k' values for branch mitigation e18c09b204e81702 arm64: Add support for HIP09 Spectre-BHB mitigation a9b5bd81b294d30a arm64: cputype: Add MIDR_CORTEX_A76AE 53a52a0ec7680287 arm64: cputype: Add comments about Qualcomm Kryo 5XX and 6XX cores 401c3333bb2396aa arm64: cputype: Add QCOM_CPU_PART_KRYO_3XX_GOLD 86edf6bdcf0571c0 smccc/kvm_guest: Enable errata based on implementation CPUs 0bc9a9e85fcf4ffb KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h But the following two changes cannot be applied since they introduced new build errors in util/arm-spe.c. So it still has the warning after this change. c8c2647e69bedf80 arm64: Make  _midr_in_range_list() an exported function e3121298c7fcaf48 arm64: Modify _midr_range() functions to read MIDR/REVIDR internally Please see tools/include/uapi/README for further details. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> perf build: [WIP] Fix arm-spe build errors Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-03Merge tag 'perf-tools-for-v6.16-1-2025-06-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Arnaldo Carvalho de Melo: "perf report/top/annotate TUI: - Accept the left arrow key as a Zoom out if done on the first column - Show if source code toggle status in title, to help spotting bugs with the various disassemblers (capstone, llvm, objdump) - Provide feedback on unhandled hotkeys Build: - Better inform when certain features are not available with warnings in the build process and in 'perf version --build-options' or 'perf -vv' perf record: - Improve the --off-cpu code by synthesizing events for switch-out -> switch-in intervals using a BPF program. This can be fine tuned using a --off-cpu-thresh knob perf report: - Add 'tgid' sort key perf mem/c2c: - Add 'op', 'cache', 'snoop', 'dtlb' output fields - Add support for 'ldlat' on AMD IBS (Instruction Based Sampling) perf ftrace: - Use process/session specific trace settings instead of messing with the global ftrace knobs perf trace: - Implement syscall summary in BPF - Support --summary-mode=cgroup - Always print return value for syscalls returning a pid - The rseq and set_robust_list don't return a pid, just -errno perf lock contention: - Symbolize zone->lock using BTF - Add -J/--inject-delay option to estimate impact on application performance by optimization of kernel locking behavior perf stat: - Improve hybrid support for the NMI watchdog warning Symbol resolution: - Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust symbols - Improve Rust demangler Hardware tracing: Intel PT: - Fix PEBS-via-PT data_src - Do not default to recording all switch events - Fix pattern matching with python3 on the SQL viewer script arm64: - Fixups for the hip08 hha PMU Vendor events: - Update Intel events/metrics files for alderlake, alderlaken, arrowlake, bonnell, broadwell, broadwellde, broadwellx, cascadelakex, clearwaterforest, elkhartlake, emeraldrapids, grandridge, graniterapids, haswell, haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, lunarlake, meteorlake, nehalemep, nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest, skylake, skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp, westmereep-sx python support: - Add support for event counts in the python binding, add a counting.py example perf list: - Display the PMU name associated with a perf metric in JSON perf test: - Hybrid improvements for metric value validation test - Fix LBR test by ignoring idle task - Add AMD IBS sw filter ana d'ldlat' tests - Add 'perf trace --summary-mode=cgroup' test - Add tests for the various language symbol demanglers Miscellaneous: - Allow specifying the cpu an event will be tied using '-e event/cpu=N/' - Sync various headers with the kernel sources - Add annotations to use clang's -Wthread-safety and fix some problems it detected - Make dump_stack() use perf's symbol resolution to provide better backtraces - Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS (Precision Event-Based Sampling), that adds timing info, the retirement latency of instructions - Various memory allocation (some detected by ASAN) and reference counting fixes - Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace PERF_RECORD_COMPRESSED - Skip unsupported event types in perf.data files, don't stop when finding one - Improve lookups using hashmaps and binary searches" * tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (206 commits) perf callchain: Always populate the addr_location map when adding IP perf lock contention: Reject more than 10ms delays for safety perf trace: Set errpid to false for rseq and set_robust_list perf symbol: Move demangling code out of symbol-elf.c perf trace: Always print return value for syscalls returning a pid perf script: Print PERF_AUX_FLAG_COLLISION flag perf mem: Show absolute percent in mem_stat output perf mem: Display sort order only if it's available perf mem: Describe overhead calculation in brief perf record: Fix incorrect --user-regs comments Revert "perf thread: Ensure comm_lock held for comm_list" perf test trace_summary: Skip --bpf-summary tests if no libbpf perf test intel-pt: Skip jitdump test if no libelf perf intel-tpebs: Avoid race when evlist is being deleted perf test demangle-java: Don't segv if demangling fails perf symbol: Fix use-after-free in filename__read_build_id perf pmu: Avoid segv for missing name/alias_name in wildcarding perf machine: Factor creating a "live" machine out of dwarf-unwind perf test: Add AMD IBS sw filter test perf mem: Count L2 HITM for c2c statistic ...
2025-05-27perf arm-spe: Add support for SPE Data Source packet on HiSilicon HIP12Yicong Yang
Add data source encoding for HiSilicon HIP12 and coresponding mapping to the perf's memory data source. This will help to synthesize the data and support upper layer tools like perf-mem and perf-c2c. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: CaiJingtao <caijingtao@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Yushan Wang <wangyushan12@huawei.com> Cc: Zeng Tao <prime.zeng@hisilicon.com> Cc: xueshan2@huawei.com Link: https://lore.kernel.org/r/20250425033845.57671-3-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-06arm64: tools: Resync sysreg.hMarc Zyngier
Perform a bulk resync of tools/arch/arm64/include/asm/sysreg.h. Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-03-03arm64: sysreg: Add layout for ICH_MISR_EL2Marc Zyngier
The ICH_MISR_EL2-related macros are missing a number of status bits that we are about to handle. Take this opportunity to fully describe the layout of that register as part of the automatic generation infrastructure. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-4-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03arm64: sysreg: Add layout for ICH_VTR_EL2Marc Zyngier
The ICH_VTR_EL2-related macros are missing a number of config bits that we are about to handle. Take this opportunity to fully describe the layout of that register as part of the automatic generation infrastructure. This results in a bit of churn to repaint constants that are now generated with a different format. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03arm64: sysreg: Add layout for ICH_HCR_EL2Marc Zyngier
The ICH_HCR_EL2-related macros are missing a number of control bits that we are about to handle. Take this opportunity to fully describe the layout of that register as part of the automatic generation infrastructure. This results in a bit of churn, unfortunately. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-01-12arm64/sysreg/tools: Move TRFCR definitions to sysregJames Clark
Convert TRFCR to automatic generation. Add separate definitions for ELx and EL2 as TRFCR_EL1 doesn't have CX. This also mirrors the previous definition so no code change is required. Also add TRFCR_EL12 which will start to be used in a later commit. Unfortunately, to avoid breaking the Perf build with duplicate definition errors, the tools copy of the sysreg.h header needs to be updated at the same time rather than the usual second commit. This is because the generated version of sysreg (arch/arm64/include/generated/asm/sysreg-defs.h), is currently shared and tools/ does not have its own copy. Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250106142446.628923-4-james.clark@linaro.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-01-12tools: arm64: Update sysreg.h header filesJames Clark
Created with the following: cp include/linux/kasan-tags.h tools/include/linux/ cp arch/arm64/include/asm/sysreg.h tools/arch/arm64/include/asm/ Update the tools copy of sysreg.h so that the next commit to add a new register doesn't have unrelated changes in it. Because the new version of sysreg.h includes kasan-tags.h, that file also now needs to be copied into tools. Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250106142446.628923-3-james.clark@linaro.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-11-14Merge tag 'kvmarm-6.13' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.13, part #1 - Support for stage-1 permission indirection (FEAT_S1PIE) and permission overlays (FEAT_S1POE), including nested virt + the emulated page table walker - Introduce PSCI SYSTEM_OFF2 support to KVM + client driver. This call was introduced in PSCIv1.3 as a mechanism to request hibernation, similar to the S4 state in ACPI - Explicitly trap + hide FEAT_MPAM (QoS controls) from KVM guests. As part of it, introduce trivial initialization of the host's MPAM context so KVM can use the corresponding traps - PMU support under nested virtualization, honoring the guest hypervisor's trap configuration and event filtering when running a nested guest - Fixes to vgic ITS serialization where stale device/interrupt table entries are not zeroed when the mapping is invalidated by the VM - Avoid emulated MMIO completion if userspace has requested synchronous external abort injection - Various fixes and cleanups affecting pKVM, vCPU initialization, and selftests
2024-10-28tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: 924725707d80bc25 ("arm64: cputype: Add Neoverse-N3 definitions") That makes this perf source code to be rebuilt: CC /tmp/build/perf-tools/util/arm-spe.o The changes in the above patch add MIDR_NEOVERSE_N3, that probably need changes in arm-spe.c, so probably we need to add it to that array? Or maybe we need to leave this for later when this is all tested on those machines? static const struct midr_range neoverse_spe[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), {}, }; Mark Rutland recommended about arm-spe.c in a previous update to this file: "I would not touch this for now -- someone would have to go audit the TRMs to check that those other cores have the same encoding, and I think it'd be better to do that as a follow-up." That addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Zx-dffKdGsgkhG96@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-26tools: arm64: Grab a copy of esr.h from kernelOliver Upton
Grab esr.h and brk-imm.h for subsequent use in KVM selftests. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241025203106.3529261-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-02tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: db0d8a84348b876d ("arm64: errata: Enable the AC03_CPU_38 workaround for ampere1a") That makes this perf source code to be rebuilt: CC /tmp/build/perf-tools/util/arm-spe.o The changes in the above patch add MIDR_AMPERE1A, used in arm-spe.c, so probably we need to add it to that array? Or maybe we need to leave this for later when this is all tested on those machines? static const struct midr_range neoverse_spe[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), {}, }; Mark Rutland recommended about arm-spe.c in a previous update to this file: "I would not touch this for now -- someone would have to go audit the TRMs to check that those other cores have the same encoding, and I think it'd be better to do that as a follow-up." That addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: D Scott Phillips <scott@os.amperecomputing.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/lkml/ZvtFu7J-Awy2zuEJ@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-07tools/include: Sync arm64 headers with the kernel sourcesNamhyung Kim
To pick up changes from: 9ef54a384526 arm64: cputype: Add Cortex-A725 definitions 58d245e03c32 arm64: cputype: Add Cortex-X1C definitions fd2ff5f0b320 arm64: cputype: Add Cortex-X925 definitions add332c40328 arm64: cputype: Add Cortex-A720 definitions be5a6f238700 arm64: cputype: Add Cortex-X3 definitions This should be used to beautify x86 syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-06-04tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: 0ce85db6c2141b7f ("arm64: cputype: Add Neoverse-V3 definitions") 02a0a04676fa7796 ("arm64: cputype: Add Cortex-X4 definitions") f4d9d9dcc70b96b5 ("arm64: Add Neoverse-V2 part") That makes this perf source code to be rebuilt: CC /tmp/build/perf-tools/util/arm-spe.o The changes in the above patch add MIDR_NEOVERSE_V[23] and MIDR_NEOVERSE_V1 is used in arm-spe.c, so probably we need to add those and perhaps MIDR_CORTEX_X4 to that array? Or maybe we need to leave this for later when this is all tested on those machines? static const struct midr_range neoverse_spe[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), {}, }; Mark Rutland recommended about arm-spe.c: "I would not touch this for now -- someone would have to go audit the TRMs to check that those other cores have the same encoding, and I think it'd be better to do that as a follow-up." That addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Besar Wicaksono <bwicaksono@nvidia.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/lkml/Zl8cYk0Tai2fs7aM@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-14Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The most interesting parts are probably the mm changes from Ryan which optimise the creation of the linear mapping at boot and (separately) implement write-protect support for userfaultfd. Outside of our usual directories, the Kbuild-related changes under scripts/ have been acked by Masahiro whilst the drivers/acpi/ parts have been acked by Rafael and the addition of cpumask_any_and_but() has been acked by Yury. ACPI: - Support for the Firmware ACPI Control Structure (FACS) signature feature which is used to reboot out of hibernation on some systems Kbuild: - Support for building Flat Image Tree (FIT) images, where the kernel Image is compressed alongside a set of devicetree blobs Memory management: - Optimisation of our early page-table manipulation for creation of the linear mapping - Support for userfaultfd write protection, which brings along some nice cleanups to our handling of invalid but present ptes - Extend our use of range TLBI invalidation at EL1 Perf and PMUs: - Ensure that the 'pmu->parent' pointer is correctly initialised by PMU drivers - Avoid allocating 'cpumask_t' types on the stack in some PMU drivers - Fix parsing of the CPU PMU "version" field in assembly code, as it doesn't follow the usual architectural rules - Add best-effort unwinding support for USER_STACKTRACE - Minor driver fixes and cleanups Selftests: - Minor cleanups to the arm64 selftests (missing NULL check, unused variable) Miscellaneous: - Add a command-line alias for disabling 32-bit application support - Add part number for Neoverse-V2 CPUs - Minor fixes and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits) arm64/mm: Fix pud_user_accessible_page() for PGTABLE_LEVELS <= 2 arm64/mm: Add uffd write-protect support arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG arm64/mm: Remove PTE_PROT_NONE bit arm64/mm: generalize PMD_PRESENT_INVALID for all levels arm64: simplify arch_static_branch/_jump function arm64: Add USER_STACKTRACE support arm64: Add the arm64.no32bit_el0 command line option drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset() drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group drivers/perf: hisi_pcie: Fix out-of-bound access when valid event group kselftest: arm64: Add a null pointer check arm64: defer clearing DAIF.D arm64: assembler: update stale comment for disable_step_tsk arm64/sysreg: Update PIE permission encodings kselftest/arm64: Remove unused parameters in abi test perf/arm-spe: Assign parents for event_source device perf/arm-smmuv3: Assign parents for event_source device perf/arm-dsu: Assign parents for event_source device perf/arm-dmc620: Assign parents for event_source device ...
2024-04-28arm64/sysreg: Update PIE permission encodingsShiqi Liu
Fix left shift overflow issue when the parameter idx is greater than or equal to 8 in the calculation of perm in PIRx_ELx_PERM macro. Fix this by modifying the encoding to use a long integer type. Signed-off-by: Shiqi Liu <shiqiliu@hust.edu.cn> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20240421063328.29710-1-shiqiliu@hust.edu.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-11tools/include: Sync arm64 asm/cputype.h with the kernel sourcesNamhyung Kim
To pick up the changes from: fb091ff39479 ("arm64: Subscribe Microsoft Azure Cobalt 100 to ARM Neoverse N2 errata") This should address these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240408185520.1550865-10-namhyung@kernel.org
2023-11-22tools headers: Update tools's copy of arm64/asm headersNamhyung Kim
tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/*.sh | head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char *fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-9-namhyung@kernel.org
2023-10-18tools headers arm64: Update sysreg.h with kernel sourcesJing Zhang
The users of sysreg.h (perf, KVM selftests) are now generating the necessary sysreg-defs.h; sync sysreg.h with the kernel sources and fix the KVM selftests that use macros which suffered a rename. Signed-off-by: Jing Zhang <jingzhangos@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231011195740.3349631-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-07-14tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: e910baa9c1efdf76 ("KVM: arm64: vgic: Add Apple M2 PRO/MAX cpus to the list of broken SEIS implementations") That makes this perf source code to be rebuilt: CC /tmp/build/perf-tools/util/arm-spe.o The changes in the above patch don't affect things that are used in arm-spe.c (things like MIDR_NEOVERSE_N1, etc). Unsure if Apple M2 has SPE (Statistical Profiling Extension) :-) That addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-18tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: decb17aeb8fa2148 ("KVM: arm64: vgic: Add Apple M2 cpus to the list of broken SEIS implementations") 07e39e60bbf0ccd5 ("arm64: Add Cortex-715 CPU part definition") 8ec8490a1950efec ("arm64: Fix bit-shifting UB in the MIDR_CPU_MODEL() macro") That addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h' diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Ali Saidi <alisaidi@amazon.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: D Scott Phillips <scott@os.amperecomputing.com> Cc: German Gomez <german.gomez@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/Y8fvEGCGn+227qW0@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: 0e5d5ae837c8ce04 ("arm64: Add AMPERE1 to the Spectre-BHB affected list") That addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h' diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: D Scott Phillips <scott@os.amperecomputing.com> https://lore.kernel.org/lkml/Y1fy5GD7ZYvkeufv@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: cae889302ebf5a9b ("KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround") That addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h' diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/lkml/Yq8w7p4omYKNwOij@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: 83bea32ac7ed37bb ("arm64: Add part number for Arm Cortex-A78AE") That addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h' diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Chanho Park <chanho61.park@samsung.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-26tools arm64: Import cputype.hAli Saidi
Bring-in the kernel's arch/arm64/include/asm/cputype.h into tools/ for arm64 to make use of all the core-type definitions in perf. Replace sysreg.h with the version already imported into tools/. Committer notes: Added an entry to tools/perf/check-headers.sh, so that we get notified when the original file in the kernel sources gets modified. Tester notes: LGTM. I did the testing on both my x86 and Arm64 platforms, thanks for the fixing up. Signed-off-by: Ali Saidi <alisaidi@amazon.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick.Forrington@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220324183323.31414-2-alisaidi@amazon.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-17tools: arm64: Import sysreg.hRaghavendra Rao Ananta
Bring-in the kernel's arch/arm64/include/asm/sysreg.h into tools/ for arm64 to make use of all the standard register definitions in consistence with the kernel. Make use of the register read/write definitions from sysreg.h, instead of the existing definitions. A syntax correction is needed for the files that use write_sysreg() to make it compliant with the new (kernel's) syntax. Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> [maz: squashed two commits in order to keep the series bisectable] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211007233439.1826892-3-rananta@google.com Link: https://lore.kernel.org/r/20211007233439.1826892-4-rananta@google.com
2019-04-11tools: add smp_* barrier variants to include infrastructureDaniel Borkmann
Add the definition for smp_rmb(), smp_wmb(), and smp_mb() to the tools include infrastructure: this patch adds the implementation for x86-64 and arm64, and have it fall back as currently is for other archs which do not have it implemented at this point. The x86-64 one uses lock + add combination for smp_mb() with address below red zone. This is on top of 09d62154f613 ("tools, perf: add and use optimized ring_buffer_{read_head, write_tail} helpers"), which didn't touch smp_* barrier implementations. Magnus recently rightfully reported however that the latter on x86-64 still wrongly falls back to sfence, lfence and mfence respectively, thus fix that for applications under tools making use of these to avoid such ugly surprises. The main header under tools (include/asm/barrier.h) will in that case not select the fallback implementation. Reported-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-01tools headers barrier: Fix arm64 tools build failure wrt ↵Will Deacon
smp_load_{acquire,release} Cheers for reporting this. I managed to reproduce the build failure with gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1). The code in question is the arm64 versions of smp_load_acquire() and smp_store_release(). Unlike other architectures, these are not built around READ_ONCE() and WRITE_ONCE() since we have instructions we can use instead of fences. Bringing our macros up-to-date with those (i.e. tweaking the union initialisation and using the special "uXX_alias_t" types) appears to fix the issue for me. Committer notes: Testing it in the systems previously failing: # time dm android-ndk:r12b-arm \ android-ndk:r15c-arm \ debian:experimental-x-arm64 \ ubuntu:14.04.4-x-linaro-arm64 \ ubuntu:16.04-x-arm \ ubuntu:16.04-x-arm64 \ ubuntu:18.04-x-arm \ ubuntu:18.04-x-arm64 1 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease) 2 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease) 3 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0 4 ubuntu:14.04.4-x-linaro-arm64 : Ok aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0 5 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 6 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 7 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0 8 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0 Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181031174408.GA27871@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-19tools, perf: add and use optimized ring_buffer_{read_head, write_tail} helpersDaniel Borkmann
Currently, on x86-64, perf uses LFENCE and MFENCE (rmb() and mb(), respectively) when processing events from the perf ring buffer which is unnecessarily expensive as we can do more lightweight in particular given this is critical fast-path in perf. According to Peter rmb()/mb() were added back then via a94d342b9cb0 ("tools/perf: Add required memory barriers") at a time where kernel still supported chips that needed it, but nowadays support for these has been ditched completely, therefore we can fix them up as well. While for x86-64, replacing rmb() and mb() with smp_*() variants would result in just a compiler barrier for the former and LOCK + ADD for the latter (__sync_synchronize() uses slower MFENCE by the way), Peter suggested we can use smp_{load_acquire,store_release}() instead for architectures where its implementation doesn't resolve in slower smp_mb(). Thus, e.g. in x86-64 we would be able to avoid CPU barrier entirely due to TSO. For architectures where the latter needs to use smp_mb() e.g. on arm, we stick to cheaper smp_rmb() variant for fetching the head. This work adds helpers ring_buffer_read_head() and ring_buffer_write_tail() for tools infrastructure that either switches to smp_load_acquire() for architectures where it is cheaper or uses READ_ONCE() + smp_rmb() barrier for those where it's not in order to fetch the data_head from the perf control page, and it uses smp_store_release() to write the data_tail. Latter is smp_mb() + WRITE_ONCE() combination or a cheaper variant if architecture allows for it. Those that rely on smp_rmb() and smp_mb() can further improve performance in a follow up step by implementing the two under tools/arch/*/include/asm/barrier.h such that they don't have to fallback to rmb() and mb() in tools/include/asm/barrier.h. Switch perf to use ring_buffer_read_head() and ring_buffer_write_tail() so it can make use of the optimizations. Later, we convert libbpf as well to use the same helpers. Side note [0]: the topic has been raised of whether one could simply use the C11 gcc builtins [1] for the smp_load_acquire() and smp_store_release() instead: __atomic_load_n(ptr, __ATOMIC_ACQUIRE); __atomic_store_n(ptr, val, __ATOMIC_RELEASE); Kernel and (presumably) tooling shipped along with the kernel has a minimum requirement of being able to build with gcc-4.6 and the latter does not have C11 builtins. While generally the C11 memory models don't align with the kernel's, the C11 load-acquire and store-release alone /could/ suffice, however. Issue is that this is implementation dependent on how the load-acquire and store-release is done by the compiler and the mapping of supported compilers must align to be compatible with the kernel's implementation, and thus needs to be verified/tracked on a case by case basis whether they match (unless an architecture uses them also from kernel side). The implementations for smp_load_acquire() and smp_store_release() in this patch have been adapted from the kernel side ones to have a concrete and compatible mapping in place. [0] http://patchwork.ozlabs.org/patch/985422/ [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08perf tools: Move arm(64) barrier.h stuff to ↵Arnaldo Carvalho de Melo
tools/arch/arm*/include/asm/barrier.h We will need it for atomic.h, so move it from the ad-hoc tools/perf/ place to a tools/ subset of the kernel arch/ hierarchy. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-cgfhreaejd7ohitdjccu9k2o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>