summaryrefslogtreecommitdiff
path: root/tools/perf/util/arm-spe-decoder
AgeCommit message (Collapse)Author
2025-11-18perf arm_spe: Expose SIMD information in other operationsLeo Yan
The other operations contain SME data processing, ASE (Advanced SIMD) and floating-point operations. Expose these info in the records. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Report GCS in recordLeo Yan
Report GCS related info in records. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Report memset and memcpy in recordsLeo Yan
Expose memset and memcpy related info in records. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Report associated info for SVE / SME operationsLeo Yan
SVE / SME operations can be predicated or Gather load / scatter store, save the relevant info into record. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Report extended memory operations in recordsLeo Yan
Extended memory operations include atomic (AT), acquire/release (AR), and exclusive (EXCL) operations. Save the relevant information in the records. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Report MTE allocation tag in recordLeo Yan
Save MTE tag info in memory record. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Report register access in recordLeo Yan
Record register access info for load / store operations. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Introduce data processing macro for SVE operationsLeo Yan
Introduce the ARM_SPE_OP_DP (data processing) macro as associated information for SVE operations. For SVE register access, only ARM_SPE_OP_SVE is set; for SVE data processing, both ARM_SPE_OP_SVE and ARM_SPE_OP_DP are set together. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Consolidate operation typesLeo Yan
Consolidate operation types in a way: (a) Extract the second-level types into separate enums. (b) The second-level types for memory and SIMD operations are classified by modules. E.g., an operation may relate to general register, SIMD/FP, SVE, etc. (c) The associated information tells details. E.g., an operation is load or store, whether it is atomic operation, etc. Start the enum items for the second-level types from 8 to accommodate more entries within a 32-bit integer. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Remove unused operation typesLeo Yan
Remove unused SVE operation types. These operations will be reintroduced in subsequent refactoring, but with a different format. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Decode SME data processing packetLeo Yan
For SME data processing, decode its Effective vector length or Tile Size (ETS), and print out if a floating-point operation. After: . 00000000: 49 00 SME-OTHER ETS 1024 FP . 00000002: b2 18 3c d7 83 00 80 ff ff VA 0xffff800083d73c18 . 0000000b: 9a 00 00 LAT 0 XLAT . 0000000e: 43 00 DATA-SOURCE 0 Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Decode ASE and FP fields in other operationLeo Yan
Add a check for other operation, which prevents any incorrectly classifying. Parse the ASE and FP fields. After: . 0000002f: 48 06 OTHER ASE FP INSN-OTHER . 00000031: b2 08 80 48 01 08 00 ff ff VA 0xffff000801488008 . 0000003a: 9a 00 00 LAT 0 XLAT . 0000003d: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Rename SPE_OP_PKT_IS_OTHER_SVE_OP macroLeo Yan
Rename the macro to SPE_OP_PKT_OTHER_SUBCLASS_SVE to unify naming. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Decode GCS operationLeo Yan
Decode a load or store from a GCS operation and the associated "common" field. After: . 00000000: 49 44 LD GCS COMM . 00000002: b2 18 3c d7 83 00 80 ff ff VA 0xffff800083d73c18 . 0000000b: 9a 00 00 LAT 0 XLAT . 0000000e: 43 00 DATA-SOURCE 0 Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Unify operation namingLeo Yan
Rename extended subclass and SVE/SME register access subclass, so that the naming can be consistent cross all sub classes. Add an log "SVE-SME-REG" for the SVE/SME register access, this is easier for parsing. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18perf arm_spe: Fix memset subclass in operationLeo Yan
The operation subclass is extracted from bits [7..1] of the payload. Since bit [0] is not parsed, there is no chance to match the memset type (0x25). As a result, the memset payload is never parsed successfully. Instead of extracting a unified bit field, change to extract the specific bits for each operation subclass. Fixes: 34fb60400e32 ("perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/store") Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-13perf build: Remove NO_AUXTRACE build optionIan Rogers
The NO_AUXTRACE build option was used when the __get_cpuid feature test failed or if it was provided on the command line. The option no longer avoids a dependency on a library and so having the option is just adding complexity to the code base. Remove the option CONFIG_AUXTRACE from Build files and HAVE_AUXTRACE_SUPPORT by assuming it is always defined. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-09-19perf arm_spe: Set HITM flagLeo Yan
Since FEAT_SPEv1p4, Arm SPE provides two extra events: "Cache data modified" and "Data snooped". Set the snoop mode as: - If both the "Cache data modified" event and the "Data snooped" event are set, which indicates a load operation that snooped from a outside cache and hit a modified copy, set the HITM flag to inspect false sharing. - If the snooped event bit is not set, and the snooped event has been supported by the hardware, set as NONE mode (no snoop operation). - If the snooped event bit is not set, and the event is not supported or absent the events info in the meta data, set as NA mode (not available). Don't set any mode for only "Cache data modified" event, as it hits a local modified copy. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf arm_spe: Fill memory levels for FEAT_SPEv1p4Leo Yan
Starting with FEAT_SPEv1p4, Arm SPE provides information on Level 2 data cache and recently fetched events. This patch fills in the memory levels for these new events. The recently fetched events are matched to line-fill buffer (LFB). In general, the latency for accessing LFB is higher than accessing L1 cache but lower than accessing L2 cache. Thus, it locates in the memory hierarchy information between L1 cache and L2 cache. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf arm_spe: Decode event types for new featuresLeo Yan
Decode new event types introduced by FEAT_SPEv1p4, FEAT_SPE_SME and FEAT_SPE_SME. The printed event names don't strictly follow the naming in the Arm ARM. For example, the "Cache data modified" event is shown as "HITM", and the "Data snooped" event is printed as "SNOOPED". Shorter names are easier to read while preserving core meanings. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf arm_spe: Directly propagate raw eventLeo Yan
Two sets of event bits are defined: one for generating samples and another are raw event bits used in the backend decoder. Reduce the redundancy by using the raw event bits directly in the frontend code. To avoid overflow issues, change the type of the event variable from enum to u64. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-27perf arm-spe: Add support for SPE Data Source packet on HiSilicon HIP12Yicong Yang
Add data source encoding for HiSilicon HIP12 and coresponding mapping to the perf's memory data source. This will help to synthesize the data and support upper layer tools like perf-mem and perf-c2c. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: CaiJingtao <caijingtao@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Yushan Wang <wangyushan12@huawei.com> Cc: Zeng Tao <prime.zeng@hisilicon.com> Cc: xueshan2@huawei.com Link: https://lore.kernel.org/r/20250425033845.57671-3-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-03-05perf arm-spe: Support previous branch target (PBT) addressLeo Yan
When FEAT_SPE_PBT is implemented, the previous branch target address (named as PBT) before the sampled operation, will be recorded. This commit first introduces a 'prev_br_tgt' field in the record for saving the PBT address in the decoder. If the current operation is a branch instruction, by combining with PBT, it can create a chain with two consecutive branches. As the branch stack stores branches in descending order, meaning a newer branch is stored in a lower entry in the stack. Arm SPE stores the latest branch in the first entry of branch stack, and the previous branch coming from PBT is stored into the second entry. Otherwise, if current operation is not a branch, the last branch will be saved for PBT only. PBT lacks associated information such as branch source address, branch type, and events. The branch entry fills zeros for the corresponding fields and only set its target address. After: perf script -f --itrace=bl -F flags,addr,brstack jcc ffff800080187914 0xffff8000801878fc/0xffff800080187914/P/-/-/1/COND/- 0x0/0xffff8000801878f8/-/-/-/0//- jcc ffff8000802d12d8 0xffff8000802d12f8/0xffff8000802d12d8/P/-/-/1/COND/- 0x0/0xffff8000802d12ec/-/-/-/0//- jcc ffff8000813fe200 0xffff8000813fe20c/0xffff8000813fe200/P/-/-/1/COND/- 0x0/0xffff8000813fe200/-/-/-/0//- jcc ffff8000813fe200 0xffff8000813fe20c/0xffff8000813fe200/P/-/-/1/COND/- 0x0/0xffff8000813fe200/-/-/-/0//- jmp ffff800081410980 0xffff800081419108/0xffff800081410980/P/-/-/1//- 0x0/0xffff800081419104/-/-/-/0//- return ffff80008036e064 0xffff80008141ba84/0xffff80008036e064/P/-/-/1/RET/- 0x0/0xffff80008141ba60/-/-/-/0//- jcc ffff8000803d54f0 0xffff8000803d54e8/0xffff8000803d54f0/P/-/-/1/COND/- 0x0/0xffff8000803d54e0/-/-/-/0//- jmp ffff80008015e468 0xffff8000803d46dc/0xffff80008015e468/P/-/-/1//- 0x0/0xffff8000803d46c8/-/-/-/0//- jmp ffff8000806e2d50 0xffff80008040f710/0xffff8000806e2d50/P/-/-/1//- 0x0/0xffff80008040f6e8/-/-/-/0//- jcc ffff800080721704 0xffff8000807216b4/0xffff800080721704/P/-/-/1/COND/- 0x0/0xffff8000807216ac/-/-/-/0//- Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250304111240.3378214-13-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05perf arm-spe: Fill branch operations and events to recordLeo Yan
The new added branch operations and events are filled into record, the information will be consumed when synthesizing samples. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250304111240.3378214-10-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05perf arm-spe: Decode transactional eventLeo Yan
The bit[16] in an event payload indicates an operation is in transactional state. Decode the bit. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250304111240.3378214-9-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05perf arm-spe: Extend branch operationsLeo Yan
In Arm ARM (ARM DDI 0487, L.a), the section "D18.2.7 Operation Type packet", the branch subclass is extended for Call Return (CR), Guarded control stack data access (GCS). This commit adds support CR and GCS operations. The IND (indirect) operation is defined only in bit [1], its macro is updated accordingly. Move the COND (Conditional) macro into the same group with other operations for better maintenance. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250304111240.3378214-8-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-09perf arm-spe: Add support for SPE Data Source packet on AmpereOneIlkka Koskinen
Decode SPE Data Source packets on AmpereOne. The field is IMPDEF. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20241108202946.16835-3-ilkka@os.amperecomputing.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-14perf arm-spe: Rename the common data source encodingLeo Yan
The Neoverse CPUs follow the common data source encoding, and other CPU variants can share the same format. Rename the CPU list and data source definitions as common data source names. This change prepares for appending more CPU variants. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241003185322.192357-3-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-02move asm/unaligned.h to linux/unaligned.hAl Viro
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-06-26perf util: Make util its own libraryIan Rogers
Make the util directory into its own library. This is done to avoid compiling code twice, once for the perf tool and once for the perf python module. For convenience: arch/common.c scripts/perl/Perf-Trace-Util/Context.c scripts/python/Perf-Trace-Util/Context.c are made part of this library. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Nick Terrell <terrelln@fb.com> Cc: Gary Guo <gary@garyguo.net> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrei Vagin <avagin@google.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Guo Ren <guoren@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: John Garry <john.g.garry@oracle.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240625214117.953777-7-irogers@google.com
2024-05-28perf arm-spe: Unaligned pointer work aroundIan Rogers
Use get_unaligned_leXX instead of leXX_to_cpu to handle unaligned pointers. Such pointers occur with libFuzzer testing. A similar change for intel-pt was done in: https://lore.kernel.org/r/20231005190451.175568-6-adrian.hunter@intel.com Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240514052402.3031871-1-irogers@google.com
2023-06-21perf arm-spe: Fix a dangling Documentation/arm64 referenceJonathan Corbet
The arm64 documentation has moved under Documentation/arch/. Fix up a dangling reference to match. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-04-04perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/storeRob Herring
Arm SPEv1.3 adds new load/store operation subclasses for Memory Tagging Extension (MTE) and memory operations (MOPS). The memory operations are memcpy and memset. Add support for decoding these new subclasses in the raw decoding. Reviewed-by: Leo Yan <leo.yan@linaro.org Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230327162057.4057188-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20perf arm-spe: Refactor arm-spe to support operation packet typeGerman Gomez
Extend the decoder of Arm SPE records to support more fields from the operation packet type. Not all fields are being decoded by this commit. Only those needed to support the use-case SVE load/store/other operations. Suggested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman.Khandual@arm.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03perf arm-spe: Add raw decoding for SPEv1.2 previous branch addressRob Herring
Arm SPEv1.2 adds a new optional address packet type: previous branch target. The recorded address is the target virtual address of the most recently taken branch in program order. Add support for decoding the address packet in raw dumps. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230203162401.132931-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02perf arm-spe: Only warn once for each unsupported address packetRob Herring
Unknown address packet indexes are not an error as the Arm architecture can (and has with SPEv1.2) define new ones and implementation defined ones are also allowed. The error message for every occurrence of the packet is needlessly noisy as well. Change the message to print just once for each unknown index. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230127205546.667740-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-11perf arm-spe: Use SPE data source for neoverse coresAli Saidi
When synthesizing data from SPE, augment the type with source information for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use the same encoding. I can't find encoding information for any other SPE implementations to unify their choices with Arm's thus that is left for future work. This change populates the mem_lvl_num for Neoverse cores as well as the deprecated mem_lvl namespace. Reviewed-by: German Gomez <german.gomez@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Ali Saidi <alisaidi@amazon.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-4-leo.yan@linaro.org Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-16perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHTNamhyung Kim
Use total latency info in the SPE counter packet as sample weight so that we can see it in local_weight and (global) weight sort keys. Maybe we can use PERF_SAMPLE_WEIGHT_STRUCT to support ins_lat as well but I'm not sure which latency it matches. So just adding total latency first. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20211201220855.1260688-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13perf arm-spe: Save context ID in recordGerman Gomez
This patch is to save context ID in record, this will be used to set TID for samples. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20211111133625.193568-4-german.gomez@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07perf tools: Use __BYTE_ORDER__Ilya Leoshkevich
Switch from the libc-defined __BYTE_ORDER to the compiler-defined __BYTE_ORDER__ in order to make endianness detection more robust, like it was done for libbpf. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20211104132311.984703-1-iii@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-07perf arm-spe: Avoid potential buffer overrunIan Rogers
SPE extended headers are > 1 byte so ensure the buffer contains at least this before reading. This issue was detected by fuzzing. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Dave Martin <dave.martin@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/20210407153955.317215-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12perf arm-spe: Store operation type in packetLeo Yan
This patch is to store operation type in packet structure. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Tested-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20210211133856.2137-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12perf arm-spe: Store memory address in packetLeo Yan
This patch is to store virtual and physical memory addresses in packet, which will be used for memory samples. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Tested-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210211133856.2137-2-james.clark@arm.com Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-26perf arm-spe: Add support for ARMv8.3-SPEWei Li
This patch is to support Armv8.3 extension for SPE, it adds alignment field in the Events packet and it supports the Scalable Vector Extension (SVE) for Operation packet and Events packet with two additions: - The vector length for SVE operations in the Operation Type packet; - The incomplete predicate and empty predicate fields in the Events packet. Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20201119152441.6972-17-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-26perf arm_spe: Decode memory tagging propertiesAndre Przywara
When SPE records a physical address, it can additionally tag the event with information from the Memory Tagging architecture extension. Decode the two additional fields in the SPE event payload. [leoy: Refined patch to use predefined macros] Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Li <liwei391@huawei.com> Link: https://lore.kernel.org/r/20201119152441.6972-16-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-26perf arm-spe: Add more sub classes for operation packetLeo Yan
For the operation type packet payload with load/store class, it misses to support these sub classes: - A load/store targeting the general-purpose registers; - A load/store targeting unspecified registers; - The ARMv8.4 nested virtualisation extension can redirect system register accesses to a memory page controlled by the hypervisor. The SPE profiling feature in newer implementations can tag those memory accesses accordingly. Add the bit pattern describing load/store sub classes, so that the perf tool can decode it properly. Inspired by Andre Przywara, refined the commit log and code for more clear description. Co-developed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Li <liwei391@huawei.com> Link: https://lore.kernel.org/r/20201119152441.6972-15-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-26perf arm-spe: Refactor operation packet handlingLeo Yan
Defines macros for operation packet header and formats (support sub classes for 'other', 'branch', 'load and store', etc). Uses these macros for operation packet decoding and dumping. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Li <liwei391@huawei.com> Link: https://lore.kernel.org/r/20201119152441.6972-14-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-26perf arm-spe: Add new function arm_spe_pkt_desc_op_type()Leo Yan
The operation type packet is complex and contains subclass; the parsing flow causes deep indentation; for more readable, this patch introduces a new function arm_spe_pkt_desc_op_type() which is used for operation type parsing. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Li <liwei391@huawei.com> Link: https://lore.kernel.org/r/20201119152441.6972-13-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-26perf arm-spe: Remove size condition checking for eventsLeo Yan
In the Armv8 ARM (ARM DDI 0487F.c), chapter "D10.2.6 Events packet", it describes the event bit is valid with specific payload requirement. For example, the Last Level cache access event, the bit is defined as: E[8], byte 1 bit [0], when SZ == 0b01 , when SZ == 0b10 , or when SZ == 0b11 It requires the payload size is at least 2 bytes, when byte 1 (start counting from 0) is valid, E[8] (bit 0 in byte 1) can be used for LLC access event type. For safety, the code checks the condition for payload size firstly, if meet the requirement for payload size, then continue to parse event type. If review function arm_spe_get_payload(), it has used cast, so any bytes beyond the valid size have been set to zeros. For this reason, we don't need to check payload size anymore afterwards when parse events, thus this patch removes payload size conditions. Suggested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Li <liwei391@huawei.com> Link: https://lore.kernel.org/r/20201119152441.6972-12-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-26perf arm-spe: Refactor event type handlingLeo Yan
Move the enums of event types to arm-spe-pkt-decoder.h, thus function arm_spe_pkt_desc_event() can use them for bitmasks. Suggested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Li <liwei391@huawei.com> Link: https://lore.kernel.org/r/20201119152441.6972-11-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>