summaryrefslogtreecommitdiff
path: root/tools/perf/pmu-events/arch/common
AgeCommit message (Collapse)Author
2025-11-11perf tool_pmu: Make core_wide and target_cpu json eventsIan Rogers
For the sake of better documentation, add core_wide and target_cpu to the tool.json. When the values of system_wide and user_requested_cpu_list are unknown, use the values from the global stat_config. Example output showing how '-a' modifies the values in `perf stat`: ``` $ perf stat -e core_wide,target_cpu true Performance counter stats for 'true': 0 core_wide 0 target_cpu 0.000993787 seconds time elapsed 0.001128000 seconds user 0.000000000 seconds sys $ perf stat -e core_wide,target_cpu -a true Performance counter stats for 'system wide': 1 core_wide 1 target_cpu 0.002271723 seconds time elapsed $ perf list ... tool: core_wide [1 if not SMT,if SMT are events being gathered on all SMT threads 1 otherwise 0. Unit: tool] ... target_cpu [1 if CPUs being analyzed,0 if threads/processes. Unit: tool] ... ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf stat: Add detail -d,-dd,-ddd metricsIan Rogers
Add metrics for the stat-shadow -d, -dd and -ddd events and hard coded metrics. Remove the events as these now come from the metrics. Following this change a detailed perf stat output looks like: ``` $ perf stat -a -ddd -- sleep 1 Performance counter stats for 'system wide': 21,089 context-switches # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 14.1 % tma_bad_speculation # 27.3 % tma_frontend_bound (30.56%) TopdownL1 (cpu_core) # 31.5 % tma_backend_bound # 27.2 % tma_retiring (30.56%) 6,302 page-faults # nan faults/sec page_faults_per_second 928,495,163 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (28.41%) 1,841,409,834 cpu_core/cpu-cycles/ # nan GHz cycles_frequency (38.51%) # 14.5 % tma_bad_speculation # 16.0 % tma_retiring (28.41%) # 36.8 % tma_frontend_bound (35.57%) 100,859,118 cpu_atom/branches/ # nan M/sec branch_frequency (42.73%) 572,657,734 cpu_core/branches/ # nan M/sec branch_frequency (54.43%) 1,527 cpu-migrations # nan migrations/sec migrations_per_second # 32.7 % tma_backend_bound (42.73%) 0.00 msec cpu-clock # 0.000 CPUs utilized # 0.0 CPUs CPUs_utilized 498,668,509 cpu_atom/instructions/ # 0.57 insn per cycle # 0.6 instructions insn_per_cycle (42.97%) 3,281,762,225 cpu_core/instructions/ # 1.84 insn per cycle # 1.8 instructions insn_per_cycle (62.20%) 4,919,511 cpu_atom/branch-misses/ # 5.43% of all branches # 5.4 % branch_miss_rate (35.80%) 7,431,776 cpu_core/branch-misses/ # 1.39% of all branches # 1.4 % branch_miss_rate (62.20%) 2,517,007 cpu_atom/LLC-loads/ # 0.1 % llc_miss_rate (28.62%) 3,931,318 cpu_core/LLC-loads/ # 40.4 % llc_miss_rate (45.98%) 14,918,674 cpu_core/L1-dcache-load-misses/ # 2.25% of all L1-dcache accesses # nan % l1d_miss_rate (37.80%) 27,067,264 cpu_atom/L1-icache-load-misses/ # 15.92% of all L1-icache accesses # 15.9 % l1i_miss_rate (21.47%) 116,848,994 cpu_atom/dTLB-loads/ # 0.8 % dtlb_miss_rate (21.47%) 764,870,407 cpu_core/dTLB-loads/ # 0.1 % dtlb_miss_rate (15.12%) 1.006181526 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf jevents: Add metric DefaultShowEventsIan Rogers
Some Default group metrics require their events showing for consistency with perf's previous behavior. Add a flag to indicate when this is the case and use it in stat-display. As events are coming from Default metrics remove that default hardware and software events from perf stat. Following this change the default perf stat output on an alderlake looks like: ``` $ perf stat -a -- sleep 1 Performance counter stats for 'system wide': 20,550 context-switches # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 9.0 % tma_bad_speculation # 28.1 % tma_frontend_bound TopdownL1 (cpu_core) # 29.2 % tma_backend_bound # 33.7 % tma_retiring 6,685 page-faults # nan faults/sec page_faults_per_second 790,091,064 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (49.83%) 2,563,918,366 cpu_core/cpu-cycles/ # nan GHz cycles_frequency # 12.3 % tma_bad_speculation # 14.5 % tma_retiring (50.20%) # 33.8 % tma_frontend_bound (50.24%) 76,390,322 cpu_atom/branches/ # nan M/sec branch_frequency (60.20%) 1,015,173,047 cpu_core/branches/ # nan M/sec branch_frequency 1,325 cpu-migrations # nan migrations/sec migrations_per_second # 39.3 % tma_backend_bound (60.17%) 0.00 msec cpu-clock # 0.000 CPUs utilized # 0.0 CPUs CPUs_utilized 554,347,072 cpu_atom/instructions/ # 0.64 insn per cycle # 0.6 instructions insn_per_cycle (60.14%) 5,228,931,991 cpu_core/instructions/ # 2.04 insn per cycle # 2.0 instructions insn_per_cycle 4,308,874 cpu_atom/branch-misses/ # 5.65% of all branches # 5.6 % branch_miss_rate (49.76%) 9,890,606 cpu_core/branch-misses/ # 0.97% of all branches # 1.0 % branch_miss_rate 1.005477803 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf jevents: Add set of common metrics based on default onesIan Rogers
Add support to getting a common set of metrics from a default table. It simplifies the generation to add json metrics at the same time. The metrics added are CPUs_utilized, cs_per_second, migrations_per_second, page_faults_per_second, insn_per_cycle, stalled_cycles_per_instruction, frontend_cycles_idle, backend_cycles_idle, cycles_frequency, branch_frequency and branch_miss_rate based on the shadow metric definitions. Following this change the default perf stat output on an alderlake looks like: ``` $ perf stat -a -- sleep 2 Performance counter stats for 'system wide': 0.00 msec cpu-clock # 0.000 CPUs utilized 77,739 context-switches 15,033 cpu-migrations 321,313 page-faults 14,355,634,225 cpu_atom/instructions/ # 1.40 insn per cycle (35.37%) 134,561,560,583 cpu_core/instructions/ # 3.44 insn per cycle (57.85%) 10,263,836,145 cpu_atom/cycles/ (35.42%) 39,138,632,894 cpu_core/cycles/ (57.60%) 2,989,658,777 cpu_atom/branches/ (42.60%) 32,170,570,388 cpu_core/branches/ (57.39%) 29,789,870 cpu_atom/branch-misses/ # 1.00% of all branches (42.69%) 165,991,152 cpu_core/branch-misses/ # 0.52% of all branches (57.19%) (software) # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 11.9 % tma_bad_speculation # 19.6 % tma_frontend_bound (63.97%) TopdownL1 (cpu_core) # 18.8 % tma_backend_bound # 49.7 % tma_retiring (63.97%) (software) # nan faults/sec page_faults_per_second # nan GHz cycles_frequency (42.88%) # nan GHz cycles_frequency (69.88%) TopdownL1 (cpu_atom) # 11.7 % tma_bad_speculation # 29.9 % tma_retiring (50.07%) TopdownL1 (cpu_atom) # 31.3 % tma_frontend_bound (43.09%) (cpu_atom) # nan M/sec branch_frequency (43.09%) # nan M/sec branch_frequency (70.07%) # nan migrations/sec migrations_per_second TopdownL1 (cpu_atom) # 27.1 % tma_backend_bound (43.08%) (software) # 0.0 CPUs CPUs_utilized # 1.4 instructions insn_per_cycle (43.04%) # 3.5 instructions insn_per_cycle (69.99%) # 1.0 % branch_miss_rate (35.46%) # 0.5 % branch_miss_rate (65.02%) 2.005626564 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-06perf stat: Add ScaleUnit to {cpu,task}-clock JSON descriptionNamhyung Kim
This changes the output of the event like below. In fact, that's the output it used to have before the JSON conversion. Before: $ perf stat -e task-clock true Performance counter stats for 'true': 313,848 task-clock # 0.290 CPUs utilized 0.001081223 seconds time elapsed 0.001122000 seconds user 0.000000000 seconds sys After: $ perf stat -e task-clock true Performance counter stats for 'true': 0.36 msec task-clock # 0.297 CPUs utilized 0.001225435 seconds time elapsed 0.001268000 seconds user 0.000000000 seconds sys Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 9957d8c801fe0cb90 ("perf jevents: Add common software event json") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-15perf jevents: Add legacy-hardware and legacy-cache jsonIan Rogers
The legacy-hardware.json is added containing hardware events similarly to the software.json file. A difference is that for the software PMU the name is known and matches sysfs. In the legacy-hardware.json no Unit/PMU is specified for the events meaning default_core is used and the events will appear for all core PMUs. There are potentially 1216 legacy cache events, rather than list them in a json file add a make_legacy_cache.py helper to generate them. By using json for legacy hardware and cache events: descriptions of the events can be added; events can be marked as deprecated, such as those misleadingly named l2 (deprecated is also used to mark all events that weren't previously displayed in perf list); and the name lookup becomes case insensitive. The C string encoding all the perf events and metrics is increased in size by 123,499 bytes which will increase the perf binary size. Later changes will remove hard coded event parsing for legacy hardware and cache events, turning parsing overhead into a binary search during event lookup. That event descriptions are based off of those in perf_event_open man page, credit to Vince Weaver <vincent.weaver@maine.edu>. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-26perf jevents: Add common software event jsonIan Rogers
Add json for software events so that in perf list the events can have a description. Common json exists for the tool PMU but it has no sysfs equivalent. Modify the map_for_pmu code to return the common map (rather than an architecture specific one) when a PMU with a common name is being looked for, this allows the events to be found. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250725185202.68671-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf jevents: Add tool event json under a common architectureIan Rogers
Introduce the notion of a common architecture/model that can be used to find event tables for common PMUs like the tool PMU. By having tool events be json standard PMU attribute configuration, descriptions, etc. can be used and these routines are already optimized for things like binary searching. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>