| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
"Perf event/metric description:
Unify all event and metric descriptions in JSON format. Now event
parsing and handling is greatly simplified by that.
From users point of view, perf list will provide richer information
about hardware events like the following.
$ perf list hw
List of pre-defined events (to be used in -e or -M):
legacy hardware:
branch-instructions
[Retired branch instructions [This event is an alias of branches]. Unit: cpu]
branch-misses
[Mispredicted branch instructions. Unit: cpu]
branches
[Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu]
bus-cycles
[Bus cycles,which can be different from total cycles. Unit: cpu]
cache-misses
[Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the
PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu]
cache-references
[Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include
prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu]
cpu-cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu]
cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu]
instructions
[Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu]
ref-cycles
[Total cycles; not affected by CPU frequency scaling. Unit: cpu]
But most notable changes would be in the perf stat. On the right side,
the default metrics are better named and aligned. :)
$ perf stat -- perf test -w noploop
Performance counter stats for 'perf test -w noploop':
11 context-switches # 10.8 cs/sec cs_per_second
0 cpu-migrations # 0.0 migrations/sec migrations_per_second
3,612 page-faults # 3532.5 faults/sec page_faults_per_second
1,022.51 msec task-clock # 1.0 CPUs CPUs_utilized
110,466 branch-misses # 0.0 % branch_miss_rate (88.66%)
6,934,452,104 branches # 6781.8 M/sec branch_frequency (88.66%)
4,657,032,590 cpu-cycles # 4.6 GHz cycles_frequency (88.65%)
27,755,874,218 instructions # 6.0 instructions insn_per_cycle (89.03%)
TopdownL1 # 0.3 % tma_backend_bound
# 9.3 % tma_bad_speculation (89.05%)
# 9.7 % tma_frontend_bound (77.86%)
# 80.7 % tma_retiring (88.81%)
1.025318171 seconds time elapsed
1.013248000 seconds user
0.012014000 seconds sys
Deferred unwinding support:
With the kernel support (commit c69993ecdd4d: "perf: Support deferred
user unwind"), perf can use deferred callchains for userspace stack
trace with frame pointers like below:
$ perf record --call-graph fp,defer ...
This will be transparent to users when it comes to other commands like
perf report and perf script. They will merge the deferred callchains
to the previous samples as if they were collected together.
ARM SPE updates
- Extensive enhancements to support various kinds of memory
operations including GCS, MTE allocation tags, memcpy/memset,
register access, and SIMD operations.
- Add inverted data source filter (inv_data_src_filter) support to
exclude certain data sources.
- Improve documentation.
Vendor event updates:
- Intel: Updated event files for Sierra Forest, Panther Lake, Meteor
Lake, Lunar Lake, Granite Rapids, and others.
- Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE
definitions.
- RISC-V: Added JSON support for T-HEAD C920V2.
Misc:
- Improve pointer tracking in data type profiling. It'd give better
output when the variable is using container_of() to convert type.
- Annotation support for perf c2c report in TUI. Press 'a' key to
enter annotation view from cacheline browser window. This will show
which instruction is causing the cacheline contention.
- Lots of fixes and test coverage improvements!"
* tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits)
libperf: Use 'extern' in LIBPERF_API visibility macro
perf stat: Improve handling of termination by signal
perf tests stat: Add test for error for an offline CPU
perf stat: When no events, don't report an error if there is none
perf tests stat: Add "--null" coverage
perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask
libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
perf stat: Allow no events to open if this is a "--null" run
perf test kvm: Add some basic perf kvm test coverage
perf tests evlist: Add basic evlist test
perf tests script dlfilter: Add a dlfilter test
perf tests kallsyms: Add basic kallsyms test
perf tests timechart: Add a perf timechart test
perf tests top: Add basic perf top coverage test
perf tests buildid: Add purge and remove testing
perf tests c2c: Add a basic c2c
perf c2c: Clean up some defensive gets and make asan clean
perf jitdump: Fix missed dso__put
perf mem-events: Don't leak online CPU map
perf hist: In init, ensure mem_info is put on error paths
...
|
|
Add a test that if an offline CPU is requested perf stat will fail.
$ perf test -vv "perf stat tests"
101: perf stat tests:
--- start ---
test child forked, pid 46965
Basic stat command test
Basic stat command test [Success]
Null stat command test
Null stat command test [Success]
Offline CPU stat command test (cpu 8)
Offline CPU stat command test [Success]
stat record and report test
stat record and report test [Success]
stat record and script test
stat record and script test [Success]
stat repeat weak groups test
stat repeat weak groups test [Success]
Topdown event group test
Topdown event group test [Success]
Topdown weak groups test
Topdown weak groups test [Skipped event parsing failed]
cputype test
cputype test [Success]
hybrid test
hybrid test [Success]
---- end(0) ----
101: perf stat tests : Ok
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Closes: https://lore.kernel.org/linux-perf-users/94313b82-888b-4f42-9fb0-4585f9e90080@linux.ibm.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Ensure "--null" does a minimal run.
Reported-by: Ingo Molnar <mingo@kernel.org>
Closes: https://lore.kernel.org/linux-perf-users/aSwt7yzFjVJCEmVp@gmail.com/
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Setup qemu with KVM then run kvm stat and some host
recording/reporting/build-id tests.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add test that evlist reports expected events from perf record.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Compile a simple dlfilter and make sure it remove samples from
everything other than a test_loop.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add test that kallsyms finds a well known symbol and fails for
another.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Basic coverage for `perf timechart` doing a record and then a basic
sanity test of the generated SVG file.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The test starts a backgroup thloop workload and monitors it using
cpu-clock ensuring test_loop appears in the output.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add testing for the purge and remove commands. Use the noploop
workload rather than just a return to avoid missing samples in the
workload in perf record. Tidy up the cleanup code to cleanup when
signals happen.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add basic c2c record and report testing to gain some coverage.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Create a fake root directory for /proc/{version,modules,kallsyms} in
/tmp for testing. The kallsyms has a bad symbol in the module and it
causes the main map splitted. The test ensures it only has two maps -
kernel and the module and it finds the initial map after the module
without creating the split maps like [kernel].0 and so on.
$ perf test -vv "split kallsyms"
69: split kallsyms:
--- start ---
test child forked, pid 1016196
try to create fake root directory
create kernel maps from the fake root directory
maps__set_modules_path_dir: cannot open /tmp/perf-test.Zrv6Sy/lib/modules/X.Y.Z dir
Problems setting modules path maps, continuing anyway...
Failed to open /tmp/perf-test.Zrv6Sy/proc/kcore. Note /proc/kcore requires CAP_SYS_RAWIO capability to access.
Using /tmp/perf-test.Zrv6Sy/proc/kallsyms for symbols
kernel map loaded - check symbol and map
---- end(0) ----
69: split kallsyms : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add a test case for the python interpreter like below so that we can
make sure it won't break again. To validate the effect of build-ID
generation, it adds and removes the JIT'ed DSOs to/from the build-ID
cache for the test.
$ perf test -vv jitdump
84: python profiling with jitdump:
--- start ---
test child forked, pid 214316
Run python with -Xperf_jit
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.180 MB /tmp/__perf_test.perf.data.XbqZNm (140 samples) ]
Generate JIT-ed DSOs using perf inject
Add JIT-ed DSOs to the build-ID cache
Check the symbol containing the script name
Found 108 matching lines
Remove JIT-ed DSOs from the build-ID cache
---- end(0) ----
84: python profiling with jitdump : Ok
Cc: Pablo Galindo <pablogsal@gmail.com>
Link: https://docs.python.org/3/howto/perf_profiling.html#how-to-work-without-frame-pointers
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The mem-loads-aux event exists on hybrid systems but the "cpu" PMU
does not. This causes an event parsing error which erroneously makes
the test look like it is failing. Avoid naming the PMU to avoid
this. Rather than cleaning up perf.data in the directory the test is
run, explicitly send the 'perf record' output to /dev/null and avoid
any cleanup scripts.
Fixes: fc9c17b22352 ("perf test: Add a perf event fallback test")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Simplify the build ID reading code by removing the non-blocking option.
Having to pass the correct option to this function was fragile and a
mistake would result in a hang, see the linked fix. Furthermore,
compressed files are always opened blocking anyway, ignoring the
non-blocking option.
We also don't expect to read build IDs from non-regular files. The only
hits to this function that are non-regular are devices that won't be elf
files with build IDs, for example "/dev/dri/renderD129".
Now instead of opening these as non-blocking and failing to read, we
skip them. Even if something like a pipe or character device did have a
build ID, I don't think it would have worked because you need to call
read() in a loop, check for -EAGAIN and handle timeouts to make
non-blocking reads work.
Link: https://lore.kernel.org/linux-perf-users/20251022-james-perf-fix-dso-block-v1-1-c4faab150546@linaro.org/
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
perf_event_attr has gained a new field, config4, so add support for it
extending the existing configN support.
Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Determine if a metric is default from `perf list --raw-dump $m` eg:
```
$ perf list --raw-dump l1_prefetch_miss_rate
Default4 l1_prefetch_miss_rate
```
If a metric has "not supported" or "no supported events" then ignore
these failures for default metrics. Tidy up the skip/fail messages in
the output to make them easier to spot/read.
```
$ perf list -vv "all metrics"
...
Testing llc_miss_rate
[Ignored llc_miss_rate] failed but as a Default metric this can be expected
Error: No supported events found. The LLC-loads event is not supported.
...
```
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Closes: https://lore.kernel.org/linux-perf-users/20251119104751.51960-1-tmricht@linux.ibm.com/
Reported-by: Namhyung Kim <namhyung@kernel.org>
Reported-by: James Clark <james.clark@linaro.org>
Closes: https://lore.kernel.org/lkml/aRi9xnwdLh3Dir9f@google.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add additional test to the maps covering
maps__fixup_overlap_and_insert. Change the test suite to be for more
than just 1 test.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
walltime_nsecs_stats is no longer used for counter values, move into
that stat_config where it controls certain things like noise
measurement.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In certain hypervisor set ups the value 0 may be returned but this is
only erroneous if the user rdpmc isn't disabled.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
print_metric_only_json and print_metric_end in stat-display.c may
create a metric value of "none" which fails validation as isfloat. Add
a helper to properly validate metric numeric values.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
errno.h isn't used in auxtrace.h so remove it and fix build failures
caused by transitive dependencies through auxtrace.h on errno.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The NO_AUXTRACE build option was used when the __get_cpuid feature
test failed or if it was provided on the command line. The option no
longer avoids a dependency on a library and so having the option is
just adding complexity to the code base. Remove the option
CONFIG_AUXTRACE from Build files and HAVE_AUXTRACE_SUPPORT by assuming
it is always defined.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Couple of independent fixes:
1. Wire in SIGSEGV handler that terminates the test with a failure code.
2. Use "--lock-cgroup" instead of "-g"; "-g" was proposed but never
merged. See commit 4d1792d0a2564caf ("perf lock contention: Add
--lock-cgroup option")
3. Call cleanup() on every normal exit so trap_cleanup() doesn't mistake
it for an unexpected signal and emit a false-negative "Unexpected
signal in main" message.
Before patch:
# ./perf test -vv "lock contention"
85: kernel lock contention analysis test:
--- start ---
test child forked, pid 610711
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --lock-cgroup
Unexpected signal in test_aggr_cgroup
---- end(0) ----
85: kernel lock contention analysis test : Ok
After patch:
# ./perf test -vv "lock contention"
85: kernel lock contention analysis test:
--- start ---
test child forked, pid 602637
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --lock-cgroup
Testing perf lock contention --type-filter (w/ spinlock)
Testing perf lock contention --lock-filter (w/ tasklist_lock)
Testing perf lock contention --callstack-filter (w/ unix_stream)
[Skip] Could not find 'unix_stream'
Testing perf lock contention --callstack-filter with task aggregation
[Skip] Could not find 'unix_stream'
Testing perf lock contention --cgroup-filter
Testing perf lock contention CSV output
---- end(0) ----
85: kernel lock contention analysis test : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Tycho Andersen <tycho@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This adds test cases to verify the precise ip fallback logic:
- If the system supports precise ip, for an event given with the maximum
precision level, it should be able to decrease precise_ip to find a
supported level.
- The same fallback behavior should also work in more complex scenarios,
such as event groups or when PEBS is involved
Additional fallback tests, such as those covering missing feature cases,
can be added in the future.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Ian Rogers <irogers!@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Explicitly use a metric rather than implicitly expecting '-e
instructions,cycles' to produce a metric. Use a metric with software
events to make it more compatible.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
test_stat_record_report and test_stat_record_script used default
output which triggers a bug when sending metrics. As this isn't
relevant to the test switch to using named software events.
Update the match in test_hybrid as the cycles event is now cpu-cycles
to workaround potential ARM issues.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Previously '-e cycles,instructions' would implicitly create an IPC
metric. This now has to be explicit with '-M insn_per_cycle'.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Default metrics may use unsupported events and be ignored. These
metrics shouldn't cause metric testing to fail.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Make the expectations match json metrics rather than the previous hard
coded ones.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The Default[234] metric groups may contain unsupported legacy
events. Allow those metric groups to fail.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When testing metric-only, pass a metric to perf rather than expecting
a hard coded metric value to be generated.
Remove keys that were really metric-only units and instead don't
expect metric only to have a matching json key as it encodes metrics
as {"metric_name", "metric_value"}.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The behavior of weak terms is subtle, add a test that they aren't
accidentally broken. The test finds an event with a weak 'period' and
then overrides it. In no such event is present then the test skips.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Allow the number of threads for the thloop workload to be increased
beyond the normal 2. Add error checking to the parsed time and thread
count values.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add TEST_ASSERT_EVSEL to dump the failing evsel in the event of a
failure.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add TEST_ASSERT_EVLIST to dump the failing evlist in the event of a
failure.
Add the macro to a number of tests not currently checking the evlist
length.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Just have a single test_hw_config helper that strips extended type
information in the case of hardware and hardware cache events.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Without a PMU perf matches an event against any PMU with the
event. Unfortunately some PMU drivers advertise a "cycles" event which
is typically just a core event. As tests assume a core event, switch
to use "cpu-cycles" that avoids the overloaded "cycles" event on
troublesome PMUs and is so far not overloaded. Note, on x86 this
changes a legacy event into a sysfs one.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In the event parse string, switch "cpu" to "default_core" and then
rewrite this to the first core PMU name prior to parsing. This enables
testing with a PMU on hybrid x86 and other systems that don't use
"cpu" for the core PMU name. The name "default_core" is already used
by jevents. Update test expectations to match.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Without a PMU perf matches an event against any PMU with the
event. Unfortunately some PMU drivers advertise a "cycles" event which
is typically just a core event. Switch to using "cpu-cycles" which is
an indentical legacy event but avoids the multiple PMU confusion
introduced by the PMU drivers. Note, on x86 cpu-cycles is also a sysfs
event but cycles isn't.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Switch from the test's assert_hw/test_config to the common
evsel__match code that appropriately handles events with both legacy
and sysfs/json encoding.
For tests asserting that a config value matches that placed in the
perf_event_attr just directly compare the config values.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Now the events file isn't directly parsed from a FILE but stored in a
string prior to parsing, remove the FILE argument to the associated
scanner functions as they only ever pass NULL.
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When an event/alias is created for a PMU the terms are eagerly parsed
using parse_events_terms. For a command like perf stat or perf record,
the particular event/alias will be found, the terms parsed, the
terms cloned for use in the event parsing, and then the terms used to
configure the perf_event_attr. Events/aliases may be eagerly loaded,
such as from sysfs or in perf list, in which case the aliases terms
will be little or never used. To avoid redundant work, to avoid
cloning, and to reduce memory overhead, hold the terms for an event as
a string until they need handling as a term list. This may introduce
duplicate parsing if an event is repeated in a list, but this
situation is expected to be uncommon.
Measuring the number of instructions before and after with a sysfs
event and perf stat, there is a minor reduction in the number of
instructions executed by 0.3%.
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Use strdup() instead of fixed PATH_MAX buffer for storing paths to not
waste memory.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
We already only test each kcore map once, but on slow systems
(particularly with network filesystems) even the non-kcore maps are
slow.
The test can test the same objdump output over and over which only wastes
time. Generalize the skipping mechanism to track all DSOs and addresses
so that each section is only tested once.
On a fully loaded ARM Juno (simulating a parallel 'perf test' run) with
a network filesystem, the original runtime is:
real 1m51.126s
user 0m19.445s
sys 1m15.431s
And the new runtime is:
real 0m48.873s
user 0m8.031s
sys 0m32.353s
Committer testing:
# perf test "code read"
22: Object code reading : Ok
#
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
clang-18.1.3 on Ubuntu 24.04.2 reports warning:
unroll_loop_thread.c:35:25: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
35 | : /* in */ [in] "r" (in)
| ^
unroll_loop_thread.c:39:1: warning: non-void function does not return a value [-Wreturn-type]
39 | }
| ^
Use the modifier "w" for 32-bit register access and return NULL at the
end of thread function.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-7-4305590795b2@arm.com
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: linux-riscv@lists.infradead.org
Cc: llvm@lists.linux.dev
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
clang-18.1.3 on Ubuntu 24.04.2 reports warning:
thread_loop.c:41:23: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
41 | : /* in */ [i] "r" (i), [len] "r" (len)
| ^
thread_loop.c:37:8: note: use constraint modifier "w"
37 | "add %[i], %[i], #1\n"
| ^~~~
| %w[i]
thread_loop.c:41:23: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
41 | : /* in */ [i] "r" (i), [len] "r" (len)
| ^
thread_loop.c:37:14: note: use constraint modifier "w"
37 | "add %[i], %[i], #1\n"
| ^~~~
| %w[i]
thread_loop.c:41:23: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
41 | : /* in */ [i] "r" (i), [len] "r" (len)
| ^
thread_loop.c:38:8: note: use constraint modifier "w"
38 | "cmp %[i], %[len]\n"
| ^~~~
| %w[i]
thread_loop.c:41:38: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
41 | : /* in */ [i] "r" (i), [len] "r" (len)
| ^
thread_loop.c:38:14: note: use constraint modifier "w"
38 | "cmp %[i], %[len]\n"
| ^~~~~~
| %w[len]
Use the modifier "w" for 32-bit register access.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-6-4305590795b2@arm.com
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: linux-riscv@lists.infradead.org
Cc: llvm@lists.linux.dev
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
clang-18.1.3 on Ubuntu 24.04.2 reports warning:
memcpy_thread.c:30:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
30 | }
| ^
Dismiss the warning with returning NULL from the thread function.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-5-4305590795b2@arm.com
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: linux-riscv@lists.infradead.org
Cc: llvm@lists.linux.dev
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Include event parsing and regression tests for auto counter reload
and ratio-to-prev event term.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When running on a hypervisor the expected IPC metric may be missing as
the events may fail to be read. Don't expect metric output for this
test to avoid it failing.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|