| Age | Commit message (Collapse) | Author |
|
When perf resolves symbols from kernel module ELF files (ET_REL),
it converts symbol addresses to file offsets so that sample IPs
can be matched to the correct symbol. The conversion adjusts each
symbol's st_value:
sym->st_value -= shdr->sh_addr - shdr->sh_offset;
For vmlinux (ET_EXEC), st_value is a virtual address and sh_addr
is the section's virtual base, so subtracting sh_addr and adding
sh_offset correctly yields a file offset.
For kernel modules (ET_REL), st_value is a section-relative
offset. The module loader ignores sh_addr entirely and places
symbols at module_base + st_value. Converting to file offset
requires only adding sh_offset; subtracting sh_addr introduces an
error equal to sh_addr bytes.
When .text has sh_addr == 0 -- the historical norm for simple
modules -- both formulas produce the same result and the bug is
latent. As modules gain more metadata sections before .text (.note,
.static_call.text, etc.), the linker assigns .text a non-zero
sh_addr, exposing the defect. For example, nfsd.ko on this kernel
has sh_addr=0xa80, kvm-intel.ko has sh_addr=0x1e90.
The effect is that all .text symbols in affected modules
shift by sh_addr bytes relative to sample IPs, causing perf
report to attribute samples to incorrect, nearby symbols. This
was observed as 13% of LLC-load-miss samples misattributed
to nfsd_file_get_dio_attrs when the actual hot function was
nfsd_cache_lookup, approximately 0xa80 bytes away in the symbol
table.
Use the existing dso__rel() flag (already set for ET_REL modules)
to select the correct adjustment: add sh_offset for ET_REL,
subtract (sh_addr - sh_offset) for ET_EXEC/ET_DYN.
Fixes: 0131c4ec794a ("perf tools: Make it possible to read object code from kernel modules")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The index into the cpumap array and the number of entries within the
array can never be negative, so let's make them unsigned. This is
prompted by reports that gcc 13 with -O6 is giving a
alloc-size-larger-than errors. The change makes the cpumap changes and
then updates the declaration of index variables throughout perf and
libperf to be unsigned. The two things are hard to separate as
compiler warnings about mixing signed and unsigned types breaks the
build.
Reported-by: Chingbin Li <liqb365@163.com>
Closes: https://lore.kernel.org/lkml/20260212025127.841090-1-liqb365@163.com/
Tested-by: Chingbin Li <liqb365@163.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
This patch adds a new --pmu-filter option to perf-stat command to allow
filtering events on specific PMUs. This is useful when there are
multiple PMUs with same type (e.g. hisi_sicl2_cpa0 and hisi_sicl0_cpa0).
[root@localhost tmp]# perf stat -M cpa_p0_avg_bw
Performance counter stats for 'system wide':
19,417,779,115 hisi_sicl0_cpa0/cpa_cycles/ # 0.00 cpa_p0_avg_bw
0 hisi_sicl0_cpa0/cpa_p0_wr_dat/
0 hisi_sicl0_cpa0/cpa_p0_rd_dat_64b/
0 hisi_sicl0_cpa0/cpa_p0_rd_dat_32b/
19,417,751,103 hisi_sicl10_cpa0/cpa_cycles/ # 0.00 cpa_p0_avg_bw
0 hisi_sicl10_cpa0/cpa_p0_wr_dat/
0 hisi_sicl10_cpa0/cpa_p0_rd_dat_64b/
0 hisi_sicl10_cpa0/cpa_p0_rd_dat_32b/
19,417,730,679 hisi_sicl2_cpa0/cpa_cycles/ # 0.31 cpa_p0_avg_bw
75,635,749 hisi_sicl2_cpa0/cpa_p0_wr_dat/
18,520,640 hisi_sicl2_cpa0/cpa_p0_rd_dat_64b/
0 hisi_sicl2_cpa0/cpa_p0_rd_dat_32b/
19,417,674,227 hisi_sicl8_cpa0/cpa_cycles/ # 0.00 cpa_p0_avg_bw
0 hisi_sicl8_cpa0/cpa_p0_wr_dat/
0 hisi_sicl8_cpa0/cpa_p0_rd_dat_64b/
0 hisi_sicl8_cpa0/cpa_p0_rd_dat_32b/
19.417734480 seconds time elapsed
[root@localhost tmp]# perf stat --pmu-filter hisi_sicl2_cpa0 -M cpa_p0_avg_bw
Performance counter stats for 'system wide':
6,234,093,559 cpa_cycles # 0.60 cpa_p0_avg_bw
50,548,465 cpa_p0_wr_dat
7,552,182 cpa_p0_rd_dat_64b
0 cpa_p0_rd_dat_32b
6.234139320 seconds time elapsed
Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The "comm" column allows grouping events by the process command. It is
intended to group like programs, despite having different PIDs. But some
workloads may adjust their own command, so that a unique identifier
(e.g. a PID or some other numeric value) is part of the command name.
This destroys the utility of "comm", forcing perf to place each unique
process name into its own bucket, which can contribute to a
combinatorial explosion of memory use in perf report.
Create a less strict version of this column, which ignores digits when
comparing command names. Commands whose names are the same (ignoring
digits) are sorted into the same histogram buckets, and displayed with
the placeholder value "<N>" in the place of digits. For example,
hypothetical command names "kworker/1" "kworker/2" "kworker/3" would
sort into the same bucket and be represented as "kworker/<N>".
Committer testing:
$ perf report -s comm,comm_nodigit | grep -F "<N>"
0.01% CPU 6/TCG CPU <N>/TCG
0.01% kworker/53:2-mm kworker/<N>:<N>-mm
0.01% migration/24 migration/<N>
0.01% kworker/24:1-ev kworker/<N>:<N>-ev
0.01% llvmpipe-8 llvmpipe-<N>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
commit e5e66adfe45a6 ("perf regs: Remove __weak attributive arch_sdt_arg_parse_op() function")
removes arch_sdt_arg_parse_op() functions and reveals missing s390 support.
The following warning is printed:
Unknown ELF machine 22, standard arguments parse will be skipped.
ELF machine 22 is the EM_S390 host. This happens with command
# ./perf record -v -- stress-ng -t 1s --matrix 0
when the event is not specified.
Add s390 specific __perf_sdt_arg_parse_op_s390() function to support
-architecture calls to arch_sdt_arg_parse_op() for s390.
The warning disappears.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Jan Polensky <japo@linux.ibm.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
To get the various fixes for v7.0.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Frame pointer callchains are not supported on s390 and dwarf
callchains are only supported on software events.
Switch the default event from the hardware 'cycles' event to the
software 'cpu-clock' or 'task-clock' on s390 if callchains are
enabled. Move some of the target initialization earlier in builtin-top
and builtin-record, so it is ready for use by evlist__new_default.
If frame pointer callchains are requested on s390 show a
warning. Modify the '-g' option of `perf top` and `perf record` to
default to dwarf callchains on s390.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
record_opts__parse_callchain is shared by builtin-record and
builtin-trace, it is declared in callchain.h. Move the declaration to
callchain.c for consistency with the header. In other cases make the
option callback a small static stub that then calls into callchain.c.
Make the no argument '-g' callchain option just a short-cut for
'--call-graph fp' so that there is consistency in how the arguments
are handled. This requires the const char* string to be strdup-ed in
__parse_callchain_report_opt. For consistency also make
parse_callchain_record use strdup and remove some unnecessary
casts. Also, be more explicit about the '-g' behavior if there is a
.perfconfig file setting.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The options are used to configure the evsel but are not themselves
configured. Make the arguments const to better capture this.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Allow the target to be const in callers.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Switch to using evsel__match rather than comparing perf_event_attr
values, this is robust on hybrid architectures.
Ensure evsel->pmu matches the evsel->core.attr.
Remove exclude bits that get set in other fallback attempts when
switching the event.
Log the event name with modifiers when switching the event on fallback.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Previously, only the first DWARF location entry was collected for each
variable. This was based on the assumption that instruction tracking
could reconstruct the remaining state. However, variables may have
different locations across different address ranges, and relying solely
on instruction tracking can miss valid type information.
Change __die_collect_vars_cb() to iterate over all location entries
using dwarf_getlocations() in a loop. This ensures that variables with
multiple location ranges are properly tracked, improving type coverage.
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When a function call occurs, caller-saved registers are typically
invalidated since the callee may clobber them. However, DWARF debug info
provides location ranges that indicate exactly where a variable is valid
in a register.
Track the DWARF location range end address in type_state_reg and use it
to determine if a caller-saved register should be preserved across a
call. If the current call address is within the DWARF-specified lifetime
of the variable, keep the register state valid instead of invalidating
it.
This improves type annotation for code where the compiler knows a
register value survives across calls (e.g., when the callee is known not
to clobber certain registers or when the value is reloaded after the
call at the same logical location).
Changes:
- Add `end` and `has_range` fields to die_var_type to capture DWARF
location range information
- Add `lifetime_active` and `lifetime_end` fields to type_state_reg
- Check location lifetime before invalidating caller-saved registers
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Previously, the x86 call handler returned early without invalidating
caller-saved registers when the call target symbol could not be resolved
(func == NULL). This violated the ABI which requires caller-saved
registers to be considered clobbered after any call instruction.
Fix this by:
1. Always invalidating caller-saved registers for any call instruction
(except __fentry__ which preserves registers)
2. Using dl->ops.target.name as fallback when func->name is unavailable,
allowing return type lookup for more call targets
This is a conservative change that may reduce type coverage for indirect
calls (e.g., callq *(%rax)) where we cannot determine the return type
but it ensures correctness.
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add a helper function to consistently invalidate register state instead
of field assignments. This ensures kind, ok, and copied_from are all
properly cleared when a register becomes invalid.
The helper sets:
- kind = TSR_KIND_INVALID
- ok = false
- copied_from = -1
Replace all invalidation patterns with calls to this helper. No
functional change and this removes some incorrect annotations that were
caused by incomplete invalidation (e.g. a obsolete copied_from from an
invalidated register).
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When a register holds a constant value (TSR_KIND_CONST) and is used with
a negative offset, treat it as a potential global variable access
instead of falling through to CFA (frame) handling.
This fixes cases like array indexing with computed offsets:
movzbl -0x7d72725a(%rax), %eax # array[%rax]
Where %rax contains a computed index and the negative offset points to a
global array. Previously this fell through to the CFA path which doesn't
handle global variables, resulting in "no type information".
The fix redirects such accesses to check_kernel which calls
get_global_var_type() to resolve the type from the global variable
cache. This is only done for kernel DSOs since the pattern relies on
kernel-specific global variable resolution. We could also treat
registers with integer types to the global variable path, but this
requires more changes.
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Previously, global_var__collect() required get_global_var_info() to
succeed (i.e., the variable must have a symbol name) before caching a
global variable. This prevented variables that exist in DWARF but lack
symbol table coverage from being cached.
Remove the symbol table requirement since DW_OP_addr already provides
the variable's address directly from DWARF. The symbol table lookup is
now optional to obtain the variable name when available.
Also remove the var_offset != 0 check, which was intended to skip
variables where the access address doesn't match the symbol start. The
symbol table lookup is now optional and I found removing this check has
no effect on the annotation results for both kernel and userspace
programs.
Test results show improved annotation coverage especially for userspace
programs with RIP-relative addressing instructions.
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When a struct member is an array type, die_get_member_type() would stop
iterating since array types weren't handled in the loop. This caused
accesses to array elements within structs to not resolve properly.
Add array type handling by resolving the array to its element type and
calculating the offset within an element using modulo arithmetic
This improves type annotation coverage for struct members that are
arrays.
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When comparing types from different scopes, first compare their type
offsets. A larger offset means the field belongs to an outer (enclosing)
struct. This helps resolve cases where a pointer is found in an inner
scope, but a struct containing that pointer exists in an outer scope.
Previously, is_better_type would prefer the pointer type, but the struct
type is actually more complete and should be chosen.
Prefer types from outer scopes when is_better_type cannot determine a
better type. This is a heuristic for the case `struct A { struct B; }`
where A and B have the same size but I think in most cases A is in the
outer scope and should be preferred.
Signed-off-by: Zecheng Li <zecheng@google.com>
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Both die_find_variable_by_reg and die_find_variable_by_addr call
match_var_offset which already performs sufficient checking and type
matching. The additional check_variable call is redundant, and its
need_pointer logic is only a heuristic. Since DWARF encodes accurate
type information, which match_var_offset verifies, skipping
check_variable improves both coverage and accuracy.
Return the matched type from die_find_variable_by_reg and
die_find_variable_by_addr via the existing `type` field in
find_var_data, removing the need for check_variable in
find_data_type_die.
Signed-off-by: Zecheng Li <zecheng@google.com>
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Preserve typedefs in match_var_offset to match the results by
__die_get_real_type. Also move the (offset == 0) branch after the
is_pointer check to ensure the correct type is used, fixing cases where
an incorrect pointer type was chosen when the access offset was 0.
Signed-off-by: Zecheng Li <zecheng@google.com>
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When a variable type is wrapped in typedef/qualifiers, callers may need
to first resolve it to the underlying DW_TAG_pointer_type or
DW_TAG_array_type. A simple tag check is not enough and directly calling
__die_get_real_type() can stop at the pointer type (e.g. typedef ->
pointer) instead of the pointee type.
Add die_get_pointer_type() helper that follows typedef/qualifier chains
and returns the underlying pointer DIE. Use it in annotate-data.c so
pointer checks and dereference work correctly for typedef'd pointers.
Signed-off-by: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
According to RISC-V psABI specification, the PLT (Program Linkage Table)
has the following layout:
- The first PLT entry occupies two 16-byte entries (32 bytes total)
- Subsequent PLT entries take up 16 bytes each
This aligns with the binutils-gdb implementation which defines the same
PLT sizes for RISC-V architecture.
Update get_plt_sizes() to set plt_header_size=32 and plt_entry_size=16
for EM_RISCV, matching the architecture's standard ABI.
Since AARCH64, LOONGARCH, and RISCV have the same PLT size definition,
they are merged together.
Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=bfd/elfnn-riscv.c
Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Remove duplicate inclusion of stat.h in intel-tpebs.c to clean up
redundant code.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Remove duplicate inclusion of debug.h in symbol.c to clean up redundant
code.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When compiling perf with CORESIGHT=1, an additional build option may
be used: CSTRACE_RAW=1, which will cause the CoreSight formatted trace
frames to be printed out during a perf --dump command. This is useful
when investigating issues with trace generation, decode or possible
data corruption.
e.g. for ETMv4 trace source into a formatted ETR sink a dump -
. ... CoreSight ETMV4I Trace data: size 0x28c150 bytes
Idx:0; ID:14; I_ASYNC : Alignment Synchronisation.
Idx:12; ID:14; I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 };
Decoder Sync point TINFO
Idx:17; ID:14; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.;
Addr=0x0000000000000000;
becomes with CSTRACE_RAW=1:
. ... CoreSight ETMV4I Trace data: size 0x28c150 bytes
Frame Data; Index 0; ID_DATA[0x14]; 00 00 00 00 00 00 00 00
00 00 00 80 01 01
Idx:0; ID:14; I_ASYNC : Alignment Synchronisation.
Frame Data; Index 16; ID_DATA[0x14]; 00 9d 00 00 00 00 00 00
00 00 04 85 57 08 f2
Idx:12; ID:14; I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 };
Decoder Sync point TINFO
Idx:17; ID:14; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.;
Addr=0x0000000000000000;
CSTRACE_RAW=1 has no effect on ETE + TRBE trace as there is no trace
formatting in the TRBE buffer.
This patch enhances the output so that for each packet the individual
bytes associated with the packet are printed.
Thus for ETMv4 this now becomes:
. ... CoreSight ETMV4I Trace data: size 0x28c150 bytes
Frame Data; Index 0; ID_DATA[0x14]; 00 00 00 00 00 00 00 00
00 00 00 80 01 01
Idx:0; ID:14;[0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x80];
I_ASYNC : Alignment Synchronisation.
Frame Data; Index 16; ID_DATA[0x14]; 00 9d 00 00 00 00 00 00
00 00 04 85 57 08 f2
Idx:12; ID:14; [0x01 0x01 0x00 ]; I_TRACE_INFO : Trace Info.; INFO=0x0
{ CC.0 }; Decoder Sync point TINFO
Idx:17; ID:14; [0x9d 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ];
I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.;
Addr=0x0000000000000000;
ETE trace output changes from:
Idx:0; ID:14; I_ASYNC : Alignment Synchronisation.
Idx:12; ID:14; I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0, TSTATE.0 };
Decoder Sync point TINFO
Idx:15; ID:14; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.;
Addr=0xFFFF80007CF7F56C;
becoming:
Idx:0; ID:14;[0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x80];
I_ASYNC : Alignment Synchronisation.
Idx:12; ID:14; [0x01 0x01 0x00 ]; I_TRACE_INFO : Trace Info.; INFO=0x0
{ CC.0, TSTATE.0 }; Decoder Sync point TINFO
Idx:15; ID:14; [0x9d 0x5b 0x7a 0xf7 0x7c 0x00 0x80 0xff 0xff ];
I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.;
Addr=0xFFFF80007CF7F56C;
Tested-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Mike Leach <mike.leach@arm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Building perf with CORESIGHT=1 and the optional CSTRACE_RAW=1 enables
additional debug printing of raw trace data when using command:-
perf report --dump.
This raw trace prints the CoreSight formatted trace frames, which may be
used to investigate suspected issues with trace quality / corruption /
decode.
These frames are not present in ETE + TRBE trace.
This fix removes the unnecessary call to print these frames.
This fix also rationalises implementation - original code had helper
function that unnecessarily repeated initialisation calls that had
already been made.
Due to an addtional fault with the OpenCSD library, this call when ETE/TRBE
are being decoded will cause a segfault in perf. This fix also prevents
that problem for perf using older (<= 1.8.0 version) OpenCSD libraries.
Fixes: 68ffe3902898 ("perf tools: Add decoder mechanic to support dumping trace data")
Reported-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Mike Leach <mike.leach@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add an extra "../" to the relative paths so that the uAPI headers
provided by tools can be found correctly.
Fixes: a724a8fce5e25b45 ("perf kvm stat: Fix build error")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The "Read backward ring buffer" test crashes on big-endian (e.g. s390x)
due to a NULL dereference when the backward mmap path isn't enabled.
Reproducer:
# ./perf test -F 'Read backward ring buffer'
Segmentation fault (core dumped)
# uname -m
s390x
#
Root cause:
get_config_terms() stores into evsel_config_term::val.val (u64) while later
code reads boolean fields such as evsel_config_term::val.overwrite.
On big-endian the 1-byte boolean is left-aligned, so writing
evsel_config_term::val.val = 1 is read back as
evsel_config_term::val.overwrite = 0,
leaving backward mmap disabled and a NULL map being used.
Store values in the union member that matches the term type, e.g.:
/* for OVERWRITE */
new_term->val.overwrite = 1; /* not new_term->val.val = 1 */
to fix this. Improve add_config_term() and add two more parameters for
string and value. Function add_config_term() now creates a complete node
element of type evsel_config_term and handles all evsel_config_term::val
union members.
Impact:
Enables backward mmap on big-endian and prevents the crash.
No change on little-endian.
Output after:
# ./perf test -Fv 44
--- start ---
Using CPUID IBM,9175,705,ME1,3.8,002f
mmap size 1052672B
mmap size 8192B
---- end ----
44: Read backward ring buffer : Ok
#
Fixes: 159ca97cd97ce8cc ("perf parse-events: Refactor get_config_terms() to remove macros")
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use metricgroup__for_each_metric() rather than
pmu_metrics_table__for_each_metric() that combines the
default metric table with, a potentially empty, CPUID table.
Fixes: cee275edcdb1acfd ("perf metricgroup: Don't early exit if no CPUID table exists")
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Header file declares unused macros, so remove.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
bpf_map__fprintf is unused so delete it, the header file declaring it
and the now unused static helper functions.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
dump_insn and arch_is_uncond_branch are declared in
intel-pt-insn-decoder.c which is unconditionally part of all perf
builds. Don't declare weak versions of these symbols that will be unused.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Function is only used by symsrc__init in symbol-elf.c, make static to
reduce scope. Switch to not passing the argument by value but as a
pointer.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
If the entry is NULL the value is meaningless so early return NULL to
avoid an increment of NULL. This was happening in calls from
has_stitched_lbr when running the "perf record LBR tests". The return
value isn't used in that case, so returning NULL as no effect.
Fixes: 42bbabed09ce ("perf tools: Add hw_idx in struct branch_stack")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
perf_event__synthesize_modules() allocates a single union perf_event and
reuses it across every kernel module callback.
After the first module is processed, perf_record_mmap2__read_build_id()
sets PERF_RECORD_MISC_MMAP_BUILD_ID in header.misc and writes that
module's build ID into the event.
On subsequent iterations the callback overwrites start, len, pid, and
filename for the next module but never clears the stale build ID fields
or the MMAP_BUILD_ID flag.
When perf_record_mmap2__read_build_id() runs for the second module it
sees the flag, reads the stale build ID into a dso_id, and
__dso__improve_id() permanently poisons the DSO with the wrong build ID.
Every module after the first therefore receives the first module's build
ID in its MMAP2 record.
On a system with the sunrpc and nfsd modules loaded, this causes perf
script and perf report to show [unknown] for all module symbols.
The latent bug has existed since commit d9f2ecbc5e47fca7 ("perf dso:
Move build_id to dso_id") introduced the PERF_RECORD_MISC_MMAP_BUILD_ID
check in perf_record_mmap2__read_build_id().
Commit 53b00ff358dc75b1 ("perf record: Make --buildid-mmap the default")
then exposed it to all users by making the MMAP2-with-build-ID path the
default. Both commits were merged in the same series.
Clear the MMAP_BUILD_ID flag and zero the build_id union before each
call to perf_record_mmap2__read_build_id() so that every module starts
with a clean slate.
Fixes: d9f2ecbc5e47fca7 ("perf dso: Move build_id to dso_id")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The fileloc is a copy of a pointer to a string but in places like
symbol_disassemble__llvm this string appears to be freed setting up
potential use-after-frees:
llvm.c:
```
dl = disasm_line__new(args);
if (dl == NULL)
goto err;
annotation_line__add(&dl->al, ¬es->src->source);
free(args->fileloc);
```
disasm.c:
```
static void annotation_line__init(struct annotation_line *al,
struct annotate_args *args,
int nr)
{
al->offset = args->offset;
al->line = strdup(args->line);
al->line_nr = args->line_nr;
al->fileloc = args->fileloc;
al->data_nr = nr;
}
struct disasm_line *disasm_line__new(struct annotate_args *args)
{
struct disasm_line *dl = NULL;
struct annotation *notes = symbol__annotation(args->ms->sym);
int nr = notes->src->nr_events;
dl = zalloc(disasm_line_size(nr));
if (!dl)
return NULL;
annotation_line__init(&dl->al, args, nr);
```
Fix this by making the fileloc a copy of the underlying string in its
init/exit.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add support for parsing an optional layout parameter in the --symfs
command line option. The format is:
--symfs <directory[,layout]>
Where layout can be:
- 'hierarchy': matches full path (default)
- 'flat': only matches base name
When debugging symbol files from a copy of the filesystem (e.g., from a
container or remote machine), the debug files are often stored in a
flat directory structure with only filenames, not the full original
paths. In this case, using 'flat' layout allows perf to find debug
symbols by matching only the filename rather than the full path.
For example, given a binary path like:
/build/output/lib/foo.so
With 'perf report --symfs /debug/files,flat', perf will look for:
/debug/files/foo.so
Instead of:
/debug/files/build/output/lib/foo.so
This is particularly useful when:
- Extracting debug files from containers with different directory layouts
- Working with build systems that flatten directory structures
Signed-off-by: Changbin Du <changbin.du@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
A copy-paste of a similar issue fixed by Peter Collingbourne in:
https://lore.kernel.org/linux-perf-users/20260304190613.2507582-1-pcc@google.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The hashmap__new() function never returns NULL, it returns error
pointers. Fix the error checking to match.
Additionally, set src->samples to NULL to prevent any later code from
accidentally using the error pointer.
Fixes: d3e7cad6f36d9e80 ("perf annotate: Add a hashmap for symbol histogram")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tianyou Li <tianyou.li@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
These #defines have been removed from the kernel headers in favour of
the string based PMU format attributes. Usages were previously removed
from the recording side of cs-etm in Perf. Finish the removal by
removing usages from the decode side too.
It's a straight replacement of the old #defines with the new register
bit definitions. Except cs_etm__setup_timeless_decoding() which wasn't
looking at the saved metadata and was instead hard coding an access to
'attr.config'. This was vulnerable to the same issue of .config being
moved to .config2 etc that the original removal of ETM_OPT_* tried to
fix. So fix that too.
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If a branch target points to one past the end of a function, the branch
should be treated as a branch to another function.
This can happen e.g. with a tail call to a function that is laid out
immediately after the caller.
Fixes: 751b1783da784299 ("perf annotate: Mark jumps to outher functions with the call arrow")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://linux-review.googlesource.com/id/Ide471112e82d68177e0faf08ca411d9fcf0a7bdf
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is consistent with what llvm-objdump does (see [1]) and allows
the LLVM disassembler to disassemble instructions not in the base
instruction set.
[1] https://reviews.llvm.org/D127741
Link: https://linux-review.googlesource.com/id/I52e4fef18d2e12b45f875231fa9d3efff2538fd4
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
linux/string.h provides strstarts that matches the starts_with
function. For style and consistency reasons remove the starts_with
functions and use strstarts.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add an extra "../" to the relative paths so that the uAPI headers
provided by tools can be found correctly.
Fixes: a724a8fce5e2 ("perf kvm stat: Fix build error")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Some system calls never return because it'd terminate the calling
thread. Let's hook the task exit path and update the duration of the
last syscall.
Before:
$ sudo perf trace -as --bpf-summary -- true |& grep exit
(nothing)
After:
$ sudo perf trace -as --bpf-summary -- true |& grep exit
exit_group 1 0 0.004 0.004 0.004 0.004 0.00%
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Use metricgroup__for_each_metric rather than
pmu_metrics_table__for_each_metric that combines the default metric
table with, a potentially empty, CPUID table.
Fixes: cee275edcdb1 ("perf metricgroup: Don't early exit if no CPUID table exists")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The "Read backward ring buffer" test crashes on big-endian (e.g. s390x)
due to a NULL dereference when the backward mmap path isn't enabled.
Reproducer:
# ./perf test -F 'Read backward ring buffer'
Segmentation fault (core dumped)
# uname -m
s390x
#
Root cause:
get_config_terms() stores into evsel_config_term::val.val (u64) while later
code reads boolean fields such as evsel_config_term::val.overwrite.
On big-endian the 1-byte boolean is left-aligned, so writing
evsel_config_term::val.val = 1 is read back as
evsel_config_term::val.overwrite = 0,
leaving backward mmap disabled and a NULL map being used.
Store values in the union member that matches the term type, e.g.:
/* for OVERWRITE */
new_term->val.overwrite = 1; /* not new_term->val.val = 1 */
to fix this. Improve add_config_term() and add two more parameters for
string and value. Function add_config_term() now creates a complete node
element of type evsel_config_term and handles all evsel_config_term::val
union members.
Impact:
Enables backward mmap on big-endian and prevents the crash.
No change on little-endian.
Output after:
# ./perf test -Fv 44
--- start ---
Using CPUID IBM,9175,705,ME1,3.8,002f
mmap size 1052672B
mmap size 8192B
---- end ----
44: Read backward ring buffer : Ok
#
Fixes: 159ca97cd97c ("perf parse-events: Refactor get_config_terms() to remove macros")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Introduce 'perf sched stats' tool with record/report/diff workflows
using schedstat counters
- Add a faster libdw based addr2line implementation and allow selecting
it or its alternatives via 'perf config addr2line.style='
- Data-type profiling fixes and improvements including the ability to
select fields using 'perf report''s -F/-fields, e.g.:
'perf report --fields overhead,type'
- Add 'perf test' regression tests for Data-type profiling with C and
Rust workloads
- Fix srcline printing with inlines in callchains, make sure this has
coverage in 'perf test'
- Fix printing of leaf IP in LBR callchains
- Fix display of metrics without sufficient permission in 'perf stat'
- Print all machines in 'perf kvm report -vvv', not just the host
- Switch from SHA-1 to BLAKE2s for build ID generation, remove SHA-1
code
- Fix 'perf report's histogram entry collapsing with '-F' option
- Use system's cacheline size instead of a hardcoded value in 'perf
report'
- Allow filtering conversion by time range in 'perf data'
- Cover conversion to CTF using 'perf data' in 'perf test'
- Address newer glibc const-correctness (-Werror=discarded-qualifiers)
issues
- Fixes and improvements for ARM's CoreSight support, simplify ARM SPE
event config in 'perf mem', update docs for 'perf c2c' including the
ARM events it can be used with
- Build support for generating metrics from arch specific python
script, add extra AMD, Intel, ARM64 metrics using it
- Add AMD Zen 6 events and metrics
- Add JSON file with OpenHW Risc-V CVA6 hardware counters
- Add 'perf kvm' stats live testing
- Add more 'perf stat' tests to 'perf test'
- Fix segfault in `perf lock contention -b/--use-bpf`
- Fix various 'perf test' cases for s390
- Build system cleanups, bump minimum shellcheck version to 0.7.2
- Support building the capstone based annotation routines as a plugin
- Allow passing extra Clang flags via EXTRA_BPF_FLAGS
* tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (255 commits)
perf test script: Add python script testing support
perf test script: Add perl script testing support
perf script: Allow the generated script to be a path
perf test: perf data --to-ctf testing
perf test: Test pipe mode with data conversion --to-json
perf json: Pipe mode --to-ctf support
perf json: Pipe mode --to-json support
perf check: Add libbabeltrace to the listed features
perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS
perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing
tools build: Fix feature test for rust compiler
perf libunwind: Fix calls to thread__e_machine()
perf stat: Add no-affinity flag
perf evlist: Reduce affinity use and move into iterator, fix no affinity
perf evlist: Missing TPEBS close in evlist__close()
perf evlist: Special map propagation for tool events that read on 1 CPU
perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel
Revert "perf tool_pmu: More accurately set the cpus for tool events"
tools build: Emit dependencies file for test-rust.bin
tools build: Make test-rust.bin be removed by the 'clean' target
...
|
|
In pipe mode the environment may not be fully initialized so be robust
to fields being NULL.
Add default handling of attr events, use the feature events to populate
the ctf writer environment.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|