diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-03-31 08:52:33 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-03-31 08:52:33 -0700 |
| commit | 802f0d58d52e8e34e08718479475ccdff0caffa0 (patch) | |
| tree | 305f3be98d12b0c6881a6c59eb92e795e6088e51 /tools/perf/util/python.c | |
| parent | 4e82c87058f45e79eeaa4d5bcc3b38dd3dce7209 (diff) | |
| parent | 35d13f841a3d8159ef20d5e32a9ed3faa27875bc (diff) | |
Merge tag 'perf-tools-for-v6.15-2025-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
"perf record:
- Introduce latency profiling using scheduler information.
The latency profiling is to show impacts on wall-time rather than
cpu-time. By tracking context switches, it can weight samples and
find which part of the code contributed more to the execution
latency.
The value (period) of the sample is weighted by dividing it by the
number of parallel execution at the moment. The parallelism is
tracked in perf report with sched-switch records. This will reduce
the portion that are run in parallel and in turn increase the
portion of serial executions.
For now, it's limited to profile processes, IOW system-wide
profiling is not supported. You can add --latency option to enable
this.
$ perf record --latency -- make -C tools/perf
I've run the above command for perf build which adds -j option to
make with the number of CPUs in the system internally. Normally
it'd show something like below:
$ perf report -F overhead,comm
...
#
# Overhead Command
# ........ ...............
#
78.97% cc1
6.54% python3
4.21% shellcheck
3.28% ld
1.80% as
1.37% cc1plus
0.80% sh
0.62% clang
0.56% gcc
0.44% perl
0.39% make
...
The cc1 takes around 80% of the overhead as it's the actual
compiler. However it runs in parallel so its contribution to
latency may be less than that. Now, perf report will show both
overhead and latency (if --latency was given at record time) like
below:
$ perf report -s comm
...
#
# Overhead Latency Command
# ........ ........ ...............
#
78.97% 48.66% cc1
6.54% 25.68% python3
4.21% 0.39% shellcheck
3.28% 13.70% ld
1.80% 2.56% as
1.37% 3.08% cc1plus
0.80% 0.98% sh
0.62% 0.61% clang
0.56% 0.33% gcc
0.44% 1.71% perl
0.39% 0.83% make
...
You can see latency of cc1 goes down to around 50% and python3 and
ld contribute a lot more than their overhead. You can use --latency
option in perf report to get the same result but ordered by
latency.
$ perf report --latency -s comm
perf report:
- As a side effect of the latency profiling work, it adds a new
output field 'latency' and a sort key 'parallelism'. The below is a
result from my system with 64 CPUs. The build was well-parallelized
but contained some serial portions.
$ perf report -s parallelism
...
#
# Overhead Latency Parallelism
# ........ ........ ...........
#
16.95% 1.54% 62
13.38% 1.24% 61
12.50% 70.47% 1
11.81% 1.06% 63
7.59% 0.71% 60
4.33% 12.20% 2
3.41% 0.33% 59
2.05% 0.18% 64
1.75% 1.09% 9
1.64% 1.85% 5
...
- Support Feodra mini-debuginfo which is a LZMA compressed symbol
table inside ".gnu_debugdata" ELF section.
perf annotate:
- Add --code-with-type option to enable data-type profiling with the
usual annotate output.
Instead of focusing on data structure, it shows code annotation
together with data type it accesses in case the instruction refers
to a memory location (and it was able to resolve the target data
type). Currently it only works with --stdio.
$ perf annotate --stdio --code-with-type
...
Percent | Source code & Disassembly of vmlinux for cpu/mem-loads,ldlat=30/pp (18 samples, percent: local period)
----------------------------------------------------------------------------------------------------------------------
: 0 0xffffffff81050610 <__fdget>:
0.00 : ffffffff81050610: callq 0xffffffff81c01b80 <__fentry__> # data-type: (stack operation)
0.00 : ffffffff81050615: pushq %rbp # data-type: (stack operation)
0.00 : ffffffff81050616: movq %rsp, %rbp
0.00 : ffffffff81050619: pushq %r15 # data-type: (stack operation)
0.00 : ffffffff8105061b: pushq %r14 # data-type: (stack operation)
0.00 : ffffffff8105061d: pushq %rbx # data-type: (stack operation)
0.00 : ffffffff8105061e: subq $0x10, %rsp
0.00 : ffffffff81050622: movl %edi, %ebx
0.00 : ffffffff81050624: movq %gs:0x7efc4814(%rip), %rax # 0x14e40 <current_task> # data-type: struct task_struct* +0
0.00 : ffffffff8105062c: movq 0x8d0(%rax), %r14 # data-type: struct task_struct +0x8d0 (files)
0.00 : ffffffff81050633: movl (%r14), %eax # data-type: struct files_struct +0 (count.counter)
0.00 : ffffffff81050636: cmpl $0x1, %eax
0.00 : ffffffff81050639: je 0xffffffff810506a9 <__fdget+0x99>
0.00 : ffffffff8105063b: movq 0x20(%r14), %rcx # data-type: struct files_struct +0x20 (fdt)
0.00 : ffffffff8105063f: movl (%rcx), %eax # data-type: struct fdtable +0 (max_fds)
0.00 : ffffffff81050641: cmpl %ebx, %eax
0.00 : ffffffff81050643: jbe 0xffffffff810506ef <__fdget+0xdf>
0.00 : ffffffff81050649: movl %ebx, %r15d
5.56 : ffffffff8105064c: movq 0x8(%rcx), %rdx # data-type: struct fdtable +0x8 (fd)
...
The "# data-type:" part was added with this change. The first few
entries are not very interesting. But later you can it accesses a
couple of fields in the task_struct, files_struct and fdtable.
perf trace:
- Support syscall tracing for different ABI. For example it can trace
system calls for 32-bit applications on 64-bit kernel
transparently.
- Add --summary-mode=total option to show global syscall summary. The
default is 'thread' to show per-thread syscall summary.
Python support:
- Add more interfaces to 'perf' module to parse events, and config,
enable or disable the event list properly so that it can implement
basic functionalities purely in Python. There is an example code
for these new interfaces in python/tracepoint.py.
- Add mypy and pylint support to enable build time checking. Fix some
code based on the findings from these tools.
Internals:
- Introduce io_dir__readdir() API to make directory traveral (usually
for proc or sysfs) efficient with less memory footprint.
JSON vendor events:
- Add events and metrics for ARM Neoverse N3 and V3
- Update events and metrics on various Intel CPUs
- Add/update events for a number of SiFive processors"
* tag 'perf-tools-for-v6.15-2025-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (229 commits)
perf bpf-filter: Fix a parsing error with comma
perf report: Fix a memory leak for perf_env on AMD
perf trace: Fix wrong size to bpf_map__update_elem call
perf tools: annotate asm_pure_loop.S
perf python: Fix setup.py mypy errors
perf test: Address attr.py mypy error
perf build: Add pylint build tests
perf build: Add mypy build tests
perf build: Rename TEST_LOGS to SHELL_TEST_LOGS
tools/build: Don't pass test log files to linker
perf bench sched pipe: fix enforced blocking reads in worker_thread
perf tools: Fix is_compat_mode build break in ppc64
perf build: filter all combinations of -flto for libperl
perf vendor events arm64 AmpereOneX: Fix frontend_bound calculation
perf vendor events arm64: AmpereOne/AmpereOneX: Mark LD_RETIRED impacted by errata
perf trace: Fix evlist memory leak
perf trace: Fix BTF memory leak
perf trace: Make syscall table stable
perf syscalltbl: Mask off ABI type for MIPS system calls
perf build: Remove Makefile.syscalls
...
Diffstat (limited to 'tools/perf/util/python.c')
| -rw-r--r-- | tools/perf/util/python.c | 160 |
1 files changed, 133 insertions, 27 deletions
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c index b4bc57859f73..f3c05da25b4a 100644 --- a/tools/perf/util/python.c +++ b/tools/perf/util/python.c @@ -9,10 +9,12 @@ #include <event-parse.h> #endif #include <perf/mmap.h> +#include "callchain.h" #include "evlist.h" #include "evsel.h" #include "event.h" #include "print_binary.h" +#include "record.h" #include "strbuf.h" #include "thread_map.h" #include "trace-event.h" @@ -20,13 +22,6 @@ #include "util/sample.h" #include <internal/lib.h> -#define _PyUnicode_FromString(arg) \ - PyUnicode_FromString(arg) -#define _PyUnicode_FromFormat(...) \ - PyUnicode_FromFormat(__VA_ARGS__) -#define _PyLong_FromLong(arg) \ - PyLong_FromLong(arg) - PyMODINIT_FUNC PyInit_perf(void); #define member_def(type, member, ptype, help) \ @@ -47,7 +42,7 @@ struct pyrf_event { }; #define sample_members \ - sample_member_def(sample_ip, ip, T_ULONGLONG, "event type"), \ + sample_member_def(sample_ip, ip, T_ULONGLONG, "event ip"), \ sample_member_def(sample_pid, pid, T_INT, "event pid"), \ sample_member_def(sample_tid, tid, T_INT, "event tid"), \ sample_member_def(sample_time, time, T_ULONGLONG, "event timestamp"), \ @@ -270,6 +265,12 @@ static PyMemberDef pyrf_sample_event__members[] = { { .name = NULL, }, }; +static void pyrf_sample_event__delete(struct pyrf_event *pevent) +{ + perf_sample__exit(&pevent->sample); + Py_TYPE(pevent)->tp_free((PyObject*)pevent); +} + static PyObject *pyrf_sample_event__repr(const struct pyrf_event *pevent) { PyObject *ret; @@ -336,23 +337,14 @@ get_tracepoint_field(struct pyrf_event *pevent, PyObject *attr_name) { const char *str = _PyUnicode_AsString(PyObject_Str(attr_name)); struct evsel *evsel = pevent->evsel; + struct tep_event *tp_format = evsel__tp_format(evsel); struct tep_format_field *field; - if (!evsel->tp_format) { - struct tep_event *tp_format; - - tp_format = trace_event__tp_format_id(evsel->core.attr.config); - if (IS_ERR_OR_NULL(tp_format)) - return NULL; - - evsel->tp_format = tp_format; - } - - field = tep_find_any_field(evsel->tp_format, str); - if (!field) + if (IS_ERR_OR_NULL(tp_format)) return NULL; - return tracepoint_field(pevent, field); + field = tep_find_any_field(tp_format, str); + return field ? tracepoint_field(pevent, field) : NULL; } #endif /* HAVE_LIBTRACEEVENT */ @@ -428,6 +420,9 @@ static int pyrf_event__setup_types(void) pyrf_sample_event__type.tp_new = pyrf_context_switch_event__type.tp_new = pyrf_throttle_event__type.tp_new = PyType_GenericNew; + + pyrf_sample_event__type.tp_dealloc = (destructor)pyrf_sample_event__delete, + err = PyType_Ready(&pyrf_mmap_event__type); if (err < 0) goto out; @@ -481,6 +476,11 @@ static PyObject *pyrf_event__new(const union perf_event *event) event->header.type == PERF_RECORD_SWITCH_CPU_WIDE)) return NULL; + // FIXME this better be dynamic or we need to parse everything + // before calling perf_mmap__consume(), including tracepoint fields. + if (sizeof(pevent->event) < event->header.size) + return NULL; + ptype = pyrf_event__type[event->header.type]; pevent = PyObject_New(struct pyrf_event, ptype); if (pevent != NULL) @@ -802,6 +802,28 @@ static PyMethodDef pyrf_evsel__methods[] = { { .ml_name = NULL, } }; +#define evsel_member_def(member, ptype, help) \ + { #member, ptype, \ + offsetof(struct pyrf_evsel, evsel.member), \ + 0, help } + +#define evsel_attr_member_def(member, ptype, help) \ + { #member, ptype, \ + offsetof(struct pyrf_evsel, evsel.core.attr.member), \ + 0, help } + +static PyMemberDef pyrf_evsel__members[] = { + evsel_member_def(tracking, T_BOOL, "tracking event."), + evsel_attr_member_def(type, T_UINT, "attribute type."), + evsel_attr_member_def(size, T_UINT, "attribute size."), + evsel_attr_member_def(config, T_ULONGLONG, "attribute config."), + evsel_attr_member_def(sample_period, T_ULONGLONG, "attribute sample_period."), + evsel_attr_member_def(sample_type, T_ULONGLONG, "attribute sample_type."), + evsel_attr_member_def(read_format, T_ULONGLONG, "attribute read_format."), + evsel_attr_member_def(wakeup_events, T_UINT, "attribute wakeup_events."), + { .name = NULL, }, +}; + static const char pyrf_evsel__doc[] = PyDoc_STR("perf event selector list object."); static PyTypeObject pyrf_evsel__type = { @@ -811,6 +833,7 @@ static PyTypeObject pyrf_evsel__type = { .tp_dealloc = (destructor)pyrf_evsel__delete, .tp_flags = Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE, .tp_doc = pyrf_evsel__doc, + .tp_members = pyrf_evsel__members, .tp_methods = pyrf_evsel__methods, .tp_init = (initproc)pyrf_evsel__init, .tp_str = pyrf_evsel__str, @@ -851,6 +874,16 @@ static void pyrf_evlist__delete(struct pyrf_evlist *pevlist) Py_TYPE(pevlist)->tp_free((PyObject*)pevlist); } +static PyObject *pyrf_evlist__all_cpus(struct pyrf_evlist *pevlist) +{ + struct pyrf_cpu_map *pcpu_map = PyObject_New(struct pyrf_cpu_map, &pyrf_cpu_map__type); + + if (pcpu_map) + pcpu_map->cpus = perf_cpu_map__get(pevlist->evlist.core.all_cpus); + + return (PyObject *)pcpu_map; +} + static PyObject *pyrf_evlist__mmap(struct pyrf_evlist *pevlist, PyObject *args, PyObject *kwargs) { @@ -984,20 +1017,22 @@ static PyObject *pyrf_evlist__read_on_cpu(struct pyrf_evlist *pevlist, evsel = evlist__event2evsel(evlist, event); if (!evsel) { + Py_DECREF(pyevent); Py_INCREF(Py_None); return Py_None; } pevent->evsel = evsel; - err = evsel__parse_sample(evsel, event, &pevent->sample); - - /* Consume the even only after we parsed it out. */ perf_mmap__consume(&md->core); - if (err) + err = evsel__parse_sample(evsel, &pevent->event, &pevent->sample); + if (err) { + Py_DECREF(pyevent); return PyErr_Format(PyExc_OSError, "perf: can't parse sample, err=%d", err); + } + return pyevent; } end: @@ -1019,8 +1054,53 @@ static PyObject *pyrf_evlist__open(struct pyrf_evlist *pevlist, return Py_None; } +static PyObject *pyrf_evlist__config(struct pyrf_evlist *pevlist) +{ + struct record_opts opts = { + .sample_time = true, + .mmap_pages = UINT_MAX, + .user_freq = UINT_MAX, + .user_interval = ULLONG_MAX, + .freq = 4000, + .target = { + .uses_mmap = true, + .default_per_cpu = true, + }, + .nr_threads_synthesize = 1, + .ctl_fd = -1, + .ctl_fd_ack = -1, + .no_buffering = true, + .no_inherit = true, + }; + struct evlist *evlist = &pevlist->evlist; + + evlist__config(evlist, &opts, &callchain_param); + Py_INCREF(Py_None); + return Py_None; +} + +static PyObject *pyrf_evlist__disable(struct pyrf_evlist *pevlist) +{ + evlist__disable(&pevlist->evlist); + Py_INCREF(Py_None); + return Py_None; +} + +static PyObject *pyrf_evlist__enable(struct pyrf_evlist *pevlist) +{ + evlist__enable(&pevlist->evlist); + Py_INCREF(Py_None); + return Py_None; +} + static PyMethodDef pyrf_evlist__methods[] = { { + .ml_name = "all_cpus", + .ml_meth = (PyCFunction)pyrf_evlist__all_cpus, + .ml_flags = METH_NOARGS, + .ml_doc = PyDoc_STR("CPU map union of all evsel CPU maps.") + }, + { .ml_name = "mmap", .ml_meth = (PyCFunction)pyrf_evlist__mmap, .ml_flags = METH_VARARGS | METH_KEYWORDS, @@ -1056,6 +1136,24 @@ static PyMethodDef pyrf_evlist__methods[] = { .ml_flags = METH_VARARGS | METH_KEYWORDS, .ml_doc = PyDoc_STR("reads an event.") }, + { + .ml_name = "config", + .ml_meth = (PyCFunction)pyrf_evlist__config, + .ml_flags = METH_NOARGS, + .ml_doc = PyDoc_STR("Apply default record options to the evlist.") + }, + { + .ml_name = "disable", + .ml_meth = (PyCFunction)pyrf_evlist__disable, + .ml_flags = METH_NOARGS, + .ml_doc = PyDoc_STR("Disable the evsels in the evlist.") + }, + { + .ml_name = "enable", + .ml_meth = (PyCFunction)pyrf_evlist__enable, + .ml_flags = METH_NOARGS, + .ml_doc = PyDoc_STR("Enable the evsels in the evlist.") + }, { .ml_name = NULL, } }; @@ -1254,6 +1352,8 @@ static PyObject *pyrf_evsel__from_evsel(struct evsel *evsel) evsel__init(&pevsel->evsel, &evsel->core.attr, evsel->core.idx); evsel__clone(&pevsel->evsel, evsel); + if (evsel__is_group_leader(evsel)) + evsel__set_leader(&pevsel->evsel, &pevsel->evsel); return (PyObject *)pevsel; } @@ -1281,12 +1381,18 @@ static PyObject *pyrf__parse_events(PyObject *self, PyObject *args) struct evlist evlist = {}; struct parse_events_error err; PyObject *result; + PyObject *pcpus = NULL, *pthreads = NULL; + struct perf_cpu_map *cpus; + struct perf_thread_map *threads; - if (!PyArg_ParseTuple(args, "s", &input)) + if (!PyArg_ParseTuple(args, "s|OO", &input, &pcpus, &pthreads)) return NULL; + threads = pthreads ? ((struct pyrf_thread_map *)pthreads)->threads : NULL; + cpus = pcpus ? ((struct pyrf_cpu_map *)pcpus)->cpus : NULL; + parse_events_error__init(&err); - evlist__init(&evlist, NULL, NULL); + evlist__init(&evlist, cpus, threads); if (parse_events(&evlist, input, &err)) { parse_events_error__print(&err, input); PyErr_SetFromErrno(PyExc_OSError); |
