<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/tools/perf/util/util.c, branch v6.3</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>perf tools: Add reallocarray_as_needed()</title>
<updated>2022-07-20T14:08:37+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2022-07-11T09:32:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=10d34700223b14ed94a6399cf026142956f87965'/>
<id>10d34700223b14ed94a6399cf026142956f87965</id>
<content type='text'>
Add helper reallocarray_as_needed() to reallocate an array to a larger
size and initialize the extra entries to an arbitrary value.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-24-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add helper reallocarray_as_needed() to reallocate an array to a larger
size and initialize the extra entries to an arbitrary value.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-24-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf tools: Remove also guest kcore_dir with host kcore_dir</title>
<updated>2022-07-20T14:08:30+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2022-07-11T09:32:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=386e0d83d351a4461525b06b845ffcd58235381e'/>
<id>386e0d83d351a4461525b06b845ffcd58235381e</id>
<content type='text'>
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can
be stored in the perf.data output directory under the subdirectory named
kcore_dir. Guest machines will have their files also under subdirectories
beginning kcore_dir__ followed by the machine pid. Remove these also when
removing kcore_dir.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can
be stored in the perf.data output directory under the subdirectory named
kcore_dir. Guest machines will have their files also under subdirectories
beginning kcore_dir__ followed by the machine pid. Remove these also when
removing kcore_dir.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf tools: Print warning when HAVE_DEBUGINFOD_SUPPORT is not set and user tries to use debuginfod support</title>
<updated>2022-04-20T16:36:36+00:00</updated>
<author>
<name>Martin Liška</name>
<email>mliska@suse.cz</email>
</author>
<published>2022-04-20T13:32:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c60664dea70a76cdffb4c4c21b2a09153b41a950'/>
<id>c60664dea70a76cdffb4c4c21b2a09153b41a950</id>
<content type='text'>
When one requests debuginfod, either via --debuginfod option, or with a
perf-config value, complain when perf is not built with it.

Signed-off-by: Martin Liška &lt;mliska@suse.cz&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/35bae747-3951-dc3d-a66b-abf4cebcd9cb@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When one requests debuginfod, either via --debuginfod option, or with a
perf-config value, complain when perf is not built with it.

Signed-off-by: Martin Liška &lt;mliska@suse.cz&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/35bae747-3951-dc3d-a66b-abf4cebcd9cb@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf tools: Try chroot'ed filename when opening dso/symbol</title>
<updated>2022-02-10T18:32:25+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-02-02T07:08:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=67fd18924647bdcafab18c2945a79ddea73653de'/>
<id>67fd18924647bdcafab18c2945a79ddea73653de</id>
<content type='text'>
Currently it doesn't handle tasks in chroot properly.  As filenames in
MMAP records base on their root directory, it's different than what perf
tool can see from outside.

Add filename_with_chroot() helper to deal with those cases.  The
function returns a new filename only if it's in a different root
directory.  Since it needs to access /proc for the process, it only
works until the task exits.

With this change, I can see symbols in my program like below.

  # perf record -o- chroot myroot myprog 3 | perf report -i-
  ...
  #
  # Overhead  Command  Shared Object      Symbol
  # ........  .......  .................  .............................
  #
      99.83%  myprog   myprog             [.] loop
       0.04%  chroot   [kernel.kallsyms]  [k] fxregs_fixup
       0.04%  chroot   [kernel.kallsyms]  [k] rsm_load_seg_32
  ...

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20220202070828.143303-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently it doesn't handle tasks in chroot properly.  As filenames in
MMAP records base on their root directory, it's different than what perf
tool can see from outside.

Add filename_with_chroot() helper to deal with those cases.  The
function returns a new filename only if it's in a different root
directory.  Since it needs to access /proc for the process, it only
works until the task exits.

With this change, I can see symbols in my program like below.

  # perf record -o- chroot myroot myprog 3 | perf report -i-
  ...
  #
  # Overhead  Command  Shared Object      Symbol
  # ........  .......  .................  .............................
  #
      99.83%  myprog   myprog             [.] loop
       0.04%  chroot   [kernel.kallsyms]  [k] fxregs_fixup
       0.04%  chroot   [kernel.kallsyms]  [k] rsm_load_seg_32
  ...

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20220202070828.143303-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf record: Disable debuginfod by default</title>
<updated>2022-01-15T20:41:25+00:00</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@redhat.com</email>
</author>
<published>2021-12-09T20:04:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9bce13ea88f85344b765abe5d3dabdd0f44dc177'/>
<id>9bce13ea88f85344b765abe5d3dabdd0f44dc177</id>
<content type='text'>
Fedora 35 sets DEBUGINFOD_URLS by default, which might lead to
unexpected stalls in perf record exit path, when we try to cache
profiled binaries.

  # DEBUGINFOD_PROGRESS=1 ./perf record -a
  ^C[ perf record: Woken up 1 times to write data ]
  Downloading from https://debuginfod.fedoraproject.org/ 447069
  Downloading from https://debuginfod.fedoraproject.org/ 1502175
  Downloading \^Z

Disabling DEBUGINFOD_URLS by default in perf record and adding
debuginfod option and .perfconfig variable support to enable id.

  Default without debuginfo processing:
  # perf record -a

  Using system debuginfod setup:
  # perf record -a --debuginfod

  Using custom debuginfd url:
  # perf record -a --debuginfod='https://evenbetterdebuginfodserver.krava'

Adding single perf_debuginfod_setup function and using
it also in perf buildid-cache command.

Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Frank Ch. Eigler &lt;fche@redhat.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20211209200425.303561-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fedora 35 sets DEBUGINFOD_URLS by default, which might lead to
unexpected stalls in perf record exit path, when we try to cache
profiled binaries.

  # DEBUGINFOD_PROGRESS=1 ./perf record -a
  ^C[ perf record: Woken up 1 times to write data ]
  Downloading from https://debuginfod.fedoraproject.org/ 447069
  Downloading from https://debuginfod.fedoraproject.org/ 1502175
  Downloading \^Z

Disabling DEBUGINFOD_URLS by default in perf record and adding
debuginfod option and .perfconfig variable support to enable id.

  Default without debuginfo processing:
  # perf record -a

  Using system debuginfod setup:
  # perf record -a --debuginfod

  Using custom debuginfd url:
  # perf record -a --debuginfod='https://evenbetterdebuginfodserver.krava'

Adding single perf_debuginfod_setup function and using
it also in perf buildid-cache command.

Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Frank Ch. Eigler &lt;fche@redhat.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20211209200425.303561-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf report: Fix memory leaks around perf_tip()</title>
<updated>2021-11-18T13:18:03+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2021-11-18T07:38:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d9fc706108c15f8bc2d4ccccf8e50f74830fabd9'/>
<id>d9fc706108c15f8bc2d4ccccf8e50f74830fabd9</id>
<content type='text'>
perf_tip() may allocate memory or use a literal, this means memory
wasn't freed if allocated. Change the API so that literals aren't used.

At the same time add missing frees for system_path. These issues were
spotted using leak sanitizer.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: http://lore.kernel.org/lkml/20211118073804.2149974-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf_tip() may allocate memory or use a literal, this means memory
wasn't freed if allocated. Change the API so that literals aren't used.

At the same time add missing frees for system_path. These issues were
spotted using leak sanitizer.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: http://lore.kernel.org/lkml/20211118073804.2149974-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf tools: Support CAP_PERFMON capability</title>
<updated>2020-04-16T15:19:08+00:00</updated>
<author>
<name>Alexey Budankov</name>
<email>alexey.budankov@linux.intel.com</email>
</author>
<published>2020-04-02T08:47:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6b3e0e2e04615df128b2d38fa1dd1fcb84f2504c'/>
<id>6b3e0e2e04615df128b2d38fa1dd1fcb84f2504c</id>
<content type='text'>
Extend error messages to mention CAP_PERFMON capability as an option to
substitute CAP_SYS_ADMIN capability for secure system performance
monitoring and observability operations. Make
perf_event_paranoid_check() and __cmd_ftrace() to be aware of
CAP_PERFMON capability.

CAP_PERFMON implements the principle of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to perf_events subsystem remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure perf_events monitoring is discouraged with respect to CAP_PERFMON
capability.

Committer testing:

Using a libcap with this patch:

  diff --git a/libcap/include/uapi/linux/capability.h b/libcap/include/uapi/linux/capability.h
  index 78b2fd4c8a95..89b5b0279b60 100644
  --- a/libcap/include/uapi/linux/capability.h
  +++ b/libcap/include/uapi/linux/capability.h
  @@ -366,8 +366,9 @@ struct vfs_ns_cap_data {

   #define CAP_AUDIT_READ       37

  +#define CAP_PERFMON	     38

  -#define CAP_LAST_CAP         CAP_AUDIT_READ
  +#define CAP_LAST_CAP         CAP_PERFMON

   #define cap_valid(x) ((x) &gt;= 0 &amp;&amp; (x) &lt;= CAP_LAST_CAP)

Note that using '38' in place of 'cap_perfmon' works to some degree with
an old libcap, its only when cap_get_flag() is called that libcap
performs an error check based on the maximum value known for
capabilities that it will fail.

This makes determining the default of perf_event_attr.exclude_kernel to
fail, as it can't determine if CAP_PERFMON is in place.

Using 'perf top -e cycles' avoids the default check and sets
perf_event_attr.exclude_kernel to 1.

As root, with a libcap supporting CAP_PERFMON:

  # groupadd perf_users
  # adduser perf -g perf_users
  # mkdir ~perf/bin
  # cp ~acme/bin/perf ~perf/bin/
  # chgrp perf_users ~perf/bin/perf
  # setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" ~perf/bin/perf
  # getcap ~perf/bin/perf
  /home/perf/bin/perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
  # ls -la ~perf/bin/perf
  -rwxr-xr-x. 1 root perf_users 16968552 Apr  9 13:10 /home/perf/bin/perf

As the 'perf' user in the 'perf_users' group:

  $ perf top -a --stdio
  Error:
  Failed to mmap with 1 (Operation not permitted)
  $

Either add the cap_ipc_lock capability to the perf binary or reduce the
ring buffer size to some smaller value:

  $ perf top -m10 -a --stdio
  rounding mmap pages size to 64K (16 pages)
  Error:
  Failed to mmap with 1 (Operation not permitted)
  $ perf top -m4 -a --stdio
  Error:
  Failed to mmap with 1 (Operation not permitted)
  $ perf top -m2 -a --stdio
   PerfTop: 762 irqs/sec  kernel:49.7%  exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (all, 4 CPUs)
  ------------------------------------------------------------------------------------------------------

     9.83%  perf                [.] __symbols__insert
     8.58%  perf                [.] rb_next
     5.91%  [kernel]            [k] module_get_kallsym
     5.66%  [kernel]            [k] kallsyms_expand_symbol.constprop.0
     3.98%  libc-2.29.so        [.] __GI_____strtoull_l_internal
     3.66%  perf                [.] rb_insert_color
     2.34%  [kernel]            [k] vsnprintf
     2.30%  [kernel]            [k] string_nocheck
     2.16%  libc-2.29.so        [.] _IO_getdelim
     2.15%  [kernel]            [k] number
     2.13%  [kernel]            [k] format_decode
     1.58%  libc-2.29.so        [.] _IO_feof
     1.52%  libc-2.29.so        [.] __strcmp_avx2
     1.50%  perf                [.] rb_set_parent_color
     1.47%  libc-2.29.so        [.] __libc_calloc
     1.24%  [kernel]            [k] do_syscall_64
     1.17%  [kernel]            [k] __x86_indirect_thunk_rax

  $ perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.552 MB perf.data (74 samples) ]
  $ perf evlist
  cycles
  $ perf evlist -v
  cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  $ perf report | head -20
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 74  of event 'cycles'
  # Event count (approx.): 15694834
  #
  # Overhead  Command          Shared Object               Symbol
  # ........  ...............  ..........................  ......................................
  #
      19.62%  perf             [kernel.vmlinux]            [k] strnlen_user
      13.88%  swapper          [kernel.vmlinux]            [k] intel_idle
      13.83%  ksoftirqd/0      [kernel.vmlinux]            [k] pfifo_fast_dequeue
      13.51%  swapper          [kernel.vmlinux]            [k] kmem_cache_free
       6.31%  gnome-shell      [kernel.vmlinux]            [k] kmem_cache_free
       5.66%  kworker/u8:3+ix  [kernel.vmlinux]            [k] delay_tsc
       4.42%  perf             [kernel.vmlinux]            [k] __set_cpus_allowed_ptr
       3.45%  kworker/2:1-eve  [kernel.vmlinux]            [k] shmem_truncate_range
       2.29%  gnome-shell      libgobject-2.0.so.0.6000.7  [.] g_closure_ref
  $

Signed-off-by: Alexey Budankov &lt;alexey.budankov@linux.intel.com&gt;
Reviewed-by: James Morris &lt;jamorris@linux.microsoft.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Igor Lubashev &lt;ilubashe@akamai.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Serge Hallyn &lt;serge@hallyn.com&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-doc@vger.kernel.org
Cc: linux-man@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: selinux@vger.kernel.org
Link: http://lore.kernel.org/lkml/a66d5648-2b8e-577e-e1f2-1d56c017ab5e@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Extend error messages to mention CAP_PERFMON capability as an option to
substitute CAP_SYS_ADMIN capability for secure system performance
monitoring and observability operations. Make
perf_event_paranoid_check() and __cmd_ftrace() to be aware of
CAP_PERFMON capability.

CAP_PERFMON implements the principle of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to perf_events subsystem remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure perf_events monitoring is discouraged with respect to CAP_PERFMON
capability.

Committer testing:

Using a libcap with this patch:

  diff --git a/libcap/include/uapi/linux/capability.h b/libcap/include/uapi/linux/capability.h
  index 78b2fd4c8a95..89b5b0279b60 100644
  --- a/libcap/include/uapi/linux/capability.h
  +++ b/libcap/include/uapi/linux/capability.h
  @@ -366,8 +366,9 @@ struct vfs_ns_cap_data {

   #define CAP_AUDIT_READ       37

  +#define CAP_PERFMON	     38

  -#define CAP_LAST_CAP         CAP_AUDIT_READ
  +#define CAP_LAST_CAP         CAP_PERFMON

   #define cap_valid(x) ((x) &gt;= 0 &amp;&amp; (x) &lt;= CAP_LAST_CAP)

Note that using '38' in place of 'cap_perfmon' works to some degree with
an old libcap, its only when cap_get_flag() is called that libcap
performs an error check based on the maximum value known for
capabilities that it will fail.

This makes determining the default of perf_event_attr.exclude_kernel to
fail, as it can't determine if CAP_PERFMON is in place.

Using 'perf top -e cycles' avoids the default check and sets
perf_event_attr.exclude_kernel to 1.

As root, with a libcap supporting CAP_PERFMON:

  # groupadd perf_users
  # adduser perf -g perf_users
  # mkdir ~perf/bin
  # cp ~acme/bin/perf ~perf/bin/
  # chgrp perf_users ~perf/bin/perf
  # setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" ~perf/bin/perf
  # getcap ~perf/bin/perf
  /home/perf/bin/perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
  # ls -la ~perf/bin/perf
  -rwxr-xr-x. 1 root perf_users 16968552 Apr  9 13:10 /home/perf/bin/perf

As the 'perf' user in the 'perf_users' group:

  $ perf top -a --stdio
  Error:
  Failed to mmap with 1 (Operation not permitted)
  $

Either add the cap_ipc_lock capability to the perf binary or reduce the
ring buffer size to some smaller value:

  $ perf top -m10 -a --stdio
  rounding mmap pages size to 64K (16 pages)
  Error:
  Failed to mmap with 1 (Operation not permitted)
  $ perf top -m4 -a --stdio
  Error:
  Failed to mmap with 1 (Operation not permitted)
  $ perf top -m2 -a --stdio
   PerfTop: 762 irqs/sec  kernel:49.7%  exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (all, 4 CPUs)
  ------------------------------------------------------------------------------------------------------

     9.83%  perf                [.] __symbols__insert
     8.58%  perf                [.] rb_next
     5.91%  [kernel]            [k] module_get_kallsym
     5.66%  [kernel]            [k] kallsyms_expand_symbol.constprop.0
     3.98%  libc-2.29.so        [.] __GI_____strtoull_l_internal
     3.66%  perf                [.] rb_insert_color
     2.34%  [kernel]            [k] vsnprintf
     2.30%  [kernel]            [k] string_nocheck
     2.16%  libc-2.29.so        [.] _IO_getdelim
     2.15%  [kernel]            [k] number
     2.13%  [kernel]            [k] format_decode
     1.58%  libc-2.29.so        [.] _IO_feof
     1.52%  libc-2.29.so        [.] __strcmp_avx2
     1.50%  perf                [.] rb_set_parent_color
     1.47%  libc-2.29.so        [.] __libc_calloc
     1.24%  [kernel]            [k] do_syscall_64
     1.17%  [kernel]            [k] __x86_indirect_thunk_rax

  $ perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.552 MB perf.data (74 samples) ]
  $ perf evlist
  cycles
  $ perf evlist -v
  cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  $ perf report | head -20
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 74  of event 'cycles'
  # Event count (approx.): 15694834
  #
  # Overhead  Command          Shared Object               Symbol
  # ........  ...............  ..........................  ......................................
  #
      19.62%  perf             [kernel.vmlinux]            [k] strnlen_user
      13.88%  swapper          [kernel.vmlinux]            [k] intel_idle
      13.83%  ksoftirqd/0      [kernel.vmlinux]            [k] pfifo_fast_dequeue
      13.51%  swapper          [kernel.vmlinux]            [k] kmem_cache_free
       6.31%  gnome-shell      [kernel.vmlinux]            [k] kmem_cache_free
       5.66%  kworker/u8:3+ix  [kernel.vmlinux]            [k] delay_tsc
       4.42%  perf             [kernel.vmlinux]            [k] __set_cpus_allowed_ptr
       3.45%  kworker/2:1-eve  [kernel.vmlinux]            [k] shmem_truncate_range
       2.29%  gnome-shell      libgobject-2.0.so.0.6000.7  [.] g_closure_ref
  $

Signed-off-by: Alexey Budankov &lt;alexey.budankov@linux.intel.com&gt;
Reviewed-by: James Morris &lt;jamorris@linux.microsoft.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Igor Lubashev &lt;ilubashe@akamai.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Serge Hallyn &lt;serge@hallyn.com&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-doc@vger.kernel.org
Cc: linux-man@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: selinux@vger.kernel.org
Link: http://lore.kernel.org/lkml/a66d5648-2b8e-577e-e1f2-1d56c017ab5e@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf util: Factor out sysctl__nmi_watchdog_enabled()</title>
<updated>2020-03-10T17:46:19+00:00</updated>
<author>
<name>Kan Liang</name>
<email>kan.liang@linux.intel.com</email>
</author>
<published>2020-02-24T21:59:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2a14c1bf017f48a17c8c0ba26a22625363e77cc7'/>
<id>2a14c1bf017f48a17c8c0ba26a22625363e77cc7</id>
<content type='text'>
The NMI watchdog status is required for metric group constraint
examination.  Factor out sysctl__nmi_watchdog_enabled() to retrieve the
NMI watchdog status.

Users may count more than one metric group each time. If so, the NMI
watchdog status may be retrieved several times. To reduce the overhead,
cache the NMI watchdog status.

Replace the NMI watchdog status checking in print_footer() by
sysctl__nmi_watchdog_enabled().

Suggested-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@linux.ibm.com&gt;
Link: http://lore.kernel.org/lkml/1582581564-184429-4-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The NMI watchdog status is required for metric group constraint
examination.  Factor out sysctl__nmi_watchdog_enabled() to retrieve the
NMI watchdog status.

Users may count more than one metric group each time. If so, the NMI
watchdog status may be retrieved several times. To reduce the overhead,
cache the NMI watchdog status.

Replace the NMI watchdog status checking in print_footer() by
sysctl__nmi_watchdog_enabled().

Suggested-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@linux.ibm.com&gt;
Link: http://lore.kernel.org/lkml/1582581564-184429-4-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf record: Put a copy of kcore into the perf.data directory</title>
<updated>2019-11-06T18:43:05+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2019-10-04T08:31:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=eeb399b531a1576e36016f8a7f0c50d10194e190'/>
<id>eeb399b531a1576e36016f8a7f0c50d10194e190</id>
<content type='text'>
Add a new 'perf record' option '--kcore' which will put a copy of
/proc/kcore, kallsyms and modules into a perf.data directory. Note, that
without the --kcore option, output goes to a file as previously.  The
tools' -o and -i options work with either a file name or directory name.

Example:

  $ sudo perf record --kcore uname

  $ sudo tree perf.data
  perf.data
  ├── kcore_dir
  │   ├── kallsyms
  │   ├── kcore
  │   └── modules
  └── data

  $ sudo perf script -v
  build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de
  build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541
  Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field.
  Using CPUID GenuineIntel-6-8E-A
  Using perf.data/kcore_dir/kcore for kernel data
  Using perf.data/kcore_dir/kallsyms for symbols
             perf 19058 506778.423729:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423733:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423734:          7 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423736:        117 cycles:  ffffffffa2caa54a native_write_msr+0xa (vmlinux)
             perf 19058 506778.423738:       2092 cycles:  ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux)
             perf 19058 506778.423740:      37380 cycles:  ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux)
            uname 19058 506778.423751:     582673 cycles:  ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux)
            uname 19058 506778.423892:    2241841 cycles:  ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux)
            uname 19058 506778.424430:    2457397 cycles:  ffffffffa3019232 check_memory_region+0x52 (vmlinux)

Committer testing:

  # rm -rf perf.data*
  # perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
  # ls -l perf.data
  -rw-------. 1 root root 34772 Oct 21 11:08 perf.data
  # perf record --kcore uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
  ls[root@quaco ~]# ls -lad perf.data*
  drwx------. 3 root root  4096 Oct 21 11:08 perf.data
  -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  # perf evlist -v -i perf.data/data
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  #

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new 'perf record' option '--kcore' which will put a copy of
/proc/kcore, kallsyms and modules into a perf.data directory. Note, that
without the --kcore option, output goes to a file as previously.  The
tools' -o and -i options work with either a file name or directory name.

Example:

  $ sudo perf record --kcore uname

  $ sudo tree perf.data
  perf.data
  ├── kcore_dir
  │   ├── kallsyms
  │   ├── kcore
  │   └── modules
  └── data

  $ sudo perf script -v
  build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de
  build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541
  Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field.
  Using CPUID GenuineIntel-6-8E-A
  Using perf.data/kcore_dir/kcore for kernel data
  Using perf.data/kcore_dir/kallsyms for symbols
             perf 19058 506778.423729:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423733:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423734:          7 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423736:        117 cycles:  ffffffffa2caa54a native_write_msr+0xa (vmlinux)
             perf 19058 506778.423738:       2092 cycles:  ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux)
             perf 19058 506778.423740:      37380 cycles:  ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux)
            uname 19058 506778.423751:     582673 cycles:  ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux)
            uname 19058 506778.423892:    2241841 cycles:  ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux)
            uname 19058 506778.424430:    2457397 cycles:  ffffffffa3019232 check_memory_region+0x52 (vmlinux)

Committer testing:

  # rm -rf perf.data*
  # perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
  # ls -l perf.data
  -rw-------. 1 root root 34772 Oct 21 11:08 perf.data
  # perf record --kcore uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
  ls[root@quaco ~]# ls -lad perf.data*
  drwx------. 3 root root  4096 Oct 21 11:08 perf.data
  -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  # perf evlist -v -i perf.data/data
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  #

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf data: Rename directory "header" file to "data"</title>
<updated>2019-11-06T18:43:05+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2019-10-04T08:31:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9b70b9db4e0cc03d224795a18088fdb916dec823'/>
<id>9b70b9db4e0cc03d224795a18088fdb916dec823</id>
<content type='text'>
In preparation to support a single file directory format, rename "header"
to "data" because "header" is a mis-leading name when there is only 1 file.
Note, in the multi-file case, the "header" file also contains data.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20191004083121.12182-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In preparation to support a single file directory format, rename "header"
to "data" because "header" is a mis-leading name when there is only 1 file.
Note, in the multi-file case, the "header" file also contains data.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20191004083121.12182-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
