<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/trace/trace_syscalls.c, branch v6.13-rc7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>tracing/perf: Add might_fault check to syscall probes</title>
<updated>2024-10-09T21:09:46+00:00</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2024-10-09T01:07:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cdb537ac417938408ee819992f432c410f2d01a2'/>
<id>cdb537ac417938408ee819992f432c410f2d01a2</id>
<content type='text'>
Add a might_fault() check to validate that the perf sys_enter/sys_exit
probe callbacks are indeed called from a context where page faults can
be handled.

Cc: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/20241009010718.2050182-8-mathieu.desnoyers@efficios.com
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a might_fault() check to validate that the perf sys_enter/sys_exit
probe callbacks are indeed called from a context where page faults can
be handled.

Cc: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/20241009010718.2050182-8-mathieu.desnoyers@efficios.com
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing/ftrace: Add might_fault check to syscall probes</title>
<updated>2024-10-09T21:09:36+00:00</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2024-10-09T01:07:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a3204c740a59bebb3b37a294d83f4e353303a52c'/>
<id>a3204c740a59bebb3b37a294d83f4e353303a52c</id>
<content type='text'>
Add a might_fault() check to validate that the ftrace sys_enter/sys_exit
probe callbacks are indeed called from a context where page faults can
be handled.

Cc: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/20241009010718.2050182-7-mathieu.desnoyers@efficios.com
Acked-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a might_fault() check to validate that the ftrace sys_enter/sys_exit
probe callbacks are indeed called from a context where page faults can
be handled.

Cc: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/20241009010718.2050182-7-mathieu.desnoyers@efficios.com
Acked-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing/perf: disable preemption in syscall probe</title>
<updated>2024-10-09T21:07:25+00:00</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2024-10-09T01:07:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=65e7462a16cea593025ca3b34c5d74e69b027ee0'/>
<id>65e7462a16cea593025ca3b34c5d74e69b027ee0</id>
<content type='text'>
In preparation for allowing system call enter/exit instrumentation to
handle page faults, make sure that perf can handle this change by
explicitly disabling preemption within the perf system call tracepoint
probes to respect the current expectations within perf ring buffer code.

This change does not yet allow perf to take page faults per se within
its probe, but allows its existing probes to adapt to the upcoming
change.

Cc: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/20241009010718.2050182-4-mathieu.desnoyers@efficios.com
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In preparation for allowing system call enter/exit instrumentation to
handle page faults, make sure that perf can handle this change by
explicitly disabling preemption within the perf system call tracepoint
probes to respect the current expectations within perf ring buffer code.

This change does not yet allow perf to take page faults per se within
its probe, but allows its existing probes to adapt to the upcoming
change.

Cc: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/20241009010718.2050182-4-mathieu.desnoyers@efficios.com
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing/ftrace: disable preemption in syscall probe</title>
<updated>2024-10-09T21:07:00+00:00</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2024-10-09T01:07:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=13d750c2c03e9861e15268574ed2c239cca9c9d5'/>
<id>13d750c2c03e9861e15268574ed2c239cca9c9d5</id>
<content type='text'>
In preparation for allowing system call enter/exit instrumentation to
handle page faults, make sure that ftrace can handle this change by
explicitly disabling preemption within the ftrace system call tracepoint
probes to respect the current expectations within ftrace ring buffer
code.

This change does not yet allow ftrace to take page faults per se within
its probe, but allows its existing probes to adapt to the upcoming
change.

Cc: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/20241009010718.2050182-3-mathieu.desnoyers@efficios.com
Acked-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In preparation for allowing system call enter/exit instrumentation to
handle page faults, make sure that ftrace can handle this change by
explicitly disabling preemption within the ftrace system call tracepoint
probes to respect the current expectations within ftrace ring buffer
code.

This change does not yet allow ftrace to take page faults per se within
its probe, but allows its existing probes to adapt to the upcoming
change.

Cc: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/20241009010718.2050182-3-mathieu.desnoyers@efficios.com
Acked-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Use fake pt_regs when doing bpf syscall tracepoint tracing</title>
<updated>2024-09-11T20:27:27+00:00</updated>
<author>
<name>Yonghong Song</name>
<email>yonghong.song@linux.dev</email>
</author>
<published>2024-09-10T21:40:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=376bd59e2a0404b09767cc991cf5aed394cf0cf2'/>
<id>376bd59e2a0404b09767cc991cf5aed394cf0cf2</id>
<content type='text'>
Salvatore Benedetto reported an issue that when doing syscall tracepoint
tracing the kernel stack is empty. For example, using the following
command line
  bpftrace -e 'tracepoint:syscalls:sys_enter_read { print("Kernel Stack\n"); print(kstack()); }'
  bpftrace -e 'tracepoint:syscalls:sys_exit_read { print("Kernel Stack\n"); print(kstack()); }'
the output for both commands is
===
  Kernel Stack
===

Further analysis shows that pt_regs used for bpf syscall tracepoint
tracing is from the one constructed during user-&gt;kernel transition.
The call stack looks like
  perf_syscall_enter+0x88/0x7c0
  trace_sys_enter+0x41/0x80
  syscall_trace_enter+0x100/0x160
  do_syscall_64+0x38/0xf0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

The ip address stored in pt_regs is from user space hence no kernel
stack is printed.

To fix the issue, kernel address from pt_regs is required.
In kernel repo, there are already a few cases like this. For example,
in kernel/trace/bpf_trace.c, several perf_fetch_caller_regs(fake_regs_ptr)
instances are used to supply ip address or use ip address to construct
call stack.

Instead of allocate fake_regs in the stack which may consume
a lot of bytes, the function perf_trace_buf_alloc() in
perf_syscall_{enter, exit}() is leveraged to create fake_regs,
which will be passed to perf_call_bpf_{enter,exit}().

For the above bpftrace script, I got the following output with this patch:
for tracepoint:syscalls:sys_enter_read
===
  Kernel Stack

        syscall_trace_enter+407
        syscall_trace_enter+407
        do_syscall_64+74
        entry_SYSCALL_64_after_hwframe+75
===
and for tracepoint:syscalls:sys_exit_read
===
Kernel Stack

        syscall_exit_work+185
        syscall_exit_work+185
        syscall_exit_to_user_mode+305
        do_syscall_64+118
        entry_SYSCALL_64_after_hwframe+75
===

Reported-by: Salvatore Benedetto &lt;salvabenedetto@meta.com&gt;
Suggested-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20240910214037.3663272-1-yonghong.song@linux.dev
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Salvatore Benedetto reported an issue that when doing syscall tracepoint
tracing the kernel stack is empty. For example, using the following
command line
  bpftrace -e 'tracepoint:syscalls:sys_enter_read { print("Kernel Stack\n"); print(kstack()); }'
  bpftrace -e 'tracepoint:syscalls:sys_exit_read { print("Kernel Stack\n"); print(kstack()); }'
the output for both commands is
===
  Kernel Stack
===

Further analysis shows that pt_regs used for bpf syscall tracepoint
tracing is from the one constructed during user-&gt;kernel transition.
The call stack looks like
  perf_syscall_enter+0x88/0x7c0
  trace_sys_enter+0x41/0x80
  syscall_trace_enter+0x100/0x160
  do_syscall_64+0x38/0xf0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

The ip address stored in pt_regs is from user space hence no kernel
stack is printed.

To fix the issue, kernel address from pt_regs is required.
In kernel repo, there are already a few cases like this. For example,
in kernel/trace/bpf_trace.c, several perf_fetch_caller_regs(fake_regs_ptr)
instances are used to supply ip address or use ip address to construct
call stack.

Instead of allocate fake_regs in the stack which may consume
a lot of bytes, the function perf_trace_buf_alloc() in
perf_syscall_{enter, exit}() is leveraged to create fake_regs,
which will be passed to perf_call_bpf_{enter,exit}().

For the above bpftrace script, I got the following output with this patch:
for tracepoint:syscalls:sys_enter_read
===
  Kernel Stack

        syscall_trace_enter+407
        syscall_trace_enter+407
        do_syscall_64+74
        entry_SYSCALL_64_after_hwframe+75
===
and for tracepoint:syscalls:sys_exit_read
===
Kernel Stack

        syscall_exit_work+185
        syscall_exit_work+185
        syscall_exit_to_user_mode+305
        do_syscall_64+118
        entry_SYSCALL_64_after_hwframe+75
===

Reported-by: Salvatore Benedetto &lt;salvabenedetto@meta.com&gt;
Suggested-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20240910214037.3663272-1-yonghong.song@linux.dev
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Change syscall_nr type to int in struct syscall_tp_t</title>
<updated>2023-10-13T19:39:36+00:00</updated>
<author>
<name>Artem Savkov</name>
<email>asavkov@redhat.com</email>
</author>
<published>2023-10-13T05:42:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ba8ea72388a192c10f1ee5f5a4a32332e7cced76'/>
<id>ba8ea72388a192c10f1ee5f5a4a32332e7cced76</id>
<content type='text'>
linux-rt-devel tree contains a patch (b1773eac3f29c ("sched: Add support
for lazy preemption")) that adds an extra member to struct trace_entry.
This causes the offset of args field in struct trace_event_raw_sys_enter
be different from the one in struct syscall_trace_enter:

struct trace_event_raw_sys_enter {
        struct trace_entry         ent;                  /*     0    12 */

        /* XXX last struct has 3 bytes of padding */
        /* XXX 4 bytes hole, try to pack */

        long int                   id;                   /*    16     8 */
        long unsigned int          args[6];              /*    24    48 */
        /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
        char                       __data[];             /*    72     0 */

        /* size: 72, cachelines: 2, members: 4 */
        /* sum members: 68, holes: 1, sum holes: 4 */
        /* paddings: 1, sum paddings: 3 */
        /* last cacheline: 8 bytes */
};

struct syscall_trace_enter {
        struct trace_entry         ent;                  /*     0    12 */

        /* XXX last struct has 3 bytes of padding */

        int                        nr;                   /*    12     4 */
        long unsigned int          args[];               /*    16     0 */

        /* size: 16, cachelines: 1, members: 3 */
        /* paddings: 1, sum paddings: 3 */
        /* last cacheline: 16 bytes */
};

This, in turn, causes perf_event_set_bpf_prog() fail while running bpf
test_profiler testcase because max_ctx_offset is calculated based on the
former struct, while off on the latter:

  10488         if (is_tracepoint || is_syscall_tp) {
  10489                 int off = trace_event_get_offsets(event-&gt;tp_event);
  10490
  10491                 if (prog-&gt;aux-&gt;max_ctx_offset &gt; off)
  10492                         return -EACCES;
  10493         }

What bpf program is actually getting is a pointer to struct
syscall_tp_t, defined in kernel/trace/trace_syscalls.c. This patch fixes
the problem by aligning struct syscall_tp_t with struct
syscall_trace_(enter|exit) and changing the tests to use these structs
to dereference context.

Signed-off-by: Artem Savkov &lt;asavkov@redhat.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Link: https://lore.kernel.org/bpf/20231013054219.172920-1-asavkov@redhat.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
linux-rt-devel tree contains a patch (b1773eac3f29c ("sched: Add support
for lazy preemption")) that adds an extra member to struct trace_entry.
This causes the offset of args field in struct trace_event_raw_sys_enter
be different from the one in struct syscall_trace_enter:

struct trace_event_raw_sys_enter {
        struct trace_entry         ent;                  /*     0    12 */

        /* XXX last struct has 3 bytes of padding */
        /* XXX 4 bytes hole, try to pack */

        long int                   id;                   /*    16     8 */
        long unsigned int          args[6];              /*    24    48 */
        /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
        char                       __data[];             /*    72     0 */

        /* size: 72, cachelines: 2, members: 4 */
        /* sum members: 68, holes: 1, sum holes: 4 */
        /* paddings: 1, sum paddings: 3 */
        /* last cacheline: 8 bytes */
};

struct syscall_trace_enter {
        struct trace_entry         ent;                  /*     0    12 */

        /* XXX last struct has 3 bytes of padding */

        int                        nr;                   /*    12     4 */
        long unsigned int          args[];               /*    16     0 */

        /* size: 16, cachelines: 1, members: 3 */
        /* paddings: 1, sum paddings: 3 */
        /* last cacheline: 16 bytes */
};

This, in turn, causes perf_event_set_bpf_prog() fail while running bpf
test_profiler testcase because max_ctx_offset is calculated based on the
former struct, while off on the latter:

  10488         if (is_tracepoint || is_syscall_tp) {
  10489                 int off = trace_event_get_offsets(event-&gt;tp_event);
  10490
  10491                 if (prog-&gt;aux-&gt;max_ctx_offset &gt; off)
  10492                         return -EACCES;
  10493         }

What bpf program is actually getting is a pointer to struct
syscall_tp_t, defined in kernel/trace/trace_syscalls.c. This patch fixes
the problem by aligning struct syscall_tp_t with struct
syscall_trace_(enter|exit) and changing the tests to use these structs
to dereference context.

Signed-off-by: Artem Savkov &lt;asavkov@redhat.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Link: https://lore.kernel.org/bpf/20231013054219.172920-1-asavkov@redhat.com
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: bpf: use struct trace_entry in struct syscall_tp_t</title>
<updated>2023-08-01T17:53:28+00:00</updated>
<author>
<name>Yauheni Kaliuta</name>
<email>ykaliuta@redhat.com</email>
</author>
<published>2023-08-01T07:52:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d3c4db86c71164ed9ae17b4026e1acde460a18ab'/>
<id>d3c4db86c71164ed9ae17b4026e1acde460a18ab</id>
<content type='text'>
bpf tracepoint program uses struct trace_event_raw_sys_enter as
argument where trace_entry is the first field. Use the same instead
of unsigned long long since if it's amended (for example by RT
patch) it accesses data with wrong offset.

Signed-off-by: Yauheni Kaliuta &lt;ykaliuta@redhat.com&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20230801075222.7717-1-ykaliuta@redhat.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bpf tracepoint program uses struct trace_event_raw_sys_enter as
argument where trace_entry is the first field. Use the same instead
of unsigned long long since if it's amended (for example by RT
patch) it accesses data with wrong offset.

Signed-off-by: Yauheni Kaliuta &lt;ykaliuta@redhat.com&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20230801075222.7717-1-ykaliuta@redhat.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Remove unused __bad_type_size() method</title>
<updated>2022-11-18T01:21:06+00:00</updated>
<author>
<name>Qiujun Huang</name>
<email>hqjagain@gmail.com</email>
</author>
<published>2022-11-17T16:44:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b8752064e30697e3982418f4274cc63cfc6f3027'/>
<id>b8752064e30697e3982418f4274cc63cfc6f3027</id>
<content type='text'>
__bad_type_size() is unused after
commit 04ae87a52074("ftrace: Rework event_create_dir()").
So, remove it.

Link: https://lkml.kernel.org/r/D062EC2E-7DB7-4402-A67E-33C3577F551E@gmail.com

Acked-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Qiujun Huang &lt;hqjagain@gmail.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__bad_type_size() is unused after
commit 04ae87a52074("ftrace: Rework event_create_dir()").
So, remove it.

Link: https://lkml.kernel.org/r/D062EC2E-7DB7-4402-A67E-33C3577F551E@gmail.com

Acked-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Qiujun Huang &lt;hqjagain@gmail.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Make tp_printk work on syscall tracepoints</title>
<updated>2022-04-26T21:58:52+00:00</updated>
<author>
<name>Jeff Xie</name>
<email>xiehuan09@gmail.com</email>
</author>
<published>2022-04-10T14:50:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cb1c45fb68b8a4285ccf750842b1136f26cfe267'/>
<id>cb1c45fb68b8a4285ccf750842b1136f26cfe267</id>
<content type='text'>
Currently the tp_printk option has no effect on syscall tracepoint.
When adding the kernel option parameter tp_printk, then:

echo 1 &gt; /sys/kernel/debug/tracing/events/syscalls/enable

When running any application, no trace information is printed on the
terminal.

Now added printk for syscall tracepoints.

Link: https://lkml.kernel.org/r/20220410145025.681144-1-xiehuan09@gmail.com

Signed-off-by: Jeff Xie &lt;xiehuan09@gmail.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently the tp_printk option has no effect on syscall tracepoint.
When adding the kernel option parameter tp_printk, then:

echo 1 &gt; /sys/kernel/debug/tracing/events/syscalls/enable

When running any application, no trace information is printed on the
terminal.

Now added printk for syscall tracepoints.

Link: https://lkml.kernel.org/r/20220410145025.681144-1-xiehuan09@gmail.com

Signed-off-by: Jeff Xie &lt;xiehuan09@gmail.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Have syscall trace events use trace_event_buffer_lock_reserve()</title>
<updated>2022-01-13T21:23:05+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2022-01-07T22:56:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3e2a56e6f639492311e0a8533f0a7aed60816308'/>
<id>3e2a56e6f639492311e0a8533f0a7aed60816308</id>
<content type='text'>
Currently, the syscall trace events call trace_buffer_lock_reserve()
directly, which means that it misses out on some of the filtering
optimizations provided by the helper function
trace_event_buffer_lock_reserve(). Have the syscall trace events call that
instead, as it was missed when adding the update to use the temp buffer
when filtering.

Link: https://lkml.kernel.org/r/20220107225839.823118570@goodmis.org

Cc: stable@vger.kernel.org
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Tom Zanussi &lt;zanussi@kernel.org&gt;
Reviewed-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, the syscall trace events call trace_buffer_lock_reserve()
directly, which means that it misses out on some of the filtering
optimizations provided by the helper function
trace_event_buffer_lock_reserve(). Have the syscall trace events call that
instead, as it was missed when adding the update to use the temp buffer
when filtering.

Link: https://lkml.kernel.org/r/20220107225839.823118570@goodmis.org

Cc: stable@vger.kernel.org
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Tom Zanussi &lt;zanussi@kernel.org&gt;
Reviewed-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
