summaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2026-02-10Merge tag 'bpf-next-7.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Support associating BPF program with struct_ops (Amery Hung) - Switch BPF local storage to rqspinlock and remove recursion detection counters which were causing false positives (Amery Hung) - Fix live registers marking for indirect jumps (Anton Protopopov) - Introduce execution context detection BPF helpers (Changwoo Min) - Improve verifier precision for 32bit sign extension pattern (Cupertino Miranda) - Optimize BTF type lookup by sorting vmlinux BTF and doing binary search (Donglin Peng) - Allow states pruning for misc/invalid slots in iterator loops (Eduard Zingerman) - In preparation for ASAN support in BPF arenas teach libbpf to move global BPF variables to the end of the region and enable arena kfuncs while holding locks (Emil Tsalapatis) - Introduce support for implicit arguments in kfuncs and migrate a number of them to new API. This is a prerequisite for cgroup sub-schedulers in sched-ext (Ihor Solodrai) - Fix incorrect copied_seq calculation in sockmap (Jiayuan Chen) - Fix ORC stack unwind from kprobe_multi (Jiri Olsa) - Speed up fentry attach by using single ftrace direct ops in BPF trampolines (Jiri Olsa) - Require frozen map for calculating map hash (KP Singh) - Fix lock entry creation in TAS fallback in rqspinlock (Kumar Kartikeya Dwivedi) - Allow user space to select cpu in lookup/update operations on per-cpu array and hash maps (Leon Hwang) - Make kfuncs return trusted pointers by default (Matt Bobrowski) - Introduce "fsession" support where single BPF program is executed upon entry and exit from traced kernel function (Menglong Dong) - Allow bpf_timer and bpf_wq use in all programs types (Mykyta Yatsenko, Andrii Nakryiko, Kumar Kartikeya Dwivedi, Alexei Starovoitov) - Make KF_TRUSTED_ARGS the default for all kfuncs and clean up their definition across the tree (Puranjay Mohan) - Allow BPF arena calls from non-sleepable context (Puranjay Mohan) - Improve register id comparison logic in the verifier and extend linked registers with negative offsets (Puranjay Mohan) - In preparation for BPF-OOM introduce kfuncs to access memcg events (Roman Gushchin) - Use CFI compatible destructor kfunc type (Sami Tolvanen) - Add bitwise tracking for BPF_END in the verifier (Tianci Cao) - Add range tracking for BPF_DIV and BPF_MOD in the verifier (Yazhou Tang) - Make BPF selftests work with 64k page size (Yonghong Song) * tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (268 commits) selftests/bpf: Fix outdated test on storage->smap selftests/bpf: Choose another percpu variable in bpf for btf_dump test selftests/bpf: Remove test_task_storage_map_stress_lookup selftests/bpf: Update task_local_storage/task_storage_nodeadlock test selftests/bpf: Update task_local_storage/recursion test selftests/bpf: Update sk_storage_omem_uncharge test bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy} bpf: Support lockless unlink when freeing map or local storage bpf: Prepare for bpf_selem_unlink_nofail() bpf: Remove unused percpu counter from bpf_local_storage_map_free bpf: Remove cgroup local storage percpu counter bpf: Remove task local storage percpu counter bpf: Change local_storage->lock and b->lock to rqspinlock bpf: Convert bpf_selem_unlink to failable bpf: Convert bpf_selem_link_map to failable bpf: Convert bpf_selem_unlink_map to failable bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage selftests/xsk: fix number of Tx frags in invalid packet selftests/xsk: properly handle batch ending in the middle of a packet bpf: Prevent reentrance into call_rcu_tasks_trace() ...
2026-02-09Merge tag 'for-7.0/block-20260206' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Support for batch request processing for ublk, improving the efficiency of the kernel/ublk server communication. This can yield nice 7-12% performance improvements - Support for integrity data for ublk - Various other ublk improvements and additions, including a ton of selftests additions and updated - Move the handling of blk-crypto software fallback from below the block layer to above it. This reduces the complexity of dealing with bio splitting - Series fixing a number of potential deadlocks in blk-mq related to the queue usage counter and writeback throttling and rq-qos debugfs handling - Add an async_depth queue attribute, to resolve a performance regression that's been around for a qhilw related to the scheduler depth handling - Only use task_work for IOPOLL completions on NVMe, if it is necessary to do so. An earlier fix for an issue resulted in all these completions being punted to task_work, to guarantee that completions were only run for a given io_uring ring when it was local to that ring. With the new changes, we can detect if it's necessary to use task_work or not, and avoid it if possible. - rnbd fixes: - Fix refcount underflow in device unmap path - Handle PREFLUSH and NOUNMAP flags properly in protocol - Fix server-side bi_size for special IOs - Zero response buffer before use - Fix trace format for flags - Add .release to rnbd_dev_ktype - MD pull requests via Yu Kuai - Fix raid5_run() to return error when log_init() fails - Fix IO hang with degraded array with llbitmap - Fix percpu_ref not resurrected on suspend timeout in llbitmap - Fix GPF in write_page caused by resize race - Fix NULL pointer dereference in process_metadata_update - Fix hang when stopping arrays with metadata through dm-raid - Fix any_working flag handling in raid10_sync_request - Refactor sync/recovery code path, improve error handling for badblocks, and remove unused recovery_disabled field - Consolidate mddev boolean fields into mddev_flags - Use mempool to allocate stripe_request_ctx and make sure max_sectors is not less than io_opt in raid5 - Fix return value of mddev_trylock - Fix memory leak in raid1_run() - Add Li Nan as mdraid reviewer - Move phys_vec definitions to the kernel types, mostly in preparation for some VFIO and RDMA changes - Improve the speed for secure erase for some devices - Various little rust updates - Various other minor fixes, improvements, and cleanups * tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) blk-mq: ABI/sysfs-block: fix docs build warnings selftests: ublk: organize test directories by test ID block: decouple secure erase size limit from discard size limit block: remove redundant kill_bdev() call in set_blocksize() blk-mq: add documentation for new queue attribute async_dpeth block, bfq: convert to use request_queue->async_depth mq-deadline: covert to use request_queue->async_depth kyber: covert to use request_queue->async_depth blk-mq: add a new queue sysfs attribute async_depth blk-mq: factor out a helper blk_mq_limit_depth() blk-mq-sched: unify elevators checking for async requests block: convert nr_requests to unsigned int block: don't use strcpy to copy blockdev name blk-mq-debugfs: warn about possible deadlock blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos() blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static blk-rq-qos: fix possible debugfs_mutex deadlock blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter ...
2026-02-05tracing: Fix ftrace event field alignmentsSteven Rostedt
The fields of ftrace specific events (events used to save ftrace internal events like function traces and trace_printk) are generated similarly to how normal trace event fields are generated. That is, the fields are added to a trace_events_fields array that saves the name, offset, size, alignment and signness of the field. It is used to produce the output in the format file in tracefs so that tooling knows how to parse the binary data of the trace events. The issue is that some of the ftrace event structures are packed. The function graph exit event structures are one of them. The 64 bit calltime and rettime fields end up 4 byte aligned, but the algorithm to show to userspace shows them as 8 byte aligned. The macros that create the ftrace events has one for embedded structure fields. There's two macros for theses fields: __field_desc() and __field_packed() The difference of the latter macro is that it treats the field as packed. Rename that field to __field_desc_packed() and create replace the __field_packed() to be a normal field that is packed and have the calltime and rettime use those. This showed up on 32bit architectures for function graph time fields. It had: ~# cat /sys/kernel/tracing/events/ftrace/funcgraph_exit/format [..] field:unsigned long func; offset:8; size:4; signed:0; field:unsigned int depth; offset:12; size:4; signed:0; field:unsigned int overrun; offset:16; size:4; signed:0; field:unsigned long long calltime; offset:24; size:8; signed:0; field:unsigned long long rettime; offset:32; size:8; signed:0; Notice that overrun is at offset 16 with size 4, where in the structure calltime is at offset 20 (16 + 4), but it shows the offset at 24. That's because it used the alignment of unsigned long long when used as a declaration and not as a member of a structure where it would be aligned by word size (in this case 4). By using the proper structure alignment, the format has it at the correct offset: ~# cat /sys/kernel/tracing/events/ftrace/funcgraph_exit/format [..] field:unsigned long func; offset:8; size:4; signed:0; field:unsigned int depth; offset:12; size:4; signed:0; field:unsigned int overrun; offset:16; size:4; signed:0; field:unsigned long long calltime; offset:20; size:8; signed:0; field:unsigned long long rettime; offset:28; size:8; signed:0; Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reported-by: "jempty.liang" <imntjempty@163.com> Link: https://patch.msgid.link/20260204113628.53faec78@gandalf.local.home Fixes: 04ae87a52074e ("ftrace: Rework event_create_dir()") Closes: https://lore.kernel.org/all/20260130015740.212343-1-imntjempty@163.com/ Closes: https://lore.kernel.org/all/20260202123342.2544795-1-imntjempty@163.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-03block: don't use strcpy to copy blockdev nameJohannes Thumshirn
0-day bot flagged the use of strcpy() in blk_trace_setup(), because the source buffer can theoretically be bigger than the destination buffer. While none of the current callers pass a string bigger than BLKTRACE_BDEV_SIZE, use strscpy() to prevent eventual future misuse and silence the checker warnings. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202602020718.GUEIRyG9-lkp@intel.com/ Fixes: 113cbd62824a ("blktrace: pass blk_user_trace2 to setup functions") Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-02ftrace: Fix direct_functions leak in update_ftrace_direct_delJiri Olsa
Alexei reported memory leak in update_ftrace_direct_del. We miss cleanup of the replaced direct_functions in the success path in update_ftrace_direct_del, adding that. Fixes: 8d2c1233f371 ("ftrace: Add update_ftrace_direct_del function") Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Closes: https://lore.kernel.org/bpf/aX_BxG5EJTJdCMT9@krava/T/#m7c13f5a95f862ed7ab78e905fbb678d635306a0c Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20260202075849.1684369-1-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-28bpf,x86: Use single ftrace_ops for direct callsJiri Olsa
Using single ftrace_ops for direct calls update instead of allocating ftrace_ops object for each trampoline. With single ftrace_ops object we can use update_ftrace_direct_* api that allows multiple ip sites updates on single ftrace_ops object. Adding HAVE_SINGLE_FTRACE_DIRECT_OPS config option to be enabled on each arch that supports this. At the moment we can enable this only on x86 arch, because arm relies on ftrace_ops object representing just single trampoline image (stored in ftrace_ops::direct_call). Archs that do not support this will continue to use *_ftrace_direct api. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-10-jolsa@kernel.org
2026-01-28ftrace: Factor ftrace_ops ops_func interfaceJiri Olsa
We are going to remove "ftrace_ops->private == bpf_trampoline" setup in following changes. Adding ip argument to ftrace_ops_func_t callback function, so we can use it to look up the trampoline. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-9-jolsa@kernel.org
2026-01-28ftrace: Add update_ftrace_direct_mod functionJiri Olsa
Adding update_ftrace_direct_mod function that modifies all entries (ip -> direct) provided in hash argument to direct ftrace ops and updates its attachments. The difference to current modify_ftrace_direct is: - hash argument that allows to modify multiple ip -> direct entries at once This change will allow us to have simple ftrace_ops for all bpf direct interface users in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-7-jolsa@kernel.org
2026-01-28ftrace: Add update_ftrace_direct_del functionJiri Olsa
Adding update_ftrace_direct_del function that removes all entries (ip -> addr) provided in hash argument to direct ftrace ops and updates its attachments. The difference to current unregister_ftrace_direct is - hash argument that allows to unregister multiple ip -> direct entries at once - we can call update_ftrace_direct_del multiple times on the same ftrace_ops object, becase we do not need to unregister all entries at once, we can do it gradualy with the help of ftrace_update_ops function This change will allow us to have simple ftrace_ops for all bpf direct interface users in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-6-jolsa@kernel.org
2026-01-28ftrace: Add update_ftrace_direct_add functionJiri Olsa
Adding update_ftrace_direct_add function that adds all entries (ip -> addr) provided in hash argument to direct ftrace ops and updates its attachments. The difference to current register_ftrace_direct is - hash argument that allows to register multiple ip -> direct entries at once - we can call update_ftrace_direct_add multiple times on the same ftrace_ops object, becase after first registration with register_ftrace_function_nolock, it uses ftrace_update_ops to update the ftrace_ops object This change will allow us to have simple ftrace_ops for all bpf direct interface users in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-5-jolsa@kernel.org
2026-01-28ftrace: Export some of hash related functionsJiri Olsa
We are going to use these functions in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-4-jolsa@kernel.org
2026-01-28ftrace: Make alloc_and_copy_ftrace_hash direct friendlyJiri Olsa
Make alloc_and_copy_ftrace_hash to copy also direct address for each hash entry. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-3-jolsa@kernel.org
2026-01-28ftrace,bpf: Remove FTRACE_OPS_FL_JMP ftrace_ops flagJiri Olsa
At the moment the we allow the jmp attach only for ftrace_ops that has FTRACE_OPS_FL_JMP set. This conflicts with following changes where we use single ftrace_ops object for all direct call sites, so all could be be attached via just call or jmp. We already limit the jmp attach support with config option and bit (LSB) set on the trampoline address. It turns out that's actually enough to limit the jmp attach for architecture and only for chosen addresses (with LSB bit set). Each user of register_ftrace_direct or modify_ftrace_direct can set the trampoline bit (LSB) to indicate it has to be attached by jmp. The bpf trampoline generation code uses trampoline flags to generate jmp-attach specific code and ftrace inner code uses the trampoline bit (LSB) to handle return from jmp attachment, so there's no harm to remove the FTRACE_OPS_FL_JMP bit. The fexit/fmodret performance stays the same (did not drop), current code: fentry : 77.904 ± 0.546M/s fexit : 62.430 ± 0.554M/s fmodret : 66.503 ± 0.902M/s with this change: fentry : 80.472 ± 0.061M/s fexit : 63.995 ± 0.127M/s fmodret : 67.362 ± 0.175M/s Fixes: 25e4e3565d45 ("ftrace: Introduce FTRACE_OPS_FL_JMP") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-2-jolsa@kernel.org
2026-01-24bpf: support fsession for bpf_session_is_returnMenglong Dong
If fsession exists, we will use the bit (1 << BPF_TRAMP_IS_RETURN_SHIFT) in ((u64 *)ctx)[-1] to store the "is_return" flag. The logic of bpf_session_is_return() for fsession is implemented in the verifier by inline following code: bool bpf_session_is_return(void *ctx) { return (((u64 *)ctx)[-1] >> BPF_TRAMP_IS_RETURN_SHIFT) & 1; } Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Co-developed-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260124062008.8657-5-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-24bpf: change prototype of bpf_session_{cookie,is_return}Menglong Dong
Add the function argument of "void *ctx" to bpf_session_cookie() and bpf_session_is_return(), which is a preparation of the next patch. The two kfunc is seldom used now, so it will not introduce much effect to change their function prototype. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20260124062008.8657-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-24bpf: use the least significant byte for the nr_args in trampolineMenglong Dong
For now, ((u64 *)ctx)[-1] is used to store the nr_args in the trampoline. However, 1 byte is enough to store such information. Therefore, we use only the least significant byte of ((u64 *)ctx)[-1] to store the nr_args, and reserve the rest for other usages. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-3-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-23function_graph: Fix args pointer mismatch in print_graph_retval()Donglin Peng
When funcgraph-args and funcgraph-retaddr are both enabled, many kernel functions display invalid parameters in trace logs. The issue occurs because print_graph_retval() passes a mismatched args pointer to print_function_args(). Fix this by retrieving the correct args pointer using the FGRAPH_ENTRY_ARGS() macro. Link: https://patch.msgid.link/20260112021601.1300479-1-dolinux.peng@gmail.com Fixes: f83ac7544fbf ("function_graph: Enable funcgraph-args and funcgraph-retaddr to work simultaneously") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-01-23tracing: Avoid possible signed 64-bit truncationIan Rogers
64-bit truncation to 32-bit can result in the sign of the truncated value changing. The cmp_mod_entry is used in bsearch and so the truncation could result in an invalid search order. This would only happen were the addresses more than 2GB apart and so unlikely, but let's fix the potentially broken compare anyway. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260108002625.333331-1-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-01-23tracing: Fix crash on synthetic stacktrace field usageSteven Rostedt
When creating a synthetic event based on an existing synthetic event that had a stacktrace field and the new synthetic event used that field a kernel crash occurred: ~# cd /sys/kernel/tracing ~# echo 's:stack unsigned long stack[];' > dynamic_events ~# echo 'hist:keys=prev_pid:s0=common_stacktrace if prev_state & 3' >> events/sched/sched_switch/trigger ~# echo 'hist:keys=next_pid:s1=$s0:onmatch(sched.sched_switch).trace(stack,$s1)' >> events/sched/sched_switch/trigger The above creates a synthetic event that takes a stacktrace when a task schedules out in a non-running state and passes that stacktrace to the sched_switch event when that task schedules back in. It triggers the "stack" synthetic event that has a stacktrace as its field (called "stack"). ~# echo 's:syscall_stack s64 id; unsigned long stack[];' >> dynamic_events ~# echo 'hist:keys=common_pid:s2=stack' >> events/synthetic/stack/trigger ~# echo 'hist:keys=common_pid:s3=$s2,i0=id:onmatch(synthetic.stack).trace(syscall_stack,$i0,$s3)' >> events/raw_syscalls/sys_exit/trigger The above makes another synthetic event called "syscall_stack" that attaches the first synthetic event (stack) to the sys_exit trace event and records the stacktrace from the stack event with the id of the system call that is exiting. When enabling this event (or using it in a historgram): ~# echo 1 > events/synthetic/syscall_stack/enable Produces a kernel crash! BUG: unable to handle page fault for address: 0000000000400010 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 6 UID: 0 PID: 1257 Comm: bash Not tainted 6.16.3+deb14-amd64 #1 PREEMPT(lazy) Debian 6.16.3-1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 RIP: 0010:trace_event_raw_event_synth+0x90/0x380 Code: c5 00 00 00 00 85 d2 0f 84 e1 00 00 00 31 db eb 34 0f 1f 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 <49> 8b 04 24 48 83 c3 01 8d 0c c5 08 00 00 00 01 cd 41 3b 5d 40 0f RSP: 0018:ffffd2670388f958 EFLAGS: 00010202 RAX: ffff8ba1065cc100 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: fffff266ffda7b90 RDI: ffffd2670388f9b0 RBP: 0000000000000010 R08: ffff8ba104e76000 R09: ffffd2670388fa50 R10: ffff8ba102dd42e0 R11: ffffffff9a908970 R12: 0000000000400010 R13: ffff8ba10a246400 R14: ffff8ba10a710220 R15: fffff266ffda7b90 FS: 00007fa3bc63f740(0000) GS:ffff8ba2e0f48000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000400010 CR3: 0000000107f9e003 CR4: 0000000000172ef0 Call Trace: <TASK> ? __tracing_map_insert+0x208/0x3a0 action_trace+0x67/0x70 event_hist_trigger+0x633/0x6d0 event_triggers_call+0x82/0x130 trace_event_buffer_commit+0x19d/0x250 trace_event_raw_event_sys_exit+0x62/0xb0 syscall_exit_work+0x9d/0x140 do_syscall_64+0x20a/0x2f0 ? trace_event_raw_event_sched_switch+0x12b/0x170 ? save_fpregs_to_fpstate+0x3e/0x90 ? _raw_spin_unlock+0xe/0x30 ? finish_task_switch.isra.0+0x97/0x2c0 ? __rseq_handle_notify_resume+0xad/0x4c0 ? __schedule+0x4b8/0xd00 ? restore_fpregs_from_fpstate+0x3c/0x90 ? switch_fpu_return+0x5b/0xe0 ? do_syscall_64+0x1ef/0x2f0 ? do_fault+0x2e9/0x540 ? __handle_mm_fault+0x7d1/0xf70 ? count_memcg_events+0x167/0x1d0 ? handle_mm_fault+0x1d7/0x2e0 ? do_user_addr_fault+0x2c3/0x7f0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The reason is that the stacktrace field is not labeled as such, and is treated as a normal field and not as a dynamic event that it is. In trace_event_raw_event_synth() the event is field is still treated as a dynamic array, but the retrieval of the data is considered a normal field, and the reference is just the meta data: // Meta data is retrieved instead of a dynamic array str_val = (char *)(long)var_ref_vals[val_idx]; // Then when it tries to process it: len = *((unsigned long *)str_val) + 1; It triggers a kernel page fault. To fix this, first when defining the fields of the first synthetic event, set the filter type to FILTER_STACKTRACE. This is used later by the second synthetic event to know that this field is a stacktrace. When creating the field of the new synthetic event, have it use this FILTER_STACKTRACE to know to create a stacktrace field to copy the stacktrace into. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Link: https://patch.msgid.link/20260122194824.6905a38e@gandalf.local.home Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-01-21bpf: support bpf_get_func_arg() for BPF_TRACE_RAW_TPMenglong Dong
For now, bpf_get_func_arg() and bpf_get_func_arg_cnt() is not supported by the BPF_TRACE_RAW_TP, which is not convenient to get the argument of the tracepoint, especially for the case that the position of the arguments in a tracepoint can change. The target tracepoint BTF type id is specified during loading time, therefore we can get the function argument count from the function prototype instead of the stack. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20260121044348.113201-2-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-20bpf: Fix memory access flags in helper prototypesZesen Liu
After commit 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking"), the verifier started relying on the access type flags in helper function prototypes to perform memory access optimizations. Currently, several helper functions utilizing ARG_PTR_TO_MEM lack the corresponding MEM_RDONLY or MEM_WRITE flags. This omission causes the verifier to incorrectly assume that the buffer contents are unchanged across the helper call. Consequently, the verifier may optimize away subsequent reads based on this wrong assumption, leading to correctness issues. For bpf_get_stack_proto_raw_tp, the original MEM_RDONLY was incorrect since the helper writes to the buffer. Change it to ARG_PTR_TO_UNINIT_MEM which correctly indicates write access to potentially uninitialized memory. Similar issues were recently addressed for specific helpers in commit ac44dcc788b9 ("bpf: Fix verifier assumptions of bpf_d_path's output buffer") and commit 2eb7648558a7 ("bpf: Specify access type of bpf_sysctl_get_name args"). Fix these prototypes by adding the correct memory access flags. Fixes: 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking") Co-developed-by: Shuran Liu <electronlsr@gmail.com> Signed-off-by: Shuran Liu <electronlsr@gmail.com> Co-developed-by: Peili Gao <gplhust955@gmail.com> Signed-off-by: Peili Gao <gplhust955@gmail.com> Co-developed-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Zesen Liu <ftyghome@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260120-helper_proto-v3-1-27b0180b4e77@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-16bpf: Add __force annotations to silence sparse warningsMykyta Yatsenko
Add __force annotations to casts that convert between __user and kernel address spaces. These casts are intentional: - In bpf_send_signal_common(), the value is stored in si_value.sival_ptr which is typed as void __user *, but the value comes from a BPF program parameter. - In the bpf_*_dynptr() kfuncs, user pointers are cast to const void * before being passed to copy helper functions that correctly handle the user address space through copy_from_user variants. Without __force, sparse reports: warning: cast removes address space '__user' of expression Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260115184509.3585759-1-mykyta.yatsenko5@gmail.com Closes: https://lore.kernel.org/oe-kbuild-all/202601131740.6C3BdBaB-lkp@intel.com/
2026-01-15arm64/ftrace,bpf: Fix partial regs after bpf_prog_runJiri Olsa
Mahe reported issue with bpf_override_return helper not working when executed from kprobe.multi bpf program on arm. The problem is that on arm we use alternate storage for pt_regs object that is passed to bpf_prog_run and if any register is changed (which is the case of bpf_override_return) it's not propagated back to actual pt_regs object. Fixing this by introducing and calling ftrace_partial_regs_update function to propagate the values of changed registers (ip and stack). Reported-by: Mahe Tardy <mahe.tardy@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/bpf/20260112121157.854473-1-jolsa@kernel.org
2026-01-15ftrace: Do not over-allocate ftrace memoryGuenter Roeck
The pg_remaining calculation in ftrace_process_locs() assumes that ENTRIES_PER_PAGE multiplied by 2^order equals the actual capacity of the allocated page group. However, ENTRIES_PER_PAGE is PAGE_SIZE / ENTRY_SIZE (integer division). When PAGE_SIZE is not a multiple of ENTRY_SIZE (e.g. 4096 / 24 = 170 with remainder 16), high-order allocations (like 256 pages) have significantly more capacity than 256 * 170. This leads to pg_remaining being underestimated, which in turn makes skip (derived from skipped - pg_remaining) larger than expected, causing the WARN(skip != remaining) to trigger. Extra allocated pages for ftrace: 2 with 654 skipped WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:7295 ftrace_process_locs+0x5bf/0x5e0 A similar problem in ftrace_allocate_records() can result in allocating too many pages. This can trigger the second warning in ftrace_process_locs(). Extra allocated pages for ftrace WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:7276 ftrace_process_locs+0x548/0x580 Use the actual capacity of a page group to determine the number of pages to allocate. Have ftrace_allocate_pages() return the number of allocated pages to avoid having to calculate it. Use the actual page group capacity when validating the number of unused pages due to skipped entries. Drop the definition of ENTRIES_PER_PAGE since it is no longer used. Cc: stable@vger.kernel.org Fixes: 4a3efc6baff93 ("ftrace: Update the mcount_loc check of skipped entries") Link: https://patch.msgid.link/20260113152243.3557219-1-linux@roeck-us.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-01-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5Alexei Starovoitov
Cross-merge BPF and other fixes after downstream PR. No conflicts. Adjacent: Auto-merging MAINTAINERS Auto-merging Makefile Auto-merging kernel/bpf/verifier.c Auto-merging kernel/sched/ext.c Auto-merging mm/memcontrol.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-07trace: ftrace_dump_on_oops[] is not exported, make it staticBen Dooks
The ftrace_dump_on_oops string is not used outside of trace.c so make it static to avoid the export warning from sparse: kernel/trace/trace.c:141:6: warning: symbol 'ftrace_dump_on_oops' was not declared. Should it be static? Fixes: dd293df6395a2 ("tracing: Move trace sysctls into trace.c") Link: https://patch.msgid.link/20260106231054.84270-1-ben.dooks@codethink.co.uk Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-01-07tracing: Add recursion protection in kernel stack trace recordingSteven Rostedt
A bug was reported about an infinite recursion caused by tracing the rcu events with the kernel stack trace trigger enabled. The stack trace code called back into RCU which then called the stack trace again. Expand the ftrace recursion protection to add a set of bits to protect events from recursion. Each bit represents the context that the event is in (normal, softirq, interrupt and NMI). Have the stack trace code use the interrupt context to protect against recursion. Note, the bug showed an issue in both the RCU code as well as the tracing stacktrace code. This only handles the tracing stack trace side of the bug. The RCU fix will be handled separately. Link: https://lore.kernel.org/all/20260102122807.7025fc87@gandalf.local.home/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Link: https://patch.msgid.link/20260105203141.515cd49f@gandalf.local.home Reported-by: Yao Kai <yaokai34@huawei.com> Tested-by: Yao Kai <yaokai34@huawei.com> Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in __rcu_read_unlock()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-01-07ring-buffer: Avoid softlockup in ring_buffer_resize() during memory freeWupeng Ma
When user resize all trace ring buffer through file 'buffer_size_kb', then in ring_buffer_resize(), kernel allocates buffer pages for each cpu in a loop. If the kernel preemption model is PREEMPT_NONE and there are many cpus and there are many buffer pages to be freed, it may not give up cpu for a long time and finally cause a softlockup. To avoid it, call cond_resched() after each cpu buffer free as Commit f6bd2c92488c ("ring-buffer: Avoid softlockup in ring_buffer_resize()") does. Detailed call trace as follow: rcu: INFO: rcu_sched self-detected stall on CPU rcu: 24-....: (14837 ticks this GP) idle=521c/1/0x4000000000000000 softirq=230597/230597 fqs=5329 rcu: (t=15004 jiffies g=26003221 q=211022 ncpus=96) CPU: 24 UID: 0 PID: 11253 Comm: bash Kdump: loaded Tainted: G EL 6.18.2+ #278 NONE pc : arch_local_irq_restore+0x8/0x20 arch_local_irq_restore+0x8/0x20 (P) free_frozen_page_commit+0x28c/0x3b0 __free_frozen_pages+0x1c0/0x678 ___free_pages+0xc0/0xe0 free_pages+0x3c/0x50 ring_buffer_resize.part.0+0x6a8/0x880 ring_buffer_resize+0x3c/0x58 __tracing_resize_ring_buffer.part.0+0x34/0xd8 tracing_resize_ring_buffer+0x8c/0xd0 tracing_entries_write+0x74/0xd8 vfs_write+0xcc/0x288 ksys_write+0x74/0x118 __arm64_sys_write+0x24/0x38 Cc: <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20251228065008.2396573-1-mawupeng1@huawei.com Signed-off-by: Wupeng Ma <mawupeng1@huawei.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-01-07tracing: Drop unneeded assignment to soft_modeJulia Lawall
soft_mode is not read in the enable case, so drop the assignment. Drop also the comment text that refers to the assignment and realign the comment. Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20251226110531.4129794-1-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-21bpf: move recursion detection logic to helpersPuranjay Mohan
BPF programs detect recursion by doing atomic inc/dec on a per-cpu active counter from the trampoline. Create two helpers for operations on this active counter, this makes it easy to changes the recursion detection logic in future. This commit makes no functional changes. Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20251219184422.2899902-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-19Merge tag 'trace-v6.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Add Documentation/core-api/tracepoint.rst to TRACING in MAINTAINERS file Updates to the tracepoint.rst document should be reviewed by the tracing maintainers. - Fix warning triggered by perf attaching to synthetic events The synthetic events do not add a function to be registered when perf attaches to them. This causes a warning when perf registers a synthetic event and passes a NULL pointer to the tracepoint register function. Ideally synthetic events should be updated to work with perf, but as that's a feature and not a bug fix, simply now return -ENODEV when perf tries to register an event that has a NULL pointer for its function. This no longer causes a kernel warning and simply causes the perf code to fail with an error message. - Fix 32bit overflow in option flag test The option's flags changed from 32 bits in size to 64 bits in size. Fix one of the places that shift 1 by the option bit number to to be 1ULL. - Fix the output of printing the direct jmp functions The enabled_functions that shows how functions are being attached by ftrace wasn't updated to accommodate the new direct jmp trampolines that set the LSB of the pointer, and outputs garbage. Update the output to handle the direct jmp trampolines. * tag 'trace-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Fix address for jmp mode in t_show() tracing: Fix UBSAN warning in __remove_instance() tracing: Do not register unsupported perf events MAINTAINERS: add tracepoint core-api doc files to TRACING
2025-12-17ftrace: Fix address for jmp mode in t_show()Menglong Dong
The address from ftrace_find_rec_direct() is printed directly in t_show(). This can mislead symbol offsets if it has the "jmp" bit in the last bit. Fix this by printing the address that returned by ftrace_jmp_get(). Link: https://patch.msgid.link/20251217030053.80343-1-dongml2@chinatelecom.cn Fixes: 25e4e3565d45 ("ftrace: Introduce FTRACE_OPS_FL_JMP") Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-17tracing: Fix UBSAN warning in __remove_instance()Darrick J. Wong
xfs/558 triggers the following UBSAN warning: ------------[ cut here ]------------ UBSAN: shift-out-of-bounds in kernel/trace/trace.c:10510:10 shift exponent 32 is too large for 32-bit type 'int' CPU: 1 UID: 0 PID: 888674 Comm: rmdir Not tainted 6.19.0-rc1-xfsx #rc1 PREEMPT(lazy) dbf607ef4c142c563f76d706e71af9731d7b9c90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-4.module+el8.8.0+21164+ed375313 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4a/0x70 ubsan_epilogue+0x5/0x2b __ubsan_handle_shift_out_of_bounds.cold+0x5e/0x113 __remove_instance.part.0.constprop.0.cold+0x18/0x26f instance_rmdir+0xf3/0x110 tracefs_syscall_rmdir+0x4d/0x90 vfs_rmdir+0x139/0x230 do_rmdir+0x143/0x230 __x64_sys_rmdir+0x1d/0x20 do_syscall_64+0x44/0x230 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f7ae8e51f17 Code: f0 ff ff 73 01 c3 48 8b 0d de 2e 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 54 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 2e 0e 00 f7 d8 64 89 02 b8 RSP: 002b:00007ffd90743f08 EFLAGS: 00000246 ORIG_RAX: 0000000000000054 RAX: ffffffffffffffda RBX: 00007ffd907440f8 RCX: 00007f7ae8e51f17 RDX: 00007f7ae8f3c5c0 RSI: 00007ffd90744a21 RDI: 00007ffd90744a21 RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000 R10: 00007f7ae8f35ac0 R11: 0000000000000246 R12: 00007ffd90744a21 R13: 0000000000000001 R14: 00007f7ae8f8b000 R15: 000055e5283e6a98 </TASK> ---[ end trace ]--- whilst tearing down an ftrace instance. TRACE_FLAGS_MAX_SIZE is now 64bit, so the mask comparison expression must be typecast to a u64 value to avoid an overflow. AFAICT, ZEROED_TRACE_FLAGS is already cast to ULL so this is ok. Link: https://patch.msgid.link/20251216174950.GA7705@frogsfrogsfrogs Fixes: bbec8e28cac592 ("tracing: Allow tracer to add more than 32 options") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-17tracing: Do not register unsupported perf eventsSteven Rostedt
Synthetic events currently do not have a function to register perf events. This leads to calling the tracepoint register functions with a NULL function pointer which triggers: ------------[ cut here ]------------ WARNING: kernel/tracepoint.c:175 at tracepoint_add_func+0x357/0x370, CPU#2: perf/2272 Modules linked in: kvm_intel kvm irqbypass CPU: 2 UID: 0 PID: 2272 Comm: perf Not tainted 6.18.0-ftest-11964-ge022764176fc-dirty #323 PREEMPTLAZY Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 RIP: 0010:tracepoint_add_func+0x357/0x370 Code: 28 9c e8 4c 0b f5 ff eb 0f 4c 89 f7 48 c7 c6 80 4d 28 9c e8 ab 89 f4 ff 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc <0f> 0b 49 c7 c6 ea ff ff ff e9 ee fe ff ff 0f 0b e9 f9 fe ff ff 0f RSP: 0018:ffffabc0c44d3c40 EFLAGS: 00010246 RAX: 0000000000000001 RBX: ffff9380aa9e4060 RCX: 0000000000000000 RDX: 000000000000000a RSI: ffffffff9e1d4a98 RDI: ffff937fcf5fd6c8 RBP: 0000000000000001 R08: 0000000000000007 R09: ffff937fcf5fc780 R10: 0000000000000003 R11: ffffffff9c193910 R12: 000000000000000a R13: ffffffff9e1e5888 R14: 0000000000000000 R15: ffffabc0c44d3c78 FS: 00007f6202f5f340(0000) GS:ffff93819f00f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055d3162281a8 CR3: 0000000106a56003 CR4: 0000000000172ef0 Call Trace: <TASK> tracepoint_probe_register+0x5d/0x90 synth_event_reg+0x3c/0x60 perf_trace_event_init+0x204/0x340 perf_trace_init+0x85/0xd0 perf_tp_event_init+0x2e/0x50 perf_try_init_event+0x6f/0x230 ? perf_event_alloc+0x4bb/0xdc0 perf_event_alloc+0x65a/0xdc0 __se_sys_perf_event_open+0x290/0x9f0 do_syscall_64+0x93/0x7b0 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e ? trace_hardirqs_off+0x53/0xc0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Instead, have the code return -ENODEV, which doesn't warn and has perf error out with: # perf record -e synthetic:futex_wait Error: The sys_perf_event_open() syscall returned with 19 (No such device) for event (synthetic:futex_wait). "dmesg | grep -i perf" may provide additional information. Ideally perf should support synthetic events, but for now just fix the warning. The support can come later. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://patch.msgid.link/20251216182440.147e4453@gandalf.local.home Fixes: 4b147936fa509 ("tracing: Add support for 'synthetic' events") Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-17Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: - Fix BPF builds due to -fms-extensions. selftests (Alexei Starovoitov), bpftool (Quentin Monnet). - Fix build of net/smc when CONFIG_BPF_SYSCALL=y, but CONFIG_BPF_JIT=n (Geert Uytterhoeven) - Fix livepatch/BPF interaction and support reliable unwinding through BPF stack frames (Josh Poimboeuf) - Do not audit capability check in arm64 JIT (Ondrej Mosnacek) - Fix truncated dmabuf BPF iterator reads (T.J. Mercier) - Fix verifier assumptions of bpf_d_path's output buffer (Shuran Liu) - Fix warnings in libbpf when built with -Wdiscarded-qualifiers under C23 (Mikhail Gavrilov) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: add regression test for bpf_d_path() bpf: Fix verifier assumptions of bpf_d_path's output buffer selftests/bpf: Add test for truncated dmabuf_iter reads bpf: Fix truncated dmabuf iterator reads x86/unwind/orc: Support reliable unwinding through BPF stack frames bpf: Add bpf_has_frame_pointer() bpf, arm64: Do not audit capability check in do_jit() libbpf: Fix -Wdiscarded-qualifiers under C23 bpftool: Fix build warnings due to MS extensions net: smc: SMC_HS_CTRL_BPF should depend on BPF_JIT selftests/bpf: Add -fms-extensions to bpf build flags
2025-12-10bpf: Fix verifier assumptions of bpf_d_path's output bufferShuran Liu
Commit 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking") started distinguishing read vs write accesses performed by helpers. The second argument of bpf_d_path() is a pointer to a buffer that the helper fills with the resulting path. However, its prototype currently uses ARG_PTR_TO_MEM without MEM_WRITE. Before 37cce22dbd51, helper accesses were conservatively treated as potential writes, so this mismatch did not cause issues. Since that commit, the verifier may incorrectly assume that the buffer contents are unchanged across the helper call and base its optimizations on this wrong assumption. This can lead to misbehaviour in BPF programs that read back the buffer, such as prefix comparisons on the returned path. Fix this by marking the second argument of bpf_d_path() as ARG_PTR_TO_MEM | MEM_WRITE so that the verifier correctly models the write to the caller-provided buffer. Fixes: 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking") Co-developed-by: Zesen Liu <ftyg@live.com> Signed-off-by: Zesen Liu <ftyg@live.com> Co-developed-by: Peili Gao <gplhust955@gmail.com> Signed-off-by: Peili Gao <gplhust955@gmail.com> Co-developed-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Shuran Liu <electronlsr@gmail.com> Reviewed-by: Matt Bobrowski <mattbobrowski@google.com> Link: https://lore.kernel.org/r/20251206141210.3148-2-electronlsr@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-06Merge tag 'trace-v6.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix accounting of stop_count in file release On opening the trace file, if "pause-on-trace" option is set, it will increment the stop_count. On file release, it checks if stop_count is set, and if so it decrements it. Since this code was originally written, the stop_count can be incremented by other use cases. This makes just checking the stop_count not enough to know if it should be decremented. Add a new iterator flag called "PAUSE" and have it set if the open disables tracing and only decrement the stop_count if that flag is set on close. - Remove length field in trace_seq_printf() of print_synth_event() When printing the synthetic event that has a static length array field, the vsprintf() of the trace_seq_printf() triggered a "(efault)" in the output. That's because the print_fmt replaced the "%.*s" with "%s" causing the arguments to be off. - Fix a bunch of typos * tag 'trace-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix typo in trace_seq.c tracing: Fix typo in trace_probe.c tracing: Fix multiple typos in trace_osnoise.c tracing: Fix multiple typos in trace_events_user.c tracing: Fix typo in trace_events_trigger.c tracing: Fix typo in trace_events_hist.c tracing: Fix typo in trace_events_filter.c tracing: Fix multiple typos in trace_events.c tracing: Fix multiple typos in trace.c tracing: Fix typo in ring_buffer_benchmark.c tracing: Fix multiple typos in ring_buffer.c tracing: Fix typo in fprobe.c tracing: Fix typo in fpgraph.c tracing: Fix fixed array of synthetic event tracing: Fix enabling of tracing on file release
2025-12-05tracing: Fix typo in trace_seq.cMaurice Hieronymus
Fix typo "wont" to "won't". Link: https://patch.msgid.link/20251121221835.28032-15-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in trace_probe.cMaurice Hieronymus
Fix typo "separater" to "separator". Link: https://patch.msgid.link/20251121221835.28032-14-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in trace_osnoise.cMaurice Hieronymus
Fix multiple typos in comments: "Anotate" -> "Annotate" "infor" -> "info" "timestemp" -> "timestamp" "tread" -> "thread" "varaibles" -> "variables" "wast" -> "waste" Link: https://patch.msgid.link/20251121221835.28032-13-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in trace_events_user.cMaurice Hieronymus
Fix multiple typos in comments: "ambigious" -> "ambiguous" "explictly" -> "explicitly" "Uknown" -> "Unknown" Link: https://patch.msgid.link/20251121221835.28032-12-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in trace_events_trigger.cMaurice Hieronymus
Fix typo "componenents" to "components". Link: https://patch.msgid.link/20251121221835.28032-11-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in trace_events_hist.cMaurice Hieronymus
Fix typo "tigger" to "trigger". Link: https://patch.msgid.link/20251121221835.28032-10-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in trace_events_filter.cMaurice Hieronymus
Fix typo "singe" to "single". Link: https://patch.msgid.link/20251121221835.28032-9-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in trace_events.cMaurice Hieronymus
Fix multiple typos in comments: "appened" -> "appended" "paranthesis" -> "parenthesis" "parethesis" -> "parenthesis" "wont" -> "won't" Link: https://patch.msgid.link/20251121221835.28032-8-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in trace.cMaurice Hieronymus
Fix multiple typos in comments: "alse" -> "also" "enabed" -> "enabled" "instane" -> "instance" "outputing" -> "outputting" "seperated" -> "separated" Link: https://patch.msgid.link/20251121221835.28032-7-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in ring_buffer_benchmark.cMaurice Hieronymus
Fix typo "overwite" to "overwrite". Link: https://patch.msgid.link/20251121221835.28032-6-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in ring_buffer.cMaurice Hieronymus
Fix multiple typos in comments: "ording" -> "ordering" "scatch" -> "scratch" "wont" -> "won't" Link: https://patch.msgid.link/20251121221835.28032-5-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in fprobe.cMaurice Hieronymus
Fix typo "funciton" to "function". Link: https://patch.msgid.link/20251121221835.28032-4-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in fpgraph.cMaurice Hieronymus
Fix typo "reservered" to "reserved". Link: https://patch.msgid.link/20251121221835.28032-3-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>