| Age | Commit message (Collapse) | Author |
|
ops.exit() may be called even if the loading failed before ops.init()
finishes successfully. This is because ops.exit() allows rich exit info
communication. Add SCX_EFLAG_INITIALIZED flag to scx_exit_info.flags to
indicate whether ops.init() finished successfully.
This enables BPF schedulers to distinguish between exit scenarios and
handle cleanup appropriately based on initialization state.
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Convert warned_zero_slice and warned_deprecated_rq in scx_sched struct to
single-bit bitfields. While this doesn't reduce struct size immediately,
it prepares for future bitfield additions.
v2: Update patch description.
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
task_can_run_on_remote_rq() takes @sch but it is using scx_root when
incrementing SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE, which is inconsistent and
gets in the way of implementing multiple scheduler support. Use @sch
instead. As currently scx_root is the only possible scheduler instance, this
doesn't cause any behavior changes.
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The find_user_dsq() function is called from contexts that are already
under RCU read lock protection. Switch from rhashtable_lookup_fast() to
rhashtable_lookup() to avoid redundant RCU locking.
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Currently, signed load instructions into arena memory are unsupported.
The compiler is free to generate these, and on GCC-14 we see a
corresponding error when it happens. The hurdle in supporting them is
deciding which unused opcode to use to mark them for the JIT's own
consumption. After much thinking, it appears 0xc0 / BPF_NOSPEC can be
combined with load instructions to identify signed arena loads. Use
this to recognize and JIT them appropriately, and remove the verifier
side limitation on the program if the JIT supports them.
Co-developed-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20250923110157.18326-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
scx_bpf_cpu_curr() has been introduced to retrieve the current task of a
given runqueue, allowing schedulers to interact with that task.
The kfunc assumes that it is always called in an RCU context, but this
is not always guaranteed and some BPF schedulers can trigger the
following warning:
WARNING: suspicious RCU usage
sched_ext: BPF scheduler "cosmos_1.0.2_gd0e71ca_x86_64_unknown_linux_gnu_debug" enabled
6.17.0-rc1 #1-NixOS Not tainted
-----------------------------
kernel/sched/ext.c:6415 suspicious rcu_dereference_check() usage!
...
Call Trace:
<IRQ>
dump_stack_lvl+0x6f/0xb0
lockdep_rcu_suspicious.cold+0x4e/0x96
scx_bpf_cpu_curr+0x7e/0x80
bpf_prog_c68b2b6b6b1b0ff8_sched_timerfn+0xce/0x1dc
bpf_timer_cb+0x7b/0x130
__hrtimer_run_queues+0x1ea/0x380
hrtimer_run_softirq+0x8c/0xd0
handle_softirqs+0xc9/0x3b0
__irq_exit_rcu+0x96/0xc0
irq_exit_rcu+0xe/0x20
sysvec_apic_timer_interrupt+0x73/0x80
</IRQ>
<TASK>
To address this, mark the kfunc with KF_RCU_PROTECTED, so the verifier
can enforce its usage only inside RCU-protected sections.
Note: this also requires commit 1512231b6cc86 ("bpf: Enforce RCU protection
for KF_RCU_PROTECTED"), currently in bpf-next, to enforce the proper
KF_RCU_PROTECTED.
Fixes: 20b158094a1ad ("sched_ext: Introduce scx_bpf_cpu_curr()")
Cc: Christian Loehle <christian.loehle@arm.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Since dynamic_events interface on tracefs is compatible with
kprobe_events and uprobe_events, it should also check the lockdown
status and reject if it is set.
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/175824455687.45175.3734166065458520748.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
When config osnoise cpus by write() syscall, the following KASAN splat may
be observed:
BUG: KASAN: slab-out-of-bounds in _parse_integer_limit+0x103/0x130
Read of size 1 at addr ffff88810121e3a1 by task test/447
CPU: 1 UID: 0 PID: 447 Comm: test Not tainted 6.17.0-rc6-dirty #288 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x55/0x70
print_report+0xcb/0x610
kasan_report+0xb8/0xf0
_parse_integer_limit+0x103/0x130
bitmap_parselist+0x16d/0x6f0
osnoise_cpus_write+0x116/0x2d0
vfs_write+0x21e/0xcc0
ksys_write+0xee/0x1c0
do_syscall_64+0xa8/0x2a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
This issue can be reproduced by below code:
const char *cpulist = "1";
int fd=open("/sys/kernel/debug/tracing/osnoise/cpus", O_WRONLY);
write(fd, cpulist, strlen(cpulist));
Function bitmap_parselist() was called to parse cpulist, it require that
the parameter 'buf' must be terminated with a '\0' or '\n'. Fix this issue
by adding a '\0' to 'buf' in osnoise_cpus_write().
Cc: <mhiramat@kernel.org>
Cc: <mathieu.desnoyers@efficios.com>
Cc: <tglozar@redhat.com>
Link: https://lore.kernel.org/20250916063948.3154627-1-wangliang74@huawei.com
Fixes: 17f89102fe23 ("tracing/osnoise: Allow arbitrarily long CPU string")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_wq is a per-CPU worqueue, yet nothing in its name tells about that
CPU affinity constraint, which is very often not required by users. Make
it clear by adding a system_percpu_wq.
queue_work() / queue_delayed_work() mod_delayed_work() will now use the
new per-cpu wq: whether the user still stick on the old name a warn will
be printed along a wq redirect to the new one.
This patch add the new system_percpu_wq except for mm, fs and net
subsystem, whom are handled in separated patches.
The old wq will be kept for a few release cylces.
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/20250905091040.109772-2-marco.crivellari@suse.com
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Implementation of the new bpf_task_work_schedule kfuncs, that let a BPF
program schedule task_work callbacks for a target task:
* bpf_task_work_schedule_signal() - schedules with TWA_SIGNAL
* bpf_task_work_schedule_resume() - schedules with TWA_RESUME
Each map value should embed a struct bpf_task_work, which the kernel
side pairs with struct bpf_task_work_kern, containing a pointer to
struct bpf_task_work_ctx, that maintains metadata relevant for the
concrete callback scheduling.
A small state machine and refcounting scheme ensures safe reuse and
teardown. State transitions:
_______________________________
| |
v |
[standby] ---> [pending] --> [scheduling] --> [scheduled]
^ |________________|_________
| |
| v
| [running]
|_______________________________________________________|
All states may transition into FREED state:
[pending] [scheduling] [scheduled] [running] [standby] -> [freed]
A FREED terminal state coordinates with map-value
deletion (bpf_task_work_cancel_and_free()).
Scheduling itself is deferred via irq_work to keep the kfunc callable
from NMI context.
Lifetime is guarded with refcount_t + RCU Tasks Trace.
Main components:
* struct bpf_task_work_context – Metadata and state management per task
work.
* enum bpf_task_work_state – A state machine to serialize work
scheduling and execution.
* bpf_task_work_schedule() – The central helper that initiates
scheduling.
* bpf_task_work_acquire_ctx() - Attempts to take ownership of the context,
pointed by passed struct bpf_task_work, allocates new context if none
exists yet.
* bpf_task_work_callback() – Invoked when the actual task_work runs.
* bpf_task_work_irq() – An intermediate step (runs in softirq context)
to enqueue task work.
* bpf_task_work_cancel_and_free() – Cleanup for deleted BPF map entries.
Flow of successful task work scheduling
1) bpf_task_work_schedule_* is called from BPF code.
2) Transition state from STANDBY to PENDING, mark context as owned by
this task work scheduler
3) irq_work_queue() schedules bpf_task_work_irq().
4) Transition state from PENDING to SCHEDULING (noop if transition
successful)
5) bpf_task_work_irq() attempts task_work_add(). If successful, state
transitions to SCHEDULED.
6) Task work calls bpf_task_work_callback(), which transition state to
RUNNING.
7) BPF callback is executed
8) Context is cleaned up, refcounts released, context state set back to
STANDBY.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Reviewed-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250923112404.668720-8-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Calculation of the BPF map key, given the pointer to a value is
duplicated in a couple of places in helpers already, in the next patch
another use case is introduced as well.
This patch extracts that functionality into a separate function.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250923112404.668720-7-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds necessary plumbing in verifier, syscall and maps to
support handling new kfunc bpf_task_work_schedule and kernel structure
bpf_task_work. The idea is similar to how we already handle bpf_wq and
bpf_timer.
verifier changes validate calls to bpf_task_work_schedule to make sure
it is safe and expected invariants hold.
btf part is required to detect bpf_task_work structure inside map value
and store its offset, which will be used in the next patch to calculate
key and value addresses.
arraymap and hashtab changes are needed to handle freeing of the
bpf_task_work: run code needed to deinitialize it, for example cancel
task_work callback if possible.
The use of bpf_task_work and proper implementation for kfuncs are
introduced in the next patch.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250923112404.668720-6-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The verifier currently enforces a zero return value for all async
callbacks—a constraint originally introduced for bpf_timer. That
restriction is too narrow for other async use cases.
Relax the rule by allowing non-zero return codes from async callbacks in
general, while preserving the zero-return requirement for bpf_timer to
maintain its existing semantics.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250923112404.668720-5-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Extract the cleanup of known embedded structs into the dedicated helper.
Remove duplication and introduce a single source of truth for freeing
special embedded structs in hashtab.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250923112404.668720-4-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Refactor the verifier by pulling the common logic from
process_timer_func() into a dedicated helper. This allows reusing
process_async_func() helper for verifying bpf_task_work struct in the
next patch.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Tested-by: syzbot@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20250923112404.668720-3-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Reduce code duplication in detection of the known special field types in
map values. This refactoring helps to avoid copying a chunk of code in
the next patch of the series.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250923112404.668720-2-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The saved_cmdlines_buffer struct is already zeroed by memset(). It's
redundant to initialize s->cmdline_idx to 0.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250825123200.306272-1-liaoyuanhong@vivo.com
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Replace the opencoded for_each_cpu(cpu, cpu_online_mask) loop with the
more readable and equivalent for_each_online_cpu(cpu) macro.
Link: https://lore.kernel.org/20250811064158.2456-1-wangfushuai@baidu.com
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Remove array_size() calls and replace vmalloc() with vmalloc_array() in
tracing_map_sort_entries(). vmalloc_array() is optimized better, uses
fewer instructions, and handles overflow more concisely[1].
[1]: https://lore.kernel.org/lkml/abc66ec5-85a4-47e1-9759-2f60ab111971@vivo.com/
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250817084725.59477-1-rongqianfeng@vivo.com
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Currently the syscall trace events show each value as hexadecimal, but
without adding "0x" it can be confusing:
sys_write(fd: 4, buf: 0x55c4a1fa9270, count: 44)
Looks like the above write wrote 44 bytes, when in reality it wrote 68
bytes.
Add a "0x" for all values greater or equal to 10 to remove the ambiguity.
For values less than 10, leave off the "0x" as that just adds noise to the
output.
Also change the iterator to check if "i" is nonzero and print the ", "
delimiter at the start, then adding the logic to the trace_seq_printf() at
the end.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Takaya Saeki <takayas@google.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Douglas Raillard <douglas.raillard@arm.com>
Link: https://lore.kernel.org/20250923130713.764558957@kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
The syscall events are pseudo events that hook to the raw syscalls. The
ftrace_syscall_enter/exit() callback is called by the raw_syscall
enter/exit tracepoints respectively whenever any of the syscall events are
enabled.
The trace_array has an array of syscall "files" that correspond to the
system calls based on their __NR_SYSCALL number. The array is read and if
there's a pointer to a trace_event_file then it is considered enabled and
if it is NULL that syscall event is considered disabled.
Currently it uses an rcu_dereference_sched() to get this pointer and a
rcu_assign_ptr() or RCU_INIT_POINTER() to write to it. This is unnecessary
as the file pointer will not go away outside the synchronization of the
tracepoint logic itself. And this code adds no extra RCU synchronization
that uses this.
Replace these functions with a simple READ_ONCE() and WRITE_ONCE() which
is all they need. This will also allow this code to not depend on
preemption being disabled as system call tracepoints are now allowed to
fault.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Takaya Saeki <takayas@google.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Douglas Raillard <douglas.raillard@arm.com>
Link: https://lore.kernel.org/20250923130713.594320290@kernel.org
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
kern_path_locked() is now only used to prepare for removing an object
from the filesystem (and that is the only credible reason for wanting a
positive locked dentry). Thus it corresponds to kern_path_create() and
so should have a corresponding name.
Unfortunately the name "kern_path_create" is somewhat misleading as it
doesn't actually create anything. The recently added
simple_start_creating() provides a better pattern I believe. The
"start" can be matched with "end" to bracket the creating or removing.
So this patch changes names:
kern_path_locked -> start_removing_path
kern_path_create -> start_creating_path
user_path_create -> start_creating_user_path
user_path_locked_at -> start_removing_user_path_at
done_path_create -> end_creating_path
and also introduces end_removing_path() which is identical to
end_creating_path().
__start_removing_path (which was __kern_path_locked) is enhanced to
call mnt_want_write() for consistency with the start_creating_path().
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
audit_alloc_mark() and audit_get_nd() both need to perform a path
lookup getting the parent dentry (which must exist) and the final
target (following a LAST_NORM name) which sometimes doesn't need to
exist.
They don't need the parent to be locked, but use kern_path_locked() or
kern_path_locked_negative() anyway. This is somewhat misleading to the
casual reader.
This patch introduces a more targeted function, kern_path_parent(),
which returns not holding locks. On success the "path" will
be set to the parent, which must be found, and the return value is the
dentry of the target, which might be negative.
This will clear the way to rename kern_path_locked() which is
otherwise only used to prepare for removing something.
It also allows us to remove kern_path_locked_negative(), which is
transformed into the new kern_path_parent().
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Torture-test updates:
* rcutorture: Fix jitter.sh spin time
* torture: Add --do-normal parameter to torture.sh help text
* torture: Announce kernel boot status at torture-test startup
* rcutorture: Suppress "Writer stall state" reports during boot
* rcutorture: Delay rcutorture readers and writers until boot completes
* torture: Delay CPU-hotplug operations until boot completes
* rcutorture: Delay forward-progress testing until boot completes
* rcutorture,refscale: Use kcalloc() instead of kzalloc()
* refperf: Remove redundant kfree() after torture_stop_kthread()
* refperf: Set reader_tasks to NULL after kfree()
|
|
SRCU updates:
* Create srcu_read_{,un}lock_fast_notrace()
* Add srcu_read_lock_fast_notrace() and srcu_read_unlock_fast_notrace()
* Add guards for notrace variants of SRCU-fast readers
* Document srcu_read_{,un}lock_fast() use of implicit RCU readers
* Document srcu_flip() memory-barrier D relation to SRCU-fast
* Remove preempt_disable/enable() in Tiny SRCU srcu_gp_start_if_needed()
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This change add the WQ_UNBOUND flag to sync_wq, to make explicit this
workqueue can be unbound and that it does not benefit from per-cpu work.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This patch adds a new WQ_PERCPU flag to explicitly request the use of
the per-CPU behavior. Both flags coexist for one release cycle to allow
callers to transition their calls.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
All existing users have been updated accordingly.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_wq is a per-CPU worqueue, yet nothing in its name tells about that
CPU affinity constraint, which is very often not required by users. Make
it clear by adding a system_percpu_wq.
The old wq will be kept for a few release cylces.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
The only user of msi_post_free() - powerpc/pseries - has been changed to
use msi_teardown().
Remove this unused callback.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250916061007.964005-1-namcao@linutronix.de
|
|
The timer drivers could be converted into modules. The different
functions to register the clocksource or the clockevent are already
exporting their symbols for modules but the sched_clock_register()
function is missing.
Export the symbols so the drivers using this function can be converted
into modules.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Will McVicker <willmcvicker@google.com>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20250602151853.1942521-8-daniel.lezcano@linaro.org
|
|
sys_prlimit64() paths
The usage of task_lock(tsk->group_leader) in sys_prlimit64()->do_prlimit()
path is very broken.
sys_prlimit64() does get_task_struct(tsk) but this only protects task_struct
itself. If tsk != current and tsk is not a leader, this process can exit/exec
and task_lock(tsk->group_leader) may use the already freed task_struct.
Another problem is that sys_prlimit64() can race with mt-exec which changes
->group_leader. In this case do_prlimit() may take the wrong lock, or (worse)
->group_leader may change between task_lock() and task_unlock().
Change sys_prlimit64() to take tasklist_lock when necessary. This is not
nice, but I don't see a better fix for -stable.
Link: https://lkml.kernel.org/r/20250915120917.GA27702@redhat.com
Fixes: 18c91bb2d872 ("prlimit: do not grab the tasklist_lock")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This patch extends the BPF_PROG_LOAD command by adding three new fields
to `union bpf_attr` in the user-space API:
- signature: A pointer to the signature blob.
- signature_size: The size of the signature blob.
- keyring_id: The serial number of a loaded kernel keyring (e.g.,
the user or session keyring) containing the trusted public keys.
When a BPF program is loaded with a signature, the kernel:
1. Retrieves the trusted keyring using the provided `keyring_id`.
2. Verifies the supplied signature against the BPF program's
instruction buffer.
3. If the signature is valid and was generated by a key in the trusted
keyring, the program load proceeds.
4. If no signature is provided, the load proceeds as before, allowing
for backward compatibility. LSMs can chose to restrict unsigned
programs and implement a security policy.
5. If signature verification fails for any reason,
the program is not loaded.
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/r/20250921160120.9711-2-kpsingh@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The is_prs_invalid helper function is redundant as it serves a similar
purpose to is_partition_invalid. It can be fully replaced by the existing
is_partition_invalid function, so this patch removes the is_prs_invalid
helper.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
If the parent is not a valid partition, an error will be returned before
any partition update command is processed. This means the
WARN_ON_ONCE(!is_partition_valid(parent)) can never be triggered, so
it is safe to remove.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The nodelist_parse function already handles empty nodemask input
appropriately, making it unnecessary to handle this case separately
during the node mask update process.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fix from jun Heo:
"This contains a fix for sched_ext idle CPU selection that likely fixes
a substantial performance regression.
The scx_bpf_select_cpu_dfl/and() kfuncs were incorrectly detecting all
tasks as migration-disabled when called outside ops.select_cpu(),
causing them to always return -EBUSY instead of finding idle CPUs.
The fix properly distinguishes between genuinely migration-disabled
tasks vs. the current task whose migration is temporarily disabled by
BPF execution"
* tag 'sched_ext-for-6.17-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: idle: Handle migration-disabled tasks in BPF code
|
|
When scx_bpf_select_cpu_dfl()/and() kfuncs are invoked outside of
ops.select_cpu() we can't rely on @p->migration_disabled to determine if
migration is disabled for the task @p.
In fact, migration is always disabled for the current task while running
BPF code: __bpf_prog_enter() disables migration and __bpf_prog_exit()
re-enables it.
To handle this, when @p->migration_disabled == 1, check whether @p is
the current task. If so, migration was not disabled before entering the
callback, otherwise migration was disabled.
This ensures correct idle CPU selection in all cases. The behavior of
ops.select_cpu() remains unchanged, because this callback is never
invoked for the current task and migration-disabled tasks are always
excluded.
Example: without this change scx_bpf_select_cpu_and() called from
ops.enqueue() always returns -EBUSY; with this change applied, it
correctly returns idle CPUs.
Fixes: 06efc9fe0b8de ("sched_ext: idle: Handle migration-disabled tasks in idle selection")
Cc: stable@vger.kernel.org # v6.16+
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add ns_debug() that asserts that the correct operations are used for the
namespace type.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Simply derive the ns operations from the namespace type.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
vhost_task_create() creates a task and keeps a reference to its
task_struct. That task may exit early via a signal and its task_struct
will be released.
A pending vhost_task_wake() will then attempt to wake the task and
access a task_struct which is no longer there.
Acquire a reference on the task_struct while creating the thread and
release the reference while the struct vhost_task itself is removed.
If the task exits early due to a signal, then the vhost_task_wake() will
still access a valid task_struct. The wake is safe and will be skipped
in this case.
Fixes: f9010dbdce911 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression")
Reported-by: Sean Christopherson <seanjc@google.com>
Closes: https://lore.kernel.org/all/aKkLEtoDXKxAAWju@google.com/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Message-Id: <20250918181144.Ygo8BZ-R@linutronix.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Sean Christopherson <seanjc@google.com>
|
|
Patch series "Improvements to Victim Process Thawing and OOM Reaper
Traversal Order", v10.
This patch series focuses on optimizing victim process thawing and
refining the traversal order of the OOM reaper. Since __thaw_task() is
used to thaw a single thread of the victim, thawing only one thread cannot
guarantee the exit of the OOM victim when it is frozen. Patch 1 thaw the
entire process of the OOM victim to ensure that OOM victims are able to
terminate themselves. Even if the oom_reaper is delayed, patch 2 is still
beneficial for reaping processes with a large address space footprint, and
it also greatly improves process_mrelease.
This patch (of 10):
OOM killer is a mechanism that selects and kills processes when the system
runs out of memory to reclaim resources and keep the system stable. But
the oom victim cannot terminate on its own when it is frozen, even if the
OOM victim task is thawed through __thaw_task(). This is because
__thaw_task() can only thaw a single OOM victim thread, and cannot thaw
the entire OOM victim process.
In addition, freezing_slow_path() determines whether a task is an OOM
victim by checking the task's TIF_MEMDIE flag. When a task is identified
as an OOM victim, the freezer bypasses both PM freezing and cgroup
freezing states to thaw it.
Historically, TIF_MEMDIE was a "this is the oom victim & it has access to
memory reserves" flag in the past. It has that thread vs. process
problems and tsk_is_oom_victim was introduced later to get rid of them and
other issues as well as the guarantee that we can identify the oom
victim's mm reliably for other oom_reaper.
Therefore, thaw_process() is introduced to unfreeze all threads within the
OOM victim process, ensuring that every thread is properly thawed. The
freezer now uses tsk_is_oom_victim() to determine OOM victim status,
allowing all victim threads to be unfrozen as necessary.
With this change, the entire OOM victim process will be thawed when an OOM
event occurs, ensuring that the victim can terminate on its own.
Link: https://lkml.kernel.org/r/20250915162946.5515-1-zhongjinji@honor.com
Link: https://lkml.kernel.org/r/20250915162946.5515-2-zhongjinji@honor.com
Signed-off-by: zhongjinji <zhongjinji@honor.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When memory block is removed via try_remove_memory(), it eventually
reaches release_mem_region_adjustable(). The current implementation
assumes that when a busy memory resource is split into two, all child
resources remain in the lower address range.
This simplification causes problems when child resources actually belong
to the upper split. For example:
* Initial memory layout:
lsmem
RANGE SIZE STATE REMOVABLE BLOCK
0x0000000000000000-0x00000002ffffffff 12G online yes 0-95
* /proc/iomem
00000000-2dfefffff : System RAM
158834000-1597b3fff : Kernel code
1597b4000-159f50fff : Kernel data
15a13c000-15a218fff : Kernel bss
2dff00000-2ffefffff : Crash kernel
2fff00000-2ffffffff : System RAM
* After offlining and removing range
0x150000000-0x157ffffff
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED
(output according to upcoming lsmem changes with the configured column:
s390)
RANGE SIZE STATE BLOCK CONFIGURED
0x0000000000000000-0x000000014fffffff 5.3G online 0-41 yes
0x0000000150000000-0x0000000157ffffff 128M offline 42 no
0x0000000158000000-0x00000002ffffffff 6.6G online 43-95 yes
The iomem resource gets split into two entries, but kernel code, kernel
data, and kernel bss remain attached to the lower resource [0–5376M]
instead of the correct upper resource [5504M–12288M].
As a result, WARN_ON() triggers in release_mem_region_adjustable()
("Usecase: split into two entries - we need a new resource")
------------[ cut here ]------------
WARNING: CPU: 5 PID: 858 at kernel/resource.c:1486
release_mem_region_adjustable+0x210/0x280
Modules linked in:
CPU: 5 UID: 0 PID: 858 Comm: chmem Not tainted 6.17.0-rc2-11707-g2c36aaf3ba4e
Hardware name: IBM 3906 M04 704 (z/VM 7.3.0)
Krnl PSW : 0704d00180000000 0000024ec0dae0e4
(release_mem_region_adjustable+0x214/0x280)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000000 00000002ffffafc0 fffffffffffffff0 0000000000000000
000000014fffffff 0000024ec2257608 0000000000000000 0000024ec2301758
0000024ec22680d0 00000000902c9140 0000000150000000 00000002ffffafc0
000003ffa61d8d18 0000024ec21fb478 0000024ec0dae014 000001cec194fbb0
Krnl Code: 0000024ec0dae0d8: af000000 mc 0,0
0000024ec0dae0dc: a7f4ffc1 brc 15,0000024ec0dae05e
#0000024ec0dae0e0: af000000 mc 0,0
>0000024ec0dae0e4: a5defffd llilh %r13,65533
0000024ec0dae0e8: c04000c6064c larl %r4,0000024ec266ed80
0000024ec0dae0ee: eb1d400000f8 laa %r1,%r13,0(%r4)
0000024ec0dae0f4: 07e0 bcr 14,%r0
0000024ec0dae0f6: a7f4ffc0 brc 15,0000024ec0dae076
[<0000024ec0dae0e4>] release_mem_region_adjustable+0x214/0x280
([<0000024ec0dadf3c>] release_mem_region_adjustable+0x6c/0x280)
[<0000024ec10a2130>] try_remove_memory+0x100/0x140
[<0000024ec10a4052>] __remove_memory+0x22/0x40
[<0000024ec18890f6>] config_mblock_store+0x326/0x3e0
[<0000024ec11f7056>] kernfs_fop_write_iter+0x136/0x210
[<0000024ec1121e86>] vfs_write+0x236/0x3c0
[<0000024ec11221b8>] ksys_write+0x78/0x110
[<0000024ec1b6bfbe>] __do_syscall+0x12e/0x350
[<0000024ec1b782ce>] system_call+0x6e/0x90
Last Breaking-Event-Address:
[<0000024ec0dae014>] release_mem_region_adjustable+0x144/0x280
---[ end trace 0000000000000000 ]---
Also, resource adjustment doesn't happen and stale resources still cover
[0-12288M]. Later, memory re-add fails in register_memory_resource() with
-EBUSY.
i.e: /proc/iomem is still:
00000000-2dfefffff : System RAM
158834000-1597b3fff : Kernel code
1597b4000-159f50fff : Kernel data
15a13c000-15a218fff : Kernel bss
2dff00000-2ffefffff : Crash kernel
2fff00000-2ffffffff : System RAM
Enhance release_mem_region_adjustable() to reassign child resources to the
correct parent after a split. Children are now assigned based on their
actual range: If they fall within the lower split, keep them in the lower
parent. If they fall within the upper split, move them to the upper
parent.
Kernel code/data/bss regions are not offlined, so they will always reside
entirely within one parent and never span across both.
Output after the enhancement:
* Initial state /proc/iomem (before removal of memory block):
00000000-2dfefffff : System RAM
1f94f8000-1fa477fff : Kernel code
1fa478000-1fac14fff : Kernel data
1fae00000-1faedcfff : Kernel bss
2dff00000-2ffefffff : Crash kernel
2fff00000-2ffffffff : System RAM
* Offline and remove 0x1e8000000-0x1efffffff memory range
* /proc/iomem
00000000-1e7ffffff : System RAM
1f0000000-2dfefffff : System RAM
1f94f8000-1fa477fff : Kernel code
1fa478000-1fac14fff : Kernel data
1fae00000-1faedcfff : Kernel bss
2dff00000-2ffefffff : Crash kernel
2fff00000-2ffffffff : System RAM
Link: https://lkml.kernel.org/r/20250912123021.3219980-1-sumanthk@linux.ibm.com
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
dma_common_contiguous_remap() is used to remap an "allocated contiguous
region". Within a single allocation, there is no need to use nth_page()
anymore.
Neither the buddy, nor hugetlb, nor CMA will hand out problematic page
ranges.
Link: https://lkml.kernel.org/r/20250901150359.867252-24-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Inside the small stack_not_used() function there are several ifdefs for
stack growing-up vs. regular versions. Instead just implement this
function two times, one for growing-up and another regular.
Add comments like /* !CONFIG_DEBUG_STACK_USAGE */ to clarify what the
ifdefs are doing.
[linus.walleij@linaro.org: rebased, function moved elsewhere in the kernel]
Link: https://lkml.kernel.org/r/20250829-fork-cleanups-for-dynstack-v1-2-3bbaadce1f00@linaro.org
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/20240311164638.2015063-13-pasha.tatashin@soleen.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Ben Segall <bsegall@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Mel Gorman <mgorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: task_stack: Stack handling cleanups".
These are some small cleanups for the fork code that was split off from
Pasha:s dynamic stack patch series, they are generally nice on their own
so let's propose them for merging.
This patch (of 2):
No need to do zero cached stack if memcg charge fails, so move the
charging attempt before the memset operation.
Link: https://lkml.kernel.org/r/20250829-fork-cleanups-for-dynstack-v1-0-3bbaadce1f00@linaro.org
Link: https://lkml.kernel.org/r/20250829-fork-cleanups-for-dynstack-v1-1-3bbaadce1f00@linaro.org
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/20240311164638.2015063-6-pasha.tatashin@soleen.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Ben Segall <bsegall@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Mel Gorman <mgorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Correct several typos found in comments across various files in the
kernel/time directory.
No functional changes are introduced by these corrections.
Signed-off-by: Haofeng Li <lihaofeng@kylinos.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The check for scancode 0xe0 is always false because earlier on
the scan code is masked with 0x7f so there are never going to
be values greater than 0x7f. Remove the redundant check.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
|
|
strcpy() is deprecated; use the new helper function kdb_strdup_dequote()
instead. In addition to string duplication similar to kdb_strdup(), it
also trims surrounding quotes from the input string if present.
kdb_strdup_dequote() also checks for a trailing quote in the input
string which was previously not checked.
Link: https://github.com/KSPP/linux/issues/88
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
|
|
strcpy() is deprecated; use memcpy() instead.
We can safely use memcpy() because we already know the length of the
source string 'cp' and that it is guaranteed to be NUL-terminated within
the first KDB_GREP_STRLEN bytes.
Link: https://github.com/KSPP/linux/issues/88
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
|
|
strcpy() is deprecated and its behavior is undefined when the source and
destination buffers overlap. Use memmove() instead to avoid any
undefined behavior.
Adjust comments for clarity.
Link: https://github.com/KSPP/linux/issues/88
Fixes: 5d5314d6795f ("kdb: core for kgdb back end (1 of 2)")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
|