| Age | Commit message (Collapse) | Author |
|
Till now, the runtime PM workqueue has been flagged as freezable, so it
does not process work items during system-wide PM transitions like
system suspend and resume. The original reason to do that was to
reduce the likelihood of runtime PM getting in the way of system-wide
PM processing, but now it is mostly an optimization because (1) runtime
suspend of devices is prevented by bumping up their runtime PM usage
counters in device_prepare() and (2) device drivers are expected to
disable runtime PM for the devices handled by them before they embark
on system-wide PM activities that may change the state of the hardware
or otherwise interfere with runtime PM. However, it prevents
asynchronous runtime resume of devices from working during system-wide
PM transitions, which is confusing because synchronous runtime resume
is not prevented at the same time, and it also sometimes turns out to
be problematic.
For example, it has been reported that blk_queue_enter() may deadlock
during a system suspend transition because of the pm_request_resume()
usage in it [1]. It may also deadlock during a system resume transition
in a similar way. That happens because the asynchronous runtime resume
of the given device is not processed due to the freezing of the runtime
PM workqueue. While it may be better to address this particular issue
in the block layer, the very presence of it means that similar problems
may be expected to occur elsewhere.
For this reason, remove the WQ_FREEZABLE flag from the runtime PM
workqueue and make device_suspend_late() use the generic variant of
pm_runtime_disable() that will carry out runtime PM of the device
synchronously if there is pending resume work for it.
Also update the comment before the pm_runtime_disable() call in
device_suspend_late(), to document the fact that the runtime PM
should not be expected to work for the device until the end of
device_resume_early(), and update the related documentation.
This change may, even though it is not expected to, uncover some
latent issues related to queuing up asynchronous runtime resume
work items during system suspend or hibernation. However, they
should be limited to the interference between runtime resume and
system-wide PM callbacks in the cases when device drivers start
to handle system-wide PM before disabling runtime PM as described
above.
Link: https://lore.kernel.org/linux-pm/20251126101636.205505-2-yang.yang@vivo.com/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/12794222.O9o76ZdvQC@rafael.j.wysocki
|
|
There's three layers of logic in the scheduler that
deal with 'has_blocked' (load) handling of the NOHZ code:
(1) nohz.has_blocked,
(2) rq->has_blocked_load, deal with NOHZ idle balancing,
(3) and cfs_rq_has_blocked(), which is part of the layer
that is passing the SMP load-balancing signal to the
NOHZ layers.
The 'has_blocked' and 'has_blocked_load' names are used
in a mixed fashion, sometimes within the same function.
Standardize on 'has_blocked_load' to make it all easy
to read and easy to grep.
No change in functionality.
Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Link: https://patch.msgid.link/aS6yvxyc3JfMxxQW@gmail.com
|
|
for wrapped-signed aritmetics
We have to be careful with vruntime comparisons and subtraction,
due to the possibility of wrapping, so we have macros like:
#define vruntime_gt(field, lse, rse) ({ (s64)((lse)->field - (rse)->field) > 0; })
Which is used like this:
if (vruntime_gt(min_vruntime, se, rse))
se->min_vruntime = rse->min_vruntime;
Replace this with an easier to read pattern that uses the regular
arithmetics operators:
if (vruntime_cmp(se->min_vruntime, ">", rse->min_vruntime))
se->min_vruntime = rse->min_vruntime;
Also replace vruntime subtractions with vruntime_op():
- delta = (s64)(sea->vruntime - seb->vruntime) +
- (s64)(cfs_rqb->zero_vruntime_fi - cfs_rqa->zero_vruntime_fi);
+ delta = vruntime_op(sea->vruntime, "-", seb->vruntime) +
+ vruntime_op(cfs_rqb->zero_vruntime_fi, "-", cfs_rqa->zero_vruntime_fi);
In the vruntime_cmp() and vruntime_op() macros use Use __builtin_strcmp(),
because of __HAVE_ARCH_STRCMP might turn off the compiler optimizations
we rely on here to catch usage bugs.
No change in functionality.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
functions
The ::avg_vruntime field is a misnomer: it says it's an
'average vruntime', but in reality it's the momentary sum
of the weighted vruntimes of all queued tasks, which is
at least a division away from being an average.
This is clear from comments about the math of fair scheduling:
* \Sum (v_i - v0) * w_i := cfs_rq->avg_vruntime
This confusion is increased by the cfs_avg_vruntime() function,
which does perform the division and returns a true average.
The sum of all weighted vruntimes should be named thusly,
so rename the field to ::sum_w_vruntime. (As arguably
::sum_weighted_vruntime would be a bit of a mouthful.)
Understanding the scheduler is hard enough already, without
extra layers of obfuscated naming. ;-)
Also rename related helper functions:
sum_vruntime_add() => sum_w_vruntime_add()
sum_vruntime_sub() => sum_w_vruntime_sub()
sum_vruntime_update() => sum_w_vruntime_update()
With the notable exception of cfs_avg_vruntime(), which
was named accurately.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251201064647.1851919-7-mingo@kernel.org
|
|
The ::avg_load field is a long-standing misnomer: it says it's an
'average load', but in reality it's the momentary sum of the load
of all currently runnable tasks. We'd have to also perform a
division by nr_running (or use time-decay) to arrive at any sort
of average value.
This is clear from comments about the math of fair scheduling:
* \Sum w_i := cfs_rq->avg_load
The sum of all weights is ... the sum of all weights, not
the average of all weights.
To make it doubly confusing, there's also an ::avg_load
in the load-balancing struct sg_lb_stats, which *is* a
true average.
The second part of the field's name is a minor misnomer
as well: it says 'load', and it is indeed a load_weight
structure as it shares code with the load-balancer - but
it's only in an SMP load-balancing context where
load = weight, in the fair scheduling context the primary
purpose is the weighting of different nice levels.
So rename the field to ::sum_weight instead, which makes
the terminology of the EEVDF math match up with our
implementation of it:
* \Sum w_i := cfs_rq->sum_weight
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251201064647.1851919-6-mingo@kernel.org
|
|
- Fix vertical alignment
- Fix typos
- Fix capitalization
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251201064647.1851919-3-mingo@kernel.org
|
|
Join two identical #ifdef blocks:
#ifdef CONFIG_FAIR_GROUP_SCHED
...
#endif
#ifdef CONFIG_FAIR_GROUP_SCHED
...
#endif
Also mark nested #ifdef blocks in the usual fashion, to make
it more apparent where in a nested hierarchy of #ifdefs we
are at a glance.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Link: https://patch.msgid.link/20251201064647.1851919-2-mingo@kernel.org
|
|
Add some checks to the sched_change pattern to validate assumptions
around changing classes.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251127154725.771691954@infradead.org
|
|
The task_tick_fair() function does:
- update the hierarchical runtimes
- drive NUMA-balancing
- update load-balance statistics
- drive force-idle preemption
All but the very first can be limited to the periodic tick. Let hrtick
only update accounting and drive preemption, not load-balancing and
other bits.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20250918080205.563385766@infradead.org
|
|
With fair switched to rcu_dereference_all() validation, having IRQ or
preemption disabled is sufficient, remove the rcu_read_lock()
clutter.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251127154725.647502625@infradead.org
|
|
With the {rcu,sched,bh} RCU flavours being unified, it doesn't really
make sense to check for just the rcu one. Switch to the _all family of
verification which includes all 3 of the listed flavours.
Notably, this will enable us to remove some superfluous
rcu_read_lock() regions when we know they are inside preempt/IRQ
disabled regions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
rcu_dereference_sched_domain()
Remove check from the name for being surplus to requirements.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
While poking at this code recently I noted we do a pointless
unlock+lock cycle in sched_balance_newidle(). We drop the rq->lock (so
we can balance) but then instantly grab the same rq->lock again in
sched_balance_update_blocked_averages().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251127154725.532469061@infradead.org
|
|
Nine (and a half) instances of the same pattern is just silly, fold the lot.
Notably, the half instance in enqueue_load_avg() is right after setting
cfs_rq->avg.load_sum to cfs_rq->avg.load_avg * get_pelt_divider(&cfs_rq->avg).
Since get_pelt_divisor() >= PELT_MIN_DIVIDER, this ends up being a no-op
change.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20251127154725.413564507@infradead.org
|
|
Commit af65320948b8 ("bpf: Bump iter seq size to support BTF
representation of large data structures") increased the fixed buffer
size from PAGE_SIZE to PAGE_SIZE << 3, but the docs for the function
didn't get updated at the same time. Update them.
Signed-off-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20251207091005.2829703-1-tjmercier@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fix from Ingo Molnar:
- Fix CPU hotplug callbacks to disable interrupts on UP kernels
* tag 'smp-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu: Make atomic hotplug callbacks run with interrupts disabled on UP
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
- Fix NULL pointer dereference crash in the Intel PMU driver
- Fix missing read event generation on task exit
- Fix AMD uncore driver init error handling
- Fix whitespace noise
* tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()
perf/core: Fix missing read event generation on task exit
perf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on error
perf/uprobes: Remove <space><Tab> whitespace noise
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
- Fix error code in the irqchip/mchp-eic driver
- Fix setup_percpu_irq() affinity assumptions
- Remove the unused irq_domain_add_tree() function
* tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mchp-eic: Fix error code in mchp_eic_domain_alloc()
irqdomain: Delete irq_domain_add_tree()
genirq: Allow NULL affinity for setup_percpu_irq()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc updates from Andrew Morton:
"There are no significant series in this small merge. Please see the
individual changelogs for details"
[ Editor's note: it's mainly ocfs2 and a couple of random fixes ]
* tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: memfd_luo: add CONFIG_SHMEM dependency
mm: shmem: avoid build warning for CONFIG_SHMEM=n
ocfs2: fix memory leak in ocfs2_merge_rec_left()
ocfs2: invalidate inode if i_mode is zero after block read
ocfs2: avoid -Wflex-array-member-not-at-end warning
ocfs2: convert remaining read-only checks to ocfs2_emergency_state
ocfs2: add ocfs2_emergency_state helper and apply to setattr
checkpatch: add uninitialized pointer with __free attribute check
args: fix documentation to reflect the correct numbers
ocfs2: fix kernel BUG in ocfs2_find_victim_chain
liveupdate: luo_core: fix redundant bound check in luo_ioctl()
ocfs2: validate inline xattr size and entry count in ocfs2_xattr_ibody_list
fs/fat: remove unnecessary wrapper fat_max_cache()
ocfs2: replace deprecated strcpy with strscpy
ocfs2: check tl_used after reading it from trancate log inode
liveupdate: luo_file: don't use invalid list iterator
|
|
Chris reported that the recent affinity management changes result in
overwriting the already initialized thread flags.
Use set_bit() to set the affinity bit instead of assigning the bit value to
the flags.
Fixes: 801afdfbfcd9 ("genirq: Fix interrupt threads affinity vs. cpuset isolated partitions")
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://patch.msgid.link/87ecp0e4cf.ffs@tglx
Closes: https://lore.kernel.org/all/20251212014848.3509622-1-clm@meta.com
|
|
move_local_task_to_local_dsq() is used when moving a task from a non-local
DSQ to a local DSQ on the same CPU. It directly manipulates the local DSQ
without going through dispatch_enqueue() and was missing the post-enqueue
handling that triggers preemption when SCX_ENQ_PREEMPT is set or the idle
task is running.
The function is used by move_task_between_dsqs() which backs
scx_bpf_dsq_move() and may be called while the CPU is busy.
Add local_dsq_post_enq() call to move_local_task_to_local_dsq(). As the
dispatch path doesn't need post-enqueue handling, add SCX_RQ_IN_BALANCE
early exit to keep consume_dispatch_q() behavior unchanged and avoid
triggering unnecessary resched when scx_bpf_dsq_move() is used from the
dispatch path.
Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()")
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Factor out local_dsq_post_enq() which performs post-enqueue handling for
local DSQs - triggering resched_curr() if SCX_ENQ_PREEMPT is specified or if
the current CPU is idle. No functional change.
This will be used by the next patch to fix move_local_task_to_local_dsq().
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
scx_enable() calls scx_bypass(true) to initialize in bypass mode and then
scx_bypass(false) on success to exit. If scx_enable() fails during task
initialization - e.g. scx_cgroup_init() or scx_init_task() returns an error -
it jumps to err_disable while bypass is still active. scx_disable_workfn()
then calls scx_bypass(true/false) for its own bypass, leaving the bypass depth
at 1 instead of 0. This causes the system to remain permanently in bypass mode
after a failed scx_enable().
Failures after task initialization is complete - e.g. scx_tryset_enable_state()
at the end - already call scx_bypass(false) before reaching the error path and
are not affected. This only affects a subset of failure modes.
Fix it by tracking whether scx_enable() called scx_bypass(true) in a bool and
having scx_disable_workfn() call an extra scx_bypass(false) to clear it. This
is a temporary measure as the bypass depth will be moved into the sched
instance, which will make this tracking unnecessary.
Fixes: 8c2090c504e9 ("sched_ext: Initialize in bypass mode")
Cc: stable@vger.kernel.org # v6.12+
Reported-by: Chris Mason <clm@meta.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/stable/286e6f7787a81239e1ce2989b52391ce%40kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The new memfd code fails to link without SHMEM:
aarch64-linux-ld: mm/memfd_luo.o: in function `memfd_luo_retrieve_folios':
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0xdc): undefined reference to `shmem_add_to_page_cache'
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0x11c): undefined reference to `shmem_inode_acct_blocks'
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0x134): undefined reference to `shmem_recalc_inode'
Add a Kconfig dependency to disallow that configuration.
Link: https://lkml.kernel.org/r/20251204100203.1034394-1-arnd@kernel.org
Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The kernel test robot reported a Smatch warning:
kernel/liveupdate/luo_core.c:402 luo_ioctl() warn: unsigned 'nr' is
never less than zero.
This occurs because 'nr' is unsigned and LIVEUPDATE_CMD_BASE is currently
defined as 0, making the check (nr < LIVEUPDATE_CMD_BASE) always false.
Remove the explicit lower bound check. The logic remains correct because
'nr' is unsigned; if nr is less than LIVEUPDATE_CMD_BASE, the expression
(nr - LIVEUPDATE_CMD_BASE) will wrap around to a large positive value.
This will inevitably be larger than ARRAY_SIZE(luo_ioctl_ops) and be
caught by the upper bound check.
Link: https://lkml.kernel.org/r/20251130010919.1488230-1-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202511280300.6pvBmXUS-lkp@intel.com/
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If we exit a list_for_each_entry() without hitting a break then the list
iterator points to an offset from the list_head. It's a non-NULL but
invalid pointer and dereferencing it isn't allowed.
Introduce a new "found" variable to test instead.
Link: https://lkml.kernel.org/r/aSlMc4SS09Re4_xn@stanley.mountain
Fixes: 3ee1d673194e ("liveupdate: luo_file: implement file systems callbacks")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202511280420.y9O4fyhX-lkp@intel.com/
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- Use the MSI parent domain API instead of the legacy API for setup and
teardown of PCI MSI IRQs
- Select POSIX_CPU_TIMERS_TASK_WORK now that VIRT_XFER_TO_GUEST_WORK
has been implemented for s390
- Fix a KVM bug which can lead to guest memory corruption
- Fix KASAN shadow memory mapping for hotplugged memory
- Minor bug fixes and improvements
* tag 's390-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/bug: Add missing alignment
s390/bug: Add missing CONFIG_BUG ifdef again
KVM: s390: Fix gmap_helper_zap_one_page() again
s390/pci: Migrate s390 IRQ logic to IRQ domain API
genirq: Change hwirq parameter to irq_hw_number_t
s390: Select POSIX_CPU_TIMERS_TASK_WORK
s390: Unmap early KASAN shadow on memory offlining
s390/vmem: Support 2G page splitting for KASAN shadow freeing
s390/boot: Use entire page for PTEs
s390/vmur: Use scnprintf() instead of sprintf()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
- last minute fix for missing parenthesis in recently merged code (Hans
de Goede)
- removal of excessive, non-fatal warnings (Dave Kleikamp)
* tag 'dma-mapping-6.19-2025-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: Fix DMA_BIT_MASK() macro being broken
dma/pool: eliminate alloc_pages warning in atomic_pool_expand
|
|
Commit 37cce22dbd51 ("bpf: verifier: Refactor helper access type
tracking") started distinguishing read vs write accesses performed by
helpers.
The second argument of bpf_d_path() is a pointer to a buffer that the
helper fills with the resulting path. However, its prototype currently
uses ARG_PTR_TO_MEM without MEM_WRITE.
Before 37cce22dbd51, helper accesses were conservatively treated as
potential writes, so this mismatch did not cause issues. Since that
commit, the verifier may incorrectly assume that the buffer contents
are unchanged across the helper call and base its optimizations on this
wrong assumption. This can lead to misbehaviour in BPF programs that
read back the buffer, such as prefix comparisons on the returned path.
Fix this by marking the second argument of bpf_d_path() as
ARG_PTR_TO_MEM | MEM_WRITE so that the verifier correctly models the
write to the caller-provided buffer.
Fixes: 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking")
Co-developed-by: Zesen Liu <ftyg@live.com>
Signed-off-by: Zesen Liu <ftyg@live.com>
Co-developed-by: Peili Gao <gplhust955@gmail.com>
Signed-off-by: Peili Gao <gplhust955@gmail.com>
Co-developed-by: Haoran Ni <haoran.ni.cs@gmail.com>
Signed-off-by: Haoran Ni <haoran.ni.cs@gmail.com>
Signed-off-by: Shuran Liu <electronlsr@gmail.com>
Reviewed-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/r/20251206141210.3148-2-electronlsr@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull futex updates from Ingo Molnar:
- Standardize on ktime_t in restart_block::time as well (Thomas
Weißschuh)
- Futex selftests:
- Add robust list testcases (André Almeida)
- Formatting fixes/cleanups (Carlos Llamas)
* tag 'locking-futex-2025-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Store time as ktime_t in restart block
selftests/futex: Create test for robust list
selftests/futex: Skip tests if shmget unsupported
selftests/futex: Add newline to ksft_exit_fail_msg()
selftests/futex: Remove unused test_futex_mpol()
|
|
This patch improves the verifier to correctly compute bounds for
sign extension compiler pattern composed of left shift by 32bits
followed by a sign right shift by 32bits. Pattern in the verifier was
limitted to positive value bounds and would reset bound computation for
negative values. New code allows both positive and negative values for
sign extension without compromising bound computation and verifier to
pass.
This change is required by GCC which generate such pattern, and was
detected in the context of systemd, as described in the following GCC
bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119731
Three new tests were added in verifier_subreg.c.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
Link: https://lore.kernel.org/r/20251202180220.11128-2-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
After commit 9216477449f3 ("bpf: cpumap: Add the possibility to attach
an eBPF program to cpumap"), __cpu_map_entry_alloc() may fail with
errors other than -ENOMEM, such as -EBADF or -EINVAL.
However, __cpu_map_entry_alloc() returns NULL on all failures, and
cpu_map_update_elem() unconditionally converts this NULL into -ENOMEM.
As a result, user space always receives -ENOMEM regardless of the actual
underlying error.
Examples of unexpected behavior:
- Nonexistent fd : -ENOMEM (should be -EBADF)
- Non-BPF fd : -ENOMEM (should be -EINVAL)
- Bad attach type : -ENOMEM (should be -EINVAL)
Change __cpu_map_entry_alloc() to return ERR_PTR(err) instead of NULL
and have cpu_map_update_elem() propagate this error.
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20251208131449.73036-2-enjuk@amazon.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
If there is a large number (hundreds) of dmabufs allocated, the text
output generated from dmabuf_iter_seq_show can exceed common user buffer
sizes (e.g. PAGE_SIZE) necessitating multiple start/stop cycles to
iterate through all dmabufs. However the dmabuf iterator currently
returns NULL in dmabuf_iter_seq_start for all non-zero pos values, which
results in the truncation of the output before all dmabufs are handled.
After dma_buf_iter_begin / dma_buf_iter_next, the refcount of the buffer
is elevated so that the BPF iterator program can run without holding any
locks. When a stop occurs, instead of immediately dropping the reference
on the buffer, stash a pointer to the buffer in seq->priv until
either start is called or the iterator is released. This also enables
the resumption of iteration without first walking through the list of
dmabufs based on the pos value.
Fixes: 76ea95534995 ("bpf: Add dmabuf iterator")
Signed-off-by: T.J. Mercier <tjmercier@google.com>
Link: https://lore.kernel.org/r/20251204000348.1413593-1-tjmercier@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Introduce a bpf_has_frame_pointer() helper that unwinders can call to
determine whether a given instruction pointer is within the valid frame
pointer region of a BPF JIT program or trampoline (i.e., after the
prologue, before the epilogue).
This will enable livepatch (with the ORC unwinder) to reliably unwind
through BPF JIT frames.
Acked-by: Song Liu <song@kernel.org>
Acked-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/fd2bc5b4e261a680774b28f6100509fd5ebad2f0.1764818927.git.jpoimboe@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
|
|
On SMP systems the CPU hotplug callbacks in the "starting" range are
invoked while the CPU is brought up and interrupts are still
disabled. Callbacks which are added later are invoked via the
hotplug-thread on the target CPU and interrupts are explicitly disabled.
In the UP case callbacks which are added later are invoked directly without
the thread indirection. This is in principle okay since there is just one
CPU but those callbacks are invoked with interrupt disabled code. That's
incorrect as those callbacks assume interrupt disabled context.
Disable interrupts before invoking the callbacks on UP if the state is
atomic and interrupts are expected to be disabled. The "save" part is
required because this is also invoked early in the boot process while
interrupts are disabled and must not be enabled prematurely.
Fixes: 06ddd17521bf1 ("sched/smp: Always define is_percpu_thread() and scheduler_ipi()")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251127144723.ev9DuXXR@linutronix.de
|
|
setup_percpu_irq() was forgotten when the percpu_devid infrastructure was
updated to deal with CPU affinities.
In order to keep ignoring users of this legacy API, provide sensible
defaults by setting the affinity to cpu_online_mask if none was provided by
the caller.
Fixes: bdf4e2ac295fe ("genirq: Allow per-cpu interrupt sharing for non-overlapping affinities")
Reported-by: Daniel Thompson <danielt@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251205091814.3944205-1-maz@kernel.org
Closes: https://lore.kernel.org/r/aTFozefMQRg7lYxh@aspen.lan
|
|
For events with inherit_stat enabled, a "read" event will be generated
to collect per task event counts on task exit.
The call chain is as follows:
do_exit
-> perf_event_exit_task
-> perf_event_exit_task_context
-> perf_event_exit_event
-> perf_remove_from_context
-> perf_child_detach
-> sync_child_event
-> perf_event_read_event
However, the child event context detaches the task too early in
perf_event_exit_task_context, which causes sync_child_event to never
generate the read event in this case, since child_event->ctx->task is
always set to TASK_TOMBSTONE. Fix that by moving context lock section
backward to ensure ctx->task is not set to TASK_TOMBSTONE before
generating the read event.
Because perf_event_free_task calls perf_event_exit_task_context with
exit = false to tear down all child events from the context, and the
task never lived, accessing the task PID can lead to a use-after-free.
To fix that, let sync_child_event read task from argument and move the
call to the only place it should be triggered to avoid the effect of
setting ctx->task to TASK_TOMESTONE, and add a task parameter to
perf_event_exit_event to trigger the sync_child_event properly when
needed.
This bug can be reproduced by running "perf record -s" and attaching to
any program that generates perf events in its child tasks. If we check
the result with "perf report -T", the last line of the report will leave
an empty table like "# PID TID", which is expected to contain the
per-task event counts by design.
Fixes: ef54c1a476ae ("perf: Rework perf_event_exit_event()")
Signed-off-by: Thaumy Cheng <thaumy.love@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-perf-users@vger.kernel.org
Link: https://patch.msgid.link/20251209041600.963586-1-thaumy.love@gmail.com
|
|
Recent commit 68e83f347266 ("tools: ynl-gen: add regeneration comment")
added a hint how to regenerate the code to the headers. Update
the new headers from this release cycle to also include it.
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251207004740.1657799-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Make the rescuer process more work on the last pwq when there are no
more to rescue for the whole workqueue to help the regular workers in
case it is a temporary memory pressure relief and to reduce relapse.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Previously, the rescuer scanned for all matching work items at once and
processed them within a single rescuer thread, which could cause one
blocking work item to stall all others.
Make the rescuer process work items one-by-one instead of slurping all
matches in a single pass.
Break the rescuer loop after finding and processing the first matching
work item, then restart the search to pick up the next. This gives
normal worker threads a chance to process other items which gives them
the opportunity to be processed instead of waiting on the rescuer's
queue and prevents a blocking work item from stalling the rest once
memory pressure is relieved.
Introduce a dummy cursor work item to avoid potentially O(N^2)
rescans of the work list. The marker records the resume position for
the next scan, eliminating redundant traversals.
Also introduce RESCUER_BATCH to control the maximum number of work items
the rescuer processes in each turn, and move on to other PWQs when the
limit is reached.
Cc: ying chen <yc1082463@gmail.com>
Reported-by: ying chen <yc1082463@gmail.com>
Fixes: e22bee782b3b ("workqueue: implement concurrency managed dynamic worker pool")
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Make send_mayday() operate on a PWQ directly instead of taking a work
item, so that rescuer_thread() now calls send_mayday(pwq) instead of
open-coding the mayday list manipulation.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Commit 406100f3da08 ("cpuset: fix race between hotplug work and later CPU
offline") added a check for empty effective_cpus in partitions for cgroup
v2. However, this check did not account for remote partitions, which were
introduced later.
After commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug processing
synchronous"), cpuset hotplug handling is now synchronous. This eliminates
the race condition with subsequent CPU offline operations that the original
check aimed to fix.
Instead of extending the check to support remote partitions, this patch
removes all the redundant effective_cpus check. Additionally, it adds a
check and warning to verify that all generated sched domains consist of
active CPUs, preventing partition_sched_domains from being invoked with
offline CPUs.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Use the new css_is_online() helper that has been introduced to check css
online state, instead of testing the CSS_ONLINE flag directly. This
improves readability and centralizes the state check logic.
No functional changes intended.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
On x86-64, this_cpu_cmpxchg() uses CMPXCHG without LOCK prefix which
means it is only safe for the local CPU and not for multiple CPUs.
Recently the commit 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi
safe") make css_rstat_updated lockless and uses lockless list to allow
reentrancy. Since css_rstat_updated can invoked from process context,
IRQ and NMI, it uses this_cpu_cmpxchg() to select the winner which will
inset the lockless lnode into the global per-cpu lockless list.
However the commit missed one case where lockless node of a cgroup can
be accessed and modified by another CPU doing the flushing. Basically
llist_del_first_init() in css_process_update_tree().
On a cursory look, it can be questioned how css_process_update_tree()
can see a lockless node in global lockless list where the updater is at
this_cpu_cmpxchg() and before llist_add() call in css_rstat_updated().
This can indeed happen in the presence of IRQs/NMI.
Consider this scenario: Updater for cgroup stat C on CPU A in process
context is after llist_on_list() check and before this_cpu_cmpxchg() in
css_rstat_updated() where it get interrupted by IRQ/NMI. In the IRQ/NMI
context, a new updater calls css_rstat_updated() for same cgroup C and
successfully inserts rstatc_pcpu->lnode.
Now concurrently CPU B is running the flusher and it calls
llist_del_first_init() for CPU A and got rstatc_pcpu->lnode of cgroup C
which was added by the IRQ/NMI updater.
Now imagine CPU B calling init_llist_node() on cgroup C's
rstatc_pcpu->lnode of CPU A and on CPU A, the process context updater
calling this_cpu_cmpxchg(rstatc_pcpu->lnode) concurrently.
The CMPXCNG without LOCK on CPU A is not safe and thus we need LOCK
prefix.
In Meta's fleet running the kernel with the commit 36df6e3dbd7e, we are
observing on some machines the memcg stats are getting skewed by more
than the actual memory on the system. On close inspection, we noticed
that lockless node for a workload for specific CPU was in the bad state
and thus all the updates on that CPU for that cgroup was being lost.
To confirm if this skew was indeed due to this CMPXCHG without LOCK in
css_rstat_updated(), we created a repro (using AI) at [1] which shows
that CMPXCHG without LOCK creates almost the same lnode corruption as
seem in Meta's fleet and with LOCK CMPXCHG the issue does not
reproduces.
Link: http://lore.kernel.org/efiagdwmzfwpdzps74fvcwq3n4cs36q33ij7eebcpssactv3zu@se4hqiwxcfxq [1]
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: stable@vger.kernel.org # v6.17+
Fixes: 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe")
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
NULL next
Early when trying to get sched_ext and proxy-exe working together,
I kept tripping over NULL ptr in put_prev_task_scx() on the line:
if (sched_class_above(&ext_sched_class, next->sched_class)) {
Which was due to put_prev_task() passes a NULL next, calling:
prev->sched_class->put_prev_task(rq, prev, NULL);
put_prev_task_scx() already guards for a NULL next in the
switch_class case, but doesn't seem to have a guard for
sched_class_above() check.
I can't say I understand why this doesn't trip usually without
proxy-exec. And in newer kernels there are way fewer
put_prev_task(), and I can't easily reproduce the issue now
even with proxy-exec.
But we still have one put_prev_task() call left in core.c that
seems like it could trip this, so I wanted to send this out for
consideration.
tj: put_prev_task() can be called with NULL @next; however, when @p is
queued, that doesn't happen, so this condition shouldn't currently be
triggerable. The connection isn't straightforward or necessarily reliable,
so add the NULL check even if it can't currently be triggered.
Link: http://lkml.kernel.org/r/20251206022218.1541878-1-jstultz@google.com
Signed-off-by: John Stultz <jstultz@google.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
This commit use kthread_destroy_worker() to release sch->helper
objects to fix the following kmemleak:
unreferenced object 0xffff888121ec7b00 (size 128):
comm "scx_simple", pid 1197, jiffies 4295884415
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 ad 4e ad de .............N..
ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff ................
backtrace (crc 587b3352):
kmemleak_alloc+0x62/0xa0
__kmalloc_cache_noprof+0x28d/0x3e0
kthread_create_worker_on_node+0xd5/0x1f0
scx_enable.isra.210+0x6c2/0x25b0
bpf_scx_reg+0x12/0x20
bpf_struct_ops_link_create+0x2c3/0x3b0
__sys_bpf+0x3102/0x4b00
__x64_sys_bpf+0x79/0xc0
x64_sys_call+0x15d9/0x1dd0
do_syscall_64+0xf0/0x470
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: bff3b5aec1b7 ("sched_ext: Move disable machinery into scx_sched")
Cc: stable@vger.kernel.org # v6.16+
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
atomic_pool_expand iteratively tries the allocation while decrementing
the page order. There is no need to issue a warning if an attempted
allocation fails.
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Fixes: d7e673ec2c8e ("dma-pool: Only allocate from CMA when in same memory zone")
[mszyprow: fixed typo]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20251202152810.142370-1-dave.kleikamp@oracle.com
|
|
The irqdomain implementation internally represents hardware IRQs as
irq_hw_number_t, which is defined as unsigned long int. When providing
an irq_hw_number_t to the generic_handle_domain() functions that expect
and unsigned int hwirq, this can lead to a loss of information. Change
the hwirq parameter to irq_hw_number_t to support the full range of
hwirqs.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Tobias Schumacher <ts@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko)
fixes a build issue and does some cleanup in ib/sys_info.c
- "Implement mul_u64_u64_div_u64_roundup()" (David Laight)
enhances the 64-bit math code on behalf of a PWM driver and beefs up
the test module for these library functions
- "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich)
makes BPF symbol names, sizes, and line numbers available to the GDB
debugger
- "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang)
adds a sysctl which can be used to cause additional info dumping when
the hung-task and lockup detectors fire
- "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu)
adds a general base64 encoder/decoder to lib/ and migrates several
users away from their private implementations
- "rbree: inline rb_first() and rb_last()" (Eric Dumazet)
makes TCP a little faster
- "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin)
reworks the KEXEC Handover interfaces in preparation for Live Update
Orchestrator (LUO), and possibly for other future clients
- "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin)
increases the flexibility of KEXEC Handover. Also preparation for LUO
- "Live Update Orchestrator" (Pasha Tatashin)
is a major new feature targeted at cloud environments. Quoting the
cover letter:
This series introduces the Live Update Orchestrator, a kernel
subsystem designed to facilitate live kernel updates using a
kexec-based reboot. This capability is critical for cloud
environments, allowing hypervisors to be updated with minimal
downtime for running virtual machines. LUO achieves this by
preserving the state of selected resources, such as memory,
devices and their dependencies, across the kernel transition.
As a key feature, this series includes support for preserving
memfd file descriptors, which allows critical in-memory data, such
as guest RAM or any other large memory region, to be maintained in
RAM across the kexec reboot.
Mike Rappaport merits a mention here, for his extensive review and
testing work.
- "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain)
moves the kexec and kdump sysfs entries from /sys/kernel/ to
/sys/kernel/kexec/ and adds back-compatibility symlinks which can
hopefully be removed one day
- "kho: fixes for vmalloc restoration" (Mike Rapoport)
fixes a BUG which was being hit during KHO restoration of vmalloc()
regions
* tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits)
calibrate: update header inclusion
Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"
vmcoreinfo: track and log recoverable hardware errors
kho: fix restoring of contiguous ranges of order-0 pages
kho: kho_restore_vmalloc: fix initialization of pages array
MAINTAINERS: TPM DEVICE DRIVER: update the W-tag
init: replace simple_strtoul with kstrtoul to improve lpj_setup
KHO: fix boot failure due to kmemleak access to non-PRESENT pages
Documentation/ABI: new kexec and kdump sysfs interface
Documentation/ABI: mark old kexec sysfs deprecated
kexec: move sysfs entries to /sys/kernel/kexec
test_kho: always print restore status
kho: free chunks using free_page() instead of kfree()
selftests/liveupdate: add kexec test for multiple and empty sessions
selftests/liveupdate: add simple kexec-based selftest for LUO
selftests/liveupdate: add userspace API selftests
docs: add documentation for memfd preservation via LUO
mm: memfd_luo: allow preserving memfd
liveupdate: luo_file: add private argument to store runtime state
mm: shmem: export some functions to internal.h
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix accounting of stop_count in file release
On opening the trace file, if "pause-on-trace" option is set, it will
increment the stop_count. On file release, it checks if stop_count is
set, and if so it decrements it. Since this code was originally
written, the stop_count can be incremented by other use cases. This
makes just checking the stop_count not enough to know if it should be
decremented.
Add a new iterator flag called "PAUSE" and have it set if the open
disables tracing and only decrement the stop_count if that flag is
set on close.
- Remove length field in trace_seq_printf() of print_synth_event()
When printing the synthetic event that has a static length array
field, the vsprintf() of the trace_seq_printf() triggered a
"(efault)" in the output. That's because the print_fmt replaced the
"%.*s" with "%s" causing the arguments to be off.
- Fix a bunch of typos
* tag 'trace-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix typo in trace_seq.c
tracing: Fix typo in trace_probe.c
tracing: Fix multiple typos in trace_osnoise.c
tracing: Fix multiple typos in trace_events_user.c
tracing: Fix typo in trace_events_trigger.c
tracing: Fix typo in trace_events_hist.c
tracing: Fix typo in trace_events_filter.c
tracing: Fix multiple typos in trace_events.c
tracing: Fix multiple typos in trace.c
tracing: Fix typo in ring_buffer_benchmark.c
tracing: Fix multiple typos in ring_buffer.c
tracing: Fix typo in fprobe.c
tracing: Fix typo in fpgraph.c
tracing: Fix fixed array of synthetic event
tracing: Fix enabling of tracing on file release
|