<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/perf, branch v7.0-rc7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Convert more 'alloc_obj' cases to default GFP_KERNEL arguments</title>
<updated>2026-02-22T04:03:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T04:03:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=32a92f8c89326985e05dce8b22d3f0aa07a3e1bd'/>
<id>32a92f8c89326985e05dce8b22d3f0aa07a3e1bd</id>
<content type='text'>
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm-cmn: Reject unsupported hardware configurations</title>
<updated>2026-02-03T19:43:52+00:00</updated>
<author>
<name>Robin Murphy</name>
<email>robin.murphy@arm.com</email>
</author>
<published>2026-02-03T14:07:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=36c0de02575ce59dfd879eb4ef63d53a68bbf9ce'/>
<id>36c0de02575ce59dfd879eb4ef63d53a68bbf9ce</id>
<content type='text'>
So far we've been fairly lax about accepting both unknown CMN models
(at least with a warning), and unknown revisions of those which we
do know, as although things do frequently change between releases,
typically enough remains the same to be somewhat useful for at least
some basic bringup checks. However, we also make assumptions of the
maximum supported sizes and numbers of things in various places, and
there's no guarantee that something new might not be bigger and lead
to nasty array overflows. Make sure we only try to run on things that
actually match our assumptions and so will not risk memory corruption.

We have at least always failed on completely unknown node types, so
update that error message for clarity and consistency too.

Cc: stable@vger.kernel.org
Fixes: 7819e05a0dce ("perf/arm-cmn: Revamp model detection")
Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So far we've been fairly lax about accepting both unknown CMN models
(at least with a warning), and unknown revisions of those which we
do know, as although things do frequently change between releases,
typically enough remains the same to be somewhat useful for at least
some basic bringup checks. However, we also make assumptions of the
maximum supported sizes and numbers of things in various places, and
there's no guarantee that something new might not be bigger and lead
to nasty array overflows. Make sure we only try to run on things that
actually match our assumptions and so will not risk memory corruption.

We have at least always failed on completely unknown node types, so
update that error message for clarity and consistency too.

Cc: stable@vger.kernel.org
Fixes: 7819e05a0dce ("perf/arm-cmn: Revamp model detection")
Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: arm_spe: Properly set hw.state on failures</title>
<updated>2026-02-03T19:41:50+00:00</updated>
<author>
<name>Leo Yan</name>
<email>leo.yan@arm.com</email>
</author>
<published>2026-02-03T14:40:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=283182c1c239f6873d1a50e9e710c1a699f2256b'/>
<id>283182c1c239f6873d1a50e9e710c1a699f2256b</id>
<content type='text'>
When arm_spe_pmu_next_off() fails to calculate a valid limit, it returns
zero to indicate that tracing should not start.  However, the caller
arm_spe_perf_aux_output_begin() does not propagate this failure by
updating hwc-&gt;state, cause the error to be silently ignored by upper
layers.

Because hwc-&gt;state remains zero after a failure, arm_spe_pmu_start()
continues to programs filter registers unnecessarily.  The driver
still reports success to the perf core, so the core assumes the SPE
event was enabled and proceeds to enable other events.  This breaks
event group semantics: SPE is already stopped while other events in the
same group are enabled.

Fix this by updating arm_spe_perf_aux_output_begin() to return a status
code indicating success (0) or failure (-EIO).  Both the interrupt
handler and arm_spe_pmu_start() check the return value and call
arm_spe_pmu_stop() to set PERF_HES_STOPPED in hwc-&gt;state.

In the interrupt handler, the period (e.g., period_left) needs to be
updated, so PERF_EF_UPDATE is passed to arm_spe_pmu_stop().  When the
error occurs during event start, the trace unit is not yet enabled, so
a flag '0' is used to drain buffer and update state only.

Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Signed-off-by: Leo Yan &lt;leo.yan@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When arm_spe_pmu_next_off() fails to calculate a valid limit, it returns
zero to indicate that tracing should not start.  However, the caller
arm_spe_perf_aux_output_begin() does not propagate this failure by
updating hwc-&gt;state, cause the error to be silently ignored by upper
layers.

Because hwc-&gt;state remains zero after a failure, arm_spe_pmu_start()
continues to programs filter registers unnecessarily.  The driver
still reports success to the perf core, so the core assumes the SPE
event was enabled and proceeds to enable other events.  This breaks
event group semantics: SPE is already stopped while other events in the
same group are enabled.

Fix this by updating arm_spe_perf_aux_output_begin() to return a status
code indicating success (0) or failure (-EIO).  Both the interrupt
handler and arm_spe_pmu_start() check the return value and call
arm_spe_pmu_stop() to set PERF_HES_STOPPED in hwc-&gt;state.

In the interrupt handler, the period (e.g., period_left) needs to be
updated, so PERF_EF_UPDATE is passed to arm_spe_pmu_stop().  When the
error occurs during event start, the trace unit is not yet enabled, so
a flag '0' is used to drain buffer and update state only.

Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Signed-off-by: Leo Yan &lt;leo.yan@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/cxlpmu: Replace IRQF_ONESHOT with IRQF_NO_THREAD</title>
<updated>2026-01-28T13:06:03+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2026-01-28T09:55:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ab26d9c85554c4ff1d95ca8341522880ed9219d6'/>
<id>ab26d9c85554c4ff1d95ca8341522880ed9219d6</id>
<content type='text'>
Passing IRQF_ONESHOT ensures that the interrupt source is masked until
the secondary (threaded) handler is done. If only a primary handler is
used then the flag makes no sense because the interrupt can not fire
(again) while its handler is running.
The flag also disallows force-threading of the primary handler and the
irq-core will warn about this.

The intention here was probably not allowing forced-threading.

Replace IRQF_ONESHOT with IRQF_NO_THREAD.

Reviewed-by: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Passing IRQF_ONESHOT ensures that the interrupt source is masked until
the secondary (threaded) handler is done. If only a primary handler is
used then the flag makes no sense because the interrupt can not fire
(again) while its handler is running.
The flag also disallows force-threading of the primary handler and the
irq-core will warn about this.

The intention here was probably not allowing forced-threading.

Replace IRQF_ONESHOT with IRQF_NO_THREAD.

Reviewed-by: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_dsu: Allow standard cycles events</title>
<updated>2026-01-06T21:27:40+00:00</updated>
<author>
<name>Robin Murphy</name>
<email>robin.murphy@arm.com</email>
</author>
<published>2025-12-15T13:05:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=79448fa1f495c3e3b7119e53bedc3cce273aa95f'/>
<id>79448fa1f495c3e3b7119e53bedc3cce273aa95f</id>
<content type='text'>
Since we do not use the divide-by-64 option, there should be no
significant difference between the dedicated cycle counter and the
standard cycles event. Since using the latter on DSU-120 now has
the side-effect of allowing multiple cycles events to be scheduled
simultaneously (beneficial for multiple cycle-based metrics), there
seems little reason not to allow the same on older DSUs as well.

Signed-off-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since we do not use the divide-by-64 option, there should be no
significant difference between the dedicated cycle counter and the
standard cycles event. Since using the latter on DSU-120 now has
the side-effect of allowing multiple cycles events to be scheduled
simultaneously (beneficial for multiple cycle-based metrics), there
seems little reason not to allow the same on older DSUs as well.

Signed-off-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_dsu: Support DSU-120</title>
<updated>2026-01-06T21:27:40+00:00</updated>
<author>
<name>Robin Murphy</name>
<email>robin.murphy@arm.com</email>
</author>
<published>2025-12-15T13:04:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=85c0dbd8b6e2ca5e672560c7cd86801bffa0d884'/>
<id>85c0dbd8b6e2ca5e672560c7cd86801bffa0d884</id>
<content type='text'>
DSU-120 has the same system register interface as previous DSUs, but
no longer offers a dedicated cycle counter. While this is not directly
discoverable via PMCR, the PMCCNTR register is still defined to exist
with RAZ/WI behaviour, allowing for a straightforward heuristic.

Signed-off-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DSU-120 has the same system register interface as previous DSUs, but
no longer offers a dedicated cycle counter. While this is not directly
discoverable via PMCR, the PMCCNTR register is still defined to exist
with RAZ/WI behaviour, allowing for a straightforward heuristic.

Signed-off-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_dsu: Support DSU-110</title>
<updated>2026-01-06T21:27:39+00:00</updated>
<author>
<name>Robin Murphy</name>
<email>robin.murphy@arm.com</email>
</author>
<published>2025-12-15T13:04:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0113affc91014a14251890c3af8f2bade1c20222'/>
<id>0113affc91014a14251890c3af8f2bade1c20222</id>
<content type='text'>
DSU-110 sneakily made all the event counters 64-bit, perhaps related
to no longer having AArch32 EL1 to worry about. While the DSU version
itself is not easily discoverable, the size of a counter certainly is.

Signed-off-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DSU-110 sneakily made all the event counters 64-bit, perhaps related
to no longer having AArch32 EL1 to worry about. While the DSU version
itself is not easily discoverable, the size of a counter certainly is.

Signed-off-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drivers: perf: use bitmap_empty() where appropriate</title>
<updated>2026-01-05T20:39:40+00:00</updated>
<author>
<name>Yury Norov (NVIDIA)</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2025-12-16T01:20:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0c7c64146f707ffe7abd5d11c9828d8129903ab5'/>
<id>0c7c64146f707ffe7abd5d11c9828d8129903ab5</id>
<content type='text'>
bitmap_empty() is more verbose and efficient, as it stops traversing
bitmaps as soon as the 1st set bit found.

Switch perf code to using bitmap_empty() where appropriate, and
correspondingly use boolean types.

Signed-off-by: Yury Norov (NVIDIA) &lt;yury.norov@gmail.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bitmap_empty() is more verbose and efficient, as it stops traversing
bitmaps as soon as the 1st set bit found.

Switch perf code to using bitmap_empty() where appropriate, and
correspondingly use boolean types.

Signed-off-by: Yury Norov (NVIDIA) &lt;yury.norov@gmail.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
