<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/perf/arm_cspmu, branch master</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux</title>
<updated>2026-04-14T23:48:56+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-04-14T23:48:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c43267e6794a36013fd495a4d81bf7f748fe4615'/>
<id>c43267e6794a36013fd495a4d81bf7f748fe4615</id>
<content type='text'>
Pull arm64 updates from Catalin Marinas:
 "The biggest changes are MPAM enablement in drivers/resctrl and new PMU
  support under drivers/perf.

  On the core side, FEAT_LSUI lets futex atomic operations with EL0
  permissions, avoiding PAN toggling.

  The rest is mostly TLB invalidation refactoring, further generic entry
  work, sysreg updates and a few fixes.

  Core features:

   - Add support for FEAT_LSUI, allowing futex atomic operations without
     toggling Privileged Access Never (PAN)

   - Further refactor the arm64 exception handling code towards the
     generic entry infrastructure

   - Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis
     through it

  Memory management:

   - Refactor the arm64 TLB invalidation API and implementation for
     better control over barrier placement and level-hinted invalidation

   - Enable batched TLB flushes during memory hot-unplug

   - Fix rodata=full block mapping support for realm guests (when
     BBML2_NOABORT is available)

  Perf and PMU:

   - Add support for a whole bunch of system PMUs featured in NVIDIA's
     Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers
     for CPU/C2C memory latency PMUs)

   - Clean up iomem resource handling in the Arm CMN driver

   - Fix signedness handling of AA64DFR0.{PMUVer,PerfMon}

  MPAM (Memory Partitioning And Monitoring):

   - Add architecture context-switch and hiding of the feature from KVM

   - Add interface to allow MPAM to be exposed to user-space using
     resctrl

   - Add errata workaround for some existing platforms

   - Add documentation for using MPAM and what shape of platforms can
     use resctrl

  Miscellaneous:

   - Check DAIF (and PMR, where relevant) at task-switch time

   - Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode
     (only relevant to asynchronous or asymmetric tag check modes)

   - Remove a duplicate allocation in the kexec code

   - Remove redundant save/restore of SCS SP on entry to/from EL0

   - Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap
     descriptions

   - Add kselftest coverage for cmpbr_sigill()

   - Update sysreg definitions"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (109 commits)
  arm64: rsi: use linear-map alias for realm config buffer
  arm64: Kconfig: fix duplicate word in CMDLINE help text
  arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode
  arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12
  arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps
  arm64: kexec: Remove duplicate allocation for trans_pgd
  ACPI: AGDI: fix missing newline in error message
  arm64: Check DAIF (and PMR) at task-switch time
  arm64: entry: Use split preemption logic
  arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode()
  arm64: entry: Consistently prefix arm64-specific wrappers
  arm64: entry: Don't preempt with SError or Debug masked
  entry: Split preemption from irqentry_exit_to_kernel_mode()
  entry: Split kernel mode logic from irqentry_{enter,exit}()
  entry: Move irqentry_enter() prototype later
  entry: Remove local_irq_{enable,disable}_exit_to_user()
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull arm64 updates from Catalin Marinas:
 "The biggest changes are MPAM enablement in drivers/resctrl and new PMU
  support under drivers/perf.

  On the core side, FEAT_LSUI lets futex atomic operations with EL0
  permissions, avoiding PAN toggling.

  The rest is mostly TLB invalidation refactoring, further generic entry
  work, sysreg updates and a few fixes.

  Core features:

   - Add support for FEAT_LSUI, allowing futex atomic operations without
     toggling Privileged Access Never (PAN)

   - Further refactor the arm64 exception handling code towards the
     generic entry infrastructure

   - Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis
     through it

  Memory management:

   - Refactor the arm64 TLB invalidation API and implementation for
     better control over barrier placement and level-hinted invalidation

   - Enable batched TLB flushes during memory hot-unplug

   - Fix rodata=full block mapping support for realm guests (when
     BBML2_NOABORT is available)

  Perf and PMU:

   - Add support for a whole bunch of system PMUs featured in NVIDIA's
     Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers
     for CPU/C2C memory latency PMUs)

   - Clean up iomem resource handling in the Arm CMN driver

   - Fix signedness handling of AA64DFR0.{PMUVer,PerfMon}

  MPAM (Memory Partitioning And Monitoring):

   - Add architecture context-switch and hiding of the feature from KVM

   - Add interface to allow MPAM to be exposed to user-space using
     resctrl

   - Add errata workaround for some existing platforms

   - Add documentation for using MPAM and what shape of platforms can
     use resctrl

  Miscellaneous:

   - Check DAIF (and PMR, where relevant) at task-switch time

   - Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode
     (only relevant to asynchronous or asymmetric tag check modes)

   - Remove a duplicate allocation in the kexec code

   - Remove redundant save/restore of SCS SP on entry to/from EL0

   - Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap
     descriptions

   - Add kselftest coverage for cmpbr_sigill()

   - Update sysreg definitions"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (109 commits)
  arm64: rsi: use linear-map alias for realm config buffer
  arm64: Kconfig: fix duplicate word in CMDLINE help text
  arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode
  arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12
  arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps
  arm64: kexec: Remove duplicate allocation for trans_pgd
  ACPI: AGDI: fix missing newline in error message
  arm64: Check DAIF (and PMR) at task-switch time
  arm64: entry: Use split preemption logic
  arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode()
  arm64: entry: Consistently prefix arm64-specific wrappers
  arm64: entry: Don't preempt with SError or Debug masked
  entry: Split preemption from irqentry_exit_to_kernel_mode()
  entry: Split kernel mode logic from irqentry_{enter,exit}()
  entry: Move irqentry_enter() prototype later
  entry: Remove local_irq_{enable,disable}_exit_to_user()
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu()</title>
<updated>2026-04-06T14:55:16+00:00</updated>
<author>
<name>Chengwen Feng</name>
<email>fengchengwen@huawei.com</email>
</author>
<published>2026-04-01T08:16:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1ab03189793ffe60f184ce58ea7d0a4f0dcb7e06'/>
<id>1ab03189793ffe60f184ce58ea7d0a4f0dcb7e06</id>
<content type='text'>
Update arm_cspmu to use acpi_get_cpu_uid() instead of
get_acpi_id_for_cpu(), aligning with unified ACPI CPU UID interface.

No functional changes are introduced by this switch (valid inputs retain
original behavior).

Signed-off-by: Chengwen Feng &lt;fengchengwen@huawei.com&gt;
Reviewed-by: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Link: https://patch.msgid.link/20260401081640.26875-7-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Update arm_cspmu to use acpi_get_cpu_uid() instead of
get_acpi_id_for_cpu(), aligning with unified ACPI CPU UID interface.

No functional changes are introduced by this switch (valid inputs retain
original behavior).

Signed-off-by: Chengwen Feng &lt;fengchengwen@huawei.com&gt;
Reviewed-by: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Link: https://patch.msgid.link/20260401081640.26875-7-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_cspmu: nvidia: Add Tegra410 PCIE-TGT PMU</title>
<updated>2026-03-24T12:37:32+00:00</updated>
<author>
<name>Besar Wicaksono</name>
<email>bwicaksono@nvidia.com</email>
</author>
<published>2026-03-24T01:29:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3dd73022306bfdb29b1c33cb106fe337f46a6105'/>
<id>3dd73022306bfdb29b1c33cb106fe337f46a6105</id>
<content type='text'>
Adds PCIE-TGT PMU support in Tegra410 SOC. This PMU is
instanced in each root complex in the SOC and it captures
traffic originating from any source towards PCIE BAR and CXL
HDM range. The traffic can be filtered based on the
destination root port or target address range.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds PCIE-TGT PMU support in Tegra410 SOC. This PMU is
instanced in each root complex in the SOC and it captures
traffic originating from any source towards PCIE BAR and CXL
HDM range. The traffic can be filtered based on the
destination root port or target address range.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_cspmu: nvidia: Add Tegra410 PCIE PMU</title>
<updated>2026-03-24T12:37:32+00:00</updated>
<author>
<name>Besar Wicaksono</name>
<email>bwicaksono@nvidia.com</email>
</author>
<published>2026-03-24T01:29:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bf585ba14726788335c640512d11186dab573612'/>
<id>bf585ba14726788335c640512d11186dab573612</id>
<content type='text'>
Adds PCIE PMU support in Tegra410 SOC. This PMU is instanced
in each root complex in the SOC and can capture traffic from
PCIE device to various memory types. This PMU can filter traffic
based on the originating root port or BDF and the target memory
types (CPU DRAM, GPU Memory, CXL Memory, or remote Memory).

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds PCIE PMU support in Tegra410 SOC. This PMU is instanced
in each root complex in the SOC and can capture traffic from
PCIE device to various memory types. This PMU can filter traffic
based on the originating root port or BDF and the target memory
types (CPU DRAM, GPU Memory, CXL Memory, or remote Memory).

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_cspmu: Add arm_cspmu_acpi_dev_get</title>
<updated>2026-03-24T12:37:32+00:00</updated>
<author>
<name>Besar Wicaksono</name>
<email>bwicaksono@nvidia.com</email>
</author>
<published>2026-03-24T01:29:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bc86281fe4bd5d4a78be2f370e8319c9517e40ff'/>
<id>bc86281fe4bd5d4a78be2f370e8319c9517e40ff</id>
<content type='text'>
Add interface to get ACPI device associated with the
PMU. This ACPI device may contain additional properties
not covered by the standard properties.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add interface to get ACPI device associated with the
PMU. This ACPI device may contain additional properties
not covered by the standard properties.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_cspmu: nvidia: Add Tegra410 UCF PMU</title>
<updated>2026-03-24T12:37:32+00:00</updated>
<author>
<name>Besar Wicaksono</name>
<email>bwicaksono@nvidia.com</email>
</author>
<published>2026-03-24T01:29:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f5caf26fd6c71294d0fb254404ed66f8cff6f7f7'/>
<id>f5caf26fd6c71294d0fb254404ed66f8cff6f7f7</id>
<content type='text'>
The Unified Coherence Fabric (UCF) contains last level cache
and cache coherent interconnect in Tegra410 SOC. The PMU in
this device can be used to capture events related to access
to the last level cache and memory from different sources.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The Unified Coherence Fabric (UCF) contains last level cache
and cache coherent interconnect in Tegra410 SOC. The PMU in
this device can be used to capture events related to access
to the last level cache and memory from different sources.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: arm_cspmu: fix error handling in arm_cspmu_impl_unregister()</title>
<updated>2025-11-03T14:20:03+00:00</updated>
<author>
<name>Ma Ke</name>
<email>make24@iscas.ac.cn</email>
</author>
<published>2025-10-22T11:53:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=970e1e41805f0bd49dc234330a9390f4708d097d'/>
<id>970e1e41805f0bd49dc234330a9390f4708d097d</id>
<content type='text'>
driver_find_device() calls get_device() to increment the reference
count once a matching device is found. device_release_driver()
releases the driver, but it does not decrease the reference count that
was incremented by driver_find_device(). At the end of the loop, there
is no put_device() to balance the reference count. To avoid reference
count leakage, add put_device() to decrease the reference count.

Found by code review.

Cc: stable@vger.kernel.org
Fixes: bfc653aa89cb ("perf: arm_cspmu: Separate Arm and vendor module")
Signed-off-by: Ma Ke &lt;make24@iscas.ac.cn&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
driver_find_device() calls get_device() to increment the reference
count once a matching device is found. device_release_driver()
releases the driver, but it does not decrease the reference count that
was incremented by driver_find_device(). At the end of the loop, there
is no put_device() to balance the reference count. To avoid reference
count leakage, add put_device() to decrease the reference count.

Found by code review.

Cc: stable@vger.kernel.org
Fixes: bfc653aa89cb ("perf: arm_cspmu: Separate Arm and vendor module")
Signed-off-by: Ma Ke &lt;make24@iscas.ac.cn&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_cspmu: nvidia: Add pmevfiltr2 support</title>
<updated>2025-11-03T13:35:07+00:00</updated>
<author>
<name>Besar Wicaksono</name>
<email>bwicaksono@nvidia.com</email>
</author>
<published>2025-09-30T00:26:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=decc3684c24112286c527188bb09dd6eaf720cc0'/>
<id>decc3684c24112286c527188bb09dd6eaf720cc0</id>
<content type='text'>
Support NVIDIA PMU that utilizes the optional event filter2 register.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Support NVIDIA PMU that utilizes the optional event filter2 register.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_cspmu: nvidia: Add revision id matching</title>
<updated>2025-11-03T13:35:07+00:00</updated>
<author>
<name>Besar Wicaksono</name>
<email>bwicaksono@nvidia.com</email>
</author>
<published>2025-09-30T00:26:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=82dfd72bfb0362a3900179595032b65be11582da'/>
<id>82dfd72bfb0362a3900179595032b65be11582da</id>
<content type='text'>
Distinguish NVIDIA devices by revision and variant bits
in PMIIDR register in addition to product id.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Distinguish NVIDIA devices by revision and variant bits
in PMIIDR register in addition to product id.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/arm_cspmu: Add pmpidr support</title>
<updated>2025-11-03T13:35:07+00:00</updated>
<author>
<name>Besar Wicaksono</name>
<email>bwicaksono@nvidia.com</email>
</author>
<published>2025-09-30T00:26:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=04330be8dc7fddf36f4adb1271932788ad47e7ad'/>
<id>04330be8dc7fddf36f4adb1271932788ad47e7ad</id>
<content type='text'>
The PMIIDR value is composed by the values in PMPIDR registers.
We can use PMPIDR registers as alternative for device
identification for systems that do not implement PMIIDR.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The PMIIDR value is composed by the values in PMPIDR registers.
We can use PMPIDR registers as alternative for device
identification for systems that do not implement PMIIDR.

Reviewed-by: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Signed-off-by: Besar Wicaksono &lt;bwicaksono@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
