From 71396cfac97d0249fa7d8dcc8e649b6ba4c090e4 Mon Sep 17 00:00:00 2001 From: Ilkka Koskinen Date: Thu, 28 Aug 2025 15:35:19 -0700 Subject: perf/dwc_pcie: Support counting multiple lane events in parallel While Designware PCIe PMU allows to count only one time based event at a time, it allows to count all the lane events simultaneously. After the patch one is able to count a group of lane events: $ perf stat -e '{dwc_rootport/tx_memory_write,lane=1/,dwc_rootport/rx_memory_read,lane=0/}' dd if=/dev/nvme0n1 of=/dev/null bs=1M count=1 Earlier the events wouldn't have been counted successfully. Signed-off-by: Ilkka Koskinen Signed-off-by: Will Deacon --- Documentation/admin-guide/perf/dwc_pcie_pmu.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst index cb376f335f40..167f9281fbf5 100644 --- a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst @@ -16,8 +16,8 @@ provides the following two features: - one 64-bit counter for Time Based Analysis (RX/TX data throughput and time spent in each low-power LTSSM state) and -- one 32-bit counter for Event Counting (error and non-error events for - a specified lane) +- one 32-bit counter per event for Event Counting (error and non-error + events for a specified lane) Note: There is no interrupt for counter overflow. -- cgit v1.2.3 From e31c0eb10388e2af0502b77e61453efbc8a4f974 Mon Sep 17 00:00:00 2001 From: Yicong Yang Date: Thu, 14 Aug 2025 17:16:20 +0800 Subject: drivers/perf: hisi: Add support for HiSilicon NoC PMU Adds the support for HiSilicon NoC (Network on Chip) PMU which will be used to monitor the events on the system bus. The PMU device will be named after the SCL ID (either Super CPU cluster or Super IO cluster) and the index ID, just similar to other HiSilicon Uncore PMUs. Below PMU formats are provided besides the event: - ch: the transaction channel (data, request, response, etc) which can be used to filter the counting. - tt_en: tracetag filtering enable. Just as other HiSilicon Uncore PMUs the NoC PMU supports only counting the transactions with tracetag. The NoC PMU doesn't have an interrupt to indicate the overflow. However we have a 64 bit counter which is large enough and it's nearly impossible to overflow. Reviewed-by: Jonathan Cameron Signed-off-by: Yicong Yang Signed-off-by: Will Deacon --- Documentation/admin-guide/perf/hisi-pmu.rst | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst index 48992a0b8e94..6f0ea4f641cc 100644 --- a/Documentation/admin-guide/perf/hisi-pmu.rst +++ b/Documentation/admin-guide/perf/hisi-pmu.rst @@ -112,6 +112,17 @@ uring channel. It is 2 bits. Some important codes are as follows: - 2'b00: default value, count the events which sent to the both uring and uring_ext channel; +6. ch: NoC PMU supports filtering the event counts of certain transaction +channel with this option. The current supported channels are as follows: + +- 3'b010: Request channel +- 3'b100: Snoop channel +- 3'b110: Response channel +- 3'b111: Data channel + +7. tt_en: NoC PMU supports counting only transactions that have tracetag set +if this option is set. See the 2nd list for more information about tracetag. + Users could configure IDs to count data come from specific CCL/ICL, by setting srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not -- cgit v1.2.3 From bad11557eed2592c017a06752e58df49080b4a6a Mon Sep 17 00:00:00 2001 From: Koichi Okuno Date: Tue, 9 Sep 2025 12:02:50 +0900 Subject: perf: Fujitsu: Add the Uncore PMU driver This adds a new dynamic PMU to the Perf Events framework to program and control the Uncore PMUs in Fujitsu chips. This driver exports formatting and event information to sysfs so it can be used by the perf user space tools with the syntaxes: perf stat -e pci_iod0_pci0/ea-pci/ ls perf stat -e pci_iod0_pci0/event=0x80/ ls perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls perf stat -e mac_iod0_mac0_ch0/event=0x80/ ls FUJITSU-MONAKA PMU Events Specification v1.1 URL: https://github.com/fujitsu/FUJITSU-MONAKA Reviewed-by: Yicong Yang Signed-off-by: Koichi Okuno Signed-off-by: Will Deacon --- .../admin-guide/perf/fujitsu_uncore_pmu.rst | 110 +++++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 111 insertions(+) create mode 100644 Documentation/admin-guide/perf/fujitsu_uncore_pmu.rst (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/perf/fujitsu_uncore_pmu.rst b/Documentation/admin-guide/perf/fujitsu_uncore_pmu.rst new file mode 100644 index 000000000000..46595b788d3a --- /dev/null +++ b/Documentation/admin-guide/perf/fujitsu_uncore_pmu.rst @@ -0,0 +1,110 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +================================================ +Fujitsu Uncore Performance Monitoring Unit (PMU) +================================================ + +This driver supports the Uncore MAC PMUs and the Uncore PCI PMUs found +in Fujitsu chips. +Each MAC PMU on these chips is exposed as a uncore perf PMU with device name +mac_iod_mac_ch. +And each PCI PMU on these chips is exposed as a uncore perf PMU with device name +pci_iod_pci. + +The driver provides a description of its available events and configuration +options in sysfs, see /sys/bus/event_sources/devices/mac_iod_mac_ch/ +and /sys/bus/event_sources/devices/pci_iod_pci/. +This driver exports: +- formats, used by perf user space and other tools to configure events +- events, used by perf user space and other tools to create events + symbolically, e.g.: + perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls + perf stat -a -e pci_iod0_pci0/event=0x24/ ls +- cpumask, used by perf user space and other tools to know on which CPUs + to open the events + +This driver supports the following events for MAC: +- cycles + This event counts MAC cycles at MAC frequency. +- read-count + This event counts the number of read requests to MAC. +- read-count-request + This event counts the number of read requests including retry to MAC. +- read-count-return + This event counts the number of responses to read requests to MAC. +- read-count-request-pftgt + This event counts the number of read requests including retry with PFTGT + flag. +- read-count-request-normal + This event counts the number of read requests including retry without PFTGT + flag. +- read-count-return-pftgt-hit + This event counts the number of responses to read requests which hit the + PFTGT buffer. +- read-count-return-pftgt-miss + This event counts the number of responses to read requests which miss the + PFTGT buffer. +- read-wait + This event counts outstanding read requests issued by DDR memory controller + per cycle. +- write-count + This event counts the number of write requests to MAC (including zero write, + full write, partial write, write cancel). +- write-count-write + This event counts the number of full write requests to MAC (not including + zero write). +- write-count-pwrite + This event counts the number of partial write requests to MAC. +- memory-read-count + This event counts the number of read requests from MAC to memory. +- memory-write-count + This event counts the number of full write requests from MAC to memory. +- memory-pwrite-count + This event counts the number of partial write requests from MAC to memory. +- ea-mac + This event counts energy consumption of MAC. +- ea-memory + This event counts energy consumption of memory. +- ea-memory-mac-write + This event counts the number of write requests from MAC to memory. +- ea-ha + This event counts energy consumption of HA. + + 'ea' is the abbreviation for 'Energy Analyzer'. + +Examples for use with perf:: + + perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls + +And, this driver supports the following events for PCI: +- pci-port0-cycles + This event counts PCI cycles at PCI frequency in port0. +- pci-port0-read-count + This event counts read transactions for data transfer in port0. +- pci-port0-read-count-bus + This event counts read transactions for bus usage in port0. +- pci-port0-write-count + This event counts write transactions for data transfer in port0. +- pci-port0-write-count-bus + This event counts write transactions for bus usage in port0. +- pci-port1-cycles + This event counts PCI cycles at PCI frequency in port1. +- pci-port1-read-count + This event counts read transactions for data transfer in port1. +- pci-port1-read-count-bus + This event counts read transactions for bus usage in port1. +- pci-port1-write-count + This event counts write transactions for data transfer in port1. +- pci-port1-write-count-bus + This event counts write transactions for bus usage in port1. +- ea-pci + This event counts energy consumption of PCI. + + 'ea' is the abbreviation for 'Energy Analyzer'. + +Examples for use with perf:: + + perf stat -e pci_iod0_pci0/ea-pci/ ls + +Given that these are uncore PMUs the driver does not support sampling, therefore +"perf record" will not work. Per-task perf sessions are not supported. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst index 072b510385c4..47d9a3df6329 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -29,3 +29,4 @@ Performance monitor support cxl ampere_cspmu mrvl-pem-pmu + fujitsu_uncore_pmu -- cgit v1.2.3 From 272dd0e5e58d9c216771aa4a9dc1e36a662792da Mon Sep 17 00:00:00 2001 From: Yushan Wang Date: Fri, 29 Aug 2025 18:14:26 +0800 Subject: Documentation: hisi-pmu: Fix of minor format error The inline path of sysfs should be placed in literal blocks to make documentation look better. Acked-by: Jonathan Cameron Acked-by: Yicong Yang Signed-off-by: Yushan Wang Signed-off-by: Will Deacon --- Documentation/admin-guide/perf/hisi-pmu.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst index 6f0ea4f641cc..8df048c26498 100644 --- a/Documentation/admin-guide/perf/hisi-pmu.rst +++ b/Documentation/admin-guide/perf/hisi-pmu.rst @@ -18,9 +18,10 @@ HiSilicon SoC uncore PMU driver Each device PMU has separate registers for event counting, control and interrupt, and the PMU driver shall register perf PMU drivers like L3C, HHA and DDRC etc. The available events and configuration options shall -be described in the sysfs, see: +be described in the sysfs, see:: + +/sys/bus/event_source/devices/hisi_sccl{X}_ -/sys/bus/event_source/devices/hisi_sccl{X}_. The "perf list" command shall list the available events from sysfs. Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU -- cgit v1.2.3 From 6d2f913fda5683fbd4c3580262e10386c1263dfb Mon Sep 17 00:00:00 2001 From: Yushan Wang Date: Fri, 29 Aug 2025 18:14:27 +0800 Subject: Documentation: hisi-pmu: Add introduction to HiSilicon V3 PMU Some of HiSilicon V3 PMU hardware is divided into parts to fulfill the job of monitoring specific parts of a device. Add description on that as well as the newly added ext option for L3C PMU. Acked-by: Jonathan Cameron Signed-off-by: Yushan Wang Reviewed-by: Yicong Yang Signed-off-by: Will Deacon --- Documentation/admin-guide/perf/hisi-pmu.rst | 33 +++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst index 8df048c26498..c4c2cbbf88cb 100644 --- a/Documentation/admin-guide/perf/hisi-pmu.rst +++ b/Documentation/admin-guide/perf/hisi-pmu.rst @@ -124,6 +124,39 @@ channel with this option. The current supported channels are as follows: 7. tt_en: NoC PMU supports counting only transactions that have tracetag set if this option is set. See the 2nd list for more information about tracetag. +For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are +further divided into parts for finer granularity of tracing, each part has its +own dedicated PMU, and all such PMUs together cover the monitoring job of events +on particular uncore device. Such PMUs are described in sysfs with name format +slightly changed:: + +/sys/bus/event_source/devices/hisi_sccl{X}_ + +Z is the sub-id, indicating different PMUs for part of hardware device. + +Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU +provides ``ext`` option to allow exploration of even finer granual statistics +of L3C PMU. L3C PMU driver uses that as hint of termination when delivering +perf command to hardware: + +- ext=0: Default, could be used with event names. +- ext=1 and ext=2: Must be used with event codes, event names are not supported. + +An example of perf command could be:: + + $# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5 + +or:: + + $# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5 + +As above, ``hisi_sccl0_l3c1_0`` locates PMU of Super CPU CLuster 0, L3 cache 1 +pipe0. + +First command locates the first part of L3C since ``ext=0`` is implied by +default. Second command issues the counting on another part of L3C with the +event ``0x1``. + Users could configure IDs to count data come from specific CCL/ICL, by setting srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not -- cgit v1.2.3