summaryrefslogtreecommitdiff
path: root/drivers/perf/hisilicon
AgeCommit message (Collapse)Author
2025-09-25drivers/perf: hisi: Add tt_core_deprecated for compatibilityYicong Yang
Previously tt_core is defined as config1:0-7 which may not cover all the CPUs sharing L3C on platforms with more than 8 CPUs in a L3C. In order to support such platforms extend tt_core to 16 bits, since no spare space in config1, tt_core was moved to config2:0-15. Though linux expects the users to retrieve the control encoding from sysfs first for each option, it's possible if user doesn't follow this and hardcoded tt_core in config1. So add an option tt_core_deprecated for config1:0-7 for backward compatibility. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-22drivers/perf: hisi: Add support for L3C PMU v3Yicong Yang
This patch adds support for L3C PMU v3. The v3 L3C PMU supports an extended events space which can be controlled in up to 2 extra address spaces with separate overflow interrupts. The layout of the control/event registers are kept the same. The extended events with original ones together cover the monitoring job of all transactions on L3C. The extended events is specified with `ext=[1|2]` option for the driver to distinguish, like below: perf stat -e hisi_sccl0_l3c0_0/event=<event_id>,ext=1/ Currently only event option using config bit [7, 0]. There's still plenty unused space. Make ext using config [16, 17] and reserve bit [15, 8] for event option for future extension. With the capability of extra counters, number of counters for HiSilicon uncore PMU could reach up to 24, the usedmap is extended accordingly. The hw_perf_event::event_base is initialized to the base MMIO address of the event and will be used for later control, overflow handling and counts readout. We still make use of the Uncore PMU framework for handling the events and interrupt migration on CPU hotplug. The framework's cpuhp callback will handle the event migration and interrupt migration of orginial event, if PMU supports extended events then the interrupt of extended events is migrated to the same CPU choosed by the framework. A new HID of HISI0215 is used for this version of L3C PMU. Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Co-developed-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-22drivers/perf: hisi: Refactor the event configuration of L3C PMUYicong Yang
The event register is configured using hisi_pmu::base directly since only one address space is supported for L3C PMU. We need to extend if events configuration locates in different address space. In order to make preparation for such hardware, extract the event register configuration to separate function using hw_perf_event::event_base as each event's base address. Implement a private hisi_uncore_ops::get_event_idx() callback for initialize the event_base besides get the hardware index. No functional changes intended. Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-22drivers/perf: hisi: Extend the field of tt_coreYicong Yang
Currently the tt_core's using config1's bit [7, 0] and can not be extended. For some platforms there's more the 8 CPUs sharing the L3 cache. So make tt_core use config2's bit [15, 0] and the remaining bits in config2 is reserved for extension. Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-22drivers/perf: hisi: Extract the event filter check of L3C PMUYicong Yang
L3C PMU has 4 filter options which are sharing perf_event_attr::config1. Driver will check config1 to see whether a certain event has a filter setting. It'll be incorrect if we make use of other bits in config1 for non-filter options. So check whether each filter options are set directly in a separate function instead. Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-22drivers/perf: hisi: Simplify the probe process of each L3C PMU versionYicong Yang
Version 1 and 2 of L3C PMU also use different HID. Make use of struct acpi_device_id::driver_data for version specific information rather than judge the version register. This will help to simplify the probe process and also a bit easier for extension. Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-22drivers/perf: hisi: Export hisi_uncore_pmu_isr()Yicong Yang
Currently Uncore PMU framework assume one PMU device only have one interrupt and will help register the interrupt handler. It cannot support a PMU with multiple interrupt resources. An uncore PMU may have multiple interrupts that can share the same handler. Export hisi_uncore_pmu_isr() to allow drivers register the irq handler by their own routine. Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-22drivers/perf: hisi: Relax the event ID check in the frameworkYicong Yang
Event ID is only using the attr::config bit [7, 0] but we check the event range using the whole 64bit field. It blocks the usage of the rest field of attr::config. Relax the check by only using the bit [7, 0]. Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yushan Wang <wangyushan12@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18drivers/perf: hisi: Add support for HiSilicon MN PMU driverJunhao He
MN (Miscellaneous Node) is a hybrid node in ARM CHI. It broadcasts the following two types of requests: DVM operations and PCIe configuration. MN PMU devices exist on both SCCL and SICL, so we named the MN pmu driver after SCL (Super cluster) ID. The MN PMU driver using the HiSilicon uncore PMU framework. And only the event parameter is supported. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18drivers/perf: hisi: Add support for HiSilicon NoC PMUYicong Yang
Adds the support for HiSilicon NoC (Network on Chip) PMU which will be used to monitor the events on the system bus. The PMU device will be named after the SCL ID (either Super CPU cluster or Super IO cluster) and the index ID, just similar to other HiSilicon Uncore PMUs. Below PMU formats are provided besides the event: - ch: the transaction channel (data, request, response, etc) which can be used to filter the counting. - tt_en: tracetag filtering enable. Just as other HiSilicon Uncore PMUs the NoC PMU supports only counting the transactions with tracetag. The NoC PMU doesn't have an interrupt to indicate the overflow. However we have a 64 bit counter which is large enough and it's nearly impossible to overflow. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Support PMUs with no interruptYicong Yang
We'll have PMUs don't have an interrupt to indicate the counter overflow, but the Uncore PMU core assume all the PMUs have interrupt. So handle this case in the core. The existing PMUs won't be affected. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Relax the event number check of v2 PMUsJunhao He
The supported event number range of each Uncore PMUs is provided by each driver in hisi_pmu::check_event and out of range events will be rejected. A later version with expanded event number range needs to register the PMU with updated hisi_pmu::check_event even if it's the only update, which means the expanded events cannot be used unless the driver's updated. However the unsupported events won't be counted by the hardware so we can relax the event number check to allow the use the expanded events. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driverJunhao He
SLLC v3 PMU has the following changes compared to previous version: a) update the register layout b) update the definition of SRCID_CTRL and TGTID_CTRL registers. To be compatible with v2, we use maximum width (11 bits) and mask the extra length for themselves. c) remove latency events (driver does not need to be adapted). SLLC v3 PMU is identified with HID HISI0264. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU informationJunhao He
Make use of struct acpi_device_id::driver_data for version specific information rather than judge the version register. This will help to simplify the probe process and also a bit easier for extension. Factor out SLLC register definition to struct hisi_sllc_pmu_regs. No functional changes intended. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driverJunhao He
HiSilicon DDRC v3 PMU has the different interrupt register offset compared to the v2. Add device information of v3 PMU with ACPI HID HISI0235. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Simplify the probe process for each DDRC versionJunhao He
Version 1 and 2 of DDRC PMU also use different HID. Make use of struct acpi_device_id::driver_data for version specific information rather than judge the version register. This will help to simplify the probe process and also a bit easier for extension. In order to support this extend struct hisi_pmu_dev_info for version specific counter bits and event range. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-01-07drivers/perf: hisi: Set correct IRQ affinity for PMUs with no associationYicong Yang
For PMUs with no association, the hisi_pmu->on_cpu is initialized according to the NUMA locality but use a wrong CPU for the interrupt affinity. The CPU selected from cpumask_local_spread() can be different from the CPU by the cpuhp callback. Fix this by setting the IRQ affinity to hisi_pmu->on_cpu. Fixes: 6cd137088fdf ("drivers/perf: hisi: Refactor the detection of associated CPUs") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241223125134.57885-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Delete redundant blank line of DDRC PMUJunhao He
Do not add blank line at the end of a code block defined by braces. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-11-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Fix incorrect variable name "hha_pmu" in DDRC PMU driverJunhao He
In the callback function write_evtype(), the variable name of struct hisi_pmu should be "ddrc_pmu" instead of "hha_pmu". Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-10-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Export associated CPUs of each PMU through sysfsYicong Yang
Although the event of the uncore PMU can only be opened on a single CPU, some PMU does have the affinity on a range of CPUs. For example the L3C PMU is associated to the CPUs sharing the L3T it monitors. Users may infer this affinity by the PMU name which may have SCCL ID and CCL ID encoded (for L3C etc), but it's not that straightforward. So export this information by adding an "associated_cpus" sysfs attribute then user can get this directly. Reviewed-by: Jonathan Cameron <Joanthan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-9-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Provide a generic implementation of cpumask/identifierYicong Yang
Each type of HiSilicon Uncore PMU has the following sysfs attributes: - format: bitmask in perf_event_attr::config[012] of corresponding attribute - event: events name and corresponding event code - cpumask: range of CPUs the events can be opened on - identifier: the version of this PMU Different types of PMU have different implementations of the "format" and "event" but all share the same implementation of the "cpumask" and "identifier". Thus we can move cpumask and identifier to the hisi_uncore_pmu framework and drivers can use the generic implementation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-8-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Add a common function to retrieve topology from firmwareYicong Yang
Currently each type of uncore PMU driver uses almost the same routine and the same firmware interface (or properties) to retrieve the topology information, then reset the unused IDs to -1. Extract the common parts to the framework in hisi_uncore_pmu_init_topology(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Extract topology information to a separate structureYicong Yang
HiSilicon Uncore PMUs are identified by the IDs of the topology element on which the PMUs are located. Add a new separate struct hisi_pmu_toplogy to encapsulate this information. Add additional documentation on the meaning of each ID. - make sccl_id and sicl_id into a union since they're exclusive. It can also be accessed by scl_id if the SICL/SCCL distinction is not relevant - make index_id and sub_id signed so -1 may be used to indicate the PMU doesn't have this topology element or it could not be retrieved. This patch should have no functional changes. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Refactor the detection of associated CPUsYicong Yang
There are two type of PMUs supported currently: 1) PMUs locate on SCCL (Super CPU Cluster [1]), associated with certain CCL (CPU cluster [1])(e.g. L3C PMU) or not (e.g. DDRC PMU) 2) PMUs locate on the SICL (Super IO Cluster [1]), which has no association with certain CPU topology (e.g. CPA PMU) Currently the associated CPUs of the PMU is detected in the cpuhp online callback as below: - for type 1) the CPUs match PMU's sccl_id and ccl_id - for type 2) PMU's sccl_id is -1 and all online CPUs will be associated Since uncore PMUs are not bound to certain CPU context and event could be counting started by any online CPU, the associated CPUs are just a preference. Below disadvantages are observed in current implementation: - the PMU cannot be used if its associated CPUs are offline - SICL PMUs are associated to all the online CPUs implicitly without the consideration of locality So refactor the way we detect the associated CPUs in below aspects: - add a clear definition of hisi_pmu::associated_cpus - initialize hisi_pmu::on_cpu based on locality if no associated CPU found, otherwise update it from associated CPUs - drop the detection with a sccl_id of -1 for SICL PMUs [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/perf/hisi-pmu.rst?h=v6.12-rc1 Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Migrate to one online CPU if no associated one onlineYicong Yang
If the selected CPU hisi_pmu::on_cpu goes offline, driver will select a new online CPU from hisi_pmu::associated_cpus, or if no online CPU found the PMU context won't be migrated. However for uncore PMUs the associated CPUs are just a peference and it also works to schedule the events on any online CPUs. So add a fallback to choose an online CPU if no associated CPUs found. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Don't update the associated_cpus on CPU offlineYicong Yang
Event will be scheduled on CPU of hisi_pmu::on_cpu which is selected from the intersection of hisi_pmu::associated_cpus and online CPUs. So the associated_cpus don't need to be maintained with online CPUs. This will save one update operation if one associated CPU is offlined. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Define a symbol namespace for HiSilicon Uncore PMUsYicong Yang
The HiSilicon Uncore PMU framework implements some common functions and exported them to the drivers. Use a specific HISI_PMU namespace for the exported symbols to avoid pollute the generic ones. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-11-06perf: Switch back to struct platform_driver::remove()Uwe Kleine-König
After commit 0edb555a65d1 ("platform: Make platform_driver::remove() return void") .remove() is (again) the right callback to implement for platform drivers. Convert all platform drivers below drivers/perf to use .remove(), with the eventual goal to drop struct platform_driver::remove_new(). As .remove() and .remove_new() have the same prototypes, conversion is done by just changing the structure member name in the driver initializer. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/20241027180313.410964-2-u.kleine-koenig@baylibre.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Export supported Root Ports [bdf_min, bdf_max]Yicong Yang
Currently users can get the Root Ports supported by the PCIe PMU by "bus" sysfs attributes which indicates the PCIe bus number where Root Ports are located. This maybe insufficient since Root Ports supported by different PCIe PMUs may be located on the same PCIe bus. So export the BDF range the Root Ports additionally. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Fix TLP headers bandwidth countingYicong Yang
We make the initial value of event ctrl register as HISI_PCIE_INIT_SET and modify according to the user options. This will make TLP headers bandwidth only counting never take effect since HISI_PCIE_INIT_SET configures to count the TLP payloads bandwidth. Fix this by making the initial value of event ctrl register as 0. Fixes: 17d573984d4d ("drivers/perf: hisi: Add TLP filter support") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Record hardware counts correctlyYicong Yang
Currently we set the period and record it as the initial value of the counter without checking it's set to the hardware successfully or not. However the counter maybe unwritable if the target event is unsupported by the device. In such case we will pass user a wrong count: [start counts when setting the period] hwc->prev_count = 0x8000000000000000 device.counter_value = 0 // the counter is not set as the period [when user reads the counter] event->count = device.counter_value - hwc->prev_count = 0x8000000000000000 // wrong. should be 0. Fix this by record the hardware counter counts correctly when setting the period. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-10perf: add missing MODULE_DESCRIPTION() macrosJeff Johnson
With ARCH=x86, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm-ccn.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/fsl_imx8_ddr_perf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/marvell_cn10k_ddr_pmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/arm_cspmu_module.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/nvidia_cspmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/ampere_cspmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/cxl_pmu.o Add the missing invocation of the MODULE_DESCRIPTION() macro to all files which have a MODULE_LICENSE(). This includes drivers/perf/hisilicon/hisi_uncore_pmu.c which, although it did not produce a warning with the x86 allmodconfig configuration, may cause this warning with arm64 configurations. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240709-md-drivers-perf-v3-1-513275b75ed0@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-22Merge tag 'driver-core-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the small set of driver core and kernfs changes for 6.10-rc1. Nothing major here at all, just a small set of changes for some driver core apis, and minor fixups. Included in here are: - sysfs_bin_attr_simple_read() helper added and used - device_show_string() helper added and used All usages of these were acked by the various maintainers. Also in here are: - kernfs minor cleanup - removed unused functions - typo fix in documentation - pay attention to sysfs_create_link() failures in module.c finally All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: device property: Fix a typo in the description of device_get_child_node_count() kernfs: mount: Remove unnecessary ‘NULL’ values from knparent scsi: Use device_show_string() helper for sysfs attributes platform/x86: Use device_show_string() helper for sysfs attributes perf: Use device_show_string() helper for sysfs attributes IB/qib: Use device_show_string() helper for sysfs attributes hwmon: Use device_show_string() helper for sysfs attributes driver core: Add device_show_string() helper for sysfs attributes treewide: Use sysfs_bin_attr_simple_read() helper sysfs: Add sysfs_bin_attr_simple_read() helper module: don't ignore sysfs_create_link() failures driver core: Remove unused platform_notify, platform_notify_remove
2024-05-04perf: Use device_show_string() helper for sysfs attributesLukas Wunner
Deduplicate sysfs ->show() callbacks which expose a string at a static memory location. Use the newly introduced device_show_string() helper in the driver core instead by declaring those sysfs attributes with DEVICE_STRING_ATTR_RO(). No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/3a297850312b4ecb62d6872121de04496900f502.1713608122.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-28drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()Hao Chen
pci_alloc_irq_vectors() allocates an irq vector. When devm_add_action() fails, the irq vector is not freed, which leads to a memory leak. Replace the devm_add_action with devm_add_action_or_reset to ensure the irq vector can be destroyed when it fails. Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28drivers/perf: hisi: hns3: Fix out-of-bound access when valid event groupJunhao He
The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HNS3_PMU_MAX_HW_EVENTS, the memory write overflow of event_group array occurs. Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds. There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/} Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Hao Chen <chenhao418@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28drivers/perf: hisi_pcie: Fix out-of-bound access when valid event groupJunhao He
The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HISI_PCIE_MAX_COUNTERS, the memory write overflow of event_group array occurs. Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds. There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}' Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-hns3: Assign parents for event_source deviceJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-6-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-uncore: Assign parents for event_source devicesJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-4-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-pcie: Assign parent for event_source deviceJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-2-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/hisi_uncore: Avoid placing cpumask on the stackDawei Li
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240403155950.2068109-9-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/hisi_pcie: Avoid placing cpumask on the stackDawei Li
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240403155950.2068109-8-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx()Junhao He
The function xxx_find_related_event() scan all working events to find related events. During this process, we also can find the idle counters. If not found related events, return the first idle counter to simplify the code. Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-8-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Relax the check on related eventsJunhao He
If we use two events with the same filter and related event type (see the following example), the driver check whether they are related events and are in the same group, otherwise the function hisi_pcie_pmu_find_related_event() return -EINVAL, then the 2nd event cannot count but the 1st event is running, although the PCIe PMU has other idle counters. In this case, The perf event scheduler will make the two events to multiplex a counter, if the user use the formula (1st event_value / 2nd event_value) to calculate the bandwidth, he/she won't get the correct value, because they are not counting at the same period. This patch tries to fix this by making the related events to use different idle counters if they are not in the same event group. And finally, I'm going to say. The related events are best used in the same group [1]. There are two ways to know if they are related events. a) By event name, such as the latency events "xxx_latency, xxx_cnt" or bandwidth events "xxx_flux, xxx_time". b) By event type, such as "event=0xXXXX, event=0x1XXXX". Use group to count the related events: [1] -e "{pmu_name/xxx_latency,port=1/,pmu_name/xxx_cnt,port=1/}" example: 1st event: hisi_pcie0_core1/event=0x804,port=1 2nd event: hisi_pcie0_core1/event=0x10804,port=1 test cmd: perf stat -e hisi_pcie0_core1/event=0x804,port=1/ \ -e hisi_pcie0_core1/event=0x10804,port=1/ before patch: 25,281 hisi_pcie0_core1/event=0x804,port=1/ (49.91%) 470,598 hisi_pcie0_core1/event=0x10804,port=1/ (50.09%) after patch: 24,147 hisi_pcie0_core1/event=0x804,port=1/ 474,558 hisi_pcie0_core1/event=0x10804,port=1/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> Link: https://lore.kernel.org/r/20240223103359.18669-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Check the target filter properlyJunhao He
The PMU can monitor traffic of certain target Root Port or downstream target Endpoint. User can specify the target filter by the "port" or "bdf" option respectively. The PMU can only monitor the Root Port or Endpoint on the same PCIe core so the value of "port" or "bdf" should be valid and will be checked by the driver. Currently at least and only one of "port" and "bdf" option must be set. If "port" filter is not set or is set explicitly to zero (default), driver will regard the user specifies a "bdf" option since "port" option is a bitmask of the target Root Ports and zero is not a valid value. If user not explicitly set "port" or "bdf" filter, the driver uses "bdf" default value (zero) to set target filter, but driver will skip the check of bdf=0, although it's a valid value (meaning 0000:000:00.0). Then the user just gets zero. Therefore, we need to check if both "port" and "bdf" are invalid, then return failure and report warning. Testing: before the patch: 0 hisi_pcie0_core1/rx_mrd_flux/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,132 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,138 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,126 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ after the patch: <not supported> hisi_pcie0_core1/rx_mrd_flux/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,117 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,120 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,123 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Add more events for counting TLP bandwidthYicong Yang
A typical PCIe transaction is consisted of various TLP packets in both direction. For counting bandwidth only memory read events are exported currently. Add memory write and completion counting events of both direction to complete the bandwidth counting. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Fix incorrect counting under metric modeYicong Yang
The metric counting shows incorrect results if the events in the metric group using the same event but different filter options. This is because we only judge the event code to decide whether the event in the metric group should share the same hardware counter, but ignore the settings of the filter. For example, on a platform of 2 ports 0x1 and 0x2 but only port 0x1 has a downstream PCIe NVME device. The metric counting shows both ports have the same counts because we misassign these two events to one same hardware counter: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 7907484924 hisi_pcie0_core1/event=0x0104,port=0x2/ 7907484924 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.153863691 seconds time elapsed Fix this by using the whole config rather than the event only to judge whether two events are the same and should share the same hardware counter. With this patch, the metric counting in the above case tends to be corrected: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 0 hisi_pcie0_core1/event=0x0104,port=0x2/ 8123122077 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.152875631 seconds time elapsed Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val()Yicong Yang
Factor out retrieving of the register value for the corresponding event from hisi_pcie_config_event_ctrl() into a new function hisi_pcie_pmu_get_event_ctrl_val() allowing future reuse. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter()Yicong Yang
hisi_pcie_pmu_{config,clear}_filter() are config/clear HISI_PCIE_EVENT_CTRL register which contains not only the filter but also the event code. The function names are bit misleading. Rename it to hisi_pcie_pmu_{config,clear}_event_ctrl() to reflects their functions more accurately. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09Junhao He
HiSilicon UC PMU v2 suffers the erratum 162700402 that the PMU counter cannot be set due to the lack of clock under power saving mode. This will lead to error or inaccurate counts. The clock can be enabled by the PMU global enabling control. This patch tries to fix this by set the UC PMU enable before set event period to turn on the clock, and then restore the UC PMU configuration. The counter register can hold its value without a clock. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240227125231.53127-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>