| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Fix printf format warning for bprintf
sunrpc uses a trace_printk() that triggers a printf warning during
the compile. Move the __printf() attribute around for when debugging
is not enabled the warning will go away
- Remove redundant check for EVENT_FILE_FL_FREED in
event_filter_write()
The FREED flag is checked in the call to event_file_file() and then
checked again right afterward, which is unneeded
- Clean up event_file_file() and event_file_data() helpers
These helper functions played a different role in the past, but now
with eventfs, the READ_ONCE() isn't needed. Simplify the code a bit
and also add a warning to event_file_data() if the file or its data
is not present
- Remove updating file->private_data in tracing open
All access to the file private data is handled by the helper
functions, which do not use file->private_data. Stop updating it on
open
- Show ENUM names in function arguments via BTF in function tracing
When showing the function arguments when func-args option is set for
function tracing, if one of the arguments is found to be an enum,
show the name of the enum instead of its number
- Add new trace_call__##name() API for tracepoints
Tracepoints are enabled via static_branch() blocks, where when not
enabled, there's only a nop that is in the code where the execution
will just skip over it. When tracing is enabled, the nop is converted
to a direct jump to the tracepoint code. Sometimes more calculations
are required to be performed to update the parameters of the
tracepoint. In this case, trace_##name##_enabled() is called which is
a static_branch() that gets enabled only when the tracepoint is
enabled. This allows the extra calculations to also be skipped by the
nop:
if (trace_foo_enabled()) {
x = bar();
trace_foo(x);
}
Where the x=bar() is only performed when foo is enabled. The problem
with this approach is that there's now two static_branch() calls. One
for checking if the tracepoint is enabled, and then again to know if
the tracepoint should be called. The second one is redundant
Introduce trace_call__foo() that will call the foo() tracepoint
directly without doing a static_branch():
if (trace_foo_enabled()) {
x = bar();
trace_call__foo();
}
- Update various locations to use the new trace_call__##name() API
- Move snapshot code out of trace.c
Cleaning up trace.c to not be a "dump all", move the snapshot code
out of it and into a new trace_snapshot.c file
- Clean up some "%*.s" to "%*s"
- Allow boot kernel command line options to be called multiple times
Have options like:
ftrace_filter=foo ftrace_filter=bar ftrace_filter=zoo
Equal to:
ftrace_filter=foo,bar,zoo
- Fix ipi_raise event CPU field to be a CPU field
The ipi_raise target_cpus field is defined as a __bitmask(). There is
now a __cpumask() field definition. Update the field to use that
- Have hist_field_name() use a snprintf() and not a series of strcat()
It's safer to use snprintf() that a series of strcat()
- Fix tracepoint regfunc balancing
A tracepoint can define a "reg" and "unreg" function that gets called
before the tracepoint is enabled, and after it is disabled
respectively. But on error, after the "reg" func is called and the
tracepoint is not enabled, the "unreg" function is not called to tear
down what the "reg" function performed
- Fix output that shows what histograms are enabled
Event variables are displayed incorrectly in the histogram output
Instead of "sched.sched_wakeup.$var", it is showing
"$sched.sched_wakeup.var" where the '$' is in the incorrect location
- Some other simple cleanups
* tag 'trace-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (24 commits)
selftests/ftrace: Add test case for fully-qualified variable references
tracing: Fix fully-qualified variable reference printing in histograms
tracepoint: balance regfunc() on func_add() failure in tracepoint_add_func()
tracing: Rebuild full_name on each hist_field_name() call
tracing: Report ipi_raise target CPUs as cpumask
tracing: Remove duplicate latency_fsnotify() stub
tracing: Preserve repeated trace_trigger boot parameters
tracing: Append repeated boot-time tracing parameters
tracing: Remove spurious default precision from show_event_trigger/filter formats
cpufreq: Use trace_call__##name() at guarded tracepoint call sites
tracing: Remove tracing_alloc_snapshot() when snapshot isn't defined
tracing: Move snapshot code out of trace.c and into trace_snapshot.c
mm: damon: Use trace_call__##name() at guarded tracepoint call sites
btrfs: Use trace_call__##name() at guarded tracepoint call sites
spi: Use trace_call__##name() at guarded tracepoint call sites
i2c: Use trace_call__##name() at guarded tracepoint call sites
kernel: Use trace_call__##name() at guarded tracepoint call sites
tracepoint: Add trace_call__##name() API
tracing: trace_mmap.h: fix a kernel-doc warning
tracing: Pretty-print enum parameters in function arguments
...
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Pull amd-pstate new content for 7.1 (2026-04-02) from Mario Limonciello:
"Add support for new features:
* CPPC performance priority
* Dynamic EPP
* Raw EPP
* New unit tests for new features
Fixes for:
* PREEMPT_RT
* sysfs files being present when HW missing
* Broken/outdated documentation"
* tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (22 commits)
MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
cpufreq/amd-pstate-ut: Add a unit test for raw EPP
cpufreq/amd-pstate: Add support for raw EPP writes
cpufreq/amd-pstate: Add support for platform profile class
cpufreq/amd-pstate: add kernel command line to override dynamic epp
cpufreq/amd-pstate: Add dynamic energy performance preference
Documentation: amd-pstate: fix dead links in the reference section
cpufreq/amd-pstate: Cache the max frequency in cpudata
Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
amd-pstate-ut: Add module parameter to select testcases
amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
amd-pstate: Add sysfs support for floor_freq and floor_count
amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
x86/cpufeatures: Add AMD CPPC Performance Priority feature.
amd-pstate: Make certain freq_attrs conditionally visible
...
|
|
cpufreq_cpu_get() can sleep on PREEMPT_RT in presence of concurrent
writer(s), however amd-pstate depends on fetching the cpudata via the
policy's driver data which necessitates grabbing the reference.
Since schedutil governor can call "cpufreq_driver->update_perf()"
during sched_tick/enqueue/dequeue with rq_lock held and IRQs disabled,
fetching the policy object using the cpufreq_cpu_get() helper in the
scheduler fast-path leads to "BUG: scheduling while atomic" on
PREEMPT_RT [1].
Pass the cached cpufreq policy object in sg_policy to the update_perf()
instead of just the CPU. The CPU can be inferred using "policy->cpu".
The lifetime of cpufreq_policy object outlasts that of the governor and
the cpufreq driver (allocated when the CPU is onlined and only reclaimed
when the CPU is offlined / the CPU device is removed) which makes it
safe to be referenced throughout the governor's lifetime.
Closes:https://lore.kernel.org/all/20250731092316.3191-1-spasswolf@web.de/ [1]
Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Gary Guo <gary@garyguo.net> # Rust
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260316081849.19368-3-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
|
|
A recent change exposed a bug in the error path: if
freq_qos_add_request(boost_freq_req) fails, min_freq_req may remain a
valid pointer even though it was never successfully added. During policy
teardown, this leads to an unconditional call to
freq_qos_remove_request(), triggering a WARN.
The current design allocates all three freq_req objects together, making
the lifetime rules unclear and error handling fragile.
Simplify this by allocating the QoS freq_req objects at policy
allocation time. The policy itself is dynamically allocated, and two of
the three requests are always needed anyway. This ensures consistent
lifetime management and eliminates the inconsistent state in failure
paths.
Reported-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Fixes: 6e39ba4e5a82 ("cpufreq: Add boost_freq_req QoS request")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Tested-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Link: https://patch.msgid.link/a293f29d841b86c51f34699c6e717e01858d8ada.1774933424.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The Power Management Quality of Service (PM QoS) allows to
aggregate constraints from multiple entities. It is currently
used to manage the min/max frequency of a given policy.
Frequency constraints can come for instance from:
- Thermal framework: acpi_thermal_cpufreq_init()
- Firmware: _PPC objects: acpi_processor_ppc_init()
- User: by setting policyX/scaling_[min|max]_freq
The minimum of the max frequency constraints is used to compute
the resulting maximum allowed frequency.
When enabling boost frequencies, the same frequency request object
(policy->max_freq_req) as to handle requests from users is used.
As a result, when setting:
- scaling_max_freq
- boost
The last sysfs file used overwrites the request from the other
sysfs file.
To avoid this, create a per-policy boost_freq_req to save the boost
constraints instead of overwriting the last scaling_max_freq
constraint.
policy_set_boost() calls the cpufreq set_boost callback.
Update the newly added boost_freq_req request from there:
- whenever boost is toggled
- to cover all possible paths
In the existing .set_boost() callbacks:
- Don't update policy->max as this is done through the qos notifier
cpufreq_notifier_max() which calls cpufreq_set_policy().
- Remove freq_qos_update_request() calls as the qos request is now
done in policy_set_boost() and updates the new boost_freq_req
$ ## Init state
scaling_max_freq:1000000
cpuinfo_max_freq:1000000
$ echo 700000 > scaling_max_freq
scaling_max_freq:700000
cpuinfo_max_freq:1000000
$ echo 1 > ../boost
scaling_max_freq:1200000
cpuinfo_max_freq:1200000
$ echo 800000 > scaling_max_freq
scaling_max_freq:800000
cpuinfo_max_freq:1200000
$ ## Final step:
$ ## Without the patches:
$ echo 0 > ../boost
scaling_max_freq:1000000
cpuinfo_max_freq:1000000
$ ## With the patches:
$ echo 0 > ../boost
scaling_max_freq:800000
cpuinfo_max_freq:1000000
Note:
cpufreq_frequency_table_cpuinfo() updates policy->min
and max from:
A.
cpufreq_boost_set_sw()
\-cpufreq_frequency_table_cpuinfo()
B.
cpufreq_policy_online()
\-cpufreq_table_validate_and_sort()
\-cpufreq_frequency_table_cpuinfo()
Keep these updates as some drivers expect policy->min and
max to be set through B.
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-3-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
policy->max_freq_req QoS constraint represents the maximal allowed
frequency than can be requested. It is set by:
- writing to policyX/scaling_max sysfs file
- toggling the cpufreq/boost sysfs file
Upon calling freq_qos_update_request(), a successful update
of the max_freq_req value triggers cpufreq_notifier_max(),
followed by cpufreq_set_policy() which update the requested
frequency for the policy.
If the new max_freq_req value is not different from the
original value, no frequency update is triggered.
In a specific sequence of toggling:
- cpufreq/boost sysfs file
- CPU hot-plugging
a CPU could end up with boost enabled but running at the
maximal non-boost frequency, cpufreq_notifier_max() not being
triggered. The following fixed that:
commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")
The following:
commit dd016f379ebc ("cpufreq: Introduce a more generic way to
set default per-policy boost flag")
also fixed the issue by correctly setting the max_freq_req
constraint of a policy that is re-activated. This makes the
first fix unnecessary.
As the original issue is fixed by another method,
this patch reverts:
commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-2-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.
Cc: Huang Rui <ray.huang@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Perry Yuan <perry.yuan@amd.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://patch.msgid.link/20260323160052.17528-7-vineeth@bitbyteword.org
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> # cpufreq core
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
|
|
The commit 6db0f533d320 ("cpufreq: preserve freq_table_sorted
across suspend/hibernate") unintentionally made a change where
cpufreq_frequency_table_cpuinfo() isn't getting called anymore
for old policies getting re-initialized.
This leads to potentially invalid values of policy->max and
policy->cpuinfo_max_freq.
Fix the issue by reverting the original commit and adding the condition
for just the sorting function.
Fixes: 6db0f533d320 ("cpufreq: preserve freq_table_sorted across suspend/hibernate")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 6.19+ <stable@vger.kernel.org> # 6.19+
Link: https://patch.msgid.link/65ba5c45749267c82e8a87af3dc788b37a0b3f48.1773998611.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Include policy->cur in the debug message to explicitly show the frequency
transition (from current to target).
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260129121813.3874516-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"There's a little less than normal, probably due to LPC & Christmas/New
Year meaning that a few series weren't quite ready or reviewed in
time. It's still useful across the board, despite the only real
feature being support for the LS64 feature enabling 64-byte atomic
accesses to endpoints that support it.
ACPI:
- Add interrupt signalling support to the AGDI handler
- Add Catalin and myself to the arm64 ACPI MAINTAINERS entry
CPU features:
- Drop Kconfig options for PAN and LSE (these are detected at runtime)
- Add support for 64-byte single-copy atomic instructions (LS64/LS64V)
- Reduce MTE overhead when executing in the kernel on Ampere CPUs
- Ensure POR_EL0 value exposed via ptrace is up-to-date
- Fix error handling on GCS allocation failure
CPU frequency:
- Add CPU hotplug support to the FIE setup in the AMU driver
Entry code:
- Minor optimisations and cleanups to the syscall entry path
- Preparatory rework for moving to the generic syscall entry code
Hardware errata:
- Work around Spectre-BHB on TSV110 processors
- Work around broken CMO propagation on some systems with the SI-L1
interconnect
Miscellaneous:
- Disable branch profiling for arch/arm64/ to avoid issues with
noinstr
- Minor fixes and cleanups (kexec + ubsan, WARN_ONCE() instead of
WARN_ON(), reduction of boolean expression)
- Fix custom __READ_ONCE() implementation for LTO builds when
operating on non-atomic types
Perf and PMUs:
- Support for CMN-600AE
- Be stricter about supported hardware in the CMN driver
- Support for DSU-110 and DSU-120
- Support for the cycles event in the DSU driver (alongside the
dedicated cycles counter)
- Use IRQF_NO_THREAD instead of IRQF_ONESHOT in the cxlpmu driver
- Use !bitmap_empty() as a faster alternative to bitmap_weight()
- Fix SPE error handling when failing to resume profiling
Selftests:
- Add support for the FORCE_TARGETS option to the arm64 kselftests
- Avoid nolibc-specific my_syscall() function
- Add basic test for the LS64 HWCAP
- Extend fp-pidbench to cover additional workload patterns"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (43 commits)
perf/arm-cmn: Reject unsupported hardware configurations
perf: arm_spe: Properly set hw.state on failures
arm64/gcs: Fix error handling in arch_set_shadow_stack_status()
arm64: Fix non-atomic __READ_ONCE() with CONFIG_LTO=y
arm64: poe: fix stale POR_EL0 values for ptrace
kselftest/arm64: Raise default number of loops in fp-pidbench
kselftest/arm64: Add a no-SVE loop after SVE in fp-pidbench
perf/cxlpmu: Replace IRQF_ONESHOT with IRQF_NO_THREAD
arm64: mte: Set TCMA1 whenever MTE is present in the kernel
arm64/ptrace: Return early for ptrace_report_syscall_entry() error
arm64/ptrace: Split report_syscall()
arm64: Remove unused _TIF_WORK_MASK
kselftest/arm64: Add missing file in .gitignore
arm64: errata: Workaround for SI L1 downstream coherency issue
kselftest/arm64: Add HWCAP test for FEAT_LS64
arm64: Add support for FEAT_{LS64, LS64_V}
KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1
KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B
...
|
|
cpufreq_cpu_get_raw() gets cpufreq policy only if the CPU is in
policy->cpus mask, which means the CPU is already online. But in some
cases, the policy is needed before the CPU is added to cpus mask. Add a
function to get the policy in these cases.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
Acked-by: Beata Michalska <beata.michalska@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Optimize the error handling code in cpufreq_boost_trigger_state().
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Changelog edit ]
Link: https://patch.msgid.link/20251202072727.1368285-3-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In cpufreq_boost_trigger_state(), if none of the the policies support
boost, policy_set_boost() will not be called and this function will
return 0.
But it is better to return an error to indicate that the platform
doesn't support boost.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251202072727.1368285-2-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
strcpy() is deprecated; assign the NUL terminator directly instead.
Link: https://github.com/KSPP/linux/issues/88
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/20251017153354.82009-2-thorsten.blum@linux.dev
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
During S3/S4 suspend and resume, cpufreq policies are not freed or
recreated; the freq_table and policy structure remain intact. However,
set_freq_table_sorted() currently resets policy->freq_table_sorted to
UNSORTED unconditionally, which is unnecessary since the table order
does not change across suspend/resume.
This patch adds a check to skip validation if policy->freq_table_sorted
is already ASCENDING or DESCENDING. This avoids unnecessary traversal
of the frequency table on S3/S4 resume or repeated online events,
reducing overhead while preserving correctness.
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Link: https://patch.msgid.link/20251011072420.11495-1-zhangzihuan@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
|
|
commit 2a6c72738706 ("cpufreq: Initialize cpufreq-based
frequency-invariance later") postponed the frequency invariance
initialization to avoid disabling it in the error case.
This isn't locking safe, instead move the initialization up before
the subsys interface is registered (which will rebuild the
sched_domains) and add the corresponding disable on the error path.
Observed lockdep without this patch:
[ 0.989686] ======================================================
[ 0.989688] WARNING: possible circular locking dependency detected
[ 0.989690] 6.17.0-rc4-cix-build+ #31 Tainted: G S
[ 0.989691] ------------------------------------------------------
[ 0.989692] swapper/0/1 is trying to acquire lock:
[ 0.989693] ffff800082ada7f8 (sched_energy_mutex){+.+.}-{4:4}, at: rebuild_sched_domains_energy+0x30/0x58
[ 0.989705]
but task is already holding lock:
[ 0.989706] ffff000088c89bc8 (&policy->rwsem){+.+.}-{4:4}, at: cpufreq_online+0x7f8/0xbe0
[ 0.989713]
which lock already depends on the new lock.
Fixes: 2a6c72738706 ("cpufreq: Initialize cpufreq-based frequency-invariance later")
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Currently, cpufreq allows drivers to implement both ->target() and
->target_index() callbacks, but that can lead to ambiguous or incorrect
behavior.
For this reason, prevent cpufreq drivers implementing both ->target()
and ->target_index() at the same time from registering.
This check can help to catch driver implementation mistakes early and
improve overall robustness, without affecting existing valid drivers.
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Link: https://patch.msgid.link/20250908085738.31602-1-zhangzihuan@kylinos.cn
[ rjw: Subject adjustment and changelog rewrite ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Follow cleanup.h recommendations and always define and assign variables
in one statement when __free() is used.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/4691667.LvFx2qVVIh@rafael.j.wysocki
|
|
Change the 'ret' variable in store_scaling_setspeed() from unsigned int to
int, as it needs to store either negative error codes or zero returned
by kstrtouint().
No effect on runtime.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250902114545.651661-2-rongqianfeng@vivo.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Since commit e0b3165ba521 ("cpufreq: add 'freq_table' in struct
cpufreq_policy"), freq_table has been stored in struct cpufreq_policy
instead of being maintained separately.
However, several helpers in freq_table.c still take both policy and
freq_table as parameters, even though policy->freq_table can always be
used. This leads to redundant function arguments and increases the
chance of inconsistencies.
This patch removes the unnecessary freq_table argument from these
functions and updates their callers to only pass policy. This makes
the code simpler, more consistent, and avoids duplication.
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250902073323.48330-1-zhangzihuan@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
cpufreq drivers are supposed to use either ->setpolicy() or
->target()/->target_index().
Simplify the existing check by collapsing it into a single boolean
expression:
(!!driver->setpolicy == (driver->target_index || driver->target))
This is a readability/maintainability cleanup and keeps the semantics
unchanged.
No functional change intended.
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250822070424.166795-3-zhangzihuan@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Most kernel code using strncasecmp()/strncmp() passes strlen("xxx")
as the length argument. cpufreq_parse_policy() previously used
CPUFREQ_NAME_LEN (16), which is longer than the actual strings
("performance" is 11 chars, "powersave" is 9 chars).
This patch switches to strlen() for the comparison, making the
matching slightly more permissive (e.g., "powersavexxx" will now
also match "powersave"). While this is unlikely to cause functional
issues, it aligns cpufreq with common kernel style and makes the
behavior more intuitive.
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250822070424.166795-2-zhangzihuan@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When a cpufreq driver registers the first policy, it may attempt to
initialize the policy governor from `last_governor`. However, this is
meaningless for the first policy instance, because `last_governor` is
only updated when policies are removed (e.g. during CPU offline).
The `last_governor` mechanism is intended to restore the previously
used governor across CPU hotplug events. For the very first policy,
there is no "previous governor" to restore, so calling
get_governor(last_governor) is unnecessary and potentially confusing.
Skip looking up `last_governor` when registering the first policy.
Instead, it directly uses the default governor after all governors
have been registered and are available.
This avoids meaningless lookups, reduces unnecessary module reference
handling, and simplifies the initial policy path.
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250725041450.68754-1-zhangzihuan@kylinos.cn
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Merge CPUFreq updates for 6.17 from Viresh Kumar:
"- tegra124: Allow building as a module (Aaron Kling).
- Minor cleanups for Rust cpufreq and cpumask APIs and fix MAINTAINERS
entry for cpu.rs (Abhinav Ananthu, Ritvik Gupta, and Lukas Bulwahn).
- Minor cleanups for miscellaneous cpufreq drivers (Arnd Bergmann, Dan
Carpenter, Krzysztof Kozlowski, Sven Peter, and Svyatoslav Ryhel)."
* tag 'cpufreq-arm-updates-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
drivers: cpufreq: add Tegra114 support
rust: cpumask: Replace `MaybeUninit` and `mem::zeroed` with `Opaque` APIs
cpufreq: tegra124: Allow building as a module
cpufreq: dt: Add register helper
cpufreq: Export disable_cpufreq()
cpufreq: armada-8k: Fix off by one in armada_8k_cpufreq_free_table()
cpufreq: armada-8k: make both cpu masks static
rust: cpufreq: use c_ types from kernel prelude
rust: cpufreq: Ensure C ABI compatibility in all unsafe
cpufreq: brcmstb-avs: Fully open-code compatible for grepping
cpufreq: apple: drop default ARCH_APPLE in Kconfig
MAINTAINERS: adjust file entry in CPU HOTPLUG
|
|
Detect the result of starting old governor in cpufreq_set_policy(). If it
fails, exit the governor and clear policy->governor.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-5-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
cpufreq_verify_current_freq()
Move the check of cpufreq_driver->get into cpufreq_verify_current_freq() in
case of calling it without check.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-4-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In cpufreq_policy_put_kobj(), policy->rwsem is used. But in
cpufreq_policy_alloc(), if freq_qos_add_notifier() returns an error, error
path via err_kobj_remove or err_min_qos_notifier will be reached and
cpufreq_policy_put_kobj() will be called before policy->rwsem is
initialized. Thus, the calling of init_rwsem() should be moved to where
before these two error paths can be reached.
Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-3-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The cpufreq-based invariance is enabled in cpufreq_register_driver(),
but never disabled after registration fails. Move the invariance
initialization to where all other initializations have been successfully
done to solve this problem.
Fixes: 874f63531064 ("cpufreq: report whether cpufreq supports Frequency Invariance (FI)")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-2-zhenglifeng1@huawei.com
[ rjw: New subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The has_target() checks in __cpufreq_offline() are duplicate.
Remove one of them and put the operations of exiting governor together
with storing last governor's name.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250623133402.3120230-5-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
After commit c034b02e213d ("cpufreq: expose scaling_cur_freq sysfs file
for set_policy() drivers"), the file scaling_cur_freq is exposed to all
drivers.
No need to create this file separately. It's better to be contained in
cpufreq_attrs.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250623133402.3120230-4-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This is used by the tegra124-cpufreq driver.
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
In store_scaling_setspeed(), sscanf is still used to read to sysfs.
Newer kstrtox provide more features including overflow protection,
better errorhandling and allows for other systems of numeration. It
is therefore better to update sscanf() to kstrtouint().
Signed-off-by: Bowen Yu <yubowen8@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250519070938.931396-1-yubowen8@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Setting the length of str_governor with a magic number could cause
overflow when max length increases, it is better to use the defined
macro in this case.
Signed-off-by: Bowen Yu <yubowen8@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250519070908.930879-1-yubowen8@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Lockdep reports a possible circular locking dependency [1] when
cpu_hotplug_lock is acquired inside store_local_boost(), after
policy->rwsem has already been taken by store().
However, the boost update is strictly per-policy and does not
access shared state or iterate over all policies.
Since policy->rwsem is already held, this is enough to serialize
against concurrent topology changes for the current policy.
Remove the cpus_read_lock() to resolve the lockdep warning and
avoid unnecessary locking.
[1]
======================================================
WARNING: possible circular locking dependency detected
6.15.0-rc6-debug-gb01fc4eca73c #1 Not tainted
------------------------------------------------------
power-profiles-/588 is trying to acquire lock:
ffffffffb3a7d910 (cpu_hotplug_lock){++++}-{0:0}, at: store_local_boost+0x56/0xd0
but task is already holding lock:
ffff8b6e5a12c380 (&policy->rwsem){++++}-{4:4}, at: store+0x37/0x90
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&policy->rwsem){++++}-{4:4}:
down_write+0x29/0xb0
cpufreq_online+0x7e8/0xa40
cpufreq_add_dev+0x82/0xa0
subsys_interface_register+0x148/0x160
cpufreq_register_driver+0x15d/0x260
amd_pstate_register_driver+0x36/0x90
amd_pstate_init+0x1e7/0x270
do_one_initcall+0x68/0x2b0
kernel_init_freeable+0x231/0x270
kernel_init+0x15/0x130
ret_from_fork+0x2c/0x50
ret_from_fork_asm+0x11/0x20
-> #1 (subsys mutex#3){+.+.}-{4:4}:
__mutex_lock+0xc2/0x930
subsys_interface_register+0x7f/0x160
cpufreq_register_driver+0x15d/0x260
amd_pstate_register_driver+0x36/0x90
amd_pstate_init+0x1e7/0x270
do_one_initcall+0x68/0x2b0
kernel_init_freeable+0x231/0x270
kernel_init+0x15/0x130
ret_from_fork+0x2c/0x50
ret_from_fork_asm+0x11/0x20
-> #0 (cpu_hotplug_lock){++++}-{0:0}:
__lock_acquire+0x10ed/0x1850
lock_acquire.part.0+0x69/0x1b0
cpus_read_lock+0x2a/0xc0
store_local_boost+0x56/0xd0
store+0x50/0x90
kernfs_fop_write_iter+0x132/0x200
vfs_write+0x2b3/0x590
ksys_write+0x74/0xf0
do_syscall_64+0xbb/0x1d0
entry_SYSCALL_64_after_hwframe+0x56/0x5e
Signed-off-by: Seyediman Seyedarab <ImanDevel@gmail.com>
Link: https://patch.msgid.link/20250513015726.1497-1-ImanDevel@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Policy locking was added to cpufreq_policy_is_good_for_eas() by commit
4854649b1fb4 ("cpufreq/sched: Move cpufreq-specific EAS checks to
cpufreq") to address a theoretical race condition, but it turned out to
introduce a circular locking dependency between the policy rwsem and
sched_domains_mutex via cpuset_mutex. This leads to a board lockup on
OdroidN2 that is based on the ARM64 Amlogic Meson SoC.
Drop the policy locking from cpufreq_policy_is_good_for_eas() to address
this issue.
Fixes: 4854649b1fb4 ("cpufreq/sched: Move cpufreq-specific EAS checks to cpufreq")
Closes: https://lore.kernel.org/linux-pm/1bf3df62-0641-459f-99fc-fd511e564b84@samsung.com/
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2806514.mvXUDI8C0e@rjwysocki.net
|
|
Doing cpufreq-specific EAS checks that require accessing policy
internals directly from sched_is_eas_possible() is a bit unfortunate,
so introduce cpufreq_ready_for_eas() in cpufreq, move those checks
into that new function and make sched_is_eas_possible() call it.
While at it, address a possible race between the EAS governor check
and governor change by doing the former under the policy rwsem.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://patch.msgid.link/2317800.iZASKD2KPV@rjwysocki.net
|
|
|
|
Commit 7491cdf46b5c ("cpufreq: Avoid using inconsistent policy->min and
policy->max") overlooked the fact that policy->min and policy->max were
accessed directly in cpufreq_frequency_table_target() and in the
functions called by it. Consequently, the changes made by that commit
led to problems with setting policy limits.
Address this by passing the target frequency limits to __resolve_freq()
and cpufreq_frequency_table_target() and propagating them to the
functions called by the latter.
Fixes: 7491cdf46b5c ("cpufreq: Avoid using inconsistent policy->min and policy->max")
Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
Closes: https://lore.kernel.org/linux-pm/aAplED3IA_J0eZN0@linaro.org/
Reported-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/5896780.DvuYhMxLoT@rjwysocki.net
|
|
If the global boost flag is enabled and policy boost flag is disabled, a
call to `cpufreq_boost_trigger_state(true)` must enable the policy's
boost state.
The current code misses that because of an optimization. Fix it.
Suggested-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/852ff11c589e6300730d207baac195b2d9d8b95f.1745511526.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If the global boost flag was enabled and policy boost flag was disabled
before a suspend resume cycle, cpufreq_online() will enable the policy
boost flag on resume.
While it is important for the policy boost flag to mirror the global
boost flag when a policy is first created, it should be avoided when the
policy is reinitialized (for example after a suspend resume cycle).
Though, if the global boost flag is disabled at this point of time, we
want to make sure policy boost flag is disabled too.
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/de5c72a0af101049204305c73cd30eb3a3e7b4a0.1745511526.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Introduce policy_set_boost() to update boost state of a cpufreq policy.
No intentional function change.
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/1863178ac17340c810519c8593014b8e561797ea.1745511526.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The policy specific boost value may already be set correctly in
cpufreq_boost_trigger_state(), don't update it again unnecessarily.
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/3003fbdcc1850128fe7fb653d7ddb8afc4d66170.1745511526.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
|
|
Since cpufreq_driver_resolve_freq() can run in parallel with
cpufreq_set_policy() and there is no synchronization between them,
the former may access policy->min and policy->max while the latter
is updating them and it may see intermediate values of them due
to the way the update is carried out. Also the compiler is free
to apply any optimizations it wants both to the stores in
cpufreq_set_policy() and to the loads in cpufreq_driver_resolve_freq()
which may result in additional inconsistencies.
To address this, use WRITE_ONCE() when updating policy->min and
policy->max in cpufreq_set_policy() and use READ_ONCE() for reading
them in cpufreq_driver_resolve_freq(). Moreover, rearrange the update
in cpufreq_set_policy() to avoid storing intermediate values in
policy->min and policy->max with the help of the observation that
their new values are expected to be properly ordered upfront.
Also modify cpufreq_driver_resolve_freq() to take the possible reverse
ordering of policy->min and policy->max, which may happen depending on
the ordering of operations when this function and cpufreq_set_policy()
run concurrently, into account by always honoring the max when it
turns out to be less than the min (in case it comes from thermal
throttling or similar).
Fixes: 151717690694 ("cpufreq: Make policy min/max hard requirements")
Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/5907080.DvuYhMxLoT@rjwysocki.net
|
|
A recent change has introduced a bug into cpufreq_get_policy(), but this
function is not used, so it's better to drop it altogether.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/2802770.mvXUDI8C0e@rjwysocki.net
|
|
Since cpufreq_update_limits() obtains a cpufreq policy pointer for the
given CPU and reference counts the corresponding policy object, it may
as well pass the policy pointer to the cpufreq driver's ->update_limits()
callback which allows that callback to avoid invoking cpufreq_cpu_get()
for the same CPU.
Accordingly, redefine ->update_limits() to take a policy pointer instead
of a CPU number and update both drivers implementing it, intel_pstate
and amd-pstate, as needed.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/8560367.NyiUUSuA9g@rjwysocki.net
|