summaryrefslogtreecommitdiff
path: root/drivers/cpufreq
AgeCommit message (Collapse)Author
2014-06-11intel_pstate: Improve initial busy calculationDoug Smythies
commit bf8102228a8bf053051f311e5486042fe0542894 upstream. This change makes the busy calculation using 64 bit math which prevents overflow for large values of aperf/mperf. Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11intel_pstate: add sample time scalingDirk Brandewie
commit c4ee841f602e5eef8eab673295c49c5b49d7732b upstream. The PID assumes that samples are of equal time, which for a deferable timers this is not true when the system goes idle. This causes the PID to take a long time to converge to the min P state and depending on the pattern of the idle load can make the P state appear stuck. The hold-off value of three sample times before using the scaling is to give a grace period for applications that have high performance requirements and spend a lot of time idle, The poster child for this behavior is the ffmpeg benchmark in the Phoronix test suite. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11intel_pstate: Correct rounding in busy calculationDirk Brandewie
commit f0fe3cd7e12d8290c82284b5c8aee723cbd0371a upstream. Changing to fixed point math throughout the busy calculation in commit e66c1768 (Change busy calculation to use fixed point math.) Introduced some inaccuracies by rounding the busy value at two points in the calculation. This change removes roundings and moves the rounding to the output of the PID where the calculations are complete and the value returned as an integer. Fixes: e66c17683746 (intel_pstate: Change busy calculation to use fixed point math.) Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11intel_pstate: Remove C0 trackingDirk Brandewie
commit adacdf3f2b8e65aa441613cf61c4f598e9042690 upstream. Commit fcb6a15c (intel_pstate: Take core C0 time into account for core busy calculation) introduced a regression referenced below. The issue with "lockup" after suspend that this commit was addressing is now dealt with in the suspend path. Fixes: fcb6a15c2e7e (intel_pstate: Take core C0 time into account for core busy calculation) Link: https://bugzilla.kernel.org/show_bug.cgi?id=66581 Link: https://bugzilla.kernel.org/show_bug.cgi?id=75121 Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11intel_pstate: remove unneeded sample buffersDirk Brandewie
commit d37e2b764499e092ebc493d6f980827feb952e23 upstream. Remove unneeded sample buffers, intel_pstate operates on the most recent sample only. This save some memory and make the code more readable. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11cpufreq: remove race while accessing cur_policyBibek Basu
commit c5450db85b828d0c46ac8fc570fb8a51bf07ac40 upstream. While accessing cur_policy during executing events CPUFREQ_GOV_START, CPUFREQ_GOV_STOP, CPUFREQ_GOV_LIMITS, same mutex lock is not taken, dbs_data->mutex, which leads to race and data corruption while running continious suspend resume test. This is seen with ondemand governor with suspend resume test using rtcwake. Unable to handle kernel NULL pointer dereference at virtual address 00000028 pgd = ed610000 [00000028] *pgd=adf11831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM Modules linked in: nvhost_vi CPU: 1 PID: 3243 Comm: rtcwake Not tainted 3.10.24-gf5cf9e5 #1 task: ee708040 ti: ed61c000 task.ti: ed61c000 PC is at cpufreq_governor_dbs+0x400/0x634 LR is at cpufreq_governor_dbs+0x3f8/0x634 pc : [<c05652b8>] lr : [<c05652b0>] psr: 600f0013 sp : ed61dcb0 ip : 000493e0 fp : c1cc14f0 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : eb725280 r6 : c1cc1560 r5 : eb575200 r4 : ebad7740 r3 : ee708040 r2 : ed61dca8 r1 : 001ebd24 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: ad61006a DAC: 00000015 [<c05652b8>] (cpufreq_governor_dbs+0x400/0x634) from [<c055f700>] (__cpufreq_governor+0x98/0x1b4) [<c055f700>] (__cpufreq_governor+0x98/0x1b4) from [<c0560770>] (__cpufreq_set_policy+0x250/0x320) [<c0560770>] (__cpufreq_set_policy+0x250/0x320) from [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) from [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c) [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) from [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) from [<c004b4b8>] (tegra_pm_notify+0x114/0x134) [<c004b4b8>] (tegra_pm_notify+0x114/0x134) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c) [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) from [<c00ad38c>] (enter_state+0xec/0x128) [<c00ad38c>] (enter_state+0xec/0x128) from [<c00ad400>] (pm_suspend+0x38/0xa4) [<c00ad400>] (pm_suspend+0x38/0xa4) from [<c00ac114>] (state_store+0x70/0xc0) [<c00ac114>] (state_store+0x70/0xc0) from [<c027b1e8>] (kobj_attr_store+0x14/0x20) [<c027b1e8>] (kobj_attr_store+0x14/0x20) from [<c019cd9c>] (sysfs_write_file+0x104/0x184) [<c019cd9c>] (sysfs_write_file+0x104/0x184) from [<c0143038>] (vfs_write+0xd0/0x19c) [<c0143038>] (vfs_write+0xd0/0x19c) from [<c0143414>] (SyS_write+0x4c/0x78) [<c0143414>] (SyS_write+0x4c/0x78) from [<c000f080>] (ret_fast_syscall+0x0/0x30) Code: e1a00006 eb084346 e59b0020 e5951024 (e5903028) ---[ end trace 0488523c8f6b0f9d ]--- Signed-off-by: Bibek Basu <bbasu@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11cpufreq: cpu0: drop wrong devm usageLucas Stach
commit e3beb0ac521d50d158a9d253373eae8421ac3998 upstream. This driver is using devres managed calls incorrectly, giving the cpu0 device as first parameter instead of the cpufreq platform device. This results in resources not being freed if the cpufreq platform device is unbound, for example if probing has to be deferred for a missing regulator. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-07intel_pstate: remove setting P state to MAX on initDirk Brandewie
commit d40a63c45b506b0681918d7c62a15cc9d48c8681 upstream. Setting the P state of the core to max at init time is a hold over from early implementation of intel_pstate where intel_pstate disabled cpufreq and loaded VERY early in the boot sequence. This was to ensure that intel_pstate did not affect boot time. This in not needed now that intel_pstate is a cpufreq driver. Removing this covers the case where a CPU has gone through a manual CPU offline/online cycle and the P state is set to MAX on init and the CPU immediately goes idle. Due to HW coordination the P state request on the idle CPU will drag all cores to MAX P state until the load is reevaluated when to core goes non-idle. Reported-by: Patrick Marlier <patrick.marlier@gmail.com> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-07intel_pstate: Set turbo VID for BayTrailDirk Brandewie
commit 21855ff5bcbdd075e1c99772827a84912ab083dd upstream. A documentation update exposed that there is a separate set of VID values that must be used in the turbo/boost P state range. Add enumerating and setting the correct VID for P states in the turbo range. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-07MIPS/loongson2_cpufreq: Fix CPU clock rate settingAaro Koskinen
commit 8e8acb32960f42c81b1d50deac56a2c07bb6a18a upstream. Loongson2 has been using (incorrectly) kHz for cpu_clk rate. This has been unnoticed, as loongson2_cpufreq was the only place where the rate was set/get. After commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 (cpufreq: introduce cpufreq_generic_get() routine) things however broke, and now loops_per_jiffy adjustments are incorrect (1000 times too long). The patch fixes this by changing cpu_clk rate to Hz. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: cpufreq@vger.kernel.org Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Patchwork: https://patchwork.linux-mips.org/patch/6678/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13cpufreq: unicore32: fix typo issue for 'clk'Chen Gang
commit b4ddad95020e65cfbbf9aee63d3bcdf682794ade upstream. Need use 'clk' instead of 'mclk', which is the original removed local variable. The related original commit: "652ed95 cpufreq: introduce cpufreq_generic_get() routine" The related error with allmodconfig for unicore32: CC drivers/cpufreq/unicore2-cpufreq.o drivers/cpufreq/unicore2-cpufreq.c: In function ‘ucv2_target’: drivers/cpufreq/unicore2-cpufreq.c:48: error: ‘struct cpufreq_policy’ has no member named ‘mclk’ make[2]: *** [drivers/cpufreq/unicore2-cpufreq.o] Error 1 make[1]: *** [drivers/cpufreq] Error 2 make: *** [drivers] Error 2 Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13cpufreq: at32ap: don't declare local variable as staticViresh Kumar
commit 0ca97886fece9e1acd71ade4ca3a250945c8fc8b upstream. Earlier commit: commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jan 9 20:38:43 2014 +0530 cpufreq: introduce cpufreq_generic_get() routine did some changes to driver and by mistake made cpuclk as a 'static' local variable, which wasn't actually required. Fix it. Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Reported-by: Alexandre Oliva <lxoliva@fsfla.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13cpufreq: loongson2_cpufreq: don't declare local variable as staticViresh Kumar
commit 05ed672292dc9d37db4fafdd49baa782158cd795 upstream. Earlier commit: commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jan 9 20:38:43 2014 +0530 cpufreq: introduce cpufreq_generic_get() routine did some changes to driver and by mistake made cpuclk as a 'static' local variable, which wasn't actually required. Fix it. Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Reported-by: Alexandre Oliva <lxoliva@fsfla.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-13cpufreq: Skip current frequency initialization for ->setpolicy driversRafael J. Wysocki
After commit da60ce9f2fac (cpufreq: call cpufreq_driver->get() after calling ->init()) __cpufreq_add_dev() sometimes fails for CPUs handled by intel_pstate, because that driver may return 0 from its ->get() callback if it has not run long enough to collect enough samples on the given CPU. That didn't happen before commit da60ce9f2fac which added policy->cur initialization to __cpufreq_add_dev() to help reduce code duplication in other cpufreq drivers. However, the code added by commit da60ce9f2fac need not be executed for cpufreq drivers having the ->setpolicy callback defined, because the subsequent invocation of cpufreq_set_policy() will use that callback to initialize the policy anyway and it doesn't need policy->cur to be initialized upfront. The analogous code in cpufreq_update_policy() is also unnecessary for cpufreq drivers having ->setpolicy set and may be skipped for them as well. Since intel_pstate provides ->setpolicy, skipping the upfront policy->cur initialization for cpufreq drivers with that callback set will cover intel_pstate and the problem it's been having after commit da60ce9f2fac will be addressed. Fixes: da60ce9f2fac (cpufreq: call cpufreq_driver->get() after calling ->init()) References: https://bugzilla.kernel.org/show_bug.cgi?id=71931 Reported-and-tested-by: Patrik Lundquist <patrik.lundquist@gmail.com> Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06cpufreq: Initialize governor for a new policy under policy->rwsemViresh Kumar
policy->rwsem is used to lock access to all parts of code modifying struct cpufreq_policy, but it's not used on a new policy created by __cpufreq_add_dev(). Because of that, if cpufreq_update_policy() is called in a tight loop on one CPU in parallel with offline/online of another CPU, then the following crash can be triggered: Unable to handle kernel NULL pointer dereference at virtual address 00000020 pgd = c0003000 [00000020] *pgd=80000000004003, *pmd=00000000 Internal error: Oops: 206 [#1] PREEMPT SMP ARM PC is at __cpufreq_governor+0x10/0x1ac LR is at cpufreq_update_policy+0x114/0x150 ---[ end trace f23a8defea6cd706 ]--- Kernel panic - not syncing: Fatal exception CPU0: stopping CPU: 0 PID: 7136 Comm: mpdecision Tainted: G D W 3.10.0-gd727407-00074-g979ede8 #396 [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68) [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44) [<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc) [<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78) [<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74) [<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c) [<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180) [<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68) [<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30) Fix that by taking locks at appropriate places in __cpufreq_add_dev() as well. Reported-by: Saravana Kannan <skannan@codeaurora.org> Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06cpufreq: Initialize policy before making it available for others to useViresh Kumar
Policy must be fully initialized before it is being made available for use by others. Otherwise cpufreq_cpu_get() would be able to grab a half initialized policy structure that might not have affected_cpus (for example) populated. Then, anybody accessing those fields will get a wrong value and that will lead to unpredictable results. In order to fix this, do all the necessary initialization before we make the policy structure available via cpufreq_cpu_get(). That will guarantee that any code accessing fields of the policy will get correct data from them. Reported-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06cpufreq: use cpufreq_cpu_get() to avoid cpufreq_get() race conditionsAaron Plattner
If a module calls cpufreq_get while cpufreq is initializing, it's possible for it to be called after cpufreq_driver is set but before cpufreq_cpu_data is written during subsys_interface_register. This happens because cpufreq_get doesn't take the cpufreq_driver_lock around its use of cpufreq_cpu_data. Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather than reading it out of cpufreq_cpu_data directly. cpufreq_cpu_get() takes the appropriate locks to prevent this race from happening. Since it's possible for policy to be NULL if the caller passes in an invalid CPU number or calls the function before cpufreq is initialized, delete the BUG_ON(!policy) and simply return 0. Don't try to return -ENOENT because that's negative and the function returns an unsigned integer. References: https://bbs.archlinux.org/viewtopic.php?id=177934 Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-26intel_pstate: Change busy calculation to use fixed point math.Dirk Brandewie
Commit fcb6a15c2e (intel_pstate: Take core C0 time into account for core busy calculation) introduced a regression on some processor SKUs supported by intel_pstate. This was due to the truncation caused by using integer math to calculate core busy and C0 percentages. On a i7-4770K processor operating at 800Mhz going to 100% utilization the percent busy of the CPU using integer math is 22%, but it actually is 22.85%. This value scaled to the current frequency returned 97 which the PID interpreted as no error and did not adjust the P state. Tested on i7-4770K, i7-2600, i5-3230M. Fixes: fcb6a15c2e7e (intel_pstate: Take core C0 time into account for core busy calculation) References: https://lkml.org/lkml/2014/2/19/626 References: https://bugzilla.kernel.org/show_bug.cgi?id=70941 Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-21intel_pstate: Add support for Baytrail turbo P statesDirk Brandewie
A documentation update exposed the existance of the turbo ratio register. Update baytrail support to use the turbo range. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-21intel_pstate: Use LFM bus ratio as min ratio/P stateDirk Brandewie
LFM (max efficiency ratio) is the max frequency at minimum voltage supported by the processor. Using LFM as the minimum P state increases performmance without affecting power. By not using P states below LFM we avoid using P states that are less power efficient. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-19cpufreq: powernow-k8: Initialize per-cpu data-structures properlySrivatsa S. Bhat
The powernow-k8 driver maintains a per-cpu data-structure called powernow_data that is used to perform the frequency transitions. It initializes this data structure only for the policy->cpu. So, accesses to this data structure by other CPUs results in various problems because they would have been uninitialized. Specifically, if a cpu (!= policy->cpu) invokes the drivers' ->get() function, it returns 0 as the KHz value, since its per-cpu memory doesn't point to anything valid. This causes problems during suspend/resume since cpufreq_update_policy() tries to enforce this (0 KHz) as the current frequency of the CPU, and this madness gets propagated to adjust_jiffies() as well. Eventually, lots of things start breaking down, including the r8169 ethernet card, in one particularly interesting case reported by Pierre Ossman. Fix this by initializing the per-cpu data-structures of all the CPUs in the policy appropriately. References: https://bugzilla.kernel.org/show_bug.cgi?id=70311 Reported-by: Pierre Ossman <pierre@ossman.eu> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: All applicable <stable@vger.kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-19cpufreq: remove sysfs link when a cpu != policy->cpu, is removedviresh kumar
Commit 42f921a (cpufreq: remove sysfs files for CPUs which failed to come back after resume) tried to do this but missed this piece of code to fix. Currently we are getting this on suspend/resume: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 877 at fs/sysfs/dir.c:52 sysfs_warn_dup+0x68/0x84() sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq' Modules linked in: brcmfmac brcmutil CPU: 0 PID: 877 Comm: test-rtc-resume Not tainted 3.14.0-rc2-00259-g9398a10cd964 #12 [<c0015bac>] (unwind_backtrace) from [<c0011850>] (show_stack+0x10/0x14) [<c0011850>] (show_stack) from [<c056e018>] (dump_stack+0x80/0xcc) [<c056e018>] (dump_stack) from [<c0025e44>] (warn_slowpath_common+0x64/0x88) [<c0025e44>] (warn_slowpath_common) from [<c0025efc>] (warn_slowpath_fmt+0x30/0x40) [<c0025efc>] (warn_slowpath_fmt) from [<c012776c>] (sysfs_warn_dup+0x68/0x84) [<c012776c>] (sysfs_warn_dup) from [<c0127a54>] (sysfs_do_create_link_sd+0xb0/0xb8) [<c0127a54>] (sysfs_do_create_link_sd) from [<c038ef64>] (__cpufreq_add_dev.isra.27+0x2a8/0x814) [<c038ef64>] (__cpufreq_add_dev.isra.27) from [<c038f548>] (cpufreq_cpu_callback+0x70/0x8c) [<c038f548>] (cpufreq_cpu_callback) from [<c0043864>] (notifier_call_chain+0x44/0x84) [<c0043864>] (notifier_call_chain) from [<c0025f60>] (__cpu_notify+0x28/0x44) [<c0025f60>] (__cpu_notify) from [<c00261e8>] (_cpu_up+0xf0/0x140) [<c00261e8>] (_cpu_up) from [<c0569eb8>] (enable_nonboot_cpus+0x68/0xb0) [<c0569eb8>] (enable_nonboot_cpus) from [<c006339c>] (suspend_devices_and_enter+0x198/0x2dc) [<c006339c>] (suspend_devices_and_enter) from [<c0063654>] (pm_suspend+0x174/0x1e8) [<c0063654>] (pm_suspend) from [<c00624e0>] (state_store+0x6c/0xbc) [<c00624e0>] (state_store) from [<c01fc200>] (kobj_attr_store+0x14/0x20) [<c01fc200>] (kobj_attr_store) from [<c0126e50>] (sysfs_kf_write+0x44/0x48) [<c0126e50>] (sysfs_kf_write) from [<c012a274>] (kernfs_fop_write+0xb4/0x14c) [<c012a274>] (kernfs_fop_write) from [<c00d4818>] (vfs_write+0xa8/0x180) [<c00d4818>] (vfs_write) from [<c00d4bb8>] (SyS_write+0x3c/0x70) [<c00d4bb8>] (SyS_write) from [<c000e620>] (ret_fast_syscall+0x0/0x30) ---[ end trace 76969904b614c18f ]--- Fix this by removing sysfs link for cpufreq directory when cpu removed isn't policy->cpu. Revamps: 42f921a (cpufreq: remove sysfs files for CPUs which failed to come back after resume) Reported-and-tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-13intel_pstate: Remove energy reporting from pstate_sample tracepointDirk Brandewie
Remove the reporting of energy since it does not provide any useful information about the state of the driver and will be a maintainance headache going forward since the RAPL energy units register is not architectural and subject to change between micro-architectures References: https://bugzilla.kernel.org/show_bug.cgi?id=69831 Fixes: b69880f9ccf7 (intel_pstate: Add trace point to report internal state.) Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-04intel_pstate: Take core C0 time into account for core busy calculationDirk Brandewie
Take non-idle time into account when calculating core busy time. This ensures that intel_pstate will notice a decrease in load. References: https://bugzilla.kernel.org/show_bug.cgi?id=66581 Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-31Merge tag 'pm+acpi-3.14-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes and cleanups from Rafael Wysocki: - ACPI device hotplug fix preventing ACPI drivers from binding to device objects that acpi_bus_trim() has been called for and the devices represented by them may not be operational. - Recent cpufreq changes related to the "boost" (turbo) feature broke the acpi-cpufreq error code path causing a NULL pointer dereference to occur on some systems. Fix from Konrad Rzeszutek Wilk. - The log level of a CPU initialization error message added recently needs to be reduced, because the particular BIOS issue indicated by it turns out to be widespread and doesn't really matter for the majority of systems having it. From Jiang Liu. - The regulator API needs to be told to stay away from things on systems with ACPI BIOSes or it may conflict with the BIOS' own handling of voltage regulators. Fix from Mark Brown that works around a 3.13 regression in lm90 on PCs occuring if the regulator API is enabled. - Prevent the Exynos4 devfreq driver from being built on multiplatform, because it depends on things that aren't available during such builds. From Sachin Kamat. - Upstream ACPICA doesn't use the bool type as defined in the kernel, so modify the kernel's ACPICA code to follow the upstream in that respect (only one variable definition is affected) to reduce divergences between the two. From Lv Zheng. - Make the ACPI device PM code use ACPI_COMPANION() instead of its own routine doing the same thing (and invokng ACPI_COMPANION() in the process). - Modify some routines in the ACPI processor driver to follow the common convention and return negative integers on errors. From Hanjun Guo. * tag 'pm+acpi-3.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / scan: Clear match_driver flag in acpi_bus_trim() ACPI / init: Flag use of ACPI and ACPI idioms for power supplies to regulator API acpi-cpufreq: De-register CPU notifier and free struct msr on error. ACPICA: Remove bool usage from ACPICA. PM / devfreq: Disable Exynos4 driver build on multiplatform ACPI / PM: Use ACPI_COMPANION() to get ACPI companions of devices ACPI / scan: reduce log level of "ACPI: \_PR_.CPU4: failed to get CPU APIC ID" ACPI / processor: Return specific error value when mapping lapic id
2014-01-29Merge branches 'pm-cpufreq' and 'pm-devfreq'Rafael J. Wysocki
* pm-cpufreq: acpi-cpufreq: De-register CPU notifier and free struct msr on error. * pm-devfreq: PM / devfreq: Disable Exynos4 driver build on multiplatform
2014-01-28acpi-cpufreq: De-register CPU notifier and free struct msr on error.Konrad Rzeszutek Wilk
If cpufreq_register_driver() fails we would free the acpi driver related structures but not free the ones allocated by acpi_cpufreq_boost_init() function. This meant that as the driver error-ed out and a CPU online/offline event came we would crash and burn as one of the CPU notifiers would point to garbage. Fixes: cfc9c8ed03e4 (acpi-cpufreq: Adjust the code to use the common boost attribute) Acked-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-24Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: "This time, the biggest change is the work of representing hardware thermal properties in device tree infrastructure. This work includes the introduction of a device tree bindings for describing the hardware thermal behavior and limits, and also a parser to read and interpret the data, and build thermal zones and thermal binding parameters. It also contains three examples on how to use the new representation on sensor devices, using three different drivers to accomplish it. One driver is in thermal subsystem, the TI SoC thermal, and the other two drivers are in hwmon subsystem. Actually, this would be the first step of the complete work because we still need to check other potential drivers to be converted and then validate the proposed API. But the reason why I include it in this pull request is that, first, this change does not hurt any others without using this approach, second, the principle and concept of this change would not break after converting the remaining drivers. BTW, as you can see, there are several points in this change that do not belong to thermal subsystem. Because it has been suggested by Guenter R that in such cases, it is recommended to send the complete series via one single subsystem. Specifics: - representing hardware thermal properties in device tree infrastructure - fix a regression that the imx thermal driver breaks system suspend. - introduce ACPI INT3403 thermal driver to retrieve temperature data from the INT3403 ACPI device object present on some systems. - introduce debug statement for thermal core and step_wise governor. - assorted fixes and cleanups for thermal core, cpu cooling, exynos thrmal, intel powerclamp and imx thermal driver" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (34 commits) thermal: remove const flag from .ops of imx thermal Thermal: update thermal zone device after setting emul_temp intel_powerclamp: Fix cstate counter detection. thermal: imx: add necessary clk operation Thermal cpu cooling: return error if no valid cpu frequency entry thermal: fix cpu_cooling max_level behavior thermal: rcar-thermal: Enable driver compilation with COMPILE_TEST thermal: debug: add debug statement for core and step_wise thermal: imx_thermal: add module device table drivers: thermal: Mark function as static in x86_pkg_temp_thermal.c thermal:samsung: fix compilation warning thermal: imx: correct suspend/resume flow thermal: exynos: fix error return code Thermal: ACPI INT3403 thermal driver MAINTAINERS: add thermal bindings entry in thermal domain arm: dts: make OMAP4460 bandgap node to belong to OCP arm: dts: make OMAP443x bandgap node to belong to OCP arm: dts: add cooling properties on omap5 cpu node arm: dts: add omap5 thermal data arm: dts: add omap5 CORE thermal data ...
2014-01-24Merge tag 'pm+acpi-3.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "As far as the number of commits goes, the top spot belongs to ACPI this time with cpufreq in the second position and a handful of PM core, PNP and cpuidle updates. They are fixes and cleanups mostly, as usual, with a couple of new features in the mix. The most visible change is probably that we will create struct acpi_device objects (visible in sysfs) for all devices represented in the ACPI tables regardless of their status and there will be a new sysfs attribute under those objects allowing user space to check that status via _STA. Consequently, ACPI device eject or generally hot-removal will not delete those objects, unless the table containing the corresponding namespace nodes is unloaded, which is extremely rare. Also ACPI container hotplug will be handled quite a bit differently and cpufreq will support CPU boost ("turbo") generically and not only in the acpi-cpufreq driver. Specifics: - ACPI core changes to make it create a struct acpi_device object for every device represented in the ACPI tables during all namespace scans regardless of the current status of that device. In accordance with this, ACPI hotplug operations will not delete those objects, unless the underlying ACPI tables go away. - On top of the above, new sysfs attribute for ACPI device objects allowing user space to check device status by triggering the execution of _STA for its ACPI object. From Srinivas Pandruvada. - ACPI core hotplug changes reducing code duplication, integrating the PCI root hotplug with the core and reworking container hotplug. - ACPI core simplifications making it use ACPI_COMPANION() in the code "glueing" ACPI device objects to "physical" devices. - ACPICA update to upstream version 20131218. This adds support for the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug facilities. From Bob Moore, Lv Zheng and Betty Dall. - Init code change to carry out the early ACPI initialization earlier. That should allow us to use ACPI during the timekeeping initialization and possibly to simplify the EFI initialization too. From Chun-Yi Lee. - Clenups of the inclusions of ACPI headers in many places all over from Lv Zheng and Rashika Kheria (work in progress). - New helper for ACPI _DSM execution and rework of the code in drivers that uses _DSM to execute it via the new helper. From Jiang Liu. - New Win8 OSI blacklist entries from Takashi Iwai. - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria, Tang Chen, Zhang Rui. - intel_pstate driver updates, including proper Baytrail support, from Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra. - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski. - powernow-k6 cpufreq driver fixes from Mikulas Patocka. - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown. - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar. - cpuidle cleanups from Bartlomiej Zolnierkiewicz. - Support for hibernation APM events from Bin Shi. - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled during thaw transitions from Bjørn Mork. - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson. - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa, Rashika Kheria. - New tool for profiling system suspend from Todd E Brandt and a cpupower tool cleanup from One Thousand Gnomes" * tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits) thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412) cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ Documentation: cpufreq / boost: Update BOOST documentation cpufreq: exynos: Extend Exynos cpufreq driver to support boost cpufreq / boost: Kconfig: Support for software-managed BOOST acpi-cpufreq: Adjust the code to use the common boost attribute cpufreq: Add boost frequency support in core intel_pstate: Add trace point to report internal state. cpufreq: introduce cpufreq_generic_get() routine ARM: SA1100: Create dummy clk_get_rate() to avoid build failures cpufreq: stats: create sysfs entries when cpufreq_stats is a module cpufreq: stats: free table and remove sysfs entry in a single routine cpufreq: stats: remove hotplug notifiers cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly cpufreq: speedstep: remove unused speedstep_get_state platform: introduce OF style 'modalias' support for platform bus PM / tools: new tool for suspend/resume performance optimization ACPI: fix module autoloading for ACPI enumerated devices ACPI: add module autoloading support for ACPI enumerated devices ACPI: fix create_modalias() return value handling ...
2014-01-23Merge tag 'cleanup-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC cleanups from Olof Johansson: "This is the branch where we usually queue up cleanup efforts, moving drivers out of the architecture directory, header file restructuring, etc. Sometimes they tangle with new development so it's hard to keep it strictly to cleanups. Some of the things included in this branch are: * Atmel SAMA5 conversion to common clock * Reset framework conversion for tegra platforms - Some of this depends on tegra clock driver reworks that are shared with Mike Turquette's clk tree. * Tegra DMA refactoring, which are shared branches with the DMA tree. * Removal of some header files on exynos to prepare for multiplatform" * tag 'cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (169 commits) ARM: mvebu: move Armada 370/XP specific definitions to armada-370-xp.h ARM: mvebu: remove prototypes of non-existing functions from common.h ARM: mvebu: move ARMADA_XP_MAX_CPUS to armada-370-xp.h serial: sh-sci: Rework baud rate calculation serial: sh-sci: Compute overrun_bit without using baud rate algo serial: sh-sci: Remove unused GPIO request code serial: sh-sci: Move overrun_bit and error_mask fields out of pdata serial: sh-sci: Support resources passed through platform resources serial: sh-sci: Don't check IRQ in verify port operation serial: sh-sci: Set the UPF_FIXED_PORT flag serial: sh-sci: Remove duplicate interrupt check in verify port op serial: sh-sci: Simplify baud rate calculation algorithms serial: sh-sci: Remove baud rate calculation algorithm 5 serial: sh-sci: Sort headers alphabetically ARM: EXYNOS: Kill exynos_pm_late_initcall() ARM: EXYNOS: Consolidate selection of PM_GENERIC_DOMAINS for Exynos4 ARM: at91: switch Calao QIL-A9260 board to DT clk: at91: fix pmc_clk_ids data type attriubte PM / devfreq: use inclusion <mach/map.h> instead of <plat/map-s5p.h> ARM: EXYNOS: remove <mach/regs-clock.h> for exynos ...
2014-01-17cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQLukasz Majewski
Add a special driver data flag (CPUFREQ_BOOST_FREQ) to indicate a frequency that can be only enabled for BOOST mode. This frequency will be used only for limited time, since running with it for too long may cause the target device to overheat. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: exynos: Extend Exynos cpufreq driver to support boostLukasz Majewski
The cpufreq_driver's boost_supported flag is true only when boost support is explicitly enabled. Boost related attributes are exported only under the same condition. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq / boost: Kconfig: Support for software-managed BOOSTLukasz Majewski
Add CONFIG_CPU_FREQ_BOOST_SW Kconfig option such that software-managed boost is enabled only after selecting "EXYNOS Frequency Overclocking - Software". It also depends on the thermal subsystem to be compiled in, which is necessary for disabling boost and cooling down the device when overheating is detected. Software-managed boost _MUST_ _NOT_ be enabled without thermal subsystem with properly defined overheating temperature thresholds. This option doesn't affect the x86's hardware-driven boost support in the acpi-cpufreq driver. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17acpi-cpufreq: Adjust the code to use the common boost attributeLukasz Majewski
Modify acpi-cpufreq's hardware-based boost solution to work with the common cpufreq boost framework. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: Add boost frequency support in coreLukasz Majewski
This commit adds boost frequency support in cpufreq core (Hardware & Software). Some SoCs (like Exynos4 - e.g. 4x12) allow setting frequency above its normal operation limits. Such mode shall be only used for a short time. Overclocking (boost) support is essentially provided by platform dependent cpufreq driver. This commit unifies support for SW and HW (Intel) overclocking solutions in the core cpufreq driver. Previously the "boost" sysfs attribute was defined in the ACPI processor driver code. By default boost is disabled. One global attribute is available at: /sys/devices/system/cpu/cpufreq/boost. It only shows up when cpufreq driver supports overclocking. Under the hood frequencies dedicated for boosting are marked with a special flag (CPUFREQ_BOOST_FREQ) at driver's frequency table. It is the user's concern to enable/disable overclocking with a proper call to sysfs. The cpufreq_boost_trigger_state() function is defined non static on purpose. It is used later with thermal subsystem to provide automatic enable/disable of the BOOST feature. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17intel_pstate: Add trace point to report internal state.Dirk Brandewie
Add perf trace event "power:pstate_sample" to report driver state to aid in diagnosing issues reported against intel_pstate. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: introduce cpufreq_generic_get() routineViresh Kumar
CPUFreq drivers that use clock frameworks interface,i.e. clk_get_rate(), to get CPUs clk rate, have similar sort of code used in most of them. This patch adds a generic ->get() which will do the same thing for them. All those drivers are required to now is to set .get to cpufreq_generic_get() and set their clk pointer in policy->clk during ->init(). Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Acked-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: stats: create sysfs entries when cpufreq_stats is a moduleViresh Kumar
When cpufreq_stats is compiled in as a module, cpufreq driver would have already been registered. And so the CPUFREQ_CREATE_POLICY notifiers wouldn't be called for it. Hence no sysfs entries for stats. :( This patch calls cpufreq_stats_create_table() for each online CPU from cpufreq_stats_init() and so if policy is already created for CPUx then we will register sysfs stats for it. When its not compiled as module, we will return early as policy wouldn't be found for any of the CPUs. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: stats: free table and remove sysfs entry in a single routineViresh Kumar
We don't have code paths now where we need to do these two things separately, so it is better do them in a single routine. Just as they are allocated in a single routine. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: stats: remove hotplug notifiersViresh Kumar
Either CPUs are hot-unplugged or suspend/resume occurs, cpufreq core will send notifications to cpufreq-stats and stats structure and sysfs entries would be correctly handled.. And so we don't actually need hotcpu notifiers in cpufreq-stats anymore. We were only handling cpu hot-unplug events here and that are already taken care of by POLICY notifiers. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properlyViresh Kumar
There are several problems with cpufreq stats in the way it handles cpufreq_unregister_driver() and suspend/resume.. - We must not lose data collected so far when suspend/resume happens and so stats directories must not be removed/allocated during these operations, which is done currently. - cpufreq_stat has registered notifiers with both cpufreq and hotplug. It adds sysfs stats directory with a cpufreq notifier: CPUFREQ_NOTIFY and removes this directory with a notifier from hotplug core. In case cpufreq_unregister_driver() is called (on rmmod cpufreq driver), stats directories per cpu aren't removed as CPUs are still online. The only call cpufreq_stats gets is cpufreq_stats_update_policy_cpu() for all CPUs except the last of each policy. And pointer to stat information is stored in the entry for last CPU in the per-cpu cpufreq_stats_table. But policy structure would be freed inside cpufreq core and so that will result in memory leak inside cpufreq stats (as we are never freeing memory for stats). Now if we again insert the module cpufreq_register_driver() will be called and we will again allocate stats data and put it on for first CPU of every policy. In case we only have a single CPU per policy, we will return with a error from cpufreq_stats_create_table() due to this code: if (per_cpu(cpufreq_stats_table, cpu)) return -EBUSY; And so probably cpufreq stats directory would not show up anymore (as it was added inside last policies->kobj which doesn't exist anymore). I haven't tested it, though. Also the values in stats files wouldn't be refreshed as we are using the earlier stats structure. - CPUFREQ_NOTIFY is called from cpufreq_set_policy() which is called for scenarios where we don't really want cpufreq_stat_notifier_policy() to get called. For example whenever we are changing anything related to a policy: min/max/current freq, etc. cpufreq_set_policy() is called and so cpufreq stats is notified. Where we don't do any useful stuff other than simply returning with -EBUSY from cpufreq_stats_create_table(). And so this isn't the right notifier that cpufreq stats.. Due to all above reasons this patch does following changes: - Add new notifiers CPUFREQ_CREATE_POLICY and CPUFREQ_REMOVE_POLICY, which are only called when policy is created/destroyed. They aren't called for suspend/resume paths.. - Use these notifiers in cpufreq_stat_notifier_policy() to create/destory stats sysfs entries. And so cpufreq_unregister_driver() or suspend/resume shouldn't be a problem for cpufreq_stats. - Return early from cpufreq_stat_cpu_callback() for suspend/resume sequence, so that we don't free stats structure. Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17cpufreq: speedstep: remove unused speedstep_get_statePaul Bolle
The only caller of speedstep_get_state() was removed in commit d4019f0a92ab ("cpufreq: move freq change notifications to cpufreq core"). So building speedstep-smi.o now triggers a GCC warning: drivers/cpufreq/speedstep-smi.c:148:12: warning: 'speedstep_get_state' defined but not used [-Wunused-function] Remove this unused function. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-14Merge back earlier 'pm-cpufreq' material.Rafael J. Wysocki
2014-01-06intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.Dirk Brandewie
KVM environments do not support APERF/MPERF MSRs. intel_pstate cannot operate without these registers. The previous validity checks in intel_pstate_msrs_not_valid() are insufficent in nested KVMs. References: https://bugzilla.redhat.com/show_bug.cgi?id=1046317 Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06powernow-k6: reorder frequenciesMikulas Patocka
This patch reorders reported frequencies from the highest to the lowest, just like in other frequency drivers. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06powernow-k6: correctly initialize default parametersMikulas Patocka
The powernow-k6 driver used to read the initial multiplier from the powernow register. However, there is a problem with this: * If there was a frequency transition before, the multiplier read from the register corresponds to the current multiplier. * If there was no frequency transition since reset, the field in the register always reads as zero, regardless of the current multiplier that is set using switches on the mainboard and that the CPU is running at. The zero value corresponds to multiplier 4.5, so as a consequence, the powernow-k6 driver always assumes multiplier 4.5. For example, if we have 550MHz CPU with bus frequency 100MHz and multiplier 5.5, the powernow-k6 driver thinks that the multiplier is 4.5 and bus frequency is 122MHz. The powernow-k6 driver then sets the multiplier to 4.5, underclocking the CPU to 450MHz, but reports the current frequency as 550MHz. There is no reliable way how to read the initial multiplier. I modified the driver so that it contains a table of known frequencies (based on parameters of existing CPUs and some common overclocking schemes) and sets the multiplier according to the frequency. If the frequency is unknown (because of unusual overclocking or underclocking), the user must supply the bus speed and maximum multiplier as module parameters. This patch should be backported to all stable kernels. If it doesn't apply cleanly, change it, or ask me to change it. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06powernow-k6: disable cache when changing frequencyMikulas Patocka
I found out that a system with k6-3+ processor is unstable during network server load. The system locks up or the network card stops receiving. The reason for the instability is the CPU frequency scaling. During frequency transition the processor is in "EPM Stop Grant" state. The documentation says that the processor doesn't respond to inquiry requests in this state. Consequently, coherency of processor caches and bus master devices is not maintained, causing the system instability. This patch flushes the cache during frequency transition. It fixes the instability. Other minor changes: * u64 invalue changed to unsigned long because the variable is 32-bit * move the logic to set the multiplier to a separate function powernow_k6_set_cpu_multiplier * preserve lower 5 bits of the powernow port instead of 4 (the voltage field has 5 bits) * mask interrupts when reading the multiplier, so that the port is not open during other activity (running other kernel code with the port open shouldn't cause any misbehavior, but we should better be safe and keep the port closed) This patch should be backported to all stable kernels. If it doesn't apply cleanly, change it, or ask me to change it. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06cpufreq: exynos: Convert exynos-cpufreq to platform driverLukasz Majewski
To make the driver multiplatform-friendly, unconditional initialization in an initcall is replaced with a platform driver probed only if respective platform device is registered. Tested at: Exynos4210 (TRATS) and Exynos4412 (TRATS2) Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Tomasz Figa <t.figa@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Sachin Kamat <sachin.kamat@linaro.org> Tested-by: Sachin Kamat <sachin.kamat@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06cpufreq: Make sure CPU is running on a freq from freq-tableViresh Kumar
Sometimes boot loaders set CPU frequency to a value outside of frequency table present with cpufreq core. In such cases CPU might be unstable if it has to run on that frequency for long duration of time and so its better to set it to a frequency which is specified in freq-table. This also makes cpufreq stats inconsistent as cpufreq-stats would fail to register because current frequency of CPU isn't found in freq-table. Because we don't want this change to affect boot process badly, we go for the next freq which is >= policy->cur ('cur' must be set by now, otherwise we will end up setting freq to lowest of the table as 'cur' is initialized to zero). In case current frequency doesn't match any frequency from freq-table, we throw warnings to user, so that user can get this fixed in their bootloaders or freq-tables. Reported-by: Carlos Hernandez <ceh@ti.com> Reported-and-tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06cpufreq: Mark ARM drivers with CPUFREQ_NEED_INITIAL_FREQ_CHECK flagViresh Kumar
Sometimes boot loaders set CPU frequency to a value outside of frequency table present with cpufreq core. In such cases CPU might be unstable if it has to run on that frequency for long duration of time and so its better to set it to a frequency which is specified in frequency table. On some systems we can't really say what frequency we're running at the moment and so for these we shouldn't check if we are running at a frequency present in frequency table. And so we really can't force this for all the cpufreq drivers. Hence we are created another flag here: CPUFREQ_NEED_INITIAL_FREQ_CHECK that will be marked by platforms which want to go for this check at boot time. Initially this is done for all ARM platforms but others may follow if required. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>