summaryrefslogtreecommitdiff
path: root/drivers/thermal
AgeCommit message (Collapse)Author
2021-06-18Merge branch 'sched/urgent' into sched/core, to resolve conflictsIngo Molnar
This commit in sched/urgent moved the cfs_rq_is_decayed() function: a7b359fc6a37: ("sched/fair: Correctly insert cfs_rq's to list on unthrottle") and this fresh commit in sched/core modified it in the old location: 9e077b52d86a: ("sched/pelt: Check that *_avg are null when *_sum are") Merge the two variants. Conflicts: kernel/sched/fair.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-06-17thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressureLukasz Luba
The thermal pressure signal gives information to the scheduler about reduced CPU capacity due to thermal. It is based on a value stored in a per-cpu 'thermal_pressure' variable. The online CPUs will get the new value there, while the offline won't. Unfortunately, when the CPU is back online, the value read from per-cpu variable might be wrong (stale data). This might affect the scheduler decisions, since it sees the CPU capacity differently than what is actually available. Fix it by making sure that all online+offline CPUs would get the proper value in their per-cpu variable when thermal framework sets capping. Fixes: f12e4f66ab6a3 ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210614191030.22241-1-lukasz.luba@arm.com
2021-06-06Merge tag 'x86_urgent_for_v5.13-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "A bunch of x86/urgent stuff accumulated for the last two weeks so lemme unload it to you. It should be all totally risk-free, of course. :-) - Fix out-of-spec hardware (1st gen Hygon) which does not implement MSR_AMD64_SEV even though the spec clearly states so, and check CPUID bits first. - Send only one signal to a task when it is a SEGV_PKUERR si_code type. - Do away with all the wankery of reserving X amount of memory in the first megabyte to prevent BIOS corrupting it and simply and unconditionally reserve the whole first megabyte. - Make alternatives NOP optimization work at an arbitrary position within the patched sequence because the compiler can put single-byte NOPs for alignment anywhere in the sequence (32-bit retpoline), vs our previous assumption that the NOPs are only appended. - Force-disable ENQCMD[S] instructions support and remove update_pasid() because of insufficient protection against FPU state modification in an interrupt context, among other xstate horrors which are being addressed at the moment. This one limits the fallout until proper enablement. - Use cpu_feature_enabled() in the idxd driver so that it can be build-time disabled through the defines in disabled-features.h. - Fix LVT thermal setup for SMI delivery mode by making sure the APIC LVT value is read before APIC initialization so that softlockups during boot do not happen at least on one machine. - Mark all legacy interrupts as legacy vectors when the IO-APIC is disabled and when all legacy interrupts are routed through the PIC" * tag 'x86_urgent_for_v5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Check SME/SEV support in CPUID first x86/fault: Don't send SIGSEGV twice on SEGV_PKUERR x86/setup: Always reserve the first 1M of RAM x86/alternative: Optimize single-byte NOPs at an arbitrary position x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid() dmaengine: idxd: Use cpu_feature_enabled() x86/thermal: Fix LVT thermal setup for SMI delivery mode x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing
2021-05-31x86/thermal: Fix LVT thermal setup for SMI delivery modeBorislav Petkov
There are machines out there with added value crap^WBIOS which provide an SMI handler for the local APIC thermal sensor interrupt. Out of reset, the BSP on those machines has something like 0x200 in that APIC register (timestamps left in because this whole issue is timing sensitive): [ 0.033858] read lvtthmr: 0x330, val: 0x200 which means: - bit 16 - the interrupt mask bit is clear and thus that interrupt is enabled - bits [10:8] have 010b which means SMI delivery mode. Now, later during boot, when the kernel programs the local APIC, it soft-disables it temporarily through the spurious vector register: setup_local_APIC: ... /* * If this comes from kexec/kcrash the APIC might be enabled in * SPIV. Soft disable it before doing further initialization. */ value = apic_read(APIC_SPIV); value &= ~APIC_SPIV_APIC_ENABLED; apic_write(APIC_SPIV, value); which means (from the SDM): "10.4.7.2 Local APIC State After It Has Been Software Disabled ... * The mask bits for all the LVT entries are set. Attempts to reset these bits will be ignored." And this happens too: [ 0.124111] APIC: Switch to symmetric I/O mode setup [ 0.124117] lvtthmr 0x200 before write 0xf to APIC 0xf0 [ 0.124118] lvtthmr 0x10200 after write 0xf to APIC 0xf0 This results in CPU 0 soft lockups depending on the placement in time when the APIC soft-disable happens. Those soft lockups are not 100% reproducible and the reason for that can only be speculated as no one tells you what SMM does. Likely, it confuses the SMM code that the APIC is disabled and the thermal interrupt doesn't doesn't fire at all, leading to CPU 0 stuck in SMM forever... Now, before 4f432e8bb15b ("x86/mce: Get rid of mcheck_intel_therm_init()") due to how the APIC_LVTTHMR was read before APIC initialization in mcheck_intel_therm_init(), it would read the value with the mask bit 16 clear and then intel_init_thermal() would replicate it onto the APs and all would be peachy - the thermal interrupt would remain enabled. But that commit moved that reading to a later moment in intel_init_thermal(), resulting in reading APIC_LVTTHMR on the BSP too late and with its interrupt mask bit set. Thus, revert back to the old behavior of reading the thermal LVT register before the APIC gets initialized. Fixes: 4f432e8bb15b ("x86/mce: Get rid of mcheck_intel_therm_init()") Reported-by: James Feeney <james@nurealm.net> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lkml.kernel.org/r/YKIqDdFNaXYd39wz@zn.tnic
2021-05-27thermal/drivers/qcom: Fix error code in adc_tm5_get_dt_channel_data()Yang Yingliang
Return -EINVAL when args is invalid instead of 'ret' which is set to zero by a previous successful call to a function. Fixes: ca66dca5eda6 ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210527092640.2070555-1-yangyingliang@huawei.com
2021-05-24thermal/ti-soc-thermal: Fix kernel-docYang Li
Fix function name in ti-bandgap.c kernel-doc comment to remove a warning. drivers/thermal/ti-soc-thermal/ti-bandgap.c:787: warning: expecting prototype for ti_bandgap_alert_init(). Prototype was for ti_bandgap_talert_init() instead. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Suman Anna <s-anna@ti.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1621851963-36548-1-git-send-email-yang.lee@linux.alibaba.com
2021-05-14thermal/drivers/intel: Initialize RW trip to THERMAL_TEMP_INVALIDSrinivas Pandruvada
After commit 81ad4276b505 ("Thermal: Ignore invalid trip points") all user_space governor notifications via RW trip point is broken in intel thermal drivers. This commits marks trip_points with value of 0 during call to thermal_zone_device_register() as invalid. RW trip points can be 0 as user space will set the correct trip temperature later. During driver init, x86_package_temp and all int340x drivers sets RW trip temperature as 0. This results in all these trips marked as invalid by the thermal core. To fix this initialize RW trips to THERMAL_TEMP_INVALID instead of 0. Cc: <stable@vger.kernel.org> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210430122343.1789899-1-srinivas.pandruvada@linux.intel.com
2021-05-05Merge tag 'thermal-v5.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal updates from Daniel Lezcano: - Remove duplicate error message for the amlogic driver (Tang Bin) - Fix spellos in comments for the tegra and sun8i (Bhaskar Chowdhury) - Add the missing fifth node on the rcar_gen3 sensor (Niklas Söderlund) - Remove duplicate include in ti-bandgap (Zhang Yunkai) - Assign error code in the error path in the function thermal_of_populate_bind_params() (Jia-Ju Bai) - Fix spelling mistake in a comment 'disabed' -> 'disabled' (Colin Ian King) - Use the device name instead of auto-numbering for a better identification of the cooling device (Daniel Lezcano) - Improve a bit the division accuracy in the power allocator governor (Jeson Gao) - Enable the missing third sensor on msm8976 (Konrad Dybcio) - Add QCom tsens driver co-maintainer (Thara Gopinath) - Fix memory leak and use after free errors in the core code (Daniel Lezcano) - Add the MDM9607 compatible bindings (Konrad Dybcio) - Fix trivial spello in the copyright name for Hisilicon (Hao Fang) - Fix negative index array access when converting the frequency to power in the energy model (Brian-sy Yang) - Add support for Gen2 new PMIC support for Qcom SPMI (David Collins) - Update maintainer file for CPU cooling device section (Lukasz Luba) - Fix missing put_device on error in the Qcom tsens driver (Guangqing Zhu) - Add compatible DT binding for sm8350 (Robert Foss) - Add support for the MDM9607's tsens driver (Konrad Dybcio) - Remove duplicate error messages in thermal_mmio and the bcm2835 driver (Ruiqi Gong) - Add the Thermal Temperature Cooling driver (Zhang Rui) - Remove duplicate error messages in the Hisilicon sensor driver (Ye Bin) - Use the devm_platform_ioremap_resource_byname() function instead of a couple of corresponding calls (dingsenjie) - Sort the headers alphabetically in the ti-bandgap driver (Zhen Lei) - Add missing property in the DT thermal sensor binding (Rafał Miłecki) - Remove dead code in the ti-bandgap sensor driver (Lin Ruizhe) - Convert the BRCM DT bindings to the yaml schema (Rafał Miłecki) - Replace the thermal_notify_framework() call by a call to the thermal_zone_device_update() function. Remove the function as well as the corresponding documentation (Thara Gopinath) - Add support for the ipq8064-tsens sensor along with a set of cleanups and code preparation (Ansuel Smith) - Add a lockless __thermal_cdev_update() function to improve the locking scheme in the core code and governors (Lukasz Luba) - Fix multiple cooling device notification changes (Lukasz Luba) - Remove unneeded variable initialization (Colin Ian King) * tag 'thermal-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (55 commits) thermal/drivers/mtk_thermal: Remove redundant initializations of several variables thermal/core/power allocator: Use the lockless __thermal_cdev_update() function thermal/core/fair share: Use the lockless __thermal_cdev_update() function thermal/core/fair share: Lock the thermal zone while looping over instances thermal/core/power_allocator: Update once cooling devices when temp is low thermal/core/power_allocator: Maintain the device statistics from going stale thermal/core: Create a helper __thermal_cdev_update() without a lock dt-bindings: thermal: tsens: Document ipq8064 bindings thermal/drivers/tsens: Add support for ipq8064-tsens thermal/drivers/tsens: Drop unused define for msm8960 thermal/drivers/tsens: Replace custom 8960 apis with generic apis thermal/drivers/tsens: Fix bug in sensor enable for msm8960 thermal/drivers/tsens: Use init_common for msm8960 thermal/drivers/tsens: Add VER_0 tsens version thermal/drivers/tsens: Convert msm8960 to reg_field thermal/drivers/tsens: Don't hardcode sensor slope Documentation: driver-api: thermal: Remove thermal_notify_framework from documentation thermal/core: Remove thermal_notify_framework iwlwifi: mvm: tt: Replace thermal_notify_framework dt-bindings: thermal: brcm,ns-thermal: Convert to the json-schema ...
2021-04-22thermal/drivers/mtk_thermal: Remove redundant initializations of several ↵Colin Ian King
variables Several variables are being initialized with values that is never read and being updated later with a new value. The initializations are redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210422120412.246291-1-colin.king@canonical.com
2021-04-22thermal/core/power allocator: Use the lockless __thermal_cdev_update() functionLukasz Luba
Use the new helper function and avoid unnecessery second lock/unlock, which was present in old approach with thermal_cdev_update(). Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210422153624.6074-4-lukasz.luba@arm.com
2021-04-22thermal/core/fair share: Use the lockless __thermal_cdev_update() functionLukasz Luba
Use the new helper function and avoid unnecessery second lock/unlock, which was present in old approach with thermal_cdev_update(). Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210422153624.6074-3-lukasz.luba@arm.com
2021-04-22thermal/core/fair share: Lock the thermal zone while looping over instancesLukasz Luba
The tz->lock must be hold during the looping over the instances in that thermal zone. This lock was missing in the governor code since the beginning, so it's hard to point into a particular commit. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210422153624.6074-2-lukasz.luba@arm.com
2021-04-22thermal/core/power_allocator: Update once cooling devices when temp is lowLukasz Luba
The cooling device state change generates an event, also when there is no need, because temperature is low and device is not throttled. Avoid to unnecessary update the cooling device which means also not sending event. The cooling device state has not changed because the temperature is still below the first activation trip point value, so we can do this. Add a tracking mechanism to make sure it updates cooling devices only once - when the temperature dropps below first trip point. Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210422114308.29684-4-lukasz.luba@arm.com
2021-04-22thermal/core/power_allocator: Maintain the device statistics from going staleLukasz Luba
When the temperature is below the first activation trip point the cooling devices are not checked, so they cannot maintain fresh statistics. It leads into the situation, when temperature crosses first trip point, the statistics are stale and show state for very long period. This has impact on IPA algorithm calculation and wrong decisions. Thus, check the cooling devices even when the temperature is low, to refresh these statistics. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210422114308.29684-3-lukasz.luba@arm.com
2021-04-22thermal/core: Create a helper __thermal_cdev_update() without a lockLukasz Luba
There is a need to have a helper function which updates cooling device state from the governors code. With this change governor can use lock and unlock while calling helper function. This avoid unnecessary second time lock/unlock which was in previous solution present in governor implementation. This new helper function must be called with mutex 'cdev->lock' hold. The changed been discussed and part of code presented in thread: https://lore.kernel.org/linux-pm/20210419084536.25000-1-lukasz.luba@arm.com/ Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210422114308.29684-2-lukasz.luba@arm.com
2021-04-22thermal/drivers/tsens: Add support for ipq8064-tsensAnsuel Smith
Add support for tsens present in ipq806x SoCs based on generic msm8960 tsens driver. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210420183343.2272-9-ansuelsmth@gmail.com
2021-04-22thermal/drivers/tsens: Drop unused define for msm8960Ansuel Smith
Drop unused define for msm8960 replaced by generic api and reg_field. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210420183343.2272-8-ansuelsmth@gmail.com
2021-04-22thermal/drivers/tsens: Replace custom 8960 apis with generic apisAnsuel Smith
Rework calibrate function to use common function. Derive the offset from a missing hardcoded slope table and the data from the nvmem calib efuses. Drop custom get_temp function and use generic api. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Acked-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210420183343.2272-7-ansuelsmth@gmail.com
2021-04-22thermal/drivers/tsens: Fix bug in sensor enable for msm8960Ansuel Smith
Device based on tsens VER_0 contains a hardware bug that results in some problem with sensor enablement. Sensor id 6-11 can't be enabled selectively and all of them must be enabled in one step. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Acked-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210420183343.2272-6-ansuelsmth@gmail.com
2021-04-22thermal/drivers/tsens: Use init_common for msm8960Ansuel Smith
Use init_common and drop custom init for msm8960. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210420183343.2272-5-ansuelsmth@gmail.com
2021-04-22thermal/drivers/tsens: Add VER_0 tsens versionAnsuel Smith
VER_0 is used to describe device based on tsens version before v0.1. These device are devices based on msm8960 for example apq8064 or ipq806x. Add support for VER_0 in tsens.c and set the right tsens feat in tsens-8960.c file. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210420183343.2272-4-ansuelsmth@gmail.com
2021-04-22thermal/drivers/tsens: Convert msm8960 to reg_fieldAnsuel Smith
Convert msm9860 driver to reg_field to use the init_common function. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Acked-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210420183343.2272-3-ansuelsmth@gmail.com
2021-04-22thermal/drivers/tsens: Don't hardcode sensor slopeAnsuel Smith
Function compute_intercept_slope hardcode the sensor slope to SLOPE_DEFAULT. Change this and use the default value only if a slope is not defined. This is needed for tsens VER_0 that has a hardcoded slope table. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210420183343.2272-2-ansuelsmth@gmail.com
2021-04-22thermal/core: Remove thermal_notify_frameworkThara Gopinath
thermal_notify_framework just updates for a single trip point where as thermal_zone_device_update does other bookkeeping like updating the temperature of the thermal zone and setting the next trip point. The only driver that was using thermal_notify_framework was updated in the previous patch to use thermal_zone_device_update instead. Since there are no users for thermal_notify_framework remove it. Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210122023406.3500424-3-thara.gopinath@linaro.org
2021-04-21thermal/drivers/ti-soc-thermal/bandgap Remove unused variable 'val'Lin Ruizhe
The function ti_bandgap_restore_ctxt() restores the context at resume time. It checks if the sensor has a counter, reads the register but does nothing with the value. The block was probably omitted by the commit b87ea759a4cc. Remove the unused variable as well as the block using it as we can consider it as dead code. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: b87ea759a4cc ("staging: omap-thermal: fix context restore function") Signed-off-by: Lin Ruizhe <linruizhe@huawei.com> Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210421084256.57591-1-linruizhe@huawei.com
2021-04-20thermal/drivers/ti-soc-thermal/ti-bandgap: Rearrange all the included header ↵Zhen Lei
files alphabetically For the sake of lisibility, reorder the header files alphabetically. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210406091912.2583-2-thunder.leizhen@huawei.com
2021-04-20thermal/drivers/tegra: Use devm_platform_ioremap_resource_bynamedingsenjie
Use the devm_platform_ioremap_resource_byname() helper instead of calling platform_get_resource_byname() and devm_ioremap_resource() separately. Signed-off-by: dingsenjie <dingsenjie@yulong.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210414063943.96244-1-dingsenjie@163.com
2021-04-20thermal/drivers/hisi: Remove redundant dev_err call in hisi_thermal_probe()Ye Bin
There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210409075224.2109503-1-yebin10@huawei.com
2021-04-20thermal/drivers/intel: Introduce tcc cooling driverZhang Rui
On Intel processors, the core frequency can be reduced below OS request, when the current temperature reaches the TCC (Thermal Control Circuit) activation temperature. The default TCC activation temperature is specified by MSR_IA32_TEMPERATURE_TARGET. However, it can be adjusted by specifying an offset in degrees C, using the TCC Offset bits in the same MSR register. This patch introduces a cooling devices driver that utilizes the TCC Offset feature. The bigger the current cooling state is, the lower the effective TCC activation temperature is, so that the processors can be throttled earlier before system critical overheats. Note that, on different platforms, the behavior might be different on how fast the setting takes effect, and how much the CPU frequency is reduced. This patch has been tested on a KabyLake mobile platform from me, and also on a CometLake platform from Doug. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210412125901.12549-1-rui.zhang@intel.com
2021-04-20thermal/drivers/bcm2835: Remove redundant dev_err call in ↵Ruiqi Gong
bcm2835_thermal_probe() There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210408100329.7585-1-gongruiqi1@huawei.com
2021-04-20thermal/drivers/thermal_mmio: Remove redundant dev_err call in ↵Ruiqi Gong
thermal_mmio_probe() There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Acked-by: Talel Shenhar <talel@amazon.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210408100144.7494-1-gongruiqi1@huawei.com
2021-04-20thermal/drivers/qcom/tsens-v0_1: Add support for MDM9607Konrad Dybcio
MDM9607 TSENS IP is very similar to the one of MSM8916, with minor adjustments to various tuning values. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Acked-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210319220802.198215-2-konrad.dybcio@somainline.org
2021-04-15thermal/drivers/tsens: Fix missing put_device errorGuangqing Zhu
Fixes coccicheck error: drivers/thermal/qcom/tsens.c:759:4-10: ERROR: missing put_device; call of_find_device_by_node on line 715, but without a corresponding object release within this function. Fixes: a7ff82976122 ("drivers: thermal: tsens: Merge tsens-common.c into tsens.c") Signed-off-by: Guangqing Zhu <zhuguangqing83@gmail.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210404125431.12208-1-zhuguangqing83@gmail.com
2021-04-15thermal/drivers/qcom-spmi-temp-alarm: Add support for GEN2 rev 1 PMIC ↵David Collins
peripherals Add support for TEMP_ALARM GEN2 PMIC peripherals with digital major revision 1. This revision utilizes a different temperature threshold mapping than earlier revisions. Signed-off-by: David Collins <collinsd@codeaurora.org> Signed-off-by: Guru Das Srinagesh <gurus@codeaurora.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/69c90a004b3f5b7ae282f5ec5ca2920a48f23e02.1596040416.git.gurus@codeaurora.org
2021-04-15thermal/drivers/cpufreq_cooling: Fix slab OOB issuebrian-sy yang
Slab OOB issue is scanned by KASAN in cpu_power_to_freq(). If power is limited below the power of OPP0 in EM table, it will cause slab out-of-bound issue with negative array index. Return the lowest frequency if limited power cannot found a suitable OPP in EM table to fix this issue. Backtrace: [<ffffffd02d2a37f0>] die+0x104/0x5ac [<ffffffd02d2a5630>] bug_handler+0x64/0xd0 [<ffffffd02d288ce4>] brk_handler+0x160/0x258 [<ffffffd02d281e5c>] do_debug_exception+0x248/0x3f0 [<ffffffd02d284488>] el1_dbg+0x14/0xbc [<ffffffd02d75d1d4>] __kasan_report+0x1dc/0x1e0 [<ffffffd02d75c2e0>] kasan_report+0x10/0x20 [<ffffffd02d75def8>] __asan_report_load8_noabort+0x18/0x28 [<ffffffd02e6fce5c>] cpufreq_power2state+0x180/0x43c [<ffffffd02e6ead80>] power_actor_set_power+0x114/0x1d4 [<ffffffd02e6fac24>] allocate_power+0xaec/0xde0 [<ffffffd02e6f9f80>] power_allocator_throttle+0x3ec/0x5a4 [<ffffffd02e6ea888>] handle_thermal_trip+0x160/0x294 [<ffffffd02e6edd08>] thermal_zone_device_check+0xe4/0x154 [<ffffffd02d351cb4>] process_one_work+0x5e4/0xe28 [<ffffffd02d352f44>] worker_thread+0xa4c/0xfac [<ffffffd02d360124>] kthread+0x33c/0x358 [<ffffffd02d289940>] ret_from_fork+0xc/0x18 Fixes: 371a3bc79c11b ("thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power") Signed-off-by: brian-sy yang <brian-sy.yang@mediatek.com> Signed-off-by: Michael Kao <michael.kao@mediatek.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Cc: stable@vger.kernel.org #v5.7 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201229050831.19493-1-michael.kao@mediatek.com
2021-04-15thermal/drivers/hisi: Use the correct HiSilicon copyrightHao Fang
s/Hisilicon/HiSilicon/g. It should use capital S, according to https://www.hisilicon.com/en/terms-of-use. Signed-off-by: Hao Fang <fanghao11@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1617086733-2705-1-git-send-email-fanghao11@huawei.com
2021-04-15thermal/drivers/cpuidle_cooling: Fix use after errorDaniel Lezcano
When the function successfully finishes it logs an information about the registration of the cooling device and use its name to build the message. Unfortunately it was freed right before: drivers/thermal/cpuidle_cooling.c:218 __cpuidle_cooling_register() warn: 'name' was already freed. Fix this by freeing after the message happened. Fixes: 6fd1b186d900 ("thermal/drivers/cpuidle_cooling: Use device name instead of auto-numbering") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210319202522.891061-1-daniel.lezcano@linaro.org
2021-04-15thermal/drivers/devfreq_cooling: Fix wrong return on error pathDaniel Lezcano
The following error is reported by kbuild: smatch warnings: drivers/thermal/devfreq_cooling.c:433 of_devfreq_cooling_register_power() warn: passing zero to 'ERR_PTR' Fix the error code by the setting the 'err' variable instead of 'cdev'. Fixes: f8d354e821b2 ("thermal/drivers/devfreq_cooling: Use device name instead of auto-numbering") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210319202424.890968-1-daniel.lezcano@linaro.org
2021-04-15thermal/core: Fix memory leak in the error pathDaniel Lezcano
Fix the following error: smatch warnings: drivers/thermal/thermal_core.c:1020 __thermal_cooling_device_register() warn: possible memory leak of 'cdev' by freeing the cdev when exiting the function in the error path. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210319202257.890848-1-daniel.lezcano@linaro.org
2021-03-17thermal/drivers/qcom/tsens_v1: Enable sensor 3 on MSM8976Konrad Dybcio
The sensor *is* in fact used and does report temperature. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Acked-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210225213119.116550-1-konrad.dybcio@somainline.org
2021-03-17thermal/core: Add NULL pointer check before using cooling device statsManaf Meethalavalappu Pallikunhi
There is a possible chance that some cooling device stats buffer allocation fails due to very high cooling device max state value. Later cooling device update sysfs can try to access stats data for the same cooling device. It will lead to NULL pointer dereference issue. Add a NULL pointer check before accessing thermal cooling device stats data. It fixes the following bug [ 26.812833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 [ 27.122960] Call trace: [ 27.122963] do_raw_spin_lock+0x18/0xe8 [ 27.122966] _raw_spin_lock+0x24/0x30 [ 27.128157] thermal_cooling_device_stats_update+0x24/0x98 [ 27.128162] cur_state_store+0x88/0xb8 [ 27.128166] dev_attr_store+0x40/0x58 [ 27.128169] sysfs_kf_write+0x50/0x68 [ 27.133358] kernfs_fop_write+0x12c/0x1c8 [ 27.133362] __vfs_write+0x54/0x160 [ 27.152297] vfs_write+0xcc/0x188 [ 27.157132] ksys_write+0x78/0x108 [ 27.162050] ksys_write+0xf8/0x108 [ 27.166968] __arm_smccc_hvc+0x158/0x4b0 [ 27.166973] __arm_smccc_hvc+0x9c/0x4b0 [ 27.186005] el0_svc+0x8/0xc Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1607367181-24589-1-git-send-email-manafm@codeaurora.org
2021-03-16thermal/core/power_allocator: Using round the division when re-divvying up powerjeson.gao
The division is used directly in re-divvying up power, the decimal part will be discarded, devices will get less than the extra_actor_power - 1. if using round the division to make the calculation more accurate. For example: actor0 received more than its max_power, it has the extra_power 759 actor1 received less than its max_power, it require extra_actor_power 395 actor2 received less than its max_power, it require extra_actor_power 365 actor1 and actor2 require the total capped_extra_power 760 using division in re-divvying up power actor1 would actually get the extra_actor_power 394 actor2 would actually get the extra_actor_power 364 if using round the division in re-divvying up power actor1 would actually get the extra_actor_power 394 actor2 would actually get the extra_actor_power 365 Signed-off-by: Jeson Gao <jeson.gao@unisoc.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1615796737-4688-1-git-send-email-gao.yunxiao6@gmail.com
2021-03-15thermal/drivers/cpufreq_cooling: Remove unused listDaniel Lezcano
There is a list with the purpose of grouping the cpufreq cooling device together as described in the comments but actually it is unused, the code evolved since 2012 and the list was no longer needed. Delete the remaining unused list related code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210314111333.16551-5-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/cpuidle_cooling: Use device name instead of auto-numberingDaniel Lezcano
Currently the naming of a cooling device is just a cooling technique followed by a number. When there are multiple cooling devices using the same technique, it is impossible to clearly identify the related device as this one is just a number. For instance: thermal-idle-0 thermal-idle-1 thermal-idle-2 thermal-idle-3 etc ... The 'thermal' prefix is redundant with the subsystem namespace. This patch removes the 'thermal prefix and changes the number by the device name. So the naming above becomes: idle-cpu0 idle-cpu1 idle-cpu2 idle-cpu3 etc ... Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210314111333.16551-4-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/devfreq_cooling: Use device name instead of auto-numberingDaniel Lezcano
Currently the naming of a cooling device is just a cooling technique followed by a number. When there are multiple cooling devices using the same technique, it is impossible to clearly identify the related device as this one is just a number. For instance: thermal-devfreq-0 thermal-devfreq-1 etc ... The 'thermal' prefix is redundant with the subsystem namespace. This patch removes the 'thermal' prefix and changes the number by the device name. So the naming above becomes: devfreq-5000000.gpu devfreq-1d84000.ufshc etc ... Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210314111333.16551-3-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/cpufreq_cooling: Use device name instead of auto-numberingDaniel Lezcano
Currently the naming of a cooling device is just a cooling technique followed by a number. When there are multiple cooling devices using the same technique, it is impossible to clearly identify the related device as this one is just a number. For instance: thermal-cpufreq-0 thermal-cpufreq-1 etc ... The 'thermal' prefix is redundant with the subsystem namespace. This patch removes the 'thermal' prefix and changes the number by the device name. So the naming above becomes: cpufreq-cpu0 cpufreq-cpu4 etc ... Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210314111333.16551-2-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/core: Use a char pointer for the cooling device nameDaniel Lezcano
We want to have any kind of name for the cooling devices as we do no longer want to rely on auto-numbering. Let's replace the cooling device's fixed array by a char pointer to be allocated dynamically when registering the cooling device, so we don't limit the length of the name. Rework the error path at the same time as we have to rollback the allocations in case of error. Tested with a dummy device having the name: "Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch" A village on the island of Anglesey (Wales), known to have the longest name in Europe. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20210314111333.16551-1-daniel.lezcano@linaro.org
2021-03-10thermal: thermal_of: Fix error return code of thermal_of_populate_bind_params()Jia-Ju Bai
When kcalloc() returns NULL to __tcbp or of_count_phandle_with_args() returns zero or -ENOENT to count, no error return code of thermal_of_populate_bind_params() is assigned. To fix these bugs, ret is assigned with -ENOMEM and -ENOENT in these cases, respectively. Fixes: a92bab8919e3 ("of: thermal: Allow multiple devices to share cooling map") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210310122423.3266-1-baijiaju1990@gmail.com
2021-03-10thermal:ti-soc-thermal: Remove duplicate include in ti-bandgapZhang Yunkai
'of_device.h' included in 'ti-bandgap.c' is duplicated. It is also included in the 25th line. Signed-off-by: Zhang Yunkai <zhang.yunkai@zte.com.cn> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210306123415.219441-1-zhang.yunkai@zte.com.cn
2021-03-10thermal: rcar_gen3_thermal: Add support for up to five TSC nodesNiklas Söderlund
Add support for up to five TSC nodes. The new THCODE values are taken from the example in the datasheet. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210309162419.2621359-1-niklas.soderlund+renesas@ragnatech.se