summaryrefslogtreecommitdiff
path: root/drivers/thermal/thermal_core.c
AgeCommit message (Collapse)Author
2019-11-14thermal: Fix deadlock in thermal thermal_zone_device_checkWei Wang
1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") changed cancel_delayed_work to cancel_delayed_work_sync to avoid a use-after-free issue. However, cancel_delayed_work_sync could be called insides the WQ causing deadlock. [54109.642398] c0 1162 kworker/u17:1 D 0 11030 2 0x00000000 [54109.642437] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.642447] c0 1162 Call trace: [54109.642456] c0 1162 __switch_to+0x138/0x158 [54109.642467] c0 1162 __schedule+0xba4/0x1434 [54109.642480] c0 1162 schedule_timeout+0xa0/0xb28 [54109.642492] c0 1162 wait_for_common+0x138/0x2e8 [54109.642511] c0 1162 flush_work+0x348/0x40c [54109.642522] c0 1162 __cancel_work_timer+0x180/0x218 [54109.642544] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.642553] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.642563] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.642574] c0 1162 process_one_work+0x3cc/0x69c [54109.642583] c0 1162 worker_thread+0x49c/0x7c0 [54109.642593] c0 1162 kthread+0x17c/0x1b0 [54109.642602] c0 1162 ret_from_fork+0x10/0x18 [54109.643051] c0 1162 kworker/u17:2 D 0 16245 2 0x00000000 [54109.643067] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.643077] c0 1162 Call trace: [54109.643085] c0 1162 __switch_to+0x138/0x158 [54109.643095] c0 1162 __schedule+0xba4/0x1434 [54109.643104] c0 1162 schedule_timeout+0xa0/0xb28 [54109.643114] c0 1162 wait_for_common+0x138/0x2e8 [54109.643122] c0 1162 flush_work+0x348/0x40c [54109.643131] c0 1162 __cancel_work_timer+0x180/0x218 [54109.643141] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.643150] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.643159] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.643167] c0 1162 process_one_work+0x3cc/0x69c [54109.643177] c0 1162 worker_thread+0x49c/0x7c0 [54109.643186] c0 1162 kthread+0x17c/0x1b0 [54109.643195] c0 1162 ret_from_fork+0x10/0x18 [54109.644500] c0 1162 cat D 0 7766 1 0x00000001 [54109.644515] c0 1162 Call trace: [54109.644524] c0 1162 __switch_to+0x138/0x158 [54109.644536] c0 1162 __schedule+0xba4/0x1434 [54109.644546] c0 1162 schedule_preempt_disabled+0x80/0xb0 [54109.644555] c0 1162 __mutex_lock+0x3a8/0x7f0 [54109.644563] c0 1162 __mutex_lock_slowpath+0x14/0x20 [54109.644575] c0 1162 thermal_zone_get_temp+0x84/0x360 [54109.644586] c0 1162 temp_show+0x30/0x78 [54109.644609] c0 1162 dev_attr_show+0x5c/0xf0 [54109.644628] c0 1162 sysfs_kf_seq_show+0xcc/0x1a4 [54109.644636] c0 1162 kernfs_seq_show+0x48/0x88 [54109.644656] c0 1162 seq_read+0x1f4/0x73c [54109.644664] c0 1162 kernfs_fop_read+0x84/0x318 [54109.644683] c0 1162 __vfs_read+0x50/0x1bc [54109.644692] c0 1162 vfs_read+0xa4/0x140 [54109.644701] c0 1162 SyS_read+0xbc/0x144 [54109.644708] c0 1162 el0_svc_naked+0x34/0x38 [54109.845800] c0 1162 D 720.000s 1->7766->7766 cat [panic] Fixes: 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") Cc: stable@vger.kernel.org Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-11-07thermal: Initialize thermal subsystem earlierAmit Kucheria
Now that the thermal framework is built-in, in order to facilitate thermal mitigation as early as possible in the boot cycle, move the thermal framework initialization to core_initcall. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/f8ff0ab4a8e9c2eca5a26fb2256365b26cb326ce.1571656015.git.amit.kucheria@linaro.org
2019-11-07thermal: Remove netlink supportAmit Kucheria
There are no users of netlink messages for thermal inside the kernel. Remove the code and adjust the documentation. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/8ff02cf62186c7a54fff325fad40a2e9ca3affa6.1571656014.git.amit.kucheria@linaro.org
2019-09-24thermal: Add some error messagesAmit Kucheria
When registering a thermal zone device, we currently return -EINVAL in four cases. This makes it a little hard to debug the real cause of the failure. Print some error messages to make it easier for developer to figure out what happened. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-09-24thermal: Fix use-after-free when unregistering thermal zone deviceIdo Schimmel
thermal_zone_device_unregister() cancels the delayed work that polls the thermal zone, but it does not wait for it to finish. This is racy with respect to the freeing of the thermal zone device, which can result in a use-after-free [1]. Fix this by waiting for the delayed work to finish before freeing the thermal zone device. Note that thermal_zone_device_set_polling() is never invoked from an atomic context, so it is safe to call cancel_delayed_work_sync() that can block. [1] [ +0.002221] ================================================================== [ +0.000064] BUG: KASAN: use-after-free in __mutex_lock+0x1076/0x11c0 [ +0.000016] Read of size 8 at addr ffff8881e48e0450 by task kworker/1:0/17 [ +0.000023] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.2.0-rc6-custom-02495-g8e73ca3be4af #1701 [ +0.000010] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016 [ +0.000016] Workqueue: events_freezable_power_ thermal_zone_device_check [ +0.000012] Call Trace: [ +0.000021] dump_stack+0xa9/0x10e [ +0.000020] print_address_description.cold.2+0x9/0x25e [ +0.000018] __kasan_report.cold.3+0x78/0x9d [ +0.000016] kasan_report+0xe/0x20 [ +0.000016] __mutex_lock+0x1076/0x11c0 [ +0.000014] step_wise_throttle+0x72/0x150 [ +0.000018] handle_thermal_trip+0x167/0x760 [ +0.000019] thermal_zone_device_update+0x19e/0x5f0 [ +0.000019] process_one_work+0x969/0x16f0 [ +0.000017] worker_thread+0x91/0xc40 [ +0.000014] kthread+0x33d/0x400 [ +0.000015] ret_from_fork+0x3a/0x50 [ +0.000020] Allocated by task 1: [ +0.000015] save_stack+0x19/0x80 [ +0.000015] __kasan_kmalloc.constprop.4+0xc1/0xd0 [ +0.000014] kmem_cache_alloc_trace+0x152/0x320 [ +0.000015] thermal_zone_device_register+0x1b4/0x13a0 [ +0.000015] mlxsw_thermal_init+0xc92/0x23d0 [ +0.000014] __mlxsw_core_bus_device_register+0x659/0x11b0 [ +0.000013] mlxsw_core_bus_device_register+0x3d/0x90 [ +0.000013] mlxsw_pci_probe+0x355/0x4b0 [ +0.000014] local_pci_probe+0xc3/0x150 [ +0.000013] pci_device_probe+0x280/0x410 [ +0.000013] really_probe+0x26a/0xbb0 [ +0.000013] driver_probe_device+0x208/0x2e0 [ +0.000013] device_driver_attach+0xfe/0x140 [ +0.000013] __driver_attach+0x110/0x310 [ +0.000013] bus_for_each_dev+0x14b/0x1d0 [ +0.000013] driver_register+0x1c0/0x400 [ +0.000015] mlxsw_sp_module_init+0x5d/0xd3 [ +0.000014] do_one_initcall+0x239/0x4dd [ +0.000013] kernel_init_freeable+0x42b/0x4e8 [ +0.000012] kernel_init+0x11/0x18b [ +0.000013] ret_from_fork+0x3a/0x50 [ +0.000015] Freed by task 581: [ +0.000013] save_stack+0x19/0x80 [ +0.000014] __kasan_slab_free+0x125/0x170 [ +0.000013] kfree+0xf3/0x310 [ +0.000013] thermal_release+0xc7/0xf0 [ +0.000014] device_release+0x77/0x200 [ +0.000014] kobject_put+0x1a8/0x4c0 [ +0.000014] device_unregister+0x38/0xc0 [ +0.000014] thermal_zone_device_unregister+0x54e/0x6a0 [ +0.000014] mlxsw_thermal_fini+0x184/0x35a [ +0.000014] mlxsw_core_bus_device_unregister+0x10a/0x640 [ +0.000013] mlxsw_devlink_core_bus_device_reload+0x92/0x210 [ +0.000015] devlink_nl_cmd_reload+0x113/0x1f0 [ +0.000014] genl_family_rcv_msg+0x700/0xee0 [ +0.000013] genl_rcv_msg+0xca/0x170 [ +0.000013] netlink_rcv_skb+0x137/0x3a0 [ +0.000012] genl_rcv+0x29/0x40 [ +0.000013] netlink_unicast+0x49b/0x660 [ +0.000013] netlink_sendmsg+0x755/0xc90 [ +0.000013] __sys_sendto+0x3de/0x430 [ +0.000013] __x64_sys_sendto+0xe2/0x1b0 [ +0.000013] do_syscall_64+0xa4/0x4d0 [ +0.000013] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ +0.000017] The buggy address belongs to the object at ffff8881e48e0008 which belongs to the cache kmalloc-2k of size 2048 [ +0.000012] The buggy address is located 1096 bytes inside of 2048-byte region [ffff8881e48e0008, ffff8881e48e0808) [ +0.000007] The buggy address belongs to the page: [ +0.000012] page:ffffea0007923800 refcount:1 mapcount:0 mapping:ffff88823680d0c0 index:0x0 compound_mapcount: 0 [ +0.000020] flags: 0x200000000010200(slab|head) [ +0.000019] raw: 0200000000010200 ffffea0007682008 ffffea00076ab808 ffff88823680d0c0 [ +0.000016] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000 [ +0.000007] page dumped because: kasan: bad access detected [ +0.000012] Memory state around the buggy address: [ +0.000012] ffff8881e48e0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] ffff8881e48e0380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] >ffff8881e48e0400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000008] ^ [ +0.000012] ffff8881e48e0480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] ffff8881e48e0500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000007] ================================================================== Fixes: b1569e99c795 ("ACPI: move thermal trip handling to generic thermal layer") Reported-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-09-24thermal/drivers/core: Use put_device() if device_register() failsYue Hu
Never directly free @dev after calling device_register(), even if it returned an error! Always use put_device() to give up the reference initialized. Clean up the rollback block also. Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-06-27thermal/drivers/core: Use governor table to initializeDaniel Lezcano
Now that the governor table is in place and the macro allows to browse the table, declare the governor so the entry is added in the governor table in the init section. The [un]register_thermal_governors function does no longer need to use the exported [un]register thermal governor's specific function which in turn call the [un]register_thermal_governor. The governors are fully self-encapsulated. The cyclic dependency is no longer needed, remove it. Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-05-16Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: - Remove the 'module' Kconfig option for thermal subsystem framework because the thermal framework are required to be ready as early as possible to avoid overheat at boot time (Daniel Lezcano) - Fix a bug that thermal framework pokes disabled thermal zones upon resume (Wei Wang) - A couple of cleanups and trivial fixes on int340x thermal drivers (Srinivas Pandruvada, Zhang Rui, Sumeet Pawnikar) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: drivers: thermal: processor_thermal: Downgrade error message mlxsw: Remove obsolete dependency on THERMAL=m hwmon/drivers/core: Simplify complex dependency thermal/drivers/core: Fix typo in the option name thermal/drivers/core: Remove depends on THERMAL in Kconfig thermal/drivers/core: Remove module unload code thermal/drivers/core: Remove the module Kconfig's option thermal: core: skip update disabled thermal zones after suspend thermal: make device_register's type argument const thermal: intel: int340x: processor_thermal_device: simplify to get driver data thermal/int3403_thermal: favor _TMP instead of PTYP
2019-05-14thermal: Introduce devm_thermal_of_cooling_device_registerGuenter Roeck
thermal_of_cooling_device_register() and thermal_cooling_device_register() are typically called from driver probe functions, and thermal_cooling_device_unregister() is called from remove functions. This makes both a perfect candidate for device managed functions. Introduce devm_thermal_of_cooling_device_register(). This function can also be used to replace thermal_cooling_device_register() by passing a NULL pointer as device node. The new function requires both struct device * and struct device_node * as parameters since the struct device_node * parameter is not always identical to dev->of_node. Don't introduce a device managed remove function since it is not needed at this point. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-07Merge branches 'thermal-core', 'thermal-built-it' and 'thermal-intel' into nextZhang Rui
2019-05-06thermal/drivers/core: Remove module unload codeDaniel Lezcano
Now the thermal core is no longer compiled as a module. Remove the unloading module code and move the unregister function to the __init section. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-05-06thermal: core: skip update disabled thermal zones after suspendWei Wang
It is unnecessary to update disabled thermal zones post suspend and sometimes leads error/warning in bad behaved thermal drivers. Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-05-06thermal: make device_register's type argument constJean-Francois Dagenais
...because it can be, the buffer is strlcpy'd into a local buffer in a thermal struct member. Signed-off-by: Jean-Francois Dagenais <jeff.dagenais@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-11-30Thermal: do not clear passive state during system sleepWei Wang
commit ff140fea847e ("Thermal: handle thermal zone device properly during system sleep") added PM hook to call thermal zone reset during sleep. However resetting thermal zone will also clear the passive state and thus cancel the polling queue which leads the passive cooling device state not being cleared properly after sleep. thermal_pm_notify => thermal_zone_device_reset set passive to 0 thermal_zone_trip_update will skip update passive as `old_target == instance->target'. monitor_thermal_zone => thermal_zone_device_set_polling will cancel tz->poll_queue, so the cooling device state will not be changed afterwards. Reported-by: Kame Wang <kamewang@google.com> Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-11-30thermal: remove unused function parameterLukasz Luba
Clean unused parameter from internal framework function. Signed-off-by: Lukasz Luba <l.luba@partner.samsung.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-10-10thermal: core: using power_efficient_wq for thermal workerJeson Gao
For SMP systems, thermal worker should use power_efficient_wq in power saving mode, that will make scheduler more flexible on selecting an active core for running work handler to avoid keeping work handler always running on a single core, that will save some power. Even if 'power_efficient_wq' relevant configs are disabled 'system_freezable_power_efficient_wq' is identical to system_freezable_wq, behavior is unchanged. Signed-off-by: Jeson Gao <jeson.gao@unisoc.com> Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-10-10thermal: core: Fix use-after-free in thermal_cooling_device_destroy_sysfsDmitry Osipenko
This patch fixes use-after-free that was detected by KASAN. The bug is triggered on a CPUFreq driver module unload by freeing 'cdev' on device unregister and then using the freed structure during of the cdev's sysfs data destruction. The solution is to unregister the sysfs at first, then destroy sysfs data and finally release the cooling device. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-05-30drivers: thermal: Update license to SPDX formatLina Iyer
Update licences format for core thermal files. Signed-off-by: Lina Iyer <ilina@codeaurora.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-05-22thermal: Shorten name of sysfs callbacksViresh Kumar
The naming isn't consistent across all sysfs callbacks in the thermal core, some have a short name like type_show() and others have long names like thermal_cooling_device_weight_show(). This patch tries to make it consistent by shortening the name of sysfs callbacks. Some of the sysfs files are named similarly for both thermal zone and cooling device (like: type) and to avoid name clash between their show/store routines, the cooling device specific sysfs callbacks are prefixed with "cdev_". Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-04-02thermal: Add cooling device's statistics in sysfsViresh Kumar
This extends the sysfs interface for thermal cooling devices and exposes some pretty useful statistics. These statistics have proven to be quite useful specially while doing benchmarks related to the task scheduler, where we want to make sure that nothing has disrupted the test, specially the cooling device which may have put constraints on the CPUs. The information exposed here tells us to what extent the CPUs were constrained by the thermal framework. The write-only "reset" file is used to reset the statistics. The read-only "time_in_state_ms" file shows the time (in msec) spent by the device in the respective cooling states, and it prints one line per cooling state. The read-only "total_trans" file shows single positive integer value showing the total number of cooling state transitions the device has gone through since the time the cooling device is registered or the time when statistics were reset last. The read-only "trans_table" file shows a two dimensional matrix, where an entry <i,j> (row i, column j) represents the number of transitions from State_i to State_j. This is how the directory structure looks like for a single cooling device: $ ls -R /sys/class/thermal/cooling_device0/ /sys/class/thermal/cooling_device0/: cur_state max_state power stats subsystem type uevent /sys/class/thermal/cooling_device0/power: autosuspend_delay_ms runtime_active_time runtime_suspended_time control runtime_status /sys/class/thermal/cooling_device0/stats: reset time_in_state_ms total_trans trans_table This is tested on ARM 64-bit Hisilicon hikey620 board running Ubuntu and ARM 64-bit Hisilicon hikey960 board running Android. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-08-11thermal: core: Fix resources release in error paths in ↵Christophe Jaillet
thermal_zone_device_register() Reorder error handling code in order to fix some resources leaks in some cases: - 'tz' would leak if 'thermal_zone_create_device_groups()' fails - memory allocated by 'thermal_zone_create_device_groups()' would leak if 'device_register()' fails With this patch, we now have 2 error handling paths: one before 'device_register()', and one after it. This is needed because some resources are released in 'thermal_release()'. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-08-11thermal: core: Use the new 'thermal_zone_destroy_device_groups()' helper ↵Christophe Jaillet
function Simplify code by using the new 'thermal_zone_destroy_device_groups()' helper function. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-08-08thermal: core: fix some format issues on critical shutdown stringIcenowy Zheng
The critical shutdown notice string used to have some spaces missing, which makes it not so pretty. Add the spaces to satisfy usual English space rules. Reported-by: Mingcong Bai <jeffbai@aosc.io> Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-05-23thermal: core: make thermal_emergency_poweroff staticColin Ian King
Making thermal_emergency_poweroff static fixes sparse warning: drivers/thermal/thermal_core.c:6: warning: symbol 'thermal_emergency_poweroff' was not declared. Should it be static? Fixes: ef1d87e06ab4 ("thermal: core: Add a back up thermal shutdown mechanism") Acked-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2017-05-05thermal: core: Add a back up thermal shutdown mechanismKeerthy
orderly_poweroff is triggered when a graceful shutdown of system is desired. This may be used in many critical states of the kernel such as when subsystems detects conditions such as critical temperature conditions. However, in certain conditions in system boot up sequences like those in the middle of driver probes being initiated, userspace will be unable to power off the system in a clean manner and leaves the system in a critical state. In cases like these, the /sbin/poweroff will return success (having forked off to attempt powering off the system. However, the system overall will fail to completely poweroff (since other modules will be probed) and the system is still functional with no userspace (since that would have shut itself off). However, there is no clean way of detecting such failure of userspace powering off the system. In such scenarios, it is necessary for a backup workqueue to be able to force a shutdown of the system when orderly shutdown is not successful after a configurable time period. Reported-by: Nishanth Menon <nm@ti.com> Signed-off-by: Keerthy <j-keerthy@ti.com> Acked-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-05-05thermal: core: Allow orderly_poweroff to be called only onceKeerthy
thermal_zone_device_check --> thermal_zone_device_update --> handle_thermal_trip --> handle_critical_trips --> orderly_poweroff The above sequence happens every 250/500 mS based on the configuration. The orderly_poweroff function is getting called every 250/500 mS. With a full fledged file system it takes at least 5-10 Seconds to power off gracefully. In that period due to the thermal_zone_device_check triggering periodically the thermal work queues bombard with orderly_poweroff calls multiple times eventually leading to failures in gracefully powering off the system. Make sure that orderly_poweroff is called only once. Signed-off-by: Keerthy <j-keerthy@ti.com> Acked-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-02-22Merge branches 'thermal-core', 'thermal-soc', 'thermal-intel' and ↵Zhang Rui
'ida-conversion' into next
2017-01-06thermal: core: move tz->device.groups cleanup to thermal_releaseJacob von Chorus
The device_unregister call in thermal_zone_device_unregister causes the thermal_zone_device structure to be freed before the call to free the dynamically allocated attribute groups. This leads to a kernel panic. Furthermore, the 4 calls to free the trip point attribute structures occur before the call to unregister the device, leading to a kernel panic when sysfs attempts to access the attributes to remove them. Here is an example of a kernel panic when the cpu thermal zones are removed upon cpu offline: BUG: unable to handle kernel NULL pointer dereference at (null) IP: strlen+0x0/0x20 <snip> Call Trace: ? kernfs_name_hash+0x17/0x80 kernfs_find_ns+0x3f/0xd0 kernfs_remove_by_name_ns+0x36/0xa0 remove_files.isra.1+0x36/0x70 sysfs_remove_group+0x44/0x90 sysfs_remove_groups+0x2e/0x50 device_remove_attrs+0x5e/0x90 device_del+0x1ea/0x350 device_unregister+0x1a/0x60 thermal_zone_device_unregister+0x1f2/0x210 pkg_thermal_cpu_offline+0x14f/0x1a0 [x86_pkg_temp_thermal] ? kzalloc.constprop.2+0x10/0x10 [x86_pkg_temp_thermal] cpuhp_invoke_callback+0x8d/0x3f0 cpuhp_down_callbacks+0x42/0x80 cpuhp_thread_fun+0x8b/0xf0 smpboot_thread_fn+0x110/0x160 kthread+0x101/0x140 ? sort_range+0x30/0x30 ? kthread_park+0x90/0x90 ret_from_fork+0x25/0x30 This patch moves the kfree calls to clean up the dynamic attributes to the thermal_class's thermal_zone_device release function. Cc: Zhang Rui <rui.zhang@intel.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Jacob von Chorus <jacobvonchorus@cwphoto.ca> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-01-04thermal core: convert ID allocation to IDAMatthew Wilcox
The thermal core does not use the ability to look up pointers by ID, so convert it from using an IDR to the more space-efficient IDA. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-12-13Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: - Thermal core code reorganization and cleanup. Two new files are created for thermal sysfs I/F code and thermal helper functions (Eduardo Valentin). - Sanitize hotplug and locking for x86_pkg_temp driver (Thomas Gleixner) - Update MAINTAINER file for pwm-fan driver and Samsung thermal driver (Lukasz Majewski) - Fix module auto-load for max77620, tango and db8500 thermal driver (Javier Martinez Canillas) - Fix a bug that thermal hwmon sysfs I/F returns wrong critical trip point temperature value (Krzysztof Kozlowski) - Add Skylake PCH 100 series support for intel_pch_thermal driver (OGAWA Hirofumi) - Small fixes and cleanups for platform thermal drivers (Julia Lawall, Luis Henriques, Leo Yan, Stephen Boyd, Shawn Lin, Javi Merino and Lukasz Luba) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (76 commits) MAINTAINERS: Samsung: Update maintainer for PWM FAN and SAMSUNG THERMAL thermal/x86 pkg temp: Convert to hotplug state machine thermal/x86_pkg_temp: Sanitize package management thermal/x86_pkg_temp: Move work into package struct thermal/x86_pkg_temp: Move work scheduled flag into package struct thermal/x86_pkg_temp: Sanitize locking thermal/x86_pkg_temp: Cleanup code some more thermal/x86_pkg_temp: Cleanup namespace thermal/x86_pkg_temp: Get rid of ref counting thermal/x86_pkg_temp: Sanitize callback (de)initialization thermal/x86_pkg_temp: Replace open coded cpu search thermal/x86_pkg_temp: Remove redundant package search thermal/x86_pkg_temp: Cleanup thermal interrupt handling thermal: hwmon: Properly report critical temperature in sysfs devfreq_cooling: pass a pointer to devfreq in the power model callbacks devfreq_cooling: make the structs devfreq_cooling_xxx visible for all dt-bindings: rockchip-thermal: fix the misleading description thermal: rockchip: improve the warning log thermal: db8500: Fix module autoload thermal: tango: Fix module autoload ...
2016-11-23thermal: core: move slop and offset helpers to thermal_helpers.cEduardo Valentin
Reorganize code to reflect better placement. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: use kzalloc(sizeof(*ptr),...)Eduardo Valentin
As a safety check, this patch changes thermal core to check for pointer content size, instead of type size, while allocating memory. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: improve kerneldoc entry of thermal_cooling_device_unregisterEduardo Valentin
Improve description and keep 80 columns limit. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: remove style warnings and checksEduardo Valentin
Removing several style issues in thermal code code. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: remove void function return statementsEduardo Valentin
Simply removing useless returns of void functions. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: standardize line breaking alignmentEduardo Valentin
Pass through the code to remove check suggested by checkpatch.pl (alignment to parenthesis): CHECK: Alignment should match open parenthesis Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: small style fix when checking for __find_governor()Eduardo Valentin
Remove style issue: CHECK: Comparison to NULL could be written "!__find_governor" + if (__find_governor(governor->name) == NULL) { Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: remove FSF address in the GPL noticeEduardo Valentin
Simplify the GPL notice by removing the FSF address. No need to track FSF location in this file. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: add a comment describing the device management sectionEduardo Valentin
comment describing the section with function to handle registration, unregistration, binding, and unbinding of thermal devices. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: add a comment describing the power actor sectionEduardo Valentin
Simply marking the power actor section and adding a comment describing it. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: add a comment describing the main update loopEduardo Valentin
Simply marking the main update loop section and adding a comment describing it. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: move notify to the zone update sectionEduardo Valentin
moving the helper function to closer to similar functions. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: add inline to print_bind_err_msg()Eduardo Valentin
Given that this is simple wrapper, adding the inline flag. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: move __bind() to where it is usedEduardo Valentin
Moving the helper to closer where it is used. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: fix couple of style issues on __bind() helperEduardo Valentin
Removing style issues on __bind() and its helpers. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: move bind_tz() to where it is usedEduardo Valentin
Moving the helper to closer where it is used. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: move bind_cdev() to where it is usedEduardo Valentin
Moving the helper to closer where it is used. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: move __unbind() helper to where it is usedEduardo Valentin
Simply moving the helper to closer where it is actually used. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: small style fix on __unbind() helperEduardo Valentin
Simply aligning to parenthesis. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: move idr handling to device management sectionEduardo Valentin
Given that idr is only used to get id for thermal devices (zones and cooling), makes sense to move the code closer. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>