From aae760ed21cd690fe8a6db9f3a177ad55d7e12ab Mon Sep 17 00:00:00 2001 From: "Srivatsa S. Bhat" Date: Fri, 12 Jul 2013 03:45:37 +0530 Subject: cpufreq: Revert commit a66b2e to fix suspend/resume regression commit a66b2e (cpufreq: Preserve sysfs files across suspend/resume) has unfortunately caused several things in the cpufreq subsystem to break subtly after a suspend/resume cycle. The intention of that patch was to retain the file permissions of the cpufreq related sysfs files across suspend/resume. To achieve that, the commit completely removed the calls to cpufreq_add_dev() and __cpufreq_remove_dev() during suspend/resume transitions. But the problem is that those functions do 2 kinds of things: 1. Low-level initialization/tear-down that are critical to the correct functioning of cpufreq-core. 2. Kobject and sysfs related initialization/teardown. Ideally we should have reorganized the code to cleanly separate these two responsibilities, and skipped only the sysfs related parts during suspend/resume. Since we skipped the entire callbacks instead (which also included some CPU and cpufreq-specific critical components), cpufreq subsystem started behaving erratically after suspend/resume. So revert the commit to fix the regression. We'll revisit and address the original goal of that commit separately, since it involves quite a bit of careful code reorganization and appears to be non-trivial. (While reverting the commit, note that another commit f51e1eb (cpufreq: Fix cpufreq regression after suspend/resume) already reverted part of the original set of changes. So revert only the remaining ones). Signed-off-by: Srivatsa S. Bhat Acked-by: Viresh Kumar Tested-by: Paul Bolle Cc: 3.10+ Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/cpufreq.c | 4 +++- drivers/cpufreq/cpufreq_stats.c | 6 ++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 0937b8d6c2a4..7dcfa6854833 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1942,13 +1942,15 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb, if (dev) { switch (action) { case CPU_ONLINE: + case CPU_ONLINE_FROZEN: cpufreq_add_dev(dev, NULL); break; case CPU_DOWN_PREPARE: - case CPU_UP_CANCELED_FROZEN: + case CPU_DOWN_PREPARE_FROZEN: __cpufreq_remove_dev(dev, NULL); break; case CPU_DOWN_FAILED: + case CPU_DOWN_FAILED_FROZEN: cpufreq_add_dev(dev, NULL); break; } diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c index cd9e81713a71..12225d19ffcb 100644 --- a/drivers/cpufreq/cpufreq_stats.c +++ b/drivers/cpufreq/cpufreq_stats.c @@ -353,13 +353,11 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb, cpufreq_update_policy(cpu); break; case CPU_DOWN_PREPARE: + case CPU_DOWN_PREPARE_FROZEN: cpufreq_stats_free_sysfs(cpu); break; case CPU_DEAD: - cpufreq_stats_free_table(cpu); - break; - case CPU_UP_CANCELED_FROZEN: - cpufreq_stats_free_sysfs(cpu); + case CPU_DEAD_FROZEN: cpufreq_stats_free_table(cpu); break; } -- cgit v1.2.3 From e5248a111bf4048a9f3fab1a9c94c4630a10592a Mon Sep 17 00:00:00 2001 From: Liu ShuoX Date: Thu, 11 Jul 2013 16:03:45 +0800 Subject: PM / Sleep: avoid 'autosleep' in shutdown progress Prevent automatic system suspend from happening during system shutdown by making try_to_suspend() check system_state and return immediately if it is not SYSTEM_RUNNING. This prevents the following breakage from happening (scenario from Zhang Yanmin): Kernel starts shutdown and calls all device driver's shutdown callback. When a driver's shutdown is called, the last wakelock is released and suspend-to-ram starts. However, as some driver's shut down callbacks already shut down devices and disabled runtime pm, the suspend-to-ram calls driver's suspend callback without noticing that device is already off and causes crash. [rjw: Changelog] Signed-off-by: Liu ShuoX Cc: 3.5+ Signed-off-by: Rafael J. Wysocki --- kernel/power/autosleep.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/power/autosleep.c b/kernel/power/autosleep.c index c6422ffeda9a..9012ecf7b814 100644 --- a/kernel/power/autosleep.c +++ b/kernel/power/autosleep.c @@ -32,7 +32,8 @@ static void try_to_suspend(struct work_struct *work) mutex_lock(&autosleep_lock); - if (!pm_save_wakeup_count(initial_count)) { + if (!pm_save_wakeup_count(initial_count) || + system_state != SYSTEM_RUNNING) { mutex_unlock(&autosleep_lock); goto out; } -- cgit v1.2.3 From 1258ca805f613025ec079d959d4a78acfb1f79d3 Mon Sep 17 00:00:00 2001 From: Chanwoo Choi Date: Thu, 11 Jul 2013 13:55:58 +0900 Subject: PM / Sleep: Fix comment typo in pm_wakeup.h Fix a comment typo (sorce -> source) in pm_wakeup.h. [rjw: Changelog] Signed-off-by: Chanwoo Choi Signed-off-by: Rafael J. Wysocki --- include/linux/pm_wakeup.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h index 569781faa504..a0f70808d7f4 100644 --- a/include/linux/pm_wakeup.h +++ b/include/linux/pm_wakeup.h @@ -36,8 +36,8 @@ * @last_time: Monotonic clock when the wakeup source's was touched last time. * @prevent_sleep_time: Total time this source has been preventing autosleep. * @event_count: Number of signaled wakeup events. - * @active_count: Number of times the wakeup sorce was activated. - * @relax_count: Number of times the wakeup sorce was deactivated. + * @active_count: Number of times the wakeup source was activated. + * @relax_count: Number of times the wakeup source was deactivated. * @expire_count: Number of times the wakeup source's timeout has expired. * @wakeup_count: Number of times the wakeup source might abort suspend. * @active: Status of the wakeup source. -- cgit v1.2.3 From 4a6c41083df3bf4ad828c45e5f8ee1a224d537c6 Mon Sep 17 00:00:00 2001 From: Paul Bolle Date: Sun, 14 Jul 2013 20:29:56 +0200 Subject: cpufreq: s3c24xx: rename CONFIG_CPU_FREQ_S3C24XX_DEBUGFS The Kconfig symbol CPU_FREQ_S3C24XX_DEBUGFS was renamed to ARM_S3C24XX_CPUFREQ_DEBUGFS in commit f023f8dd59 ("cpufreq: s3c24xx: move cpufreq driver to drivers/cpufreq"). But that commit missed one instance of its macro CONFIG_CPU_FREQ_S3C24XX_DEBUGFS. Rename it too. Signed-off-by: Paul Bolle Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/s3c24xx-cpufreq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/s3c24xx-cpufreq.c b/drivers/cpufreq/s3c24xx-cpufreq.c index 3513e7477160..87781eb20d6d 100644 --- a/drivers/cpufreq/s3c24xx-cpufreq.c +++ b/drivers/cpufreq/s3c24xx-cpufreq.c @@ -49,7 +49,7 @@ static struct clk *clk_hclk; static struct clk *clk_pclk; static struct clk *clk_arm; -#ifdef CONFIG_CPU_FREQ_S3C24XX_DEBUGFS +#ifdef CONFIG_ARM_S3C24XX_CPUFREQ_DEBUGFS struct s3c_cpufreq_config *s3c_cpufreq_getconfig(void) { return &cpu_cur; @@ -59,7 +59,7 @@ struct s3c_iotimings *s3c_cpufreq_getiotimings(void) { return &s3c24xx_iotiming; } -#endif /* CONFIG_CPU_FREQ_S3C24XX_DEBUGFS */ +#endif /* CONFIG_ARM_S3C24XX_CPUFREQ_DEBUGFS */ static void s3c_cpufreq_getcur(struct s3c_cpufreq_config *cfg) { -- cgit v1.2.3 From 3715534adf899bfc842868865f13a001b8aa4692 Mon Sep 17 00:00:00 2001 From: Paul Bolle Date: Sun, 14 Jul 2013 14:02:19 +0200 Subject: cpufreq: s3c24xx: fix "depends on ARM_S3C24XX" in Kconfig Kconfig symbol S3C24XX_PLL depends on ARM_S3C24XX. But that symbol doesn't exist. Commit f023f8dd59bf ("cpufreq: s3c24xx: move cpufreq driver to drivers/cpufreq"), which added this issue, makes it clear that ARM_S3C24XX_CPUFREQ was intended here. Signed-off-by: Paul Bolle Acked-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki --- arch/arm/mach-s3c24xx/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/mach-s3c24xx/Kconfig b/arch/arm/mach-s3c24xx/Kconfig index 6d9252e081ce..7791ac76f945 100644 --- a/arch/arm/mach-s3c24xx/Kconfig +++ b/arch/arm/mach-s3c24xx/Kconfig @@ -208,7 +208,7 @@ config S3C24XX_GPIO_EXTRA128 config S3C24XX_PLL bool "Support CPUfreq changing of PLL frequency (EXPERIMENTAL)" - depends on ARM_S3C24XX + depends on ARM_S3C24XX_CPUFREQ help Compile in support for changing the PLL frequency from the S3C24XX series CPUfreq driver. The PLL takes time to settle -- cgit v1.2.3 From e8d05276f236ee6435e78411f62be9714e0b9377 Mon Sep 17 00:00:00 2001 From: "Srivatsa S. Bhat" Date: Tue, 16 Jul 2013 22:46:48 +0200 Subject: cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression commit 2f7021a8 "cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()" caused a regression in CPU hotplug, because it lead to a deadlock between cpufreq governor worker thread and the CPU hotplug writer task. Lockdep splat corresponding to this deadlock is shown below: [ 60.277396] ====================================================== [ 60.277400] [ INFO: possible circular locking dependency detected ] [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted [ 60.277411] ------------------------------------------------------- [ 60.277417] bash/2225 is trying to acquire lock: [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [] flush_work+0x5/0x280 [ 60.277444] but task is already holding lock: [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [] cpu_hotplug_begin+0x2b/0x60 [ 60.277465] which lock already depends on the new lock. [ 60.277472] the existing dependency chain (in reverse order) is: [ 60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}: [ 60.277490] [] lock_acquire+0xa4/0x200 [ 60.277503] [] mutex_lock_nested+0x67/0x410 [ 60.277514] [] get_online_cpus+0x3c/0x60 [ 60.277522] [] gov_queue_work+0x2a/0xb0 [ 60.277532] [] cs_dbs_timer+0xc1/0xe0 [ 60.277543] [] process_one_work+0x1cd/0x6a0 [ 60.277552] [] worker_thread+0x121/0x3a0 [ 60.277560] [] kthread+0xdb/0xe0 [ 60.277569] [] ret_from_fork+0x7c/0xb0 [ 60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}: [ 60.277592] [] lock_acquire+0xa4/0x200 [ 60.277600] [] mutex_lock_nested+0x67/0x410 [ 60.277608] [] cs_dbs_timer+0x8d/0xe0 [ 60.277616] [] process_one_work+0x1cd/0x6a0 [ 60.277624] [] worker_thread+0x121/0x3a0 [ 60.277633] [] kthread+0xdb/0xe0 [ 60.277640] [] ret_from_fork+0x7c/0xb0 [ 60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}: [ 60.277661] [] __lock_acquire+0x1766/0x1d30 [ 60.277669] [] lock_acquire+0xa4/0x200 [ 60.277677] [] flush_work+0x3d/0x280 [ 60.277685] [] __cancel_work_timer+0x8a/0x120 [ 60.277693] [] cancel_delayed_work_sync+0x13/0x20 [ 60.277701] [] cpufreq_governor_dbs+0x529/0x6f0 [ 60.277709] [] cs_cpufreq_governor_dbs+0x17/0x20 [ 60.277719] [] __cpufreq_governor+0x48/0x100 [ 60.277728] [] __cpufreq_remove_dev.isra.14+0x80/0x3c0 [ 60.277737] [] cpufreq_cpu_callback+0x38/0x4c [ 60.277747] [] notifier_call_chain+0x5d/0x110 [ 60.277759] [] __raw_notifier_call_chain+0xe/0x10 [ 60.277768] [] _cpu_down+0x88/0x330 [ 60.277779] [] cpu_down+0x36/0x50 [ 60.277788] [] store_online+0x98/0xd0 [ 60.277796] [] dev_attr_store+0x18/0x30 [ 60.277806] [] sysfs_write_file+0xdb/0x150 [ 60.277818] [] vfs_write+0xbd/0x1f0 [ 60.277826] [] SyS_write+0x4c/0xa0 [ 60.277834] [] tracesys+0xd0/0xd5 [ 60.277842] other info that might help us debug this: [ 60.277848] Chain exists of: (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock [ 60.277864] Possible unsafe locking scenario: [ 60.277869] CPU0 CPU1 [ 60.277873] ---- ---- [ 60.277877] lock(cpu_hotplug.lock); [ 60.277885] lock(&j_cdbs->timer_mutex); [ 60.277892] lock(cpu_hotplug.lock); [ 60.277900] lock((&(&j_cdbs->work)->work)); [ 60.277907] *** DEADLOCK *** [ 60.277915] 6 locks held by bash/2225: [ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [] vfs_write+0x1c3/0x1f0 [ 60.277937] #1: (&buffer->mutex){+.+.+.}, at: [] sysfs_write_file+0x3c/0x150 [ 60.277954] #2: (s_active#61){.+.+.+}, at: [] sysfs_write_file+0xc3/0x150 [ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at: [] cpu_hotplug_driver_lock+0x17/0x20 [ 60.277990] #4: (cpu_add_remove_lock){+.+.+.}, at: [] cpu_down+0x22/0x50 [ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [] cpu_hotplug_begin+0x2b/0x60 [ 60.278023] stack backtrace: [ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 [ 60.278037] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011 [ 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38 [ 60.278055] ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2 [ 60.278068] ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00 [ 60.278081] Call Trace: [ 60.278091] [] dump_stack+0x19/0x1b [ 60.278101] [] print_circular_bug+0x2b6/0x2c5 [ 60.278111] [] __lock_acquire+0x1766/0x1d30 [ 60.278123] [] ? __kernel_text_address+0x58/0x80 [ 60.278134] [] lock_acquire+0xa4/0x200 [ 60.278142] [] ? flush_work+0x5/0x280 [ 60.278151] [] flush_work+0x3d/0x280 [ 60.278159] [] ? flush_work+0x5/0x280 [ 60.278169] [] ? mark_held_locks+0x94/0x140 [ 60.278178] [] ? __cancel_work_timer+0x77/0x120 [ 60.278188] [] ? trace_hardirqs_on_caller+0xfd/0x1c0 [ 60.278196] [] __cancel_work_timer+0x8a/0x120 [ 60.278206] [] cancel_delayed_work_sync+0x13/0x20 [ 60.278214] [] cpufreq_governor_dbs+0x529/0x6f0 [ 60.278225] [] cs_cpufreq_governor_dbs+0x17/0x20 [ 60.278234] [] __cpufreq_governor+0x48/0x100 [ 60.278244] [] __cpufreq_remove_dev.isra.14+0x80/0x3c0 [ 60.278255] [] cpufreq_cpu_callback+0x38/0x4c [ 60.278265] [] notifier_call_chain+0x5d/0x110 [ 60.278275] [] __raw_notifier_call_chain+0xe/0x10 [ 60.278284] [] _cpu_down+0x88/0x330 [ 60.278292] [] ? cpu_hotplug_driver_lock+0x17/0x20 [ 60.278302] [] cpu_down+0x36/0x50 [ 60.278311] [] store_online+0x98/0xd0 [ 60.278320] [] dev_attr_store+0x18/0x30 [ 60.278329] [] sysfs_write_file+0xdb/0x150 [ 60.278337] [] vfs_write+0xbd/0x1f0 [ 60.278347] [] ? fget_light+0x320/0x4b0 [ 60.278355] [] SyS_write+0x4c/0xa0 [ 60.278364] [] tracesys+0xd0/0xd5 [ 60.280582] smpboot: CPU 1 is now offline The intention of that commit was to avoid warnings during CPU hotplug, which indicated that offline CPUs were getting IPIs from the cpufreq governor's work items. But the real root-cause of that problem was commit a66b2e5 (cpufreq: Preserve sysfs files across suspend/resume) because it totally skipped all the cpufreq callbacks during CPU hotplug in the suspend/resume path, and hence it never actually shut down the cpufreq governor's worker threads during CPU offline in the suspend/resume path. Reflecting back, the reason why we never suspected that commit as the root-cause earlier, was that the original issue was reported with just the halt command and nobody had brought in suspend/resume to the equation. The reason for _that_ in turn, as it turns out, is that earlier halt/shutdown was being done by disabling non-boot CPUs while tasks were frozen, just like suspend/resume.... but commit cf7df378a (reboot: migrate shutdown/reboot to boot cpu) which came somewhere along that very same time changed that logic: shutdown/halt no longer takes CPUs offline. Thus, the test-cases for reproducing the bug were vastly different and thus we went totally off the trail. Overall, it was one hell of a confusion with so many commits affecting each other and also affecting the symptoms of the problems in subtle ways. Finally, now since the original problematic commit (a66b2e5) has been completely reverted, revert this intermediate fix too (2f7021a8), to fix the CPU hotplug deadlock. Phew! Reported-by: Sergey Senozhatsky Reported-by: Bartlomiej Zolnierkiewicz Signed-off-by: Srivatsa S. Bhat Tested-by: Peter Wu Cc: 3.10+ Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/cpufreq_governor.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c index 464587697561..7b839a8db2a7 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -25,7 +25,6 @@ #include #include #include -#include #include "cpufreq_governor.h" @@ -137,10 +136,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy, if (!all_cpus) { __gov_queue_work(smp_processor_id(), dbs_data, delay); } else { - get_online_cpus(); for_each_cpu(i, policy->cpus) __gov_queue_work(i, dbs_data, delay); - put_online_cpus(); } } EXPORT_SYMBOL_GPL(gov_queue_work); -- cgit v1.2.3