Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] cpufreq: Fix timer/workqueue corruption by protecting reading governor_enabled

3 views
Skip to first unread message

ji...@marvell.com

unread,
Dec 27, 2013, 4:40:02 AM12/27/13
to
From: Jane Li <ji...@marvell.com>

When a CPU is hot removed we'll cancel all the delayed work items via
gov_cancel_work(). Sometimes the delayed work function determines that
it should adjust the delay for all other CPUs that the policy is
managing. If this scenario occurs, the canceling CPU will cancel its own
work but queue up the other CPUs works to run.

Commit 3617f2(cpufreq: Fix timer/workqueue corruption due to double
queueing) has tried to fix this, but reading governor_enabled is not
protected by cpufreq_governor_lock. Even though od_dbs_timer() checks
governor_enabled before gov_queue_work(), this scenario may occur. For
example:

CPU0 CPU1
---- ----
cpu_down()
... <work runs>
__cpufreq_remove_dev() od_dbs_timer()
__cpufreq_governor() policy->governor_enabled
policy->governor_enabled = false;
cpufreq_governor_dbs()
case CPUFREQ_GOV_STOP:
gov_cancel_work(dbs_data, policy);
cpu0 work is canceled
timer is canceled
cpu1 work is canceled
<waits for cpu1>
gov_queue_work(*, *, true);
cpu0 work queued
cpu1 work queued
cpu2 work queued
...
cpu1 work is canceled
cpu2 work is canceled
...

At the end of the GOV_STOP case cpu0 still has a work queued to
run although the code is expecting all of the works to be
canceled. __cpufreq_remove_dev() will then proceed to
re-initialize all the other CPUs works except for the CPU that is
going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
will trample over the queued work and debugobjects will spit out
a warning:

WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x14
Modules linked in:
CPU: 1 PID: 1205 Comm: sh Tainted: G W 3.10.0 #200
[<c01144f0>] (unwind_backtrace+0x0/0xf8) from [<c0111d98>] (show_stack+0x10/0x14)
[<c0111d98>] (show_stack+0x10/0x14) from [<c01272cc>] (warn_slowpath_common+0x4c/0x68)
[<c01272cc>] (warn_slowpath_common+0x4c/0x68) from [<c012737c>] (warn_slowpath_fmt+0x30/0x40)
[<c012737c>] (warn_slowpath_fmt+0x30/0x40) from [<c034c640>] (debug_print_object+0x94/0xbc)
[<c034c640>] (debug_print_object+0x94/0xbc) from [<c034c7f8>] (__debug_object_init+0xc8/0x3c0)
[<c034c7f8>] (__debug_object_init+0xc8/0x3c0) from [<c01360e0>] (init_timer_key+0x20/0x104)
[<c01360e0>] (init_timer_key+0x20/0x104) from [<c04872ac>] (cpufreq_governor_dbs+0x1dc/0x68c)
[<c04872ac>] (cpufreq_governor_dbs+0x1dc/0x68c) from [<c04833a8>] (__cpufreq_governor+0x80/0x1b0)
[<c04833a8>] (__cpufreq_governor+0x80/0x1b0) from [<c0483704>] (__cpufreq_remove_dev.isra.12+0x22c/0x380)
[<c0483704>] (__cpufreq_remove_dev.isra.12+0x22c/0x380) from [<c0692f38>] (cpufreq_cpu_callback+0x48/0x5c)
[<c0692f38>] (cpufreq_cpu_callback+0x48/0x5c) from [<c014fb40>] (notifier_call_chain+0x44/0x84)
[<c014fb40>] (notifier_call_chain+0x44/0x84) from [<c012ae44>] (__cpu_notify+0x2c/0x48)
[<c012ae44>] (__cpu_notify+0x2c/0x48) from [<c068dd40>] (_cpu_down+0x80/0x258)
[<c068dd40>] (_cpu_down+0x80/0x258) from [<c068df40>] (cpu_down+0x28/0x3c)
[<c068df40>] (cpu_down+0x28/0x3c) from [<c068e4c0>] (store_online+0x30/0x74)
[<c068e4c0>] (store_online+0x30/0x74) from [<c03a7308>] (dev_attr_store+0x18/0x24)
[<c03a7308>] (dev_attr_store+0x18/0x24) from [<c0256fe0>] (sysfs_write_file+0x100/0x180)
[<c0256fe0>] (sysfs_write_file+0x100/0x180) from [<c01fec9c>] (vfs_write+0xbc/0x184)
[<c01fec9c>] (vfs_write+0xbc/0x184) from [<c01ff034>] (SyS_write+0x40/0x68)
[<c01ff034>] (SyS_write+0x40/0x68) from [<c010e200>] (ret_fast_syscall+0x0/0x48)

In gov_queue_work(), lock cpufreq_governor_lock before gov_queue_work,
and unlock it after __gov_queue_work(). In this way, governor_enabled
is guaranteed not changed in gov_queue_work().

Signed-off-by: Jane Li <ji...@marvell.com>
---
drivers/cpufreq/cpufreq.c | 1 -
drivers/cpufreq/cpufreq_governor.c | 6 +++++-
include/linux/cpufreq.h | 1 +
3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 02d534d..50601f6 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,7 +39,6 @@ static struct cpufreq_driver *cpufreq_driver;
static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data_fallback);
static DEFINE_RWLOCK(cpufreq_driver_lock);
-static DEFINE_MUTEX(cpufreq_governor_lock);
static LIST_HEAD(cpufreq_policy_list);

#ifdef CONFIG_HOTPLUG_CPU
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index e6be635..11b9f0b 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -119,8 +119,11 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
{
int i;

- if (!policy->governor_enabled)
+ mutex_lock(&cpufreq_governor_lock);
+ if (!policy->governor_enabled) {
+ mutex_unlock(&cpufreq_governor_lock);
return;
+ }

if (!all_cpus) {
/*
@@ -135,6 +138,7 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
for_each_cpu(i, policy->cpus)
__gov_queue_work(i, dbs_data, delay);
}
+ mutex_unlock(&cpufreq_governor_lock);
}
EXPORT_SYMBOL_GPL(gov_queue_work);

diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index dc196bb..4faafe7 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -254,6 +254,7 @@ struct cpufreq_driver {

int cpufreq_register_driver(struct cpufreq_driver *driver_data);
int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
+static DEFINE_MUTEX(cpufreq_governor_lock);

const char *cpufreq_get_current_driver(void);

--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Viresh Kumar

unread,
Dec 27, 2013, 4:50:02 AM12/27/13
to
On 27 December 2013 15:00, <ji...@marvell.com> wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> -static DEFINE_MUTEX(cpufreq_governor_lock);

> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index dc196bb..4faafe7 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -254,6 +254,7 @@ struct cpufreq_driver {
>
> int cpufreq_register_driver(struct cpufreq_driver *driver_data);
> int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
> +static DEFINE_MUTEX(cpufreq_governor_lock);

No way, this would never work. This would create separate locks
in each file that includes cpufreq.h. And so the locks you are talking
about wouldn't protect governor.

Have you actually tested this code? If this fixes the breakage you
saw? If this fixes it then you need to do better investigation of your
problem..

you actually need to remove the static keyword from cpufreq.c file.
Nothing else.

Jane Li

unread,
Dec 30, 2013, 1:30:02 AM12/30/13
to

On 12/27/2013 05:40 PM, Viresh Kumar wrote:
> On 27 December 2013 15:00, <ji...@marvell.com> wrote:
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> -static DEFINE_MUTEX(cpufreq_governor_lock);
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index dc196bb..4faafe7 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -254,6 +254,7 @@ struct cpufreq_driver {
>>
>> int cpufreq_register_driver(struct cpufreq_driver *driver_data);
>> int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
>> +static DEFINE_MUTEX(cpufreq_governor_lock);
> No way, this would never work. This would create separate locks
> in each file that includes cpufreq.h. And so the locks you are talking
> about wouldn't protect governor.

I checked my code, and found I lost following part.
My local code base is not exactly same as open source, and has included this.


diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index e6be635..c4d0ee6 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -19,6 +19,7 @@
#include <linux/export.h>
#include <linux/kernel_stat.h>
#include <linux/slab.h>
+#include <linux/cpufreq.h>

#include "cpufreq_governor.h"

> Have you actually tested this code? If this fixes the breakage you
> saw? If this fixes it then you need to do better investigation of your
> problem.

Yes, I test it. After adding cpufreq_governor_lock in gov_queue_work() and running same test, there is no debugobjects warning.

> you actually need to remove the static keyword from cpufreq.c file.
> Nothing else.

Modify cpufreq_governor_lock definition. Do you think it is OK?

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 16d7b4a..185c9f5 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,7 +39,7 @@ static struct cpufreq_driver *cpufreq_driver;
static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data_fallback);
static DEFINE_RWLOCK(cpufreq_driver_lock);
-static DEFINE_MUTEX(cpufreq_governor_lock);
+DEFINE_MUTEX(cpufreq_governor_lock);
static LIST_HEAD(cpufreq_policy_list);

#ifdef CONFIG_HOTPLUG_CPU
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index e6be635..485ee0d 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -22,6 +22,8 @@

#include "cpufreq_governor.h"

+extern struct mutex cpufreq_governor_lock;
+
static struct attribute_group *get_sysfs_attr(struct dbs_data *dbs_data)
{
if (have_governor_per_policy())

Viresh Kumar

unread,
Dec 30, 2013, 1:40:02 AM12/30/13
to
On 30 December 2013 11:59, Jane Li <ji...@marvell.com> wrote:
> I checked my code, and found I lost following part.
> My local code base is not exactly same as open source, and has included
> this.
>
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c
> b/drivers/cpufreq/cpufreq_governor.c
> index e6be635..c4d0ee6 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -19,6 +19,7 @@
> #include <linux/export.h>
> #include <linux/kernel_stat.h>
> #include <linux/slab.h>
> +#include <linux/cpufreq.h>
>
> #include "cpufreq_governor.h"

This shouldn't be required as cpufreq_governor.h must be including
cpufreq.h..

> Yes, I test it. After adding cpufreq_governor_lock in gov_queue_work() and
> running same test, there is no debugobjects warning.

But it really can't work at all.. There should be a separate copy of lock
in every file that includes cpufreq.h.. And so this shouldn't have worked.

> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 16d7b4a..185c9f5 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -39,7 +39,7 @@ static struct cpufreq_driver *cpufreq_driver;
>
> static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
> static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data_fallback);
> static DEFINE_RWLOCK(cpufreq_driver_lock);
> -static DEFINE_MUTEX(cpufreq_governor_lock);
> +DEFINE_MUTEX(cpufreq_governor_lock);
>
> static LIST_HEAD(cpufreq_policy_list);
>
> #ifdef CONFIG_HOTPLUG_CPU
> diff --git a/drivers/cpufreq/cpufreq_governor.c
> b/drivers/cpufreq/cpufreq_governor.c
> index e6be635..485ee0d 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -22,6 +22,8 @@
>
> #include "cpufreq_governor.h"
>
> +extern struct mutex cpufreq_governor_lock;
> +

Looks fine.

Jane Li

unread,
Jan 1, 2014, 10:20:01 PM1/1/14
to
>> Yes, I test it. After adding cpufreq_governor_lock in gov_queue_work() and
>> running same test, there is no debugobjects warning.
> But it really can't work at all.. There should be a separate copy of lock
> in every file that includes cpufreq.h.. And so this shouldn't have worked.
>
Oh.. I understand what you mean now. My patch is not right and cannot fix this issue.

By default, the debugobjects warning sometimes occurs after five minutes, and sometimes occurs after twenty hours. With this patch, I test more than fifty hours, and warning did not occurs. It shows that my test time is not long enough and miss the right one.


I have updated PATCH v2, please review again. Thanks.
0 new messages