Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Patch] sysfs: add lockdep class support to s_active

1 view
Skip to first unread message

Amerigo Wang

unread,
Feb 5, 2010, 1:30:01 AM2/5/10
to

Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
As reported by several people, it is something like:

[ 6967.926563] ACPI: Preparing to enter system sleep state S3
[ 6967.956156] Disabling non-boot CPUs ...
[ 6967.970401]
[ 6967.970408] =============================================
[ 6967.970419] [ INFO: possible recursive locking detected ]
[ 6967.970431] 2.6.33-rc2-git6 #27
[ 6967.970439] ---------------------------------------------
[ 6967.970450] pm-suspend/22147 is trying to acquire lock:
[ 6967.970460] (s_active){++++.+}, at: [<c10d2941>]
sysfs_hash_and_remove+0x3d/0x4f
[ 6967.970493]
[ 6967.970497] but task is already holding lock:
[ 6967.970506] (s_active){++++.+}, at: [<c10d4110>]
sysfs_get_active_two+0x16/0x36
[...]

Eric already provides a patch for this[1], but it still can't fix the
problem. Based on his work and Peter's suggestion, I write this patch,
hopefully we can fix the warning completely.

This patch put sysfs s_active into two classes, one is for PM, the other
is for the rest, so lockdep will distinguish them.

1. http://lkml.org/lkml/2010/1/10/282


Reported-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Reported-by: Larry Finger <Larry....@lwfinger.net>
Reported-by: Miles Lane <miles...@gmail.com>
Reported-by: Heiko Carstens <heiko.c...@de.ibm.com>
Signed-off-by: WANG Cong <amw...@redhat.com>
Cc: Eric W. Biederman <ebie...@xmission.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Tejun Heo <t...@kernel.org>
Cc: Greg Kroah-Hartman <gre...@suse.de>


---
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index dc30d9e..72a8d0b 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -24,6 +24,8 @@

#include "sysfs.h"

+static struct lock_class_key sysfs_classes[SYSFS_NR_CLASSES];
+
/* used in crash dumps to help with debugging */
static char last_sysfs_file[PATH_MAX];
void sysfs_printk_last_file(void)
@@ -504,11 +506,16 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
struct sysfs_addrm_cxt acxt;
struct sysfs_dirent *sd;
int rc;
+ int class;

sd = sysfs_new_dirent(attr->name, mode, type);
if (!sd)
return -ENOMEM;
sd->s_attr.attr = (void *)attr;
+ class = SYSFS_ATTR_NORMAL;
+ if (sysfs_type(sd) == SYSFS_KOBJ_ATTR)
+ class = sd->s_attr.attr->class;
+ lockdep_set_class(&sd->s_active, &sysfs_classes[class]);

sysfs_addrm_start(&acxt, dir_sd);
rc = sysfs_add_one(&acxt, sd);
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index cfa8308..2b91b74 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -20,6 +20,12 @@
struct kobject;
struct module;

+enum sysfs_attr_lock_class {
+ SYSFS_ATTR_NORMAL,
+ SYSFS_ATTR_PM_CONTROL,
+ SYSFS_NR_CLASSES,
+};
+
/* FIXME
* The *owner field is no longer used.
* x86 tree has been cleaned up. The owner
@@ -29,6 +35,7 @@ struct attribute {
const char *name;
struct module *owner;
mode_t mode;
+ enum sysfs_attr_lock_class class;
};

struct attribute_group {
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 46c5a26..67a6fe7 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -54,13 +54,14 @@ extern int hibernation_platform_enter(void);
extern int pfn_is_nosave(unsigned long);

#define power_attr(_name) \
-static struct kobj_attribute _name##_attr = { \
- .attr = { \
- .name = __stringify(_name), \
- .mode = 0644, \
- }, \
- .show = _name##_show, \
- .store = _name##_store, \
+static struct kobj_attribute _name##_attr = { \
+ .attr = { \
+ .name = __stringify(_name), \
+ .mode = 0644, \
+ .class = SYSFS_ATTR_PM_CONTROL, \
+ }, \
+ .show = _name##_show, \
+ .store = _name##_store, \
}

/* Preferred image size in bytes (default 500 MB) */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Cong Wang

unread,
Feb 5, 2010, 1:40:01 AM2/5/10
to

Oops! I missed one part, please ignore this patch...

Amerigo Wang

unread,
Feb 5, 2010, 1:50:02 AM2/5/10
to

1. http://lkml.org/lkml/2010/1/10/282

---
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 699f371..d7de269 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -354,7 +354,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)

atomic_set(&sd->s_count, 1);
atomic_set(&sd->s_active, 0);
- sysfs_dirent_init_lockdep(sd);

sd->s_name = name;
sd->s_mode = mode;
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index dc30d9e..97e397a 100644


--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -24,6 +24,8 @@

#include "sysfs.h"

+static struct lock_class_key sysfs_classes[SYSFS_NR_CLASSES];
+
/* used in crash dumps to help with debugging */
static char last_sysfs_file[PATH_MAX];
void sysfs_printk_last_file(void)
@@ -504,11 +506,16 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
struct sysfs_addrm_cxt acxt;
struct sysfs_dirent *sd;
int rc;
+ int class;

sd = sysfs_new_dirent(attr->name, mode, type);
if (!sd)
return -ENOMEM;
sd->s_attr.attr = (void *)attr;
+ class = SYSFS_ATTR_NORMAL;
+ if (sysfs_type(sd) == SYSFS_KOBJ_ATTR)
+ class = sd->s_attr.attr->class;

+ lockdep_set_class_and_name(sd, &sysfs_classes[class], "s_active");



sysfs_addrm_start(&acxt, dir_sd);
rc = sysfs_add_one(&acxt, sd);

diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index cdd9377..dde4d73 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -88,17 +88,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
return sd->s_flags & SYSFS_TYPE_MASK;
}

-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-#define sysfs_dirent_init_lockdep(sd) \
-do { \
- static struct lock_class_key __key; \
- \
- lockdep_init_map(&sd->dep_map, "s_active", &__key, 0); \
-} while(0)
-#else
-#define sysfs_dirent_init_lockdep(sd) do {} while(0)
-#endif
-
/*
* Context structure to be used while adding/removing nodes.
*/

Xiaotian Feng

unread,
Feb 5, 2010, 2:20:02 AM2/5/10
to
On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang <amw...@redhat.com> wrote:
> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
> As reported by several people, it is something like:
>
> [ 6967.926563] ACPI: Preparing to enter system sleep state S3
> [ 6967.956156] Disabling non-boot CPUs ...
> [ 6967.970401]
> [ 6967.970408] =============================================
> [ 6967.970419] [ INFO: possible recursive locking detected ]
> [ 6967.970431] 2.6.33-rc2-git6 #27
> [ 6967.970439] ---------------------------------------------
> [ 6967.970450] pm-suspend/22147 is trying to acquire lock:
> [ 6967.970460]  (s_active){++++.+}, at: [<c10d2941>]
> sysfs_hash_and_remove+0x3d/0x4f
> [ 6967.970493]
> [ 6967.970497] but task is already holding lock:
> [ 6967.970506]  (s_active){++++.+}, at: [<c10d4110>]
> sysfs_get_active_two+0x16/0x36
> [...]
>
> Eric already provides a patch for this[1], but it still can't fix the
> problem. Based on his work and Peter's suggestion, I write this patch,
> hopefully we can fix the warning completely.
>
> This patch put sysfs s_active into two classes, one is for PM, the other
> is for the rest, so lockdep will distinguish them.

I think this patch does not hit the root cause, we have a similiar
warning which is not related with PM.
Reported by Nick when he's trying to switch evalator. It is
reproducable with "echo deadline >/sys/block/sdx/queue/scheduler"
while kernel is using cfq.

[ INFO: possible recursive locking detected ]

2.6.33-rc6 #1
---------------------------------------------
sh/889 is trying to acquire lock:
(s_active){++++.+}, at: [<7820a975>] sysfs_addrm_finish+0x27/0x4e

but task is already holding lock:

(s_active){++++.+}, at: [<7820ab82>] sysfs_get_active_two+0x18/0x3e

other info that might help us debug this:
4 locks held by sh/889:
#0: (&buffer->mutex){+.+.+.}, at: [<7820984e>] sysfs_write_file+0x20/0x99
#1: (s_active){++++.+}, at: [<7820ab82>] sysfs_get_active_two+0x18/0x3e
#2: (s_active){++++.+}, at: [<7820ab91>] sysfs_get_active_two+0x27/0x3e
#3: (&q->sysfs_lock){+.+.+.}, at: [<78289e95>] queue_attr_store+0x2e/0x68

stack backtrace:
Pid: 889, comm: sh Not tainted 2.6.33-rc6 #1
Call Trace:
[<784a6966>] ? printk+0xf/0x11
[<781752a1>] print_deadlock_bug+0x99/0xa3
[<781753c6>] check_deadlock+0x11b/0x140
[<781763e5>] validate_chain+0x4ec/0x4f9
[<78176a68>] __lock_acquire+0x676/0x6cf
[<78176b64>] lock_acquire+0xa3/0xbc
[<7820a975>] ? sysfs_addrm_finish+0x27/0x4e
[<7820a37a>] sysfs_deactivate+0x6c/0xa4
[<7820a975>] ? sysfs_addrm_finish+0x27/0x4e
[<7820a975>] sysfs_addrm_finish+0x27/0x4e
[<7820aa3a>] sysfs_remove_dir+0x62/0x72
[<7829d6dd>] kobject_del+0x11/0x32
[<78283406>] __elv_unregister_queue+0x18/0x20
[<78283c66>] elevator_switch+0x6d/0x11b
[<78283d92>] elv_iosched_store+0x7e/0x9b
[<78289eb8>] queue_attr_store+0x51/0x68
[<78209894>] sysfs_write_file+0x66/0x99
[<781cd460>] vfs_write+0x8a/0x108
[<781cd578>] sys_write+0x3c/0x63
[<78125b90>] sysenter_do_call+0x12/0x36

Cong Wang

unread,
Feb 5, 2010, 2:30:03 AM2/5/10
to


Well, the four reports that I got are all pm-related,
this one is new for me.

I think adding another class for io_scheduler would fix this.

Thanks.

--

Eric W. Biederman

unread,
Feb 5, 2010, 4:00:02 AM2/5/10
to
Amerigo Wang <amw...@redhat.com> writes:

> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
> As reported by several people, it is something like:
>
> [ 6967.926563] ACPI: Preparing to enter system sleep state S3
> [ 6967.956156] Disabling non-boot CPUs ...
> [ 6967.970401]
> [ 6967.970408] =============================================
> [ 6967.970419] [ INFO: possible recursive locking detected ]
> [ 6967.970431] 2.6.33-rc2-git6 #27
> [ 6967.970439] ---------------------------------------------
> [ 6967.970450] pm-suspend/22147 is trying to acquire lock:
> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>]
> sysfs_hash_and_remove+0x3d/0x4f
> [ 6967.970493]
> [ 6967.970497] but task is already holding lock:
> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>]
> sysfs_get_active_two+0x16/0x36
> [...]
>
> Eric already provides a patch for this[1], but it still can't fix the
> problem. Based on his work and Peter's suggestion, I write this patch,
> hopefully we can fix the warning completely.
>
> This patch put sysfs s_active into two classes, one is for PM, the other
> is for the rest, so lockdep will distinguish them.
>
> 1. http://lkml.org/lkml/2010/1/10/282

What testing has this patch seen?

In particular does this work to actually clear up the pm case?

Eric

Eric W. Biederman

unread,
Feb 5, 2010, 4:10:02 AM2/5/10
to
Amerigo Wang <amw...@redhat.com> writes:

> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
> As reported by several people, it is something like:

The interesting case is the cpu hotplug is actually a problem. It isn't
useful to get a complaint about the non-problems code paths triggered
by pm. However my earlier review spotted a real deadlock case. Where
in one of the sysfs attributes we iterate over the list of online cpus
and that appeared to an attribute that removing a cpu would remove
from sysfs...

Eric

Cong Wang

unread,
Feb 5, 2010, 4:30:01 AM2/5/10
to

Sorry, it seems that my machine doesn't support s2ram, I am still
trying to make it working...

I hope the reporters of this bug can help to test this patch.

Thanks.

Eric W. Biederman

unread,
Feb 5, 2010, 4:40:01 AM2/5/10
to
Xiaotian Feng <xtf...@gmail.com> writes:

The root cause is that our locking is crazy complicated. No lockdep
changes are going to fix that.

What we can do and what the patch does is teach lockdep to treat some
of the sysfs files as a different group (subclass) from other sysfs
files. Which keeps us from overgeneralizing too much and having
a better signal to noise ratio.

As for the block device problem goes, I can't easily say that
the block layer is correct. I expect it is because changing
the scheduler is unlikely to delete block devices. If the block layer
has bugs then adding another subclass as Amerigo suggests should simply
make lockdep warnings harder to trigger and more accurate so that
sounds like a path worth walking.

In general I recommend that pieces of code that need to do a lot of
work in a sysfs attribute consider using a work queue or a kernel
thread, as that can be easier to analyze.

Eric

Cong Wang

unread,
Feb 5, 2010, 4:40:02 AM2/5/10
to
Eric W. Biederman wrote:
> Amerigo Wang <amw...@redhat.com> writes:
>
>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
>> As reported by several people, it is something like:
>
> The interesting case is the cpu hotplug is actually a problem. It isn't
> useful to get a complaint about the non-problems code paths triggered
> by pm. However my earlier review spotted a real deadlock case. Where
> in one of the sysfs attributes we iterate over the list of online cpus
> and that appeared to an attribute that removing a cpu would remove
> from sysfs...
>

You are referring this one:
http://marc.info/?l=linux-kernel&m=126474021428905&w=2
?

Hmm, yeah, I missed that the lock it is holding is not s_active, thus so
not be the case we are trying to fix here.

Thanks for pointing this out.

Cong Wang

unread,
Feb 5, 2010, 4:50:02 AM2/5/10
to

Cc'ing Jens Axboe.

Eric W. Biederman

unread,
Feb 5, 2010, 5:10:02 AM2/5/10
to
Cong Wang <amw...@redhat.com> writes:

> Eric W. Biederman wrote:
>> Amerigo Wang <amw...@redhat.com> writes:
>>
>>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
>>> As reported by several people, it is something like:
>>
>> The interesting case is the cpu hotplug is actually a problem. It isn't
>> useful to get a complaint about the non-problems code paths triggered
>> by pm. However my earlier review spotted a real deadlock case. Where
>> in one of the sysfs attributes we iterate over the list of online cpus
>> and that appeared to an attribute that removing a cpu would remove
>> from sysfs...
>>
>
> You are referring this one:
> http://marc.info/?l=linux-kernel&m=126474021428905&w=2
> ?
>
> Hmm, yeah, I missed that the lock it is holding is not s_active, thus so
> not be the case we are trying to fix here.

Right that error appears to be legitimate. I think that is actually a
different cpu hotplug problem then the one I spotted by audit.
Spotting weird problems like that are why it is worth it for me to go
to the problem to add lockdep annotations to sysfs, and why it is
worth debugging them.

Hopefully I will have enough energy in the next little while to help
cut down on the false positive rate.

Eric

Xiaotian Feng

unread,
Feb 5, 2010, 5:10:02 AM2/5/10
to

PM case
store /sys/devices/system/cpu1/online
remove /sys/devices/system/cpu1/cache/

iosched case
store /sys/block/sdx/queue/scheduler
remove /sys/block/sdx/queue/iosched/

So it looks like this is from sysfs layer ....

Cong Wang

unread,
Feb 7, 2010, 10:20:02 PM2/7/10
to

Right, and both locks are s_active, so I think they are the
same problem, but I haven't check the iosched case carefully. ;)

Amerigo Wang

unread,
Feb 8, 2010, 5:00:01 AM2/8/10
to
Recently we met a lockdep warning from sysfs during s2ram.

As reported by several people, it is something like:

[ 6967.926563] ACPI: Preparing to enter system sleep state S3
[ 6967.956156] Disabling non-boot CPUs ...
[ 6967.970401]
[ 6967.970408] =============================================
[ 6967.970419] [ INFO: possible recursive locking detected ]
[ 6967.970431] 2.6.33-rc2-git6 #27
[ 6967.970439] ---------------------------------------------
[ 6967.970450] pm-suspend/22147 is trying to acquire lock:
[ 6967.970460] (s_active){++++.+}, at: [<c10d2941>]
sysfs_hash_and_remove+0x3d/0x4f
[ 6967.970493]
[ 6967.970497] but task is already holding lock:
[ 6967.970506] (s_active){++++.+}, at: [<c10d4110>]
sysfs_get_active_two+0x16/0x36
[...]

Eric already provides a patch for this[1], but it still can't fix the
problem. Based on his work and Peter's suggestion, I write this patch,
hopefully we can fix the warning completely.

This patch put sysfs s_active into two classes, one is for PM, the other
is for the rest, so lockdep will distinguish them.

Still, using a workqueue to do the cleaning work is another choice,
as pointed by Eric. But not sure if it's better than this approach,
this depends on if we want to eliminate all the similar cases hold
the same class of locks, or just eliminate this one case. Please
comment.

I tested this patch, it fixes the problem.

1. http://lkml.org/lkml/2010/1/10/282


Reported-by: Larry Finger <Larry....@lwfinger.net>
Reported-by: Miles Lane <miles...@gmail.com>
Reported-by: Heiko Carstens <heiko.c...@de.ibm.com>
Signed-off-by: WANG Cong <amw...@redhat.com>
Cc: Eric W. Biederman <ebie...@xmission.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Tejun Heo <t...@kernel.org>
Cc: Greg Kroah-Hartman <gre...@suse.de>

---
fs/sysfs/dir.c | 1 -
fs/sysfs/file.c | 7 +++++++
fs/sysfs/sysfs.h | 11 -----------
include/linux/sysfs.h | 7 +++++++
kernel/power/power.h | 15 ++++++++-------
5 files changed, 22 insertions(+), 19 deletions(-)

1.5.5.6

Amerigo Wang

unread,
Feb 8, 2010, 5:00:02 AM2/8/10
to

Similar to the previous PM case, in iosched, we hold an s_active
lock to store "scheduler", meanwhile we want to remove "iosched/*"
files.

This patch depends on the previous one. I tested it on my machine,
it fixes the problem.

Reported-by: Hugh Dickins <hugh.d...@tiscali.co.uk>
Signed-off-by: WANG Cong <amw...@redhat.com>
Cc: Jens Axboe <jens....@oracle.com>

---
block/blk-sysfs.c | 120 +++++++++++++++----------------------------------
include/linux/sysfs.h | 1 +
2 files changed, 38 insertions(+), 83 deletions(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 8606c95..f863d4d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -6,6 +6,7 @@
#include <linux/bio.h>
#include <linux/blkdev.h>
#include <linux/blktrace_api.h>
+#include <linux/sysfs.h>

#include "blk.h"

@@ -254,105 +255,58 @@ static ssize_t queue_iostats_store(struct request_queue *q, const char *page,
return ret;
}

-static struct queue_sysfs_entry queue_requests_entry = {
- .attr = {.name = "nr_requests", .mode = S_IRUGO | S_IWUSR },
- .show = queue_requests_show,
- .store = queue_requests_store,
-};
-
-static struct queue_sysfs_entry queue_ra_entry = {
- .attr = {.name = "read_ahead_kb", .mode = S_IRUGO | S_IWUSR },
- .show = queue_ra_show,
- .store = queue_ra_store,
-};
+#define queue_sysfs_rw_attr(_name, _filename) \
+static struct queue_sysfs_entry _name##_entry = { \
+ .attr = { \
+ .name = _filename, \
+ .mode = S_IRUGO | S_IWUSR, \
+ .class = SYSFS_ATTR_IOSCHED, \


+ }, \
+ .show = _name##_show, \
+ .store = _name##_store, \

+}

-static struct queue_sysfs_entry queue_max_sectors_entry = {
- .attr = {.name = "max_sectors_kb", .mode = S_IRUGO | S_IWUSR },
- .show = queue_max_sectors_show,
- .store = queue_max_sectors_store,
-};
+#define queue_sysfs_ro_attr(_name, _filename) \
+static struct queue_sysfs_entry _name##_entry = { \
+ .attr = { \
+ .name = _filename, \
+ .mode = S_IRUGO, \
+ .class = SYSFS_ATTR_IOSCHED, \


+ }, \
+ .show = _name##_show, \
+}

-static struct queue_sysfs_entry queue_max_hw_sectors_entry = {
- .attr = {.name = "max_hw_sectors_kb", .mode = S_IRUGO },
- .show = queue_max_hw_sectors_show,
-};

-static struct queue_sysfs_entry queue_iosched_entry = {
- .attr = {.name = "scheduler", .mode = S_IRUGO | S_IWUSR },
- .show = elv_iosched_show,
- .store = elv_iosched_store,
-};
+queue_sysfs_rw_attr(queue_requests, "nr_requests");
+queue_sysfs_rw_attr(queue_ra, "read_ahead_kb");
+queue_sysfs_rw_attr(queue_max_sectors, "max_sectors_kb");
+queue_sysfs_ro_attr(queue_max_hw_sectors, "max_hw_sectors_kb");
+queue_sysfs_rw_attr(elv_iosched, "scheduler");
+queue_sysfs_ro_attr(queue_logical_block_size, "logical_block_size");

static struct queue_sysfs_entry queue_hw_sector_size_entry = {
.attr = {.name = "hw_sector_size", .mode = S_IRUGO },
.show = queue_logical_block_size_show,
};

-static struct queue_sysfs_entry queue_logical_block_size_entry = {
- .attr = {.name = "logical_block_size", .mode = S_IRUGO },
- .show = queue_logical_block_size_show,
-};
-
-static struct queue_sysfs_entry queue_physical_block_size_entry = {
- .attr = {.name = "physical_block_size", .mode = S_IRUGO },
- .show = queue_physical_block_size_show,
-};
+queue_sysfs_ro_attr(queue_physical_block_size, "physical_block_size");
+queue_sysfs_ro_attr(queue_io_min, "minimum_io_size");
+queue_sysfs_ro_attr(queue_io_opt, "optimal_io_size");
+queue_sysfs_ro_attr(queue_discard_granularity, "discard_granularity");
+queue_sysfs_ro_attr(queue_discard_max, "discard_max_bytes");
+queue_sysfs_ro_attr(queue_discard_zeroes_data, "discard_zeroes_data");

-static struct queue_sysfs_entry queue_io_min_entry = {
- .attr = {.name = "minimum_io_size", .mode = S_IRUGO },
- .show = queue_io_min_show,
-};
-
-static struct queue_sysfs_entry queue_io_opt_entry = {
- .attr = {.name = "optimal_io_size", .mode = S_IRUGO },
- .show = queue_io_opt_show,
-};
-
-static struct queue_sysfs_entry queue_discard_granularity_entry = {
- .attr = {.name = "discard_granularity", .mode = S_IRUGO },
- .show = queue_discard_granularity_show,
-};
-
-static struct queue_sysfs_entry queue_discard_max_entry = {
- .attr = {.name = "discard_max_bytes", .mode = S_IRUGO },
- .show = queue_discard_max_show,
-};
-
-static struct queue_sysfs_entry queue_discard_zeroes_data_entry = {
- .attr = {.name = "discard_zeroes_data", .mode = S_IRUGO },
- .show = queue_discard_zeroes_data_show,
-};
-
-static struct queue_sysfs_entry queue_nonrot_entry = {
- .attr = {.name = "rotational", .mode = S_IRUGO | S_IWUSR },
- .show = queue_nonrot_show,
- .store = queue_nonrot_store,
-};
-
-static struct queue_sysfs_entry queue_nomerges_entry = {
- .attr = {.name = "nomerges", .mode = S_IRUGO | S_IWUSR },
- .show = queue_nomerges_show,
- .store = queue_nomerges_store,
-};
-
-static struct queue_sysfs_entry queue_rq_affinity_entry = {
- .attr = {.name = "rq_affinity", .mode = S_IRUGO | S_IWUSR },
- .show = queue_rq_affinity_show,
- .store = queue_rq_affinity_store,
-};
-
-static struct queue_sysfs_entry queue_iostats_entry = {
- .attr = {.name = "iostats", .mode = S_IRUGO | S_IWUSR },
- .show = queue_iostats_show,
- .store = queue_iostats_store,
-};
+queue_sysfs_rw_attr(queue_nonrot, "rotational");
+queue_sysfs_rw_attr(queue_nomerges, "nomerges");
+queue_sysfs_rw_attr(queue_rq_affinity, "rq_affinity");
+queue_sysfs_rw_attr(queue_iostats, "iostats");

static struct attribute *default_attrs[] = {
&queue_requests_entry.attr,
&queue_ra_entry.attr,
&queue_max_hw_sectors_entry.attr,
&queue_max_sectors_entry.attr,
- &queue_iosched_entry.attr,
+ &elv_iosched_entry.attr,
&queue_hw_sector_size_entry.attr,
&queue_logical_block_size_entry.attr,
&queue_physical_block_size_entry.attr,
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 2b91b74..3a91008 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -23,6 +23,7 @@ struct module;
enum sysfs_attr_lock_class {
SYSFS_ATTR_NORMAL,
SYSFS_ATTR_PM_CONTROL,
+ SYSFS_ATTR_IOSCHED,
SYSFS_NR_CLASSES,
};

Larry Finger

unread,
Feb 8, 2010, 4:00:02 PM2/8/10
to
On 02/08/2010 03:52 AM, Amerigo Wang wrote:
> Similar to the previous PM case, in iosched, we hold an s_active
> lock to store "scheduler", meanwhile we want to remove "iosched/*"
> files.
>
> This patch depends on the previous one. I tested it on my machine,
> it fixes the problem.
>
> Reported-by: Hugh Dickins <hugh.d...@tiscali.co.uk>
> Signed-off-by: WANG Cong <amw...@redhat.com>
> Cc: Jens Axboe <jens....@oracle.com>

After applying the 2 patches to 2.6.33-rc7, I get the following:

ACPI: bus type pci registered
PCI: MMCONFIG for domain 0000 [bus 00-09] at [mem 0xe0000000-0xe09fffff] (base
0xe0000000)
PCI: MMCONFIG at [mem 0xe0000000-0xe09fffff] reserved in E820
PCI: Using configuration type 1 for base access
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 1, comm: swapper Not tainted 2.6.33-rc7-Linus-00010-g6339204-dirty #181
Call Trace:
[<ffffffff8107c6e6>] __lock_acquire+0xf86/0x1d30
[<ffffffff81078e7f>] ? lockdep_init_map+0x5f/0x5d0
[<ffffffff8107d52b>] lock_acquire+0x9b/0x120
[<ffffffff81167a93>] ? sysfs_addrm_finish+0x43/0x70
[<ffffffff81167243>] sysfs_deactivate+0xc3/0x110
[<ffffffff81167a93>] ? sysfs_addrm_finish+0x43/0x70
[<ffffffff813124d3>] ? mutex_lock_nested+0x243/0x300
[<ffffffff81167a93>] sysfs_addrm_finish+0x43/0x70
[<ffffffff81167af6>] remove_dir+0x36/0x40
[<ffffffff81167b09>] sysfs_remove_subdir+0x9/0x10
[<ffffffff81168ff6>] sysfs_remove_group+0x66/0xf0
[<ffffffff81861555>] param_sysfs_init+0x102/0x277
[<ffffffff8124a5bd>] ? sysdev_create_file+0xd/0x10
[<ffffffff8130fe46>] ? register_cpu+0xa3/0xa5
[<ffffffff81861453>] ? param_sysfs_init+0x0/0x277
[<ffffffff810001d7>] do_one_initcall+0x37/0x190
[<ffffffff8184c6d0>] kernel_init+0x14f/0x1a5
[<ffffffff81003bd4>] kernel_thread_helper+0x4/0x10
[<ffffffff8131417c>] ? restore_args+0x0/0x30
[<ffffffff8184c581>] ? kernel_init+0x0/0x1a5
[<ffffffff81003bd0>] ? kernel_thread_helper+0x0/0x10

This dump does not occur with standard 2.6.33-rc7. As the above turns off the
locking correctness validator, I cannot really test to see what happens when
suspending.

Larry

Cong Wang

unread,
Feb 8, 2010, 10:00:01 PM2/8/10
to

Ouch! I forgot to add the annotations to sysfs dirs...

Thanks much for the report, I will send an updated version soon!

0 new messages