Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH v2] fs/dcache.c: avoid soft-lockup in dput()

40 views
Skip to first unread message

Wei Fang

unread,
Jun 21, 2016, 11:10:06 PM6/21/16
to
We triggered soft-lockup under stress test which
open/access/write/close one file concurrently on more than
five different CPUs:

WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
...
[<ffffffc0003986f8>] dput+0x100/0x298
[<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
[<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
[<ffffffc00038f780>] filename_lookup+0x38/0xf0
[<ffffffc000391180>] user_path_at_empty+0x78/0xd0
[<ffffffc0003911f4>] user_path_at+0x1c/0x28
[<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230

->d_lock trylock may failed many times because of concurrently
operations, and dput() may execute a long time.

Fix this by replacing cpu_relax() with cond_resched().
dput() used to be sleepable, so make it sleepable again
should be safe.

Cc: <sta...@vger.kernel.org>
Signed-off-by: Wei Fang <fang...@huawei.com>
---
Changes v1->v2:
- add might_sleep() to annotate that dput() can sleep

fs/dcache.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index d5ecc6e..074fc1c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -578,7 +578,7 @@ static struct dentry *dentry_kill(struct dentry *dentry)

failed:
spin_unlock(&dentry->d_lock);
- cpu_relax();
+ cond_resched();
return dentry; /* try again with same dentry */
}

@@ -752,6 +752,8 @@ void dput(struct dentry *dentry)
return;

repeat:
+ might_sleep();
+
rcu_read_lock();
if (likely(fast_dput(dentry))) {
rcu_read_unlock();
--
1.7.1

Boqun Feng

unread,
Jun 22, 2016, 2:50:05 AM6/22/16
to
Hi Wei Fang,
Is it better to put the cond_resched() in the caller(i.e. dput()), right
before "goto repeat"? Because it's obviously a loop there, which makes
the purpose of cond_resched() more straightforward.

Regards,
Boqun
signature.asc

Wei Fang

unread,
Jul 5, 2016, 10:40:06 PM7/5/16
to
Hi, Boqun,

>> diff --git a/fs/dcache.c b/fs/dcache.c
>> index d5ecc6e..074fc1c 100644
>> --- a/fs/dcache.c
>> +++ b/fs/dcache.c
>> @@ -578,7 +578,7 @@ static struct dentry *dentry_kill(struct dentry *dentry)
>>
>> failed:
>> spin_unlock(&dentry->d_lock);
>> - cpu_relax();
>> + cond_resched();
>
> Is it better to put the cond_resched() in the caller(i.e. dput()), right
> before "goto repeat"? Because it's obviously a loop there, which makes
> the purpose of cond_resched() more straightforward.

Agreed, that's more reasonable. I'll send v3 soon.

Thanks,
Wei

Wei Fang

unread,
Jul 5, 2016, 11:30:06 PM7/5/16
to
We triggered soft-lockup under stress test which
open/access/write/close one file concurrently on more than
five different CPUs:

WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
...
[<ffffffc0003986f8>] dput+0x100/0x298
[<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
[<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
[<ffffffc00038f780>] filename_lookup+0x38/0xf0
[<ffffffc000391180>] user_path_at_empty+0x78/0xd0
[<ffffffc0003911f4>] user_path_at+0x1c/0x28
[<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230

->d_lock trylock may failed many times because of concurrently
operations, and dput() may execute a long time.

Fix this by replacing cpu_relax() with cond_resched().
dput() used to be sleepable, so make it sleepable again
should be safe.

Cc: <sta...@vger.kernel.org>
Signed-off-by: Wei Fang <fang...@huawei.com>
---
Changes v1->v2:
- add might_sleep() to annotate that dput() can sleep
Changes v2->v3:
- put cond_resched() in dput()

fs/dcache.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index ad4a542..77f6687 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -589,7 +589,6 @@ static struct dentry *dentry_kill(struct dentry *dentry)

failed:
spin_unlock(&dentry->d_lock);
- cpu_relax();
return dentry; /* try again with same dentry */
}

@@ -763,6 +762,8 @@ void dput(struct dentry *dentry)
return;

repeat:
+ might_sleep();
+
rcu_read_lock();
if (likely(fast_dput(dentry))) {
rcu_read_unlock();
@@ -796,8 +797,10 @@ repeat:

kill_it:
dentry = dentry_kill(dentry);
- if (dentry)
+ if (dentry) {
+ cond_resched();
goto repeat;
+ }
}
EXPORT_SYMBOL(dput);

--
1.7.1

Vaishali Thakkar

unread,
Sep 16, 2016, 4:00:05 AM9/16/16
to


On Wednesday 22 June 2016 08:31 AM, Wei Fang wrote:
> We triggered soft-lockup under stress test which
> open/access/write/close one file concurrently on more than
> five different CPUs:
>
> WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
> ...
> [<ffffffc0003986f8>] dput+0x100/0x298
> [<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
> [<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
> [<ffffffc00038f780>] filename_lookup+0x38/0xf0
> [<ffffffc000391180>] user_path_at_empty+0x78/0xd0
> [<ffffffc0003911f4>] user_path_at+0x1c/0x28
> [<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230
>
> ->d_lock trylock may failed many times because of concurrently
> operations, and dput() may execute a long time.
>
> Fix this by replacing cpu_relax() with cond_resched().
> dput() used to be sleepable, so make it sleepable again
> should be safe.

Hi,

Just a question regarding this change. As after this change
dput() is sleepable, is it still safe to use if under the
spinlock in the function d_prune_aliases?

Thanks
Vaishali

Al Viro

unread,
Sep 16, 2016, 8:20:04 AM9/16/16
to
On Fri, Sep 16, 2016 at 01:19:19PM +0530, Vaishali Thakkar wrote:

> Hi,
>
> Just a question regarding this change. As after this change
> dput() is sleepable, is it still safe to use if under the
> spinlock in the function d_prune_aliases?

It has always been sleepable and it wouldn't have been safe to use
under spinlocks. Which d_prune_aliases() does not do - __dentry_kill()
is called with dentry, its parent and its inode (if present) all locked and
it drops all those locks before returning.

Vaishali Thakkar

unread,
Sep 16, 2016, 9:00:06 AM9/16/16
to
Ah, I see. Alright. Thanks for the clarification.

>

--
Vaishali
0 new messages