Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Bug#898060: Stall NFS clients: RPC request reserved 84 but used 272

913 views
Skip to first unread message

Harald Dunkel

unread,
Jun 6, 2018, 6:40:02 AM6/6/18
to
There were tons of messages, then the network went down.
kernel.log:

Jun 5 17:33:39 nasl006a kernel: [ 4624.059640] RPC request reserved 84 but used 268
Jun 5 17:33:39 nasl006a kernel: [ 4624.060169] RPC request reserved 84 but used 268
Jun 5 17:33:39 nasl006a kernel: [ 4624.060702] RPC request reserved 84 but used 268
Jun 5 17:33:39 nasl006a kernel: [ 4624.061229] RPC request reserved 84 but used 268
Jun 5 17:33:39 nasl006a kernel: [ 4661.387897] INFO: rcu_sched self-detected stall on CPU
Jun 5 17:33:39 nasl006a kernel: [ 4661.387903] 0-...: (1264 ticks this GP) idle=c9d/140000000000001/0 softirq=252509/252509 fqs=1264
Jun 5 17:33:39 nasl006a kernel: [ 4661.387907] (t=21032 jiffies g=419791 c=419790 q=374419)
Jun 5 17:33:39 nasl006a kernel: [ 4661.387909] Task dump for CPU 0:
Jun 5 17:33:39 nasl006a kernel: [ 4661.387913] nfsd R running task 0 3542 2 0x00000008
Jun 5 17:33:39 nasl006a kernel: [ 4661.387919] ffffffff91b18e00 ffffffff90ea5c7b 0000000000000000 ffffffff91b18e00
Jun 5 17:33:39 nasl006a kernel: [ 4661.387923] ffffffff90f7e95b ffff8ab52f4196c0 ffffffff91a4fc80 0000000000000000
Jun 5 17:33:39 nasl006a kernel: [ 4661.387927] ffffffff91b18e00 00000000ffffffff ffffffff90ee152a 0000000000cf0a41
Jun 5 17:33:39 nasl006a kernel: [ 4661.387928] Call Trace:
Jun 5 17:33:39 nasl006a kernel: [ 4661.387939] <IRQ>
Jun 5 17:33:39 nasl006a kernel: [ 4661.387939] [<ffffffff90ea5c7b>] ? sched_show_task+0xcb/0x130
Jun 5 17:33:39 nasl006a kernel: [ 4661.387945] [<ffffffff90f7e95b>] ? rcu_dump_cpu_stacks+0x92/0xb2
Jun 5 17:33:39 nasl006a kernel: [ 4661.387950] [<ffffffff90ee152a>] ? rcu_check_callbacks+0x75a/0x8b0
Jun 5 17:33:39 nasl006a kernel: [ 4661.387955] [<ffffffff90eefbd8>] ? update_wall_time+0x498/0x7b0
Jun 5 17:33:39 nasl006a kernel: [ 4661.387960] [<ffffffff90ef7950>] ? tick_sched_do_timer+0x30/0x30
Jun 5 17:33:39 nasl006a kernel: [ 4661.387963] [<ffffffff90ee80e8>] ? update_process_times+0x28/0x50
Jun 5 17:33:39 nasl006a kernel: [ 4661.387967] [<ffffffff90ef7360>] ? tick_sched_handle.isra.12+0x20/0x50
Jun 5 17:33:39 nasl006a kernel: [ 4661.387970] [<ffffffff90ef7988>] ? tick_sched_timer+0x38/0x70
Jun 5 17:33:39 nasl006a kernel: [ 4661.387973] [<ffffffff90ee8bbe>] ? __hrtimer_run_queues+0xde/0x250
Jun 5 17:33:39 nasl006a kernel: [ 4661.387976] [<ffffffff90ee929c>] ? hrtimer_interrupt+0x9c/0x1a0
Jun 5 17:33:39 nasl006a kernel: [ 4661.387982] [<ffffffff9141597f>] ? smp_apic_timer_interrupt+0x3f/0x50
Jun 5 17:33:39 nasl006a kernel: [ 4661.387986] [<ffffffff91414276>] ? apic_timer_interrupt+0x96/0xa0
Jun 5 17:33:39 nasl006a kernel: [ 4661.387993] <EOI>
Jun 5 17:33:39 nasl006a kernel: [ 4661.387993] [<ffffffff90ecfce6>] ? console_unlock+0x246/0x610
Jun 5 17:33:39 nasl006a kernel: [ 4661.387998] [<ffffffff90ed03c6>] ? vprintk_emit+0x316/0x4d0
Jun 5 17:33:39 nasl006a kernel: [ 4661.388004] [<ffffffff90f7e703>] ? printk+0x5a/0x76
Jun 5 17:33:39 nasl006a kernel: [ 4661.388037] [<ffffffffc0a3280e>] ? svc_xprt_release+0x14e/0x160 [sunrpc]
Jun 5 17:33:39 nasl006a kernel: [ 4661.388061] [<ffffffffc0a34301>] ? svc_send+0x101/0x130 [sunrpc]
Jun 5 17:33:39 nasl006a kernel: [ 4661.388074] [<ffffffffc0b1ac40>] ? nfsd_destroy+0x60/0x60 [nfsd]
Jun 5 17:33:39 nasl006a kernel: [ 4661.388096] [<ffffffffc0a24492>] ? svc_process+0x102/0x1a0 [sunrpc]
Jun 5 17:33:39 nasl006a kernel: [ 4661.388107] [<ffffffffc0b1ad29>] ? nfsd+0xe9/0x150 [nfsd]
Jun 5 17:33:39 nasl006a kernel: [ 4661.388112] [<ffffffff90e984c9>] ? kthread+0xd9/0xf0
Jun 5 17:33:39 nasl006a kernel: [ 4661.388116] [<ffffffff90e983f0>] ? kthread_park+0x60/0x60
Jun 5 17:33:39 nasl006a kernel: [ 4661.388119] [<ffffffff914125f7>] ? ret_from_fork+0x57/0x70
Jun 5 17:33:39 nasl006a kernel: [ 4724.437911] INFO: rcu_sched self-detected stall on CPU
Jun 5 17:33:39 nasl006a kernel: [ 4724.437919] 0-...: (2405 ticks this GP) idle=c9d/140000000000001/0 softirq=252509/252509 fqs=2404
Jun 5 17:33:39 nasl006a kernel: [ 4724.437924] (t=36795 jiffies g=419791 c=419790 q=381632)
Jun 5 17:33:39 nasl006a kernel: [ 4724.437926] Task dump for CPU 0:
Jun 5 17:33:39 nasl006a kernel: [ 4724.437932] nfsd R running task 0 3542 2 0x00000008
Jun 5 17:33:39 nasl006a kernel: [ 4724.437940] ffffffff91b18e00 ffffffff90ea5c7b 0000000000000000 ffffffff91b18e00
Jun 5 17:33:39 nasl006a kernel: [ 4724.437945] ffffffff90f7e95b ffff8ab52f4196c0 ffffffff91a4fc80 0000000000000000
Jun 5 17:33:39 nasl006a kernel: [ 4724.437950] ffffffff91b18e00 00000000ffffffff ffffffff90ee152a 0000000000cf0a41
Jun 5 17:33:39 nasl006a kernel: [ 4724.437951] Call Trace:
Jun 5 17:33:39 nasl006a kernel: [ 4724.437965] <IRQ>
Jun 5 17:33:39 nasl006a kernel: [ 4724.437965] [<ffffffff90ea5c7b>] ? sched_show_task+0xcb/0x130
Jun 5 17:33:39 nasl006a kernel: [ 4724.437973] [<ffffffff90f7e95b>] ? rcu_dump_cpu_stacks+0x92/0xb2
Jun 5 17:33:39 nasl006a kernel: [ 4724.437980] [<ffffffff90ee152a>] ? rcu_check_callbacks+0x75a/0x8b0
Jun 5 17:33:39 nasl006a kernel: [ 4724.437987] [<ffffffff90eefbd8>] ? update_wall_time+0x498/0x7b0
Jun 5 17:33:39 nasl006a kernel: [ 4724.437994] [<ffffffff90ef7950>] ? tick_sched_do_timer+0x30/0x30
Jun 5 17:33:39 nasl006a kernel: [ 4724.437997] [<ffffffff90ee80e8>] ? update_process_times+0x28/0x50
Jun 5 17:33:39 nasl006a kernel: [ 4724.438002] [<ffffffff90ef7360>] ? tick_sched_handle.isra.12+0x20/0x50
Jun 5 17:33:39 nasl006a kernel: [ 4724.438007] [<ffffffff90ef7988>] ? tick_sched_timer+0x38/0x70
Jun 5 17:33:39 nasl006a kernel: [ 4724.438012] [<ffffffff90ee8bbe>] ? __hrtimer_run_queues+0xde/0x250
Jun 5 17:33:40 nasl006a kernel: [ 4763.986103] RPC request reserved 84 but used 268
Jun 5 17:33:40 nasl006a kernel: [ 4763.988831] nfsd: peername failed (err 107)!
Jun 5 17:33:40 nasl006a kernel: [ 4763.989521] nfsd: peername failed (err 107)!
Jun 5 17:33:40 nasl006a kernel: [ 4763.989859] RPC request reserved 120 but used 268
Jun 5 17:33:40 nasl006a kernel: [ 4763.989863] nfsd: peername failed (err 107)!
Jun 5 17:33:40 nasl006a kernel: [ 4763.990180] nfsd: peername failed (err 107)!
Jun 5 17:33:40 nasl006a kernel: [ 4763.990552] nfsd: peername failed (err 107)!
:
:


Using kernel 4.16.5 (linux-image-4.16.0-0.bpo.1-amd64 version 4.16.5-1~bpo9+1 the
"RPC request" messages didn't show up yet.

Regards
Harri

Kevin Labecot

unread,
Jul 2, 2018, 7:50:02 AM7/2/18
to
On Tue, 5 Jun 2018 17:47:14 +0200 Harald Dunkel <harald...@aixigo.de> wrote:
> metoo
>
> I have "rpc request reserved 84 but used 268", but I am not picky.


I've exactly the same problem and it comes back each X days.
It flooded my server of this line and filled all the disk.
I had to restart the nfs server to temporary solve this issue but it will be back in few days.

[709533.129830] RPC request reserved 84 but used 268
[709533.130319] RPC request reserved 84 but used 268
[709533.130642] RPC request reserved 84 but used 268
[709533.131384] RPC request reserved 84 but used 268
[709533.131730] RPC request reserved 84 but used 268
[709533.132136] RPC request reserved 84 but used 268

# cat /etc/debian_version
9.4

(over LXC container)

>
>
> Regards
> Harri
>
>

Regards,
Kevin

Christof Koehler

unread,
Aug 7, 2018, 6:30:02 AM8/7/18
to
Hello everybody,

for your information I filed now, problem happes for me with ubuntu
18.04 (kernel 4.15)
https://bugs.launchpad.net/ubuntu/+source/linux-meta/+bug/1785788

Best Regards

Christof

Christof Koehler

unread,
Aug 21, 2018, 8:20:02 AM8/21/18
to
Hello everybody,

in the thread on the ubuntu tracker someone else pointed out that this
was also observed on fedora 26, see comment 14 in
https://bugzilla.redhat.com/show_bug.cgi?id=1552037

Apparently it was suggested there (comment 14) that forcing NFS vers=4.0
for the mount appears to fix this as far as it can be reliably
reproduced. Note that debian 9 and ubuntu 18.04 apparently default to
vers=4.2.

I will try that next.

Best Regards

Christof

MAG4 Piemonte

unread,
Nov 14, 2018, 8:10:03 AM11/14/18
to
Hi, we also confirm the bug with kernel 4.9.0-8-amd64 #1 SMP Debian
4.9.110-3+deb9u6 (2018-10-08) x86_64 GNU/Linux.
Regards!

Guido

Salvatore Bonaccorso

unread,
Feb 4, 2019, 7:50:02 AM2/4/19
to
Hi,

These are possibly related to
https://bugzilla.kernel.org/show_bug.cgi?id=202435 mentioning in
https://bugzilla.kernel.org/show_bug.cgi?id=202435#c3 that upstream
commits 53da6a53e1d4 "nfsd4: catch some false session retries" (may
also need preceding 085def3ade52 "nfsd4: fix cached replies to solo
SEQUENCE compounds") likely fix the issue.

Salvatore

MAG4 Piemonte

unread,
Oct 14, 2019, 7:40:02 AM10/14/19
to
Hi, two months ago we upgrade to Debian 10 and since then the issue vanished
...
Regards!

Guido
0 new messages