Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

High vmstat, filesystem unresponsive then hang 6.1 Stable

2 views
Skip to first unread message

Francisco Reyes

unread,
Jul 1, 2006, 9:49:18 PM7/1/06
to FreeBSD Stable List
I believe this may be related to the NFS issues mentioned recent, but
hopefully I may have captured enough info to help others troubleshoot..
I got the header of some ps commands.. and when was about to do full listing
of the same ps commands to files.. the machine hung up.

The machine is 6.1 Stable around 6-25 ( plus or minus 1 day).


iostat 5 (not much of a load)
tty da0 cpu
tin tout KB/t tps MB/s us ni sy in id
0 31 17.71 125 2.17 20 0 5 1 74
0 26 8.57 23 0.19 0 0 1 0 99
0 9 33.73 10 0.34 0 0 0 0 99
0 21 8.42 18 0.15 0 0 1 1 99
0 9 15.92 58 0.90 0 0 0 0 99
0 9 15.18 7 0.10 0 0 0 0 99
0 53 12.93 9 0.11 0 0 1 0 99
0 31 5.17 58 0.29 0 0 1 1 99

vmstat 5 (very high 'b' column)
procs memory page disk faults cpu
r b w avm fre flt re pi po fr sr da0 in sy cs us sy id
0 248 2 1410436 110728 1519 2 0 0 1644 264 0 4481 8862 9168 20 6 74
0 248 0 1410436 110796 0 0 0 0 13 0 4 700 40 1426 0 1 99
0 248 0 1410436 110764 1 0 0 0 39 0 14 1253 722 2615 0 1 99
0 248 0 1410436 110720 1 0 0 0 10 0 5 407 396 899 0 1 99
0 248 0 1410436 110704 1 0 0 0 60 0 21 2822 360 5695 0 2 98
0 248 0 1410436 110684 1 0 0 0 10 0 7 538 434 1166 0 1 99
0 248 0 1410436 110668 0 0 0 0 75 0 51 576 163 1026 0 0 99
0 248 0 1410436 110696 0 0 0 0 23 0 31 1171 190 2271 0 1 99

vmstat 5
procs memory page disk faults cpu
r b w avm fre flt re pi po fr sr da0 in sy cs us sy id
0 250 1 1399688 152000 1517 2 0 0 1643 264 0 4479 8853 9163 20 6 74
0 250 0 1399688 151968 2 0 0 0 25 0 28 1395 966 2852 0 2 98
0 250 0 1399692 151892 1 0 0 0 12 0 6 446 540 986 0 0 99
0 250 2 1399692 151604 1 0 0 0 50 0 37 803 675 1611 0 1 99


Don't recall which ps..
411 1 0 ufs ?? Ds 0:04.81 /usr/sbin/mountd -r
37675 650 0 ufs ?? D 0:00.46 /usr/bin/perl /data/backaway/mailarchive/client/bin/smtpproxy 127.0.0.1:10026 127.0.0.1:10025 (perl5.8.7)
37919 650 0 ufs ?? D 0:00.46 /usr/bin/perl /data/backaway/mailarchive/client/bin/smtpproxy 127.0.0.1:10026 127.0.0.1:10025 (perl5.8.7)
39306 650 0 ufs ?? D 0:00.39 /usr/bin/perl /data/backaway/mailarchive/client/bin/smtpproxy 127.0.0.1:10026 127.0.0.1:10025 (perl5.8.7)
40214 38649 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d nbi...@dialonewolfedale.com
40220 32943 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d jr...@zoofriends.org
40223 33257 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d logmon...@ewarna.com
40226 32942 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d nbi...@dialonewolfedale.com
40228 33199 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d enqu...@markaw.com
40231 38599 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d sne...@starlo.com
40233 32896 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d pci...@microimage.cc
40236 33224 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d twilk...@briorealty.com
40238 32876 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d ginn...@reidrealestate.com
40240 32976 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d cla...@classicrealtyinc.com
40242 35580 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d patric...@cifo.org
40246 35593 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d friz...@thestranger.com
40248 32923 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d ear...@gkgcpa.com
40252 35596 4100 ufs ?? Ds 0:00.01 /usr/local/bin/maildrop -d seem...@reidrealestate.com
40253 29833 4100 ufs ?? Ds 0:00.01 /usr/local/bin/maildrop -d seem...@reidrealestate.com

ps ax -O ppid,flags,mwchan | awk '$6 ~ /^D/ || $6 == "STAT"'
PID PPID F MWCHAN TT STAT TIME COMMAND
2 0 204 - ?? DL 0:17.68 [g_event]
3 0 204 - ?? DL 9:14.85 [g_up]
4 0 204 - ?? DL 10:50.81 [g_down]
5 0 204 - ?? DL 0:02.93 [thread taskq]
6 0 204 - ?? DL 0:00.00 [acpi_task0]
7 0 204 - ?? DL 0:00.00 [acpi_task1]
8 0 204 - ?? DL 0:00.00 [acpi_task2]
9 0 204 - ?? DL 0:00.00 [kqueue taskq]
15 0 204 - ?? DL 8:47.55 [yarrow]
27 0 204 - ?? DL 0:01.72 [fdc0]
28 0 204 psleep ?? DL 0:43.74 [pagedaemon]
29 0 204 psleep ?? DL 0:00.00 [vmdaemon]
30 0 20c pgzero ?? DL 7:35.27 [pagezero]
31 0 204 psleep ?? DL 0:57.11 [bufdaemon]
32 0 204 syncer ?? DL 8:46.07 [syncer]
33 0 204 vlruwt ?? DL 0:28.29 [vnlru]
34 0 204 sdflus ?? DL 2:35.54 [softdepflush]
35 0 204 - ?? DL 1:01.20 [schedcpu]
411 1 0 ufs ?? Ds 0:04.81 /usr/sbin/mountd -r
39306 650 0 ufs ?? D 0:00.39 /usr/bin/perl /data/backaway/mailarchive/client/bin/smtpproxy 127.0.0.1:10026 127.0.0.1:10025 (perl5.8.7)
40214 38649 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d nbi...@dialonewolfedale.com
40220 32943 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d jr...@zoofriends.org
40223 33257 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d logmon...@ewarna.com
40226 32942 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d nbi...@dialonewolfedale.com
40228 33199 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d enqu...@markaw.com
40231 38599 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d sne...@starlo.com
40233 32896 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d pci...@microimage.cc
40236 33224 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d twilk...@briorealty.com
40238 32876 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d ginn...@reidrealestate.com
40240 32976 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d cla...@classicrealtyinc.com
40242 35580 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d patric...@cifo.org
40246 35593 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d friz...@thestranger.com
40248 32923 4100 ufs ?? Ds 0:00.00 /usr/local/bin/maildrop -d ear...@gkgcpa.com


ps axlww
UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
0 0 0 0 12 0 0 0 - WLs ?? 0:00.00 [swapper]
0 1 0 0 8 0 744 268 wait ILs ?? 0:00.01 /sbin/init --
0 2 0 0 -8 0 0 8 - DL ?? 0:17.68 [g_event]
0 3 0 0 -8 0 0 8 - DL ?? 9:14.93 [g_up]
0 4 0 0 -8 0 0 8 - DL ?? 10:50.90 [g_down]
0 5 0 0 8 0 0 8 - DL ?? 0:02.93 [thread taskq]
0 6 0 0 8 0 0 8 - DL ?? 0:00.00 [acpi_task0]
0 7 0 0 8 0 0 8 - DL ?? 0:00.00 [acpi_task1]
0 8 0 0 8 0 0 8 - DL ?? 0:00.00 [acpi_task2]
0 9 0 0 8 0 0 8 - DL ?? 0:00.00 [kqueue taskq]
0 10 0 153 171 0 0 8 - RL ?? 3939:36.70 [idle: cpu1]
0 11 0 148 171 0 0 8 - RL ?? 4416:30.08 [idle: cpu0]
0 12 0 2 -44 0 0 8 - WL ?? 50:41.76 [swi1: net]
0 13 0 0 -32 0 0 8 - WL ?? 8:56.20 [swi4: clock sio]
0 14 0 0 -36 0 0 8 - WL ?? 0:00.00 [swi3: vm]
0 15 0 0 96 0 0 8 - DL ?? 8:47.68 [yarrow]
UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
0 0 0 0 12 0 0 0 - WLs ?? 0:00.00 [swapper]
0 1 0 0 8 0 744 268 wait ILs ?? 0:00.01 /sbin/init --
0 2 0 0 -8 0 0 8 - DL ?? 0:17.68 [g_event]
0 3 0 0 -8 0 0 8 - DL ?? 9:14.93 [g_up]
0 4 0 0 -8 0 0 8 - DL ?? 10:50.90 [g_down]
0 5 0 0 8 0 0 8 - DL ?? 0:02.93 [thread taskq]
0 6 0 0 8 0 0 8 - DL ?? 0:00.00 [acpi_task0]
0 7 0 0 8 0 0 8 - DL ?? 0:00.00 [acpi_task1]
0 8 0 0 8 0 0 8 - DL ?? 0:00.00 [acpi_task2]
0 9 0 0 8 0 0 8 - DL ?? 0:00.00 [kqueue taskq]
0 10 0 153 171 0 0 8 - RL ?? 3939:36.70 [idle: cpu1]
0 11 0 148 171 0 0 8 - RL ?? 4416:30.08 [idle: cpu0]
0 12 0 2 -44 0 0 8 - WL ?? 50:41.76 [swi1: net]
0 13 0 0 -32 0 0 8 - WL ?? 8:56.20 [swi4: clock sio]
0 14 0 0 -36 0 0 8 - WL ?? 0:00.00 [swi3: vm]
0 15 0 0 96 0 0 8 - DL ?? 8:47.68 [yarrow]
0 16 0 0 -24 0 0 8 - WL ?? 0:00.01 [swi6: task queue]
0 17 0 0 -24 0 0 8 - WL ?? 0:00.00 [swi6: +]
0 18 0 0 -28 0 0 8 - WL ?? 6:34.50 [swi5: +]
0 19 0 0 -40 0 0 8 - WL ?? 6:42.62 [swi2: cambio]
0 20 0 0 -52 0 0 8 - WL ?? 0:00.00 [irq9: acpi0]
0 21 0 0 -64 0 0 8 - WL ?? 0:00.00 [irq14: ata0]
0 22 0 0 -64 0 0 8 - WL ?? 0:00.00 [irq15: ata1]
0 23 0 0 -68 0 0 8 - WL ?? 8:13.27 [irq26: bge0]
0 24 0 0 -68 0 0 8 - WL ?? 50:29.26 [irq27: bge1]
0 25 0 0 -60 0 0 8 - WL ?? 0:00.01 [irq1: atkbd0]
0 26 0 0 -48 0 0 8 - WL ?? 0:00.00 [swi0: sio]
0 27 0 0 -8 0 0 8 - DL ?? 0:01.72 [fdc0]
0 28 0 0 -16 0 0 8 psleep DL ?? 0:43.74 [pagedaemon]
0 29 0 0 20 0 0 8 psleep DL ?? 0:00.00 [vmdaemon]
0 30 0 0 171 0 0 8 pgzero DL ?? 7:35.27 [pagezero]
0 31 0 0 -16 0 0 8 psleep DL ?? 0:57.11 [bufdaemon]
0 32 0 0 20 0 0 8 syncer DL ?? 8:46.35 [syncer]
0 33 0 0 -4 0 0 8 vlruwt DL ?? 0:28.29 [vnlru]
0 34 0 0 -16 0 0 8 sdflus DL ?? 2:35.54 [softdepflush]
0 35 0 0 -40 0 0 8 - DL ?? 1:01.29 [schedcpu]
0 116 1 255 20 0 1220 648 pause Is ?? 0:00.00 adjkerntz -i
0 295 1 0 4 0 516 276 select Is ?? 0:05.71 /sbin/devd
0 337 1 0 96 0 1344 908 select Ss ?? 5:54.01 /usr/sbin/syslogd -s
0 354 1 0 96 0 1412 1032 select Ss ?? 0:07.06 /usr/sbin/rpcbind
0 411 1 0 -4 0 1536 1128 ufs Ds ?? 0:04.81 /usr/sbin/mountd -r
0 413 1 0 4 0 1364 956 accept Is ?? 0:00.02 nfsd: master (nfsd)
0 414 413 4 4 0 1240 716 - S ?? 101:39.74 nfsd: server (nfsd)
0 415 413 0 4 0 1240 716 - S ?? 24:34.31 nfsd: server (nfsd)
0 416 413 0 4 0 1240 716 - S ?? 9:23.71 nfsd: server (nfsd)
0 417 413 0 4 0 1240 716 - S ?? 4:21.56 nfsd: server (nfsd)
0 419 413 0 4 0 1240 716 - I ?? 2:24.04 nfsd: server (nfsd)
0 420 413 0 4 0 1240 716 - I ?? 0:01.46 nfsd: server (nfsd)

Any insights would be greatly appreciated.
We are likely to try and downgrade to 5.5 stable.. 6.X has been nothing but
problems to us with regards to NFS.. both on the client and server.
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

User Freebsd

unread,
Jul 2, 2006, 12:27:49 PM7/2/06
to Francisco Reyes, FreeBSD Stable List

This is the same issue that I've been hitting, and that requires the
serial console / DDB stuff described in the debugging deadlocks web page
that I pointed you at ...

So far *knock on wood* since adding all of the debugging to one of my
server, none of mine have done it ... but the more ppl experiencing this,
and getting the debugging in place to provide proper kernel traces, the
better ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email . scr...@hub.org MSN . scr...@hub.org
Yahoo . yscrappy Skype: hub.org ICQ . 7615664

0 new messages