[Lustre-discuss] High Load and high system CPU for mds

huangql

unread,

Feb 28, 2010, 9:31:01 PM2/28/10

to lustre-discuss

Hi,

We got a problem that the MDS has high load value and the system CPU is up to 60% when running chown command on client. It's strange that the load value and system CPU didn't decrease to the normal level as long as it getted high. Even we can't do anything on clients and OSS. You can see the information with top command as follows:

[root@mainmds ~]# top

top - 10:19:02 up 1:03, 3 users, load average: 28.73, 27.10, 23.88

Tasks: 515 total, 44 running, 471 sleeping, 0 stopped, 0 zombie

Cpu0 : 0.0%us, 84.1%sy, 0.0%ni, 15.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu1 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu2 : 0.0%us, 72.5%sy, 0.0%ni, 27.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu3 : 0.0%us, 83.5%sy, 0.0%ni, 16.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu4 : 0.0%us, 78.4%sy, 0.0%ni, 21.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu5 : 0.0%us, 82.9%sy, 0.0%ni, 17.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu6 : 0.0%us, 69.2%sy, 0.0%ni, 30.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu7 : 0.0%us, 79.6%sy, 0.0%ni, 20.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu8 : 0.0%us, 77.2%sy, 0.0%ni, 22.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu9 : 0.0%us, 58.9%sy, 0.0%ni, 41.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu10 : 0.0%us, 84.4%sy, 0.0%ni, 15.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu11 : 0.0%us, 97.6%sy, 0.0%ni, 2.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu12 : 0.0%us, 81.4%sy, 0.0%ni, 18.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu13 : 0.0%us, 85.0%sy, 0.0%ni, 15.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu14 : 0.0%us, 88.0%sy, 0.0%ni, 12.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu15 : 0.0%us, 36.3%sy, 0.0%ni, 63.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 24682716k total, 2985412k used, 21697304k free, 268360k buffers

Swap: 24579440k total, 0k used, 24579440k free, 368904k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

5449 root 16 0 0 0 0 R 100.2 0.0 52:46.12 ptlrpcd

5434 root 16 0 0 0 0 R 89.0 0.0 34:15.77 socknal_sd07

5432 root 16 0 0 0 0 R 88.3 0.0 32:43.12 socknal_sd05

5430 root 16 0 0 0 0 R 79.1 0.0 30:37.78 socknal_sd03

5436 root 16 0 0 0 0 R 61.2 0.0 29:08.47 socknal_sd09

5440 root 16 0 0 0 0 S 59.5 0.0 33:31.32 socknal_sd13

5433 root 16 0 0 0 0 R 49.0 0.0 23:20.61 socknal_sd06

5431 root 15 0 0 0 0 R 45.0 0.0 26:04.43 socknal_sd04

5427 root 15 0 0 0 0 S 44.7 0.0 23:31.11 socknal_sd00

5435 root 15 0 0 0 0 S 44.3 0.0 24:50.30 socknal_sd08

5439 root 15 0 0 0 0 R 43.7 0.0 24:23.79 socknal_sd12

5437 root 15 0 0 0 0 R 39.7 0.0 27:11.58 socknal_sd10

5438 root 16 0 0 0 0 S 37.4 0.0 40:50.69 socknal_sd11

5441 root 15 0 0 0 0 S 35.4 0.0 26:35.59 socknal_sd14

According to the top information, we can see the proc ptlrpcd with 100% CPU, it is not normal for the system, it likes the ptlrpcd become locked. So we have to reboot the MDS to solve the proble now. We don't know about the phenomena. Do someone get the problem or have some idea for it? I will be appreciate for your any help.

Addition, we use the lustre 1.8.1.1 on MDS and OSS, lustre1.6.5 on clients.

Thanks advance for you.

Cheers

Qiulan Huang

--------------------------------------------------------------

Computing Center IHEP Office: Computing Center,123

19B Yuquan Road Tel: (+86) 10 88236012-607

P.O. Box 918-7 Fax: (+86) 10 8823 6839

Beijing 100049,China Email: hua...@ihep.ac.cn

--------------------------------------------------------------

2010-03-01

huangql

Oleg Drokin

unread,

Mar 1, 2010, 2:35:18 PM3/1/10

to huangql, Maxim Patlasov, lustre-discuss discuss

Hello!

On Feb 28, 2010, at 9:31 PM, huangql wrote:
> We got a problem that the MDS has high load value and the system CPU is up to 60% when running chown command on client. It's strange that the load value and system CPU didn't decrease to the normal level as long as it getted high. Even we can't do anything on clients and OSS. You can see the information with top command as follows:

How many files did that chown command affected (was it a chown -R for some huge directory tree?).
Essentially chown (setattr) works in two steps, first it changes MDS attributes then it queues an async RPC for
every file object to update the attributes on OST. If there are many files that are getting updated this way,
there would be a lot of such messages queued and all the messages are sent at once with no rate limiting.
Thisis consistent with what you are seeing here, ptlrpcd is busy sending/receiving RPCs (ptlrpcd is lustre
thread that handles async RPCs sending/completion) and individual socklnd threads are also busy processing
network transfers (also I think the code in lnet is not tuned to process huge amounts of outstanding RPCs
which leads to additional CPU overhead in that case).

So on the surface it looks like everything performs as expected, though certainly lustre might have
behaved better.
How long did you wait with this high cpu utilization before deciding to reboot and how many files
were affected by the chown?

Bye,
Oleg
_______________________________________________
Lustre-discuss mailing list
Lustre-...@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Oleg Drokin

unread,

Mar 2, 2010, 12:18:39 AM3/2/10

to huangql, lustre-discuss discuss

Hello!

I see. Sounds like a bug then.
We do not test more than 1 version back because this is what we guarantee to work.
Still can you please file a bug in our bugzilla for this as it's the newer MDS that
exhibits a problem due to client input?

Thanks.

Bye,
Oleg
On Mar 2, 2010, at 12:12 AM, huangql wrote:

>
> Hi, Oleg
>
> Thank you for your timely reply. We wait for this high cpu utilization for one night before deciding to reboot as the load and System utilization didn't decrease at all. For the second question, We really use chown -R for the huge derectory trees with the space of 5.9TB where is about ten thousands of files.
> However, we tried to this on the 1.8.1.1 client, the system CPU for MDS also got up to a level about 60%, then it decrease to a normal level after the chown command finished and it finished in a expected time. According to it, We think it's a confict between client 1.6.6 and server 1.8.1.1. Have you ever try this? Now we are using the client 1.6.5 and client 1.8.1.1, because we have the two version servers, We will upgrade all lustre client to 1.8.1.1 until we evacuate the 1.6.5 servers.

>
>
>
>
> Cheers
> Qiulan Huang
> --------------------------------------------------------------
> Computing Center IHEP Office: Computing Center,123

> 19B Yuquan Road Tel: (+86) 10 88236012-604

> P.O. Box 918-7 Fax: (+86) 10 8823 6839
> Beijing 100049,China Email: hua...@ihep.ac.cn
> --------------------------------------------------------------
>

> 2010-03-02
> huangql
> 发件人： Oleg Drokin
> 发送时间： 2010-03-02 03:31:34
> 收件人： huangql
> 抄送： lustre-discuss discuss; Maxim Patlasov
> 主题： Re: [Lustre-discuss] High Load and high system CPU for mds

> Hello!
> On Feb 28, 2010, at 9:31 PM, huangql wrote:

> > We got a problem that the MDS has high load value and the system CPU is up to 60% when running chown command on client. It's strange that the load value and system CPU didn't decrease to the normal level as long as it getted high. Even we can't do anything on clients and OSS. You can see the information with top command as follows:

Isaac Huang

unread,

Mar 2, 2010, 1:09:07 AM3/2/10

to Oleg Drokin, Maxim Patlasov, lustre-discuss discuss

On Mon, Mar 01, 2010 at 02:35:18PM -0500, Oleg Drokin wrote:
> Hello!
>
> On Feb 28, 2010, at 9:31 PM, huangql wrote:
> > We got a problem that the MDS has high load value and the system CPU is up to 60% when running chown command on client. It's strange that the load value and system CPU didn't decrease to the normal level as long as it getted high. Even we can't do anything on clients and OSS. You can see the information with top command as follows:
>
> How many files did that chown command affected (was it a chown -R for some huge directory tree?).
> Essentially chown (setattr) works in two steps, first it changes MDS attributes then it queues an async RPC for
> every file object to update the attributes on OST. If there are many files that are getting updated this way,
> there would be a lot of such messages queued and all the messages are sent at once with no rate limiting.
> Thisis consistent with what you are seeing here, ptlrpcd is busy sending/receiving RPCs (ptlrpcd is lustre
> thread that handles async RPCs sending/completion) and individual socklnd threads are also busy processing

For small messages, the socklnd can't do zero-copy sends, so there's
an additional cost for copying the small messages into socket send
buffers, which adds to CPU usage.

A show_cpu/show_processes from SysRQ should tell what the processes
are being busy with..

> network transfers (also I think the code in lnet is not tuned to process huge amounts of outstanding RPCs
> which leads to additional CPU overhead in that case).