How to analysis high IOWait problem

music4

unread,

Feb 22, 2005, 2:22:56 AM2/22/05

to

Greetings,

Our server (Sun Netra 1280, Solaris 2.8) always has more than 20% iowait. It
looks really bad. I am trying to find a way to identify which cause high
iowait.

Any idea will be greatly appreciated. Thanks in advance!

Evan

Darren Dunham

unread,

Feb 22, 2005, 2:58:05 AM2/22/05

to

music4 <mus...@163.net> wrote:
> Greetings,

> Our server (Sun Netra 1280, Solaris 2.8) always has more than 20% iowait. It
> looks really bad.

I/Owait is not always a problem. Why do you think it is bad in this
case?

> I am trying to find a way to identify which cause high iowait.

Whenever your CPU has some idle time, and at least one thread has an
outstanding I/O call, you'll accumulate I/O wait time.

If you have very little CPU needs but a lot of I/O needs (think of a
database serving lots of simple queries), then you'd probably see lots
of iowait time.

--
Darren Dunham ddu...@taos.com
Senior Technical Consultant TAOS http://www.taos.com/
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >

Alan Hargreaves - Product Technical Support (APAC)

unread,

Feb 22, 2005, 8:20:47 PM2/22/05

to music4

Why is it that folks *always* assume that IOWait is bad?

I wrote a doc on this a bit over a year ago, have a read of
http://sunsolve.sun.com/search/document.do?assetkey=1-9-75659-1

Because of all of the misunderstanding associated with IOwait, it is
defined to be 0 in Solaris 10.

The really important thing to take away from that document is that
IOWait is a subset of idle. You only get IOwait time if there is nothing
else ready to run from the dispatch queues.

alan.
--
Alan Hargreaves - http://blogs.sun.com/tpenta
Kernel/VOSJEC/Performance Engineer
Product Technical Support (APAC)
Sun Microsystems

Richard Pettit

unread,

Feb 22, 2005, 10:03:30 PM2/22/05

to

Alan Hargreaves - Product Technical Support (APAC) wrote:
> Why is it that folks *always* assume that IOWait is bad?
>
> I wrote a doc on this a bit over a year ago, have a read of
> http://sunsolve.sun.com/search/document.do?assetkey=1-9-75659-1
>
>
> Because of all of the misunderstanding associated with IOwait, it is
> defined to be 0 in Solaris 10.
>
> The really important thing to take away from that document is that
> IOWait is a subset of idle. You only get IOwait time if there is nothing
> else ready to run from the dispatch queues.
>
> alan.

That's the pinnacle of wrong answers. A hard-coded zero, I mean.

Hey, scan rate doesn't mean crap either, even though I read more
articles in this group from folks that say *any* scan rate is bad.
Should I assume Solaris 11 will have that hard-coded to zero too?

A loose cough doesn't mean you have pneumonia. But that doesn't mean
you should ignore it either.

Rich

ba...@smaalders.net

unread,

Feb 22, 2005, 11:43:44 PM2/22/05

to

Iowait per cpu doesn't really map to any real-world information...
IOwait for threads would be more meaningful, but isn't collected
right now.

If I run NCPU cpu-bound threads on a machine, I will never see
any IOwait regardless of how many thousands of other threads that
block waiting for IO... and if I write a program that spawns NCPU
threads on an otherwise idle machine and those threads do nothing
but random reads from a dvd, I'll see 100% iowait on every cpu.

Because this statistic has virtually no meaning on an MP, iowait
per cpu is no longer reported.

- Bart

Michael Bian

unread,

Feb 22, 2005, 11:50:21 PM2/22/05

to

Even,

Have you tried "iostat/sar" to see on which disk huge number of I/O
operation are executed. "iostat -P" can even give report for each disk
partition. Then you may decide which application program causes that based
on your knowledge of applications on your server.

Use "serv" (service time, in fact it's response time) in their report as an
indicator of disk I/O performance, if it keeps great than 30 (ms), you may
think about to distribute I/O operations on different disk, use disk mirror
and etc.

Regards,
Michael

"music4" <mus...@163.net> wrote in message
news:cveml7$c...@netnews.proxy.lucent.com...

music4

unread,

Feb 23, 2005, 12:34:46 AM2/23/05

to

>
> Why is it that folks *always* assume that IOWait is bad?
>
> I wrote a doc on this a bit over a year ago, have a read of
> http://sunsolve.sun.com/search/document.do?assetkey=1-9-75659-1
>
>
> Because of all of the misunderstanding associated with IOwait, it is
> defined to be 0 in Solaris 10.
>
> The really important thing to take away from that document is that
> IOWait is a subset of idle. You only get IOwait time if there is nothing
> else ready to run from the dispatch queues.
>
> alan.
> --
> Alan Hargreaves - http://blogs.sun.com/tpenta
> Kernel/VOSJEC/Performance Engineer
> Product Technical Support (APAC)
> Sun Microsystems

Alan,

Thanks for the article. Now, I understatnd what iowait means. But when CPU
is occupied by a thread that is waiting for IO, can CPU be used by other
threads?

If not, although CPU is idle (do nothing but wait), I will think wait is
also busy. High iowait means a lot of CPU time are idle but can not be used
to process other threads. That's reason why I feel high iowait is bad. And
therefore I want to analysis why iowait is hight, and try to reduce iowait
to make more CPU time to be available for other threads.

Correct me please.

Evan

Dan Koren

unread,

Feb 23, 2005, 1:48:13 AM2/23/05

to

"music4" <mus...@163.net> wrote in message

news:cvh4mf$r...@netnews.proxy.lucent.com...

Hello ?!?

Iowait means precisely that the CPU is available.

Is English your mother tongue? If not, you should
focus your efforts towards learning the language
so you can read manuals profficiently ;-)

dk

music4

unread,

Feb 23, 2005, 2:22:31 AM2/23/05

to

"Dan Koren" <dank...@yahoo.com> wrote in message
news:421c2748$1...@news.meer.net...

>
> Hello ?!?
>
> Iowait means precisely that the CPU is available.
>
> Is English your mother tongue? If not, you should
> focus your efforts towards learning the language
> so you can read manuals profficiently ;-)
>
>
> dk
>
>

I am not an English speaking man. I need to improve my English skill. But
have you read Alan's artical about how iowait is calculated? My
understanding was based on Alan's article rather than the word "IOwait".
According to Alan's article, there are four values for CPU statistic: idle,
kernal, user and wait. When a thread is waiting for IO, wait counter is
increased. But if CPU can be occupied by other thread, kernal or user
counter will also be increased. Is that true?

Casper H.S. Dik

unread,

Feb 23, 2005, 3:45:30 AM2/23/05

to

Richard Pettit <richard...@gmail.com> writes:

>That's the pinnacle of wrong answers. A hard-coded zero, I mean.

The placeholder value is left there because so many tools depend
on looking at it. "0" is about as meaningful as the current value.

If you want to know about I/O, use iostat. I/O wait is really
only a measure of the relative time needed to process the data
versus the time needed to get it off disk. A characteristic
of the workload.

Also, when you start a CPU bound job, your I/O wait suddenly
drops to zero; yet the jobs which want the I/O are still
waiting just as much. That doesn't strike me as useful.

>Hey, scan rate doesn't mean crap either, even though I read more
>articles in this group from folks that say *any* scan rate is bad.
>Should I assume Solaris 11 will have that hard-coded to zero too?

No, a scanrate is a meaningful indicator of the system being low
on memory. (Having just seen a 2-way Opteron system stressed with
a scan rate of 5 million, I wouldn't say it's quite meaningful)

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

richard...@gmail.com

unread,

Feb 23, 2005, 10:59:27 AM2/23/05

to

Casper H.S. Dik wrote:
> The placeholder value is left there because so many tools depend
> on looking at it. "0" is about as meaningful as the current value.

Whatever. It's a no-win. Both values are useless.

> If you want to know about I/O, use iostat. I/O wait is really
> only a measure of the relative time needed to process the data
> versus the time needed to get it off disk. A characteristic
> of the workload.

If I'd like to know about I/O, I'll use my own tools that sort
processes by which one is generating the most I/O. Then I'll know what
process is creating the problem. iostat paints with broad strokes and
is used as a next alternative. I would, however, like to see better
per-process I/O metrics.

> Also, when you start a CPU bound job, your I/O wait suddenly
> drops to zero; yet the jobs which want the I/O are still
> waiting just as much. That doesn't strike me as useful.

If you're that CPU bound, you have a bigger issue than the I/O wait and
you should be looking for CPU bound processes and what their problem is
anyway.

> No, a scanrate is a meaningful indicator of the system being low
> on memory. (Having just seen a 2-way Opteron system stressed with
> a scan rate of 5 million, I wouldn't say it's quite meaningful)
>
> Casper

Scan rate is useful in the same way that I/O wait is. In the absence of
better metrics, it must suffice. A per-process average page residency
time would be better. Then, the volatility of the working set of each
process can be examined. Any current measurement for APRT, at the
process or system level, is ad hoc and cannot be taken seriously.

Many think if they see a blip in the sr column of vmstat that they have
a memory shortage. First, the VM system scratching an itch does not
qualify as a shortfall. Second, the CPU power, bus bandwidth and disk
speeds of modern computers allows for a scan rate much higher and for
longer bursts than many will give berth for. A continuous high rate
(and that's a relative measure) is indicative of memory contention. An
extremely high spike over a short period, if it's an aberration, can be
noted, but not acted on. Such high spikes occurring frequently with the
VM system settling back down to quiescence can be disruptive and should
be treated with more physical memory.

Casper H.S. Dik

unread,

Feb 23, 2005, 4:31:31 PM2/23/05

to

richard...@gmail.com writes:

>If I'd like to know about I/O, I'll use my own tools that sort
>processes by which one is generating the most I/O. Then I'll know what
>process is creating the problem. iostat paints with broad strokes and
>is used as a next alternative. I would, however, like to see better
>per-process I/O metrics.

So use dtrace; it allows you to do exactly that in S10.