Oracle IO tuning tips/practices on Windows

NetComrade

unread,

Feb 1, 2006, 6:18:14 PM2/1/06

to

Can anyone point to a useful resource on tuning IO for Oracle on
Windows?

I've never dealt with Oracle on Windows before, but someone asked to
help them out with IO performance problems (they can get the dell
powervault SCSI disk array to push data to about 80Megs/sec at the
(same) time when Oracle is only pushing about 20M/sec)

I know #'s are not that easy to compare, but I don't know where else
to look into besides db_multiblock_read_count.

E.g., one of their queries has the following stats:

call count cpu elapsed disk query current rows
---- ------ ----- ---------- ---------- ---------- ---- ----------
Parse 1 0.03 0.02 0 0 0 0
Execute 1 1.71 1.75 0 0 0 0
Fetch 13 6.35 150.80 13001 45741 0 300
---- ------ ----- ---------- ---------- ---------- ---- ----------
total 15 8.10 152.59 13001 45741 0 300

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
-------------------------------- Waited ---------- ------------
db file scattered read 121 1.83 4.10
db file sequential read 11137 1.85 129.46
SQL*Net message to client 13 0.00 0.00
SQL*Net more data to client 36 0.00 0.00
SQL*Net message from client 13 0.00 0.03

seems ike it's spending too much time on IO.

Thanks for any tips/links in advance

.......
We run Oracle 9.2.0.6 on RH4 AMD
remove NSPAM to email

Noons

unread,

Feb 1, 2006, 7:22:27 PM2/1/06

to

NetComrade wrote:
> Can anyone point to a useful resource on tuning IO for Oracle on
> Windows?

Nial Litchfield was puting together a compilation of resources
for tuning Oracle on Windows. Perhaps you can swing around
his website and ask?
http://www.orawin.info/services/index.php

> I've never dealt with Oracle on Windows before, but someone asked to
> help them out with IO performance problems (they can get the dell
> powervault SCSI disk array to push data to about 80Megs/sec at the
> (same) time when Oracle is only pushing about 20M/sec)

Check that the Oracle datafiles are in dedicated disk partition(s)
that does not have any other Windows-related files in it.
Check that these partition(s) was/were created and formatted with a
sector
size that matches the database block size (8K?).
Make sure it's not a case of "all database files in one drive".
Make sure no one has put a Windows paging file in the same disk drive
as the database. Make sure the db server is not being used as well
as a backup domain controller, file server, print server or worst of
all, a PDC. Microsoft recommends against any of these being used
as database servers for SQL Server and it's also valid for Oracle.

>From the figures you quoted, it looks like there is a lot of
physical IO going on for only 300 rows returned.
Some SQL tuning to reduce the amount of IO needed might be
the order of the day? Look into indexing as well: could be excessive
range scanning in which case more selective indexing might
be appropriate.

Joel Garry

unread,

Feb 1, 2006, 7:26:31 PM2/1/06

to

>seems ike it's spending too much time on IO.

Is it? Remember, db file sequential read is random access (great
semantics, eh? Well, it could be getting blocks one at a time in an
FTS...). Is the 80M/sec figure random access?
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96533/instance_tune.htm#15959

This is the sort of thing voodoo tuners love, because often the problem
can be morphed by simply enlarging the SGA or using the various buffer
pools without determining actual cause. Then you get cpu bound and
throw more hardware at it!

So, what is your cpu_multiblock_count, and why do you think changing it
would help? See
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/2f1e21ecdaca8331/25807a324350e834?lnk=st&q=insubject%3AMulti-Block+insubject%3Aread+insubject%3Acount&rnum=1&hl=en#25807a324350e834

jg
--
@home.com is bogus. Q -How do you get your technology work done, when
your IT department is more hindrance than help? A- You don't. You fire
them and outsource their jobs to India.
http://ask.slashdot.org/askslashdot/06/02/01/2117240.shtml

bdb...@gmail.com

unread,

Feb 1, 2006, 8:12:30 PM2/1/06

to

If I read the figures correctly, your average wait for single block io
is 11.6 ms.
That is horrid.
I would think that direct attached scsi raid with relatively new
hardware should have an average response time under 5 ms - and if a SAN
with a decent size cache is used - around 1 to 2 ms.

But you're starting in the wrong place.
Tune the SQL first.
45741 blocks being accessed is likely too high, but as the SQL is not
displayed here along with information regarding rows returned, row
source stats, version for the db server software on this box that
doesn't match your sig - we're just guessing.

As far as IO throughput, if you're only fetching 1 block at a time,
you're not going to see 80 MB/sec. You didn't list how many
controllers, controller channels and hard drives are in this
configuration. Say that a drive supports 500 IOPs and you have 4 drives
as RAID 10. The max throughput one would likely see would be 16.3
MB/sec - cache aside.

There is a paper on the hotsos site called "aligning blocks with stripe
sizes" or something to that effect.
You might also be interested in James Morle's "Scaling 8i" that is
downloadable from his website - check out the Oaktable site for links -
http://www.oaktable.net
Juan Loiza's paper "How to start living and stop defragmenting" is a
classic (introduces the SAME concept).

please list the following:

db server software version and patchset
db_block_size
number of SCSI RAID controllers and channels
number of hard drives
RAID configuration (e.g. 8 disk RAID 10)
stripe size (e.g. 256 KB stripe)
Read-ahead settings for the RAID volume
filesystem block size

An 8 drive RAID 10 volume of stripe size 256 KB would be a good start.
with an 8192 byte db block size, a db_file_multiblock_read_count = 32
would read a full stripe. That doesn't help much with single block io,
unless the blocks are prefetched and accessed before they are aged out.

If this is strictly an OLTP app you might want to consider a smaller
stripe size, such as 128 KB. Make sure that the RAID stripe size is a
multiple of the db_block_size.

What filesystem block size did you choose for the filesystem(s) that
support your datafiles? I hope that it matches your db_block_size.

hth

-bdbafh

Noons

unread,

Feb 2, 2006, 1:39:25 AM2/2/06

to

bdb...@gmail.com wrote:

> What filesystem block size did you choose for the filesystem(s) that
> support your datafiles? I hope that it matches your db_block_size.
>

He's only got one file system: NTFS. Don't forget, it's Windoze.
Block size is immaterial, but sector size isn't: if it doesn't approach
the
db block size, then he's immediately got an IO performance problem
in his hands. Ideally, it should be the same as the db block size.

yon...@yahoo.com

unread,

Feb 2, 2006, 10:21:42 AM2/2/06

to

Noons wrote:
>
> Check that these partition(s) was/were created and formatted with a
> sector size that matches the database block size (8K?).

I think you meant allocation unit or cluster size, not sector size.
It's the number you provide for format /a: switch. A sector size can
only be changed by the hard drive maker. I'm not sure if bigger sector
size hard drives are commercially available.

Yong Huang

NetComrade

unread,

Feb 2, 2006, 2:10:51 PM2/2/06

to

On 1 Feb 2006 16:22:27 -0800, "Noons" <wizo...@yahoo.com.au> wrote:

>NetComrade wrote:
>> Can anyone point to a useful resource on tuning IO for Oracle on
>> Windows?
>
>Nial Litchfield was puting together a compilation of resources
>for tuning Oracle on Windows. Perhaps you can swing around
>his website and ask?
>http://www.orawin.info/services/index.php

Nial needs to tune her blog software a bit :)
Thanks for the link

>
>
>> I've never dealt with Oracle on Windows before, but someone asked to
>> help them out with IO performance problems (they can get the dell
>> powervault SCSI disk array to push data to about 80Megs/sec at the
>> (same) time when Oracle is only pushing about 20M/sec)
>
>
>Check that the Oracle datafiles are in dedicated disk partition(s)
>that does not have any other Windows-related files in it.

The db is on the array, OS on local disks

>Check that these partition(s) was/were created and formatted with a
>sector
>size that matches the database block size (8K?).

That's why i needed some Ora/Win resource, I have no idea how to check
those things on Windows.

>Make sure it's not a case of "all database files in one drive".

It's SAME.. RAID 10 across 12 disks (6disk stripe)

>Make sure no one has put a Windows paging file in the same disk drive
>as the database. Make sure the db server is not being used as well
>as a backup domain controller, file server, print server or worst of
>all, a PDC. Microsoft recommends against any of these being used
>as database servers for SQL Server and it's also valid for Oracle.

Nah, this is a new machine and a new array bought just for the purpose
to offload some reporting off production machines.

>>From the figures you quoted, it looks like there is a lot of
>physical IO going on for only 300 rows returned.
>Some SQL tuning to reduce the amount of IO needed might be
>the order of the day? Look into indexing as well: could be excessive
>range scanning in which case more selective indexing might
>be appropriate.

SQL tuning is not an option :) (I wouldn't touch the SQL I saw)
I know SQL tuning might be able to help, but this is a reporting
database, and IO is obviously a problem.

NetComrade

unread,

Feb 2, 2006, 2:11:25 PM2/2/06

to

On 2 Feb 2006 07:21:42 -0800, yon...@yahoo.com wrote:

>I think you meant allocation unit or cluster size, not sector size.
>It's the number you provide for format /a: switch. A sector size can
>only be changed by the hard drive maker. I'm not sure if bigger sector
>size hard drives are commercially available.

Any idea how to get to that info?

thanks.

NetComrade

unread,

Feb 2, 2006, 2:43:15 PM2/2/06

to

On 1 Feb 2006 17:12:30 -0800, bdb...@gmail.com wrote:

>If I read the figures correctly, your average wait for single block io
>is 11.6 ms.
>That is horrid.
>I would think that direct attached scsi raid with relatively new
>hardware should have an average response time under 5 ms - and if a SAN
>with a decent size cache is used - around 1 to 2 ms.

I concur.

>
>But you're starting in the wrong place.
>Tune the SQL first.
>45741 blocks being accessed is likely too high, but as the SQL is not
>displayed here along with information regarding rows returned, row
>source stats, version for the db server software on this box that
>doesn't match your sig - we're just guessing.

The SQL gets various summaries... There is no doubt in my mind that
the SQL or the app could be tuned significantly, the plan is over 60
lines long and has more than a dozen 'nested loops' and a # of outer
joins. I don't really want to try to 'tune' it, as I am not going to
be too involved in this app (this is just a favor to someone, but I
know nothing about oracle on windows). I know this is a wrong approach
to 'tune' things, but I'd like to help to get the IO perform
adequately.

>As far as IO throughput, if you're only fetching 1 block at a time,
>you're not going to see 80 MB/sec. You didn't list how many
>controllers, controller channels and hard drives are in this
>configuration. Say that a drive supports 500 IOPs and you have 4 drives
>as RAID 10. The max throughput one would likely see would be 16.3
>MB/sec - cache aside.

The array is Dell PowerVault V220. There is a 128M cache.. more
details below

>There is a paper on the hotsos site called "aligning blocks with stripe
>sizes" or something to that effect.

I believe we already tried that (see below)

>You might also be interested in James Morle's "Scaling 8i" that is
>downloadable from his website - check out the Oaktable site for links -
>http://www.oaktable.net
>Juan Loiza's paper "How to start living and stop defragmenting" is a
>classic (introduces the SAME concept).

Brand new array, nothing to defrag

>please list the following:
>
>db server software version and patchset

9.2.0.7

>db_block_size
8K and 16K

>number of SCSI RAID controllers and channels

1 and 1 (I think)
>number of hard drives
12

>RAID configuration (e.g. 8 disk RAID 10)

12 disk RAID10 (6disk stripe)

>stripe size (e.g. 256 KB stripe)

64K

>Read-ahead settings for the RAID volume

adaptive read ahead and direct I/o for write

>filesystem block size
yet unknown..

>An 8 drive RAID 10 volume of stripe size 256 KB would be a good start.
>with an 8192 byte db block size, a db_file_multiblock_read_count = 32
>would read a full stripe. That doesn't help much with single block io,
>unless the blocks are prefetched and accessed before they are aged out.

Their db_file_multiblock_read_count is set to 8. Which is fine for an
8K block size and 64K stripe unit size. I don't really know how the
parameter works with you also have 16K tablespaces.. I advised playing
with the parameter (e.g. set it to 8x6disks=48), but it didn't seem to
help much.

Their SGA size is set like this:
db_16k_cache_size 1367343104
db_cache_size 209715200

The majority of stuff is in the 16K tablespaces I guess
(You can see I don't know that much about the system, I've just seen
some trace files and statspack reports)

>If this is strictly an OLTP app you might want to consider a smaller
>stripe size, such as 128 KB. Make sure that the RAID stripe size is a
>multiple of the db_block_size.

This is a 'DSS'.

A little more background: the 'customer' has been trying to offload
reporting off the main 2 databases. They're using some 3rd party tool
to combine the 2 databases into 1 'reporting' database. What they have
found though, is that their SAN (where the production databases are
sitting) way outperforms the little 'reporting' SCSI array they got.
As you and others have noted, this relatively new hardware should be
performing much better.

DA Morgan

unread,

Feb 2, 2006, 2:50:18 PM2/2/06

to

NetComrade wrote:
> Can anyone point to a useful resource on tuning IO for Oracle on
> Windows?

With apologies but I can't resist:

format c:

Install Linux.
--
Daniel A. Morgan
http://www.psoug.org
damo...@x.washington.edu
(replace x with u to respond)

Joel Garry

unread,

Feb 2, 2006, 5:23:13 PM2/2/06

to

NetComrade wrote:

> Their SGA size is set like this:
> db_16k_cache_size 1367343104
> db_cache_size 209715200

http://groups.google.com/group/comp.databases.oracle.server/msg/f05a24f01c03d312?dmode=source&hl=en
See especially #4 and #6. I would add, if you were to put everything
in the same blocksize, you might be able to make a lot better use of
multiple buffer pools to shuffle around physical i/o's and SGA froth.

I'm also wondering if you are pounding on some hot blocks, resulting in
those seq waits.

jg
--
@home.com is bogus.

http://www.eweek.com/article2/0,1895,1918198,00.asp

NetComrade

unread,

Feb 2, 2006, 6:12:53 PM2/2/06

to

On Thu, 02 Feb 2006 11:50:18 -0800, DA Morgan <damo...@psoug.org>
wrote:

>NetComrade wrote:
>> Can anyone point to a useful resource on tuning IO for Oracle on
>> Windows?
>
>With apologies but I can't resist:
>
>format c:
>
>Install Linux.

That's what I've been suggesting for years ;) It doesnt matter which
OS you run your Oracle on, so might as well run it on something that
works.

bdb...@gmail.com

unread,

Feb 2, 2006, 6:18:32 PM2/2/06

to

Have you considered provisioning the RAID volumes in groups of 4 disks
as RAID 10, and creating multiple datafiles per tablespace, such that
you would have 3 mount points, one per RAID volume (aka LUN).

LUN 0 ID 0,1,2,3
LUN 1 ID 4,5,8,9
LUN 2 ID 10,11,12,13

Create one partition per LUN, for each filesystem blocksize.
If all tablespaces have an 8 KB blocksize, one partition per LUN would
do, formatted with an 8192 byte blocksize. As you have 2 block sizes,
create 2 partitions per LUN, create a second partition per LUN with a
16384 blocksize.

Now you have a setup where the datafile blocks, filesystem and RAID
stripes align.

You have increased overhead in terms of having to create sets of
datafiles for each tablespace, so its not as simple as your SAME setup.

What you would gain, is that for single block reads, a read request
would only have to involve a single drive. This should provide higher
concurrency and lower average response time than your existing
configuration.

Currently, with a 64 KB stripe size (over 6 disks) and an 8 KB block
size, half of the blocks per stripe are divided over multiple hard
drives (1,4,5,8 are not). By not aligning the database file blocks and
the physical hard drives twice as many physical reads are required half
the time to access a single block.

That is a significant penalty.

hth.

-bdbafh

yon...@yahoo.com

unread,

Feb 2, 2006, 10:43:09 PM2/2/06

to

If you're on Windows XP or 2003, you have fsutil:

C:\>fsutil fsinfo ntfsinfo c:
...
Bytes Per Sector : 512
Bytes Per Cluster : 4096
...

On all versions, you use chkdsk:

C:\>chkdsk
The type of the file system is NTFS.
...
4096 bytes in each allocation unit.
...

Chkdsk doesn't tell you sector size but that's almost never different
from 512 bytes.

I'm sure there're non-Microsoft tools that can give this info.

Yong Huang

Noons

unread,

Feb 3, 2006, 12:33:47 AM2/3/06

to

yon...@yahoo.com wrote:
> I think you meant allocation unit or cluster size, not sector size.
> It's the number you provide for format /a: switch. A sector size can
> only be changed by the hard drive maker. I'm not sure if bigger sector
> size hard drives are commercially available.
>

Yes.

Noons

unread,

Feb 3, 2006, 12:57:00 AM2/3/06

to

NetComrade wrote:
> >
> >Nial Litchfield was puting together a compilation of resources
> >for tuning Oracle on Windows. Perhaps you can swing around
> >his website and ask?
> >http://www.orawin.info/services/index.php
>
> Nial needs to tune her blog software a bit :)
> Thanks for the link

"his". Not "her".

> >> I've never dealt with Oracle on Windows before, but someone asked to
> >> help them out with IO performance problems (they can get the dell
> >> powervault SCSI disk array to push data to about 80Megs/sec at the
> >> (same) time when Oracle is only pushing about 20M/sec)

How do they get the scsi disk pushing 80Mb/sec?

Another thing, do some checks with Windows performance monitor:
you want to see what size queues there are in this disk device,
there is a specific monitor for that.
You might also consider a defrag of those disks (or checking it) :
there is just way too many waits on "sequential reads". And my
guess is the allocation unit/cluster size might be off mark. Having
two block sizes though means ideally you need two partitions,
each with its own optimal block size. I don't know how dbfmbr
will react with two block sizes but I know that "sequential reads"
are not affected by its setting. Unless there is something very
atypical going on.

> The db is on the array, OS on local disks

Still, check that no one has created a paging file
in the array.

> That's why i needed some Ora/Win resource, I have no idea how to check
> those things on Windows.

Yong mentioned chkdsk, go with that.

> It's SAME.. RAID 10 across 12 disks (6disk stripe)

Given that you're getting mostly sequential read waits - which
are direct reads with indexing- it is probably a good idea to
set the stripe size pretty small. It's only an advantage with
large values if you're doing lots of FTS. Otherwise,
you are basically doing IO you might not need to do.

> We run Oracle 9.2.0.6 on RH4 AMD

and so you should!...

Niall Litchfield

unread,

Feb 3, 2006, 7:32:38 AM2/3/06

to

NetComrade wrote:
> On 1 Feb 2006 16:22:27 -0800, "Noons" <wizo...@yahoo.com.au> wrote:
>
> >NetComrade wrote:
> >> Can anyone point to a useful resource on tuning IO for Oracle on
> >> Windows?
> >
> >Nial Litchfield was puting together a compilation of resources
> >for tuning Oracle on Windows. Perhaps you can swing around
> >his website and ask?
> >http://www.orawin.info/services/index.php
>
> Nial needs to tune her blog software a bit :)
> Thanks for the link

And indeed get some more Windows specific content :). If there was a
specific issue that you have drop me a line privately and I'll have a
look at it. The software and hosting provider I'm using is very new to
me.

> >> I've never dealt with Oracle on Windows before, but someone asked to
> >> help them out with IO performance problems (they can get the dell
> >> powervault SCSI disk array to push data to about 80Megs/sec at the
> >> (same) time when Oracle is only pushing about 20M/sec)

How are they measuring the Oracle specific IO throughput?

> >Check that the Oracle datafiles are in dedicated disk partition(s)
> >that does not have any other Windows-related files in it.
>
> The db is on the array, OS on local disks
>
> >Check that these partition(s) was/were created and formatted with a
> >sector
> >size that matches the database block size (8K?).
>
> That's why i needed some Ora/Win resource, I have no idea how to check
> those things on Windows.

fsutil from the Windows Server Resource kit (this is this laptop)

C:\Program Files\Windows Resource Kits\Tools>fsutil fsinfo ntfsinfo c:
NTFS Volume Serial Number : 0xee80613680610705
Version : 3.1
Number Sectors : 0x0000000004a813ff
Total Clusters : 0x000000000095027f
Free Clusters : 0x00000000001ec232
Total Reserved : 0x0000000000000000

Bytes Per Sector : 512
Bytes Per Cluster : 4096

Bytes Per FileRecord Segment : 1024
Clusters Per FileRecord Segment : 0
Mft Valid Data Length : 0x000000000cf9c000
Mft Start Lcn : 0x00000000000c0000
Mft2 Start Lcn : 0x0000000000000010
Mft Zone Start : 0x00000000008dfe20
Mft Zone End : 0x00000000008fd4c0

Niall Litchfield
Oracle DBA
http://www.orawin.info/services/

Niall Litchfield

unread,

Feb 3, 2006, 7:45:07 AM2/3/06

to

NetComrade wrote:
> That's what I've been suggesting for years ;) It doesnt matter which
> OS you run your Oracle on, so might as well run it on something that
> works.

LOL

Niall

Jack

unread,

Feb 6, 2006, 3:40:39 AM2/6/06

to

Answers embedded:
"NetComrade" <netcomr...@bookexchange.net> wrote in message
news:nom4u1hajl4hc9579...@4ax.com...

> On 1 Feb 2006 17:12:30 -0800, bdb...@gmail.com wrote:
>
> The array is Dell PowerVault V220. There is a 128M cache.. more
> details below

It would be better with 256M or even more

>>db_block_size
> 8K and 16K
>>number of SCSI RAID controllers and channels
> 1 and 1 (I think)
>>number of hard drives
> 12
>>RAID configuration (e.g. 8 disk RAID 10)
> 12 disk RAID10 (6disk stripe)
>>stripe size (e.g. 256 KB stripe)
> 64K

Better would be 256KB or even 1MB
(Your drives are doing seek operations most of the time!)

>>An 8 drive RAID 10 volume of stripe size 256 KB would be a good start.
>>with an 8192 byte db block size, a db_file_multiblock_read_count = 32
>>would read a full stripe. That doesn't help much with single block io,
>>unless the blocks are prefetched and accessed before they are aged out.
>
> Their db_file_multiblock_read_count is set to 8. Which is fine for an

But 32 would be better after larger (256MK) stripe set

Jack