Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ORACLE on Linux - IO bottleneck

19 views
Skip to first unread message

Wyvern

unread,
Feb 8, 2006, 12:57:02 PM2/8/06
to
Hello,

we've RedHat Linux AS3 (upd. 1) running in a 8 proccessor IA64 Itanium
II machine with RAM-16Gb. Oracle 9i (OLTP database) under RawDevices
with
an oracle blocksize of 8k and 2 QLogic Fibre Channel Adapters (QLA2340)
with
Symmetrix EMC disk array. We also have ASync I/O activated in oracle.
Kernel version is "2.4.21-9.EL #1 SMP"
All tablespaces are created with an 8K block size.

Some oracle params:
---------------------------------------------------------------------------------------------------------------
db_block_buffers integer 0
db_block_checking boolean FALSE
db_block_checksum boolean TRUE
db_block_size integer 8192
db_cache_advice string ON
db_cache_size big integer 1090519040
db_create_file_dest string
db_create_online_log_dest_1 string
db_create_online_log_dest_2 string
db_create_online_log_dest_3 string
db_create_online_log_dest_4 string
db_create_online_log_dest_5 string
db_domain string
db_file_multiblock_read_count integer 8
db_file_name_convert string
db_files integer 512
db_keep_cache_size big integer 0
dblink_encrypt_login boolean FALSE
db_recycle_cache_size big integer 0
dbwr_io_slaves integer 0
db_writer_processes integer 1
db_16k_cache_size big integer 0
db_2k_cache_size big integer 0
db_32k_cache_size big integer 0
db_4k_cache_size big integer 0
db_8k_cache_size big integer 0
filesystemio_options string ASYNCH
---------------------------------------------------------------------------------------------------------------

Look at this:
---------------------------------------------------------------------------­-----------------------------------

# sar (copy-paste a little fragment but all day is similar)
00:00:00 CPU %user %nice %system %iowait %idle

.......
10:05:00 all 29,54 0,00 6,14 27,96 36,36

10:10:00 all 46,80 0,00 5,82 15,32 32,06

10:15:00 all 30,88 0,00 2,96 17,90 48,25

10:20:00 all 32,21 0,00 7,69 19,01 41,09

10:25:00 all 37,14 0,00 6,27 38,62 17,97

10:30:00 all 33,94 0,00 7,20 29,62 29,24

10:35:00 all 52,47 0,00 10,31 28,08 9,15

10:40:00 all 55,75 0,00 5,87 13,78 24,60

10:45:00 all 26,06 0,00 7,99 12,23 53,72

---------------------------------------------------------------------------­-----------------------------------

# iostat 2 /dev/sdc1 (there are many disks but this is mainly used)


cpu-med: %user %nice %sys %iowait %idle
13,02 0,00 0,40 1,85 84,73


Device: tps Blq_leid/s Blq_escr/s Blq_leid Blq_escr
sdc1 1534,25 2673,31 174,09 5344 348


cpu-med: %user %nice %sys %iowait %idle
13,20 0,00 1,52 11,42 73,86


Device: tps Blq_leid/s Blq_escr/s Blq_leid Blq_escr
sdc1 8482,14 15847,74 125,06 31680 250
---------------------------------------------------------------------------­-----------------------------------

# iostat -x 2 /dev/sdc1


Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
sdc1 6833,34 114,06 7902,86 95,05 14743,20 209,10 7371,60
104,55 1,87 34,12 4,26 0,12 93,15


cpu-med: %user %nice %sys %iowait %idle
15,23 0,00 2,11 11,32 71,34


Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
sdc1 7149,99 25,01 8151,98 35,52 15295,47 134,07 7647,73
67,03 1,88 32,95 4,03 0,11 92,75
---------------------------------------------------------------------------­-----------------------------------

I´ve only shown /dev/sdc statistics because almost ALL the RAWs are in

that device.

Well, now the questions (¡¡of course!! ;-) ):

1.- There is a generic low performance at the system, at SO level,
at oracle level and at application level. ¿Can we suppose that there
is an IO bottleneck?
2.- ¿Are the "wrqm/s" / "rrqm/s" values correct?

We've different DDBB with similar characteristics running under similar

hardware but with EXT3 and we've got much better performance.
We also have RAW DEVICES under AIX (4.3 and 5.2) and the performance is

PERFECT.

I've been reading different manuals and documentation about Raw Devices

and Direct I/O in Linux (from Oracle an Redhat). Everything makes me
think that oracle
blocksize (8K) should not affect performance drastically because we use
DirectIO.

Well, we´ve been analizing diferent statistics from Storage system
(ECM - Symmetrix) and the number of IOs the /dev/sdc device is doing is

near hardware limit (about 8000) and the medium IO size of all of them
is
more or less 2k. I don´t know why this happen when all the IO against
this
device is done by oracle and oracle have an 8K db_block_size.

¿Any idea?

Maybe this message is OFF-TOPIC but I think it doesn´t, sorry me if it

is.

Some help, please please please ....

Thanks in advance.

Fabrizio Magni

unread,
Feb 8, 2006, 2:31:50 PM2/8/06
to
Wyvern wrote:
>
> # iostat -x 2 /dev/sdc1
>
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
> avgrq-sz avgqu-sz await svctm %util
> sdc1 6833,34 114,06 7902,86 95,05 14743,20 209,10 7371,60
> 104,55 1,87 34,12 4,26 0,12 93,15
>
>
> cpu-med: %user %nice %sys %iowait %idle
> 15,23 0,00 2,11 11,32 71,34
>
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
> avgrq-sz avgqu-sz await svctm %util
> sdc1 7149,99 25,01 8151,98 35,52 15295,47 134,07 7647,73
> 67,03 1,88 32,95 4,03 0,11 92,75
> ---------------------------------------------------------------------------­-----------------------------------
>

From what I read here: your system is asking for 512byte (wsec/wkB and
rsec/rkB) read and writes.
Physically the writes are done in 2k (w/wkB) while the read in less of
1k (7647,73/8151,98).
The service time is low (0,11) probably due to the really short reads
but it result in an inefficient operation.

If the I/O is sequential you would benefit of an I/O scheduler...

> I´ve only shown /dev/sdc statistics because almost ALL the RAWs are in
>
> that device.
>
> Well, now the questions (¡¡of course!! ;-) ):
>
> 1.- There is a generic low performance at the system, at SO level,
> at oracle level and at application level. ¿Can we suppose that there
> is an IO bottleneck?

Possible but not sure.
You are using your disks poorly, sorry, :(

> 2.- ¿Are the "wrqm/s" / "rrqm/s" values correct?
>

No, they are low. You are not merging the I/O properly (maybe because
your reads are not sequential).


>
> I've been reading different manuals and documentation about Raw Devices
>
> and Direct I/O in Linux (from Oracle an Redhat). Everything makes me
> think that oracle
> blocksize (8K) should not affect performance drastically because we use
> DirectIO.
>

Direct I/O is not always the solution...

> Well, we´ve been analizing diferent statistics from Storage system
> (ECM - Symmetrix) and the number of IOs the /dev/sdc device is doing is
>
> near hardware limit (about 8000) and the medium IO size of all of them
> is
> more or less 2k. I don´t know why this happen when all the IO against
> this
> device is done by oracle and oracle have an 8K db_block_size.
>

The storage statistics are right for writes. Check if the reads are done
in about 1k.

> ¿Any idea?
>

Just to be sure: check the readahead of your device.

( The /sys pseudo filesystem contains the information. Even hdparm can
give you the number).

--
Fabrizio Magni

fabrizi...@mycontinent.com

replace mycontinent with europe

Noons

unread,
Feb 8, 2006, 9:05:51 PM2/8/06
to
Wyvern wrote:
> Hello,
>
> we've RedHat Linux AS3 (upd. 1) running in a 8 proccessor IA64 Itanium
> II machine with RAM-16Gb. Oracle 9i (OLTP database) under RawDevices
> with
> an oracle blocksize of 8k and 2 QLogic Fibre Channel Adapters (QLA2340)
> with
> Symmetrix EMC disk array. We also have ASync I/O activated in oracle.
> Kernel version is "2.4.21-9.EL #1 SMP"

Hmmmm.....
You have a 64-bit processor system, but the version of
RH you're running seems to be the 32-bit version and then
again four upgrade levels below what Oracle recommends as
a minimum (upd.5 instead of upd.1) for 9i.

I'd first look into getting the correct version of Linux: 64-bit and
definitely higher than RHAS3-upd.1.
And then a 64-bit version of Oracle for Linux.
Call Oracle support and get them to tell you the exact versions
of everything you should be running: I' ve found that's faster
than pouring through all the thousands of conflicting, out of date
and out of context information in as many sites.

Only after that would I start looking into IO problems.


> We've different DDBB with similar characteristics running under similar
> hardware but with EXT3 and we've got much better performance.
> We also have RAW DEVICES under AIX (4.3 and 5.2) and the performance is

Have you patched up Oracle and RH with all the required
fixes to run aio in 9i and RH3? There's a bunch of them
for both. The 64-bit thing is another clanger in that
equation.

If I were you, I'd be looking at running at the very least RHAS4
for 64-bit with Oracle 9i for 64-bit. RHAS4 seems to require a
lot less patches to get going with Oracle and aio.

Having said that: we're running RHAS3-upd.5, 9.2.0.6, all patched up,
and just direct IO. 8K block size. Without a saturated IO controller,
I'm getting more than 500 direct IO/sec operations in a test program.
But when the controller saturates, I get a lot less than that. It's a
good
indicator of when we need to start splitting things across
controllers/hyperchannels.

Mladen Gogala

unread,
Feb 9, 2006, 12:27:04 AM2/9/06
to
On Wed, 08 Feb 2006 18:05:51 -0800, Noons wrote:

> Having said that: we're running RHAS3-upd.5, 9.2.0.6, all patched up,
> and just direct IO. 8K block size. Without a saturated IO controller,
> I'm getting more than 500 direct IO/sec operations in a test program.
> But when the controller saturates, I get a lot less than that. It's a
> good
> indicator of when we need to start splitting things across
> controllers/hyperchannels.

Bear in mind that at some point all those interrupt requests will start
saturating your system bus. PC buses, even the server versions, are
nowhere near the capacity of the true midrange SMP servers like HP 9000
series or IBM P960 series machines. Modern disk controllers will do
massive amounts of DMA communication with memory, using the same system
bus that CPU boards use to synchronize caches, that network controllers
use to notify CPU of the network interrupts and that all peripheral
devices use to communicate with CPU and RAM. My experience tells me that
no matter how good PC server you buy, you will never get more then 2500
I/O operations per second out of it. On a heavily used OLTP database, that
amounts to 200-300 concurrent users, with up to 50 active at a time.
When you get there, you are simply in need of a more powerful box.
Linux itself makes thing hard to measure. One critical thing that Linux
doesn't do for you is measuring I/O requests per process. The only
monitoring utility that can do that is atop, with the home page at:
http://www.atcomputing.nl/Tools/atop and it offers kernel patches that
need to be installed in order for I/O accounting to work.
Second, Linux doesn't show you the time spent on the interrupt stack.
You cannot see whether your motherboard is loaded to the capacity or not,
because you cannot see how much of the system time is actually spent
servicing interrupts. On HP-UX GlancePlus does show you that. Linux is
not an OS that I'd recommend for heavy duty serious processing. Linux
kernel 2.6 is an unmitigated disaster and I would urge anybody to think
twice before entrusting critical OLTP systems which must provide good
response time to Linux.

--
http://www.mgogala.com

Wyvern

unread,
Feb 9, 2006, 2:35:14 AM2/9/06
to
Hello,

first of all, thanks for your answer...

> From what I read here: your system is asking for 512byte (wsec/wkB and
> rsec/rkB) read and writes.
> Physically the writes are done in 2k (w/wkB) while the read in less of
> 1k (7647,73/8151,98).
> The service time is low (0,11) probably due to the really short reads
> but it result in an inefficient operation.
>
> If the I/O is sequential you would benefit of an I/O scheduler...

I didn´t know what was an IO scheduler. I´ve been looking for and
what I´ve seen is that kernel 2.6 implements one but kernel 2.4
doesn´t, isn´t it?

I´ve not understood very well the IO Scheduler function but I´ll read
more about it ...

> > 1.- There is a generic low performance at the system, at SO level,
> > at oracle level and at application level. ¿Can we suppose that there
> > is an IO bottleneck?
>
> Possible but not sure.
> You are using your disks poorly, sorry, :(

Yea!, OF COURSE, this is the only thing I´ve got very clear and this
is my main problem ...

> > 2.- ¿Are the "wrqm/s" / "rrqm/s" values correct?
> >
>
> No, they are low. You are not merging the I/O properly (maybe because
> your reads are not sequential).

OK. Reads should not be secuential in a OLTP database, I´m wrong? So,
I suppose reads are not secuential as you say ...

> > I've been reading different manuals and documentation about Raw Devices
> >
> > and Direct I/O in Linux (from Oracle an Redhat). Everything makes me
> > think that oracle
> > blocksize (8K) should not affect performance drastically because we use
> > DirectIO.
> >
>
> Direct I/O is not always the solution...

¡Note it! Thanks...

> The storage statistics are right for writes. Check if the reads are done
> in about 1k.

Sure, reads are in about 1K.

> Just to be sure: check the readahead of your device.
>
> ( The /sys pseudo filesystem contains the information. Even hdparm can
> give you the number).

hdparm says readahead in that device (and in all other) is 120. I
didn´t know what this parameter meant but I´ve been reading about and
what I´ve understood is that for non-secuential reads readahead should
be smaller than in a secuentially big file reading, isn´t it?

Thanks for all Fabrizio.

Wyvern

unread,
Feb 9, 2006, 2:55:09 AM2/9/06
to
Hello,

We have many versions of Redhat in different installations. We want to
standardize all of them but not upgrading to RHAS4 but to RHAS3. About
the update, you´re right, we should go to update 5.

On the other hand, we´ve got Oracle 9i 64bit compilation on release
9.2.0.5 in every 64bit linux machine with better performance and
without problems running also in RHAS3 so I´m not sure SO version is
the main problem in this case but the way it´s configured and "tuned".

In any case, thanks for your answer and we´ll consider in updating
RHAS3 to upd.5.

Thanks again.

Fabrizio Magni

unread,
Feb 9, 2006, 3:30:28 AM2/9/06
to
Wyvern wrote:
>
> I didn´t know what was an IO scheduler. I´ve been looking for and
> what I´ve seen is that kernel 2.6 implements one but kernel 2.4
> doesn´t, isn´t it?
>

In the "old" times of 2.4 kernel it was called I/O elevator.


>
> OK. Reads should not be secuential in a OLTP database, I´m wrong? So,
> I suppose reads are not secuential as you say ...
>

That's can be true at RDBMS level but the I/O at OS level is different.
Oracle asks for blocks of 8K so for the OS (on raw) they are 16
contiguos 512 byte reads or writes that could be merged.

Personally I would try the same workload on a filesystem (on this
configuration with and without direct I/O).


> hdparm says readahead in that device (and in all other) is 120. I
> didn´t know what this parameter meant but I´ve been reading about and
> what I´ve understood is that for non-secuential reads readahead should
> be smaller than in a secuentially big file reading, isn´t it?
>

On most OLTP it can be disabled reducing io wait (and maybe service time).
Of course, test before implementing it!

Regards

Mladen Gogala

unread,
Feb 9, 2006, 8:19:57 AM2/9/06
to
On Wed, 08 Feb 2006 23:35:14 -0800, Wyvern wrote:

> I´ve not understood very well the IO Scheduler function but I´ll read
> more about it ...

There isn't much to understand: process scheduler decides which process to
execute first, I/O scheduler decides which one, among the several
simultaneous I/O requests to the same device, should be fulfilled first.
The need for scheduling arises only if you have multiple simultaneous
requests for the same device most of the time, in which case you have a
problem, with or without the scheduler. I addition to that, most of the
smart SCSI controllers have their own schedulers. The effects of I/O
scheduler are much overrated.

--
http://www.mgogala.com

Fabrizio Magni

unread,
Feb 9, 2006, 9:14:20 AM2/9/06
to


Hi Mladen,
actually the I/O scheduler(s) works even if the I/O request are not
simultaneous.
typical is the merging of I/O for sequential reads or writes.

The advantages can be tremendous.

Here is an example (same hardware, same device):

time dd if=/dev/zero of=/dev/raw/raw3 bs=8k count=10000
10000+0 records in
10000+0 records out

real 0m41.063s
user 0m0.016s
sys 0m0.412s

time dd if=/dev/zero of=/u02/foo bs=8k count=10000
10000+0 records in
10000+0 records out

real 0m2.324s
user 0m0.005s
sys 0m0.536s

The second dd has its I/O merged.

Consider this controversial but the I/O scheduler eliminate the need to
have oracle block size "aligned" to file system block size (even without
direct I/O).

Regards

Wyvern

unread,
Feb 9, 2006, 10:52:30 AM2/9/06
to

> Hi Mladen,
> actually the I/O scheduler(s) works even if the I/O request are not
> simultaneous.
> typical is the merging of I/O for sequential reads or writes.
>
> The advantages can be tremendous.
>
> Here is an example (same hardware, same device):
>

......

> The second dd has its I/O merged.
>
> Consider this controversial but the I/O scheduler eliminate the need to
> have oracle block size "aligned" to file system block size (even without
> direct I/O).
>

It´s interesting what you say and the example shown. This morning,
looking about diferent IO parameters at device level with "blockdev"
command, I´ve seen, we´ve a blocksize of 1K in the device and, as I
told you, 120 readahead.

As you say in your previous post, without an IO scheduler, we may have
a problem if we have an oracle 8K block size and a device (sdc, sdd,
etc ...) 1K blocksize. Is it right?. Before reading your post, I
thought that using direct IO let us not worrying about device´s
blocksize but .... now I´m in doubt...

Could it be another point to consider in this case?

What do you recommend: "aligning" both block sizes (we have unique
oracle block size - 8K) or using an IO scheduler (IO elevator as you
said previously in kernel 2.4) ??

Fabrizio Magni

unread,
Feb 9, 2006, 11:15:57 AM2/9/06
to
Wyvern wrote:
>
> As you say in your previous post, without an IO scheduler, we may have
> a problem if we have an oracle 8K block size and a device (sdc, sdd,
> etc ...) 1K blocksize. Is it right?. Before reading your post, I
> thought that using direct IO let us not worrying about device愀
> blocksize but .... now I惴 in doubt...
>

Sorry,
I think I confused you with the previous post.
A raw device is a character device and as far as I know the I/O elevator
(so in a 2.4 kernel) cannot operate on such devices (it works on block
device). Things can be different if RH has backported something from
kernel 2.6... and that's I cannot say for sure.

In my example I showed the same dd, on the same hardware and device.
The difference: in the first case the I/O was not merged while in the
second one it was.

Below you can see the iostat related to the previous post.

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util

ida/c0d1 0.00 0.00 0.00 246.46 0.00 3943.43 0.00 1971.72
16.00 0.99 4.02 4.02 99.19

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util

ida/c0d1 0.00 9277.00 0.00 328.00 0.00 76776.00 0.00
38388.00 234.07 142.87 383.93 3.05 100.10


The kernel is 2.6 so the I/O on raw can be bigger than 512 byte (as you
can see it is done in 8k requests).
The first case show direct I/O (syncronous).
The second one a buffered I/O.

The I/O scheduler clearly merges the writes (second line). Resulting in
in 328 writes in a second (average: 117K each) and this cause the big
difference in performance.


>
> What do you recommend: "aligning" both block sizes (we have unique
> oracle block size - 8K) or using an IO scheduler (IO elevator as you
> said previously in kernel 2.4) ??
>


Personally I'd try a filesystem, in your case (you already stated you
don't wish to upgrade to 2.6 kernel).

What do you mean saying that your device had 1k block size?!?
Isn't you using it "raw"?

Eder San Millan

unread,
Feb 9, 2006, 2:00:36 PM2/9/06
to
Hi, I'm Wyvern but I've changed my name but with my real name couse now
I'm writting from my home...

>Sorry,
>I think I confused you with the previous post.

Maybe the problem is my poorly knowledge about what we are talking ..
;-)

>A raw device is a character device and as far as I know the I/O elevator
> (so in a 2.4 kernel) cannot operate on such devices (it works on block
>device).

¡ CLEAR !

>The kernel is 2.6 so the I/O on raw can be bigger than 512 byte (as you
>can see it is done in 8k requests).

I can suppose you are saying that in 2.4 kernels, RAWDEVICES only
allows 512 byte IOs ??

>The first case show direct I/O (syncronous).
>The second one a buffered I/O.

Direct IO (rawdevices) implies syncronous IO ( at Oracle level ) ?

>Personally I'd try a filesystem, in your case (you already stated you
>don't wish to upgrade to 2.6 kernel).

So, could we say that RawDevices in 2.4 kernels are not recomended or
this is supposing too much. We will upgrade to 2.6 kernel but this
suppose us many problems that must solve with a lot of time and this
other problem is being done in a productive system and....well...you
know ...

>What do you mean saying that your device had 1k block size?!?
>Isn't you using it "raw"?

Look (maybe I'm saying a blinge):
----------------------------------------------------------------
# blockdev --getbsz /dev/sdc
1024
----------------------------------------------------------------
SDC is the device that Storage system present me, this "disk" is part
of a VG that we divide in many LVs where we link the rawdevices
This value (bsz=1024) isn't important?

Fabrizio Magni

unread,
Feb 9, 2006, 4:57:58 PM2/9/06
to
Eder San Millan wrote:
>
> I can suppose you are saying that in 2.4 kernels, RAWDEVICES only
> allows 512 byte IOs ??
>

low level? Yep.

>> The first case show direct I/O (syncronous).
>> The second one a buffered I/O.
>
> Direct IO (rawdevices) implies syncronous IO ( at Oracle level ) ?
>

No. I used a dd and the worload was syncronous.
Oracle can exploit the kaio capabilities of linux (search for libaio).
The difference is in the sycalls used.

>
> So, could we say that RawDevices in 2.4 kernels are not recomended or
> this is supposing too much. We will upgrade to 2.6 kernel but this
> suppose us many problems that must solve with a lot of time and this
> other problem is being done in a productive system and....well...you
> know ...
>

****Test on filesystem before going with 2.6.****
Check if there are performance differences.
I used rawdevices on top of LVM on a RAC 9i for a couple of years with
good performance (and it was an oooold SLES7).

>> What do you mean saying that your device had 1k block size?!?
>> Isn't you using it "raw"?
>
> Look (maybe I'm saying a blinge):
> ----------------------------------------------------------------
> # blockdev --getbsz /dev/sdc
> 1024

Personally I don't know the --getbsz. I cannot find it in the man pages
on the web.
Tomorrow, from office, I'm goign to check the sources of blockdev.
I see there is even setbsz. Have yuo tried changing it?


> ----------------------------------------------------------------
> SDC is the device that Storage system present me, this "disk" is part
> of a VG that we divide in many LVs where we link the rawdevices
> This value (bsz=1024) isn't important?
>

Yes, it makes a lot of different (sorry, I didn't ask you if you were
using LVM).
On LVM1, lvcreate can specify the readahead value and the "contiguity"
(plus tons of other parameters).

I believe now it is time for benchmarking. :)

Maybe tomorrow I can give you more information.

Noons

unread,
Feb 9, 2006, 6:51:53 PM2/9/06
to
Fabrizio Magni wrote:

> A raw device is a character device and as far as I know the I/O elevator
> (so in a 2.4 kernel) cannot operate on such devices (it works on block
> device). Things can be different if RH has backported something from
> kernel 2.6... and that's I cannot say for sure.

No, they have not. At least as of RHAS3upd.5.

Noons

unread,
Feb 9, 2006, 7:07:29 PM2/9/06
to
Wyvern wrote:

> We have many versions of Redhat in different installations. We want to
> standardize all of them but not upgrading to RHAS4 but to RHAS3. About
> the update, you´re right, we should go to update 5.
>
> On the other hand, we´ve got Oracle 9i 64bit compilation on release
> 9.2.0.5 in every 64bit linux machine with better performance and
> without problems running also in RHAS3 so I´m not sure SO version is
> the main problem in this case but the way it´s configured and "tuned".

That's why I found it a bit strange when you mentioned
the environment before. AFAIK, RH Linux release 3 requires a
special version of OS to handle 64-bit CPU. Otherwise, all you're
doing is using the 32-bit instructions of the IA64.

In any case, upd.1 seems to me totally out of character with
the Oracle version you're running and the hardware. First thing
to look into, in my book.

As for block sizes: use dumpe2fs on the raw device of the
file system (as root) to determine the actual block size being
used by the file system.

If you are using raw devices, then Oracle takes care of the IO
and uses whatever the db block size is - except for redo logs.
If you're using file systems, then the file system block size becomes
relevant: the dumpe2fs will tell you exactly what it is. You want it
to
be as close as possible to the db block size (8K), within the
constraints of the file system you have chosen. If ext3, then that
will be 4K - max block size for that. NEVER use a file system
block size larger than the db block size. Not your case.


Note that if you are using direct IO on the database, then many
of the "smarts" of the Linux file system IO do not apply. You
may experience a drop in performance if you have a particularly
nasty IO configuration. By using direct IO, you are accepting the
responsibility of managing IO from the database side.


This requires some careful tuning of database io parameters and
definitely a good spread across the IO hardware. For example,
when using direct IO don't even dream of "all db IO through one
controller" configurations! You will definitely have a problem.
Where you don't have absolute control over the hardware config,
then it's probably better to stay with OS buffered io rather than
direct.

Eder San Millan

unread,
Feb 10, 2006, 4:50:29 AM2/10/06
to
Hello,

> > So, could we say that RawDevices in 2.4 kernels are not recomended or
> > this is supposing too much. We will upgrade to 2.6 kernel but this
> > suppose us many problems that must solve with a lot of time and this
> > other problem is being done in a productive system and....well...you
> > know ...
>
> ****Test on filesystem before going with 2.6.****
> Check if there are performance differences.
> I used rawdevices on top of LVM on a RAC 9i for a couple of years with
> good performance (and it was an oooold SLES7).

Yes, we´ll change to FS but I didn´t want to do it but as last
solution.... let´s say ..... errrr.... I see it as the "easiest"
solution .... and I wanted to find the real problem with raws ....
maybe I´m a bit obstinate

> Personally I don't know the --getbsz. I cannot find it in the man pages
> on the web.
> Tomorrow, from office, I'm goign to check the sources of blockdev.
> I see there is even setbsz. Have yuo tried changing it?

No!, I haven´t!. "getbsz" says you the blocksize of the device, the
same value with "blockdev" command as with "duimpe2fs" one (thanks
Noons)

> Yes, it makes a lot of different (sorry, I didn't ask you if you were
> using LVM).
> On LVM1, lvcreate can specify the readahead value and the "contiguity"
> (plus tons of other parameters).
>
> I believe now it is time for benchmarking. :)
>
> Maybe tomorrow I can give you more information.

Ohh, I didn´t want you to waste any time ... in any case, tanks a lot
...

Fabrizio Magni

unread,
Feb 10, 2006, 6:11:24 AM2/10/06
to
Eder San Millan wrote:
>
> Yes, we´ll change to FS but I didn´t want to do it but as last
> solution.... let´s say ..... errrr.... I see it as the "easiest"
> solution .... and I wanted to find the real problem with raws ....
> maybe I´m a bit obstinate
>

I'd do the same.

>
> No!, I haven´t!. "getbsz" says you the blocksize of the device, the
> same value with "blockdev" command as with "duimpe2fs" one (thanks
> Noons)
>

I checked: it is the soft block size.
Testing it gives me the file system blocksize (generally 4k) on devices
containing a filesystem, 512 for raw, 1024 for NTFS and LVM.

>
> Ohh, I didn´t want you to waste any time ... in any case, tanks a lot

> ....
>

I find the topic interesting... and it helps me as well... rechecking
what I really know about I/O subsystems.
Good way to learn.

Regards

Mladen Gogala

unread,
Feb 10, 2006, 7:45:16 AM2/10/06
to
On Thu, 09 Feb 2006 15:14:20 +0100, Fabrizio Magni wrote:

> The second dd has its I/O merged.
>
> Consider this controversial but the I/O scheduler eliminate the need to
> have oracle block size "aligned" to file system block size (even without
> direct I/O).
>
> Regards

How exactly did you test? Did you use elvtune (RH3) or did you put an
appropriate scheduler in grub.conf (RH4)? This is a very special case of
a sequential I/O which doesn't tell you much. Usually, I/O patterns
resulting from the database use are a bit different.

--
http://www.mgogala.com

Eder San Millan

unread,
Feb 10, 2006, 8:14:33 AM2/10/06
to
>> On the other hand, we´ve got Oracle 9i 64bit compilation on release
>> 9.2.0.5 in every 64bit linux machine with better performance and
>> without problems running also in RHAS3 so I´m not sure SO version is
>> the main problem in this case but the way it´s configured and "tuned".

>That's why I found it a bit strange when you mentioned
>the environment before. AFAIK, RH Linux release 3 requires a
>special version of OS to handle 64-bit CPU. Otherwise, all you're
>doing is using the 32-bit instructions of the IA64.

I´ve not found information (in redhat page nor intel page) about what
you say but it worries me because it would another added problem, could
you tell me where should I look about or if you know how to test if we
are using 32 or 64 bit instructions?.

>As for block sizes: use dumpe2fs on the raw device of the
>file system (as root) to determine the actual block size being
>used by the file system.

The result is the same as the one using command "blockdev" ....

>If you are using raw devices, then Oracle takes care of the IO
>and uses whatever the db block size is - except for redo logs.
>If you're using file systems, then the file system block size becomes
>relevant: the dumpe2fs will tell you exactly what it is. You want it
>to be as close as possible to the db block size (8K), within the
>constraints of the file system you have chosen. If ext3, then that
>will be 4K - max block size for that. NEVER use a file system
>block size larger than the db block size. Not your case.

Well, we´ll consider this if we decide to migrate to FS. Thanks.

>Note that if you are using direct IO on the database, then many
>of the "smarts" of the Linux file system IO do not apply. You
>may experience a drop in performance if you have a particularly
>nasty IO configuration. By using direct IO, you are accepting the
>responsibility of managing IO from the database side.
>This requires some careful tuning of database io parameters and
>definitely a good spread across the IO hardware. For example,
>when using direct IO don't even dream of "all db IO through one
>controller" configurations! You will definitely have a problem.
>Where you don't have absolute control over the hardware config,
>then it's probably better to stay with OS buffered io rather than
>direct.

Yes, I think this is EXACTLY what is happening us, with EXT3 is easier
to get an acceptable IO performance due to the "smarts" you mention.
This scenario is the first one we prove with Linux and rawdevices
(direct IO) so ... a big error supposing that raws would give the same
performance using them in Linux than in AIX ....

Thanks a lot for your answer ...

Eder San Millan

unread,
Feb 10, 2006, 10:02:50 AM2/10/06
to
> > Yes, we´ll change to FS but I didn´t want to do it but as last
> > solution.... let´s say ..... errrr.... I see it as the "easiest"
> > solution .... and I wanted to find the real problem with raws ....
> > maybe I´m a bit obstinate
> >
>
> I'd do the same.

Next week, if we have the chance (I don´t administer this system, i´m
only trying to help my colleagues) , we´ll brake storage mirror and
begin making benchmarks on the backup machine with the same data that
productive system. Every test we can before migrating to FS .... I´ll
tell you the results ....

Greetings

Fabrizio Magni

unread,
Feb 10, 2006, 12:51:50 PM2/10/06
to
Mladen Gogala wrote:
> On Thu, 09 Feb 2006 15:14:20 +0100, Fabrizio Magni wrote:
>
>> The second dd has its I/O merged.
>>
>> Consider this controversial but the I/O scheduler eliminate the need to
>> have oracle block size "aligned" to file system block size (even without
>> direct I/O).
>>
>
> How exactly did you test? Did you use elvtune (RH3) or did you put an
> appropriate scheduler in grub.conf (RH4)? This is a very special case of
> a sequential I/O which doesn't tell you much. Usually, I/O patterns
> resulting from the database use are a bit different.
>

Test was performed on a SLES9 with cfq scheduler.
The result would be the same even with the other three (or with the
older elevator).

A dd doesn't represent the typical database workload but it shows what
an I/O scheduler can do.
This component relax the rule of thumb "db block size=file system
blocksize" suggested by many.

For example: with a db block size of 8k and a filesystem of 4k every
oracle request isn't translated in two I/O of 4k each but in a single 8k
operation.
The I/O merging is only one of the function of the scheduler.

If you play with ETL and datawarehouses you can have significant
performance enhancement.

For an OLTP the deadline scheduler could be a good solution.

Noons

unread,
Feb 11, 2006, 12:45:56 AM2/11/06
to
Mladen Gogala wrote:

> How exactly did you test? Did you use elvtune (RH3) or did you put an
> appropriate scheduler in grub.conf (RH4)? This is a very special case of
> a sequential I/O which doesn't tell you much. Usually, I/O patterns
> resulting from the database use are a bit different.

FWIW, and assuming RHAS3upd.3 and upd.5 and ext3 fs:
I've recently done some extensive IO testing as part of a
data centre upgrade.
Using a home-brew blockread program (O_DIRECT or not
depending on switch, 8K reads with buffers aligned on 8K
boundaries, variable seed random reads), I've found that playing
with elvtune gains nothing. With a dd-based exerciser (sequential
reads) I can see a difference but not enough to warrant too
much fussing about. Only in multi-process situations.

I think in slower hardware IO configurations than hyperchannel
SAN, it might be more relevant. Or maybe the 2.6 kernel io
scheduler would change this picture somewhat.

If anyone knows of detailed benchmark data that lead to the
io elevators being developed in kernel 2.4, I'd love to have a look
at them. So far, I've been unable to create any test case where
they would be of a definite advantage.

Unfortunately, Oracle's ORION exerciser is useless for this
kind of testing: last time I looked it concentrated on raw
devices only. Ie, about 5% of the Linux dbserver population...
Probably appropriate for those considering ASM, OCFS and
such but useless for everyone else.

Fabrizio Magni

unread,
Feb 11, 2006, 4:39:53 AM2/11/06
to

I'm not sure since I don't use redhat but they seem to have introduced
the rawvary patch since AS 2.1 allowing writes on raw devices > of 512 byte.

Fabrizio Magni

unread,
Feb 11, 2006, 4:42:07 AM2/11/06
to
Noons wrote:
>
> If anyone knows of detailed benchmark data that lead to the
> io elevators being developed in kernel 2.4, I'd love to have a look
> at them. So far, I've been unable to create any test case where
> they would be of a definite advantage.
>

Hi Noons,
personally I have never played with the elvtune so I'm really interested
in your result.
Have you published something?

I tested the four 2.6 I/O schedulers (and their parameters) and the
difference in performance are quite relevant.

Benchmarks on the use of these schedulers are easy to spot out.

I'm posting what was written on the linux kernel developer mailing list:
http://www.cs.rice.edu/~ssiyer/r/antsched/shines.html
http://www.cs.rice.edu/~ssiyer/r/antsched/antio.html
and linux symposium (with a little bit of theory):
http://www.linuxsymposium.org/proceedings/reprints/Reprint-Axboe-OLS2004.pdf
http://www.linuxsymposium.org/proceedings/reprints/Reprint-Pratt-OLS2004.pdf

This can be considered "dated" but in my opinion a very good reading on
the linux block layer and on what the elevator can achieve is "Improving
Linux Block I/O for Enterprise Workloads".

Noons

unread,
Feb 11, 2006, 6:10:36 AM2/11/06
to
Fabrizio Magni wrote:
> >> A raw device is a character device and as far as I know the I/O elevator
> >> (so in a 2.4 kernel) cannot operate on such devices (it works on block
> >> device). Things can be different if RH has backported something from
> >> kernel 2.6... and that's I cannot say for sure.
> >
> > No, they have not. At least as of RHAS3upd.5.
> >
>
> I'm not sure since I don't use redhat but they seem to have introduced
> the rawvary patch since AS 2.1 allowing writes on raw devices > of 512 byte.
>

IIRC, upd5. But that's been around for a while, since
even before 2.4 kernel came out. IIRC, the new max
on upd5 is 131K.

Noons

unread,
Feb 11, 2006, 6:24:21 AM2/11/06
to
Fabrizio Magni wrote:
> personally I have never played with the elvtune so I'm really interested
> in your result.
> Have you published something?

Alas, no time to publish anything. Got to get this thing
running fast or else company can't charge clients!
We get paid by the click: we lose click summaries, we
can't charge, my bonus goes awol! :(

I'll see if I can put some text together in another month
or so, the US guys are taking over the rest of the ops
and I should be able to concentrate a bit more on
doco for all the db stuff of late. Will check if AUSOUG
is interested in it.

And I finally got a test box to figure out aio with cooked io
in RHAS3 and 9i! There's gotta be a way of making
it work good. And 10gr2 and RHAS4 to test as well:
it's gonna be a good year!


> I tested the four 2.6 I/O schedulers (and their parameters) and the
> difference in performance are quite relevant.
> Benchmarks on the use of these schedulers are easy to spot out.
> I'm posting what was written on the linux kernel developer mailing list:
> http://www.cs.rice.edu/~ssiyer/r/antsched/shines.html
> http://www.cs.rice.edu/~ssiyer/r/antsched/antio.html
> and linux symposium (with a little bit of theory):
> http://www.linuxsymposium.org/proceedings/reprints/Reprint-Axboe-OLS2004.pdf
> http://www.linuxsymposium.org/proceedings/reprints/Reprint-Pratt-OLS2004.pdf


Thanks, good stuff. Yeah, looks like a major
improvement. The only thing that worries me a bit
is it seems to be totally oriented to native controllers
or jbod architectures. We use mainly SANs with
hyperchannel switches, which means I'll have to test
this all over again!...

Frank van Bortel

unread,
Feb 11, 2006, 6:47:41 AM2/11/06
to

Thanks for those links; I have nothing to add to this thread, but
follow it with all interest.
Looking forward to Nunos posting

--
Regards,
Frank van Bortel

Top-posting is one way to shut me up...

Noons

unread,
Feb 12, 2006, 4:45:52 AM2/12/06
to
Mladen Gogala wrote:
> Bear in mind that at some point all those interrupt requests will start
> saturating your system bus. PC buses, even the server versions, are
> nowhere near the capacity of the true midrange SMP servers like HP 9000
> series or IBM P960 series machines. Modern disk controllers will do
> massive amounts of DMA communication with memory, using the same system
> bus that CPU boards use to synchronize caches, that network controllers
> use to notify CPU of the network interrupts and that all peripheral
> devices use to communicate with CPU and RAM. My experience tells me that
> no matter how good PC server you buy, you will never get more then 2500
> I/O operations per second out of it. On a heavily used OLTP database, that
> amounts to 200-300 concurrent users, with up to 50 active at a time.

That's a superb point, almost lost it in all the replies. One has to
keep in perspective that we're not talking about "super server"
technology here. A PC blade is a PC architecture, not a
SMP server on steroids, ht or dual core notwithstanding.

Linux+PC blades have a purpose and a market sweet spot. They can
be made to perform at unheard of levels only a few years ago.
But it's only too easy to spend more $$$ making them behave like a
midrange SMP database server than to actually BUY such a server!

Like everything, it's all about balance. PC-based servers with
Linux can be configured at astoundingly cheap prices. That
doesn't mean they can perform with databases at the same levels
as much more sophisticated (read: expensive) architectures. For
some situations the PC-based solution is perfect. Others require
more elaborate solutions. It's all about price-performance,
bang-for-buck and all that jazz.


The 2500 IO/sec is IME as well a good yardstick, assuming
nothing else is sapping the bandwidth. Network controllers,
memory-to-memory copies and all such can reduce this significantly.

That's real direct IO, not cache accesses! This is not the
same as rates of IO. Our PC blades drive SAN boxes at over
100MB/sec. That's however with IO request scatter-gather on
full table scans, dbfmbrc, SAN read-ahead and all such
streaming optimisations.

Discrete random IO operations are a totally different animal!

2500/sec is around 20MB/s, assuming 8K db blocks.
Anyone getting as much as that in direct random IO,
mixed access in a PC blade architecture, can count
themselves very lucky indeed.

Ours sometimes go as high as 40MB/s, or 5000 random IO/sec.
But that's in favourable conditions, wind coming from behind,
moon in the right quarter and all such! Any requirement for
higher than that and we definitely enter into diminishing
returns territory.

One thing I've been able to determine so far: number of
controllers and access paths do definitely make an almost
linear difference. There is a reason why you see those
hundreds of controllers and thousands of disks in the hardware
descriptions of OLTP workload benchmarks!


> Second, Linux doesn't show you the time spent on the interrupt stack.
> You cannot see whether your motherboard is loaded to the capacity or not,
> because you cannot see how much of the system time is actually spent
> servicing interrupts.


Yup, very much so. It's very hard (on current levels of Linux)
without some specialised driver/monitor software to determine
exactly where problems are and how to address them.
All we can do is devise tests, compare configs and finally
extrapolate from the results what is really going on and what
is the best course of action.

Thanks for the heads up, Mladen. Very interesting to see
others with same experiences I'm going through.

Fabrizio Magni

unread,
Feb 13, 2006, 3:05:00 AM2/13/06
to
Frank van Bortel wrote:
>
> Thanks for those links; I have nothing to add to this thread, but
> follow it with all interest.
> Looking forward to Nunos posting
>

Hi Frank,
don't blame me if I throw this link as well.
It is incomplete... but still I hope for a feedback.

http://www.gesinet.it/oracle/blocksize.html

Noons

unread,
Feb 13, 2006, 7:23:35 AM2/13/06
to
Fabrizio Magni wrote:
> Hi Frank,
> don't blame me if I throw this link as well.
> It is incomplete... but still I hope for a feedback.
>
> http://www.gesinet.it/oracle/blocksize.html
>


Feedback positive here. The only restriction
I'm aware of as still valid is having a linux file
system block size *larger* than the database
block size.

Smaller (as in the submultiple you mention) is not
such a problem nowadays. Used to, before the 2.4
kernel io elevators - or more precisely, the 2.3 kernel:
first time the elevator code showed up.

But since then: yes there is a difference but not enough
to warrant losing sleep over. Or at the very least I have
yet to hit a combination where it becomes a true
problem.

Having said that: don't go around slapping 1K fs block
size on 16K db block sizes: the io elevator code is far
from perfect and it's only in 2.4 kernel that the cfq more
or less avoids such problems.

Note that the above is SPECIFIC to Linux. I have had
a few problems in the past with some flavours of Unix
and specific file systems other than ext3. My guess is
those articles were at some stage extrapolations of
information gathered from Unix, then tested in 2.1
kernel.

Since 2.4 kernel (RHAS3, Suse8?), not as much of an
issue.

Frank van Bortel

unread,
Feb 13, 2006, 1:48:27 PM2/13/06
to
Fabrizio Magni wrote:
>
> Hi Frank,
> don't blame me if I throw this link as well.
> It is incomplete... but still I hope for a feedback.
>
> http://www.gesinet.it/oracle/blocksize.html
>

Don't blame you, have seen it before.

Too technical for me; I'd like some how-to on setting
these things up, and checking which is best for Oracle
in particular, and databases in general.

Fabrizio Magni

unread,
Feb 13, 2006, 2:22:13 PM2/13/06
to
Frank van Bortel wrote:
>
> Too technical for me; I'd like some how-to on setting
> these things up, and checking which is best for Oracle
> in particular, and databases in general.

I should always remind myself: "Think simple!"

Maybe I can prepare something more digestible...

Thanks, Frank.

Fabrizio Magni

unread,
Feb 13, 2006, 2:23:51 PM2/13/06
to
Noons wrote:
>
>
> Feedback positive here. The only restriction
> I'm aware of as still valid is having a linux file
> system block size *larger* than the database
> block size.
>

Thank you, Nuno.
I really appreciate the feedback: more since it is positive! ;)

Mladen Gogala

unread,
Feb 13, 2006, 5:43:43 PM2/13/06
to
On Mon, 13 Feb 2006 09:05:00 +0100, Fabrizio Magni wrote:

> Hi Frank,
> don't blame me if I throw this link as well.
> It is incomplete... but still I hope for a feedback.

Definitely a good article. I was surprised about the large drop
in the max response time when going from 4K --> 8K:

TPS: 24,82
kBPS: 50.409
Total executions: 5584
Total Rows: 67179
Total kBytes: 11338,879
Average response time: 0.034
Maximum response time: 0.464

TEST8K:

TPS: 24,72
kBPS: 50.090
Total executions: 5564
Total Rows: 66809
Total kBytes: 11272,059
Average response time: 0.036
Maximum response time: 0.246

Throughput was larger with the 4K block. Strange.
--
http://www.mgogala.com

Noons

unread,
Feb 13, 2006, 6:06:34 PM2/13/06
to
Mladen Gogala wrote:

> Definitely a good article. I was surprised about the large drop
> in the max response time when going from 4K --> 8K:
>

Me too. Probably some measurement noise?


>
> Throughput was larger with the 4K block. Strange.
> --


Not really. The file system is 4K, the db block is 4K:
you get optimal IO for the conditions. With a larger
db block size than fs block size you get a slight drop
in throughput. This is more visible and consistent
in kernel 2.4 with the io elevators: they are nowhere as
efficient as the cfq scheduler in 2.6.

Fabrizio Magni

unread,
Feb 14, 2006, 3:31:46 AM2/14/06
to
Mladen Gogala wrote:
>
> Definitely a good article. I was surprised about the large drop
> in the max response time when going from 4K --> 8K:
>
> TPS: 24,82
> kBPS: 50.409
> Total executions: 5584
> Total Rows: 67179
> Total kBytes: 11338,879
> Average response time: 0.034
> Maximum response time: 0.464
>
> TEST8K:
>
> TPS: 24,72
> kBPS: 50.090
> Total executions: 5564
> Total Rows: 66809
> Total kBytes: 11272,059
> Average response time: 0.036
> Maximum response time: 0.246
>
> Throughput was larger with the 4K block. Strange.

Hi Mladen,
yes, that value is strange.
I can only speculate but since the average response time is practical
the same on the two tests the only reasons I can think at the moment are:
- some spurious values in the first test,
- a strange Gaussian distribution for the 4K test.

As Nuno I'm for the sooner.
The latter could lead to a too strange conclusion.

Frank van Bortel

unread,
Feb 14, 2006, 2:32:13 PM2/14/06
to
Fabrizio Magni wrote:
> Maybe I can prepare something more digestible...
>

I'd really appreciate that!

Noons

unread,
Feb 15, 2006, 1:10:38 AM2/15/06
to
Frank van Bortel wrote:
>
> Thanks for those links; I have nothing to add to this thread, but
> follow it with all interest.
> Looking forward to Nunos posting
>

Well, not a full posting. But I finally got elvtune to do
anything other than noise-level changes!

Conditions are:

long batch db load job, mostly with very large indexed
range scans. Index keys are very large (url text +
search terms + ip addresses, timetamps, domains
and so on). Rows are also very large, typically we store
between 5-10 rows per 8K db block, sometimes much
less depending on how big the strings are that we
get from the search engines. 130 million rows, no
partioning. 9.2.0.6, RHAS3upd.3. Oracle has been
patched up with the "direct io on nfs" and "aio" patches.
ext3, 4K block size and filesystemio_options=directIO
in init.ora. O_DIRECT has been confirmed active with
strace.

iostat indicated a large number of read requests
per second on the devices holding the indexes for this
table. Number of reads was consistent with KB/s read
speed. We were getting large queues for the devices
(>300) and a large "usage" percentage (200%).
Service time in the order of several tens of ms.

elvtune showed r=2048, w=8192, b=6. Pretty standard
for default Linux.

My reasoning was: I know we are doing a lot of
random, indexed io. Therefore, I will not be taking
much advantage of io request elevator tuning as the
chance of consecutive addresses is very remote. Other
than the implicit two-fs-blocks-per-db-block (2*4K=8K).

So I went radical:

/sbin/elvtune -r 24 -w 24 -b 8 <device name>

The motivation for the 24 was from some actual benchmark
figures I got from a forum where Andrew and Andrea,
two of the Linux folks invloved in elvtune coding, were arguing
their reasons. 24 seemed to be a sweet spot for random access
with ext3.

It was in our case. The process handling insert-updates
in this table dropped from 2 hours exec time to 1 hour.
MB/s increased from about 5MB/s on each of the devices
where the index is stored (3 of them, so 15MB/s aggregate)
to 15MB/s (45MB/s aggregate). iostat queue lengths dropped
to 10s and service time down to single figure ms.
I didn't measure the device where the table is kept as the
speed there has never been a problem. Other than the obvious
high physical io because of large rows.

Full table scans happening at other times in this table
didn't suffer at all. dbfmbrc = 8 and is well within
the 24 of r/w in elvtune, therefore it still benefits
from streaming and the SAN read-ahead which kicks in
after < 8 consecutive reads.

Good enough improvement for me. Lessons: dont' assume
defaults are also the best, measure, reason, change,
measure again. Repeat until satisfied, then STOP!

time for a beer

Fabrizio Magni

unread,
Feb 15, 2006, 12:13:41 PM2/15/06
to
Noons wrote:
>
> Good enough improvement for me. Lessons: dont' assume
> defaults are also the best, measure, reason, change,
> measure again. Repeat until satisfied, then STOP!
>

Nuno,
I sincerely believe that you should publish your results (and testing
methodology).

Best regards

Eder San Millan

unread,
Feb 15, 2006, 2:27:34 PM2/15/06
to

Noons ha escrito:

GREAT!!!!, Thanks Noons, I think this test will help me with the one
I'd like to do ....

Thanks again, ¡very clear!

Noons

unread,
Feb 16, 2006, 4:14:19 AM2/16/06
to
Fabrizio Magni wrote:
> I sincerely believe that you should publish your results (and testing
> methodology).
>

Not yet. I want to pan this out a bit more.
Still getting some inconsistent results in
some of the boxes. I want to have a bit more
sample data before inflicting it on others!
:)


(just been told our main db will triple in size
in the next 6 months. Exactly what I needed,
now that I got the growth/perf on this one
sorted out!...)

Joel Garry

unread,
Feb 17, 2006, 6:47:49 PM2/17/06
to
Noons wrote:
>measure again. Repeat until satisfied, then STOP!
...

>(just been told our main db will triple in size
>in the next 6 months. Exactly what I needed,
>now that I got the growth/perf on this one
>sorted out!...)

Just replace STOP with NOP :-)

jg
--
@home.com is bogus.
http://www.auntvisgarden.com/auras/index.html

Noons

unread,
Feb 19, 2006, 5:59:22 AM2/19/06
to
Joel Garry wrote:
> >(just been told our main db will triple in size
> >in the next 6 months. Exactly what I needed,
> >now that I got the growth/perf on this one
> >sorted out!...)
>
> Just replace STOP with NOP :-)
>

Stop is what I'm getting when I double the load!
Aiming at three san boxes at the moment
but I also need 5 more fc controllers...
This is gonna be fun!

0 new messages