My OpenServer 5.0.6 system seems to be suferring random go slows, and
I am getting data out of sar suggests strongly that it is a disk
system bottleneck and I should increase my NBUF to cope better with
it... My big issue is that I really have absolutley no idea how to
increase the NBUF figure, nor do I know how much to increase it by so
as not to run myself low of memory in the system
Can someone let me know...? Or point me to instructions?
Just incase it make a difference, the system itself is an Intel Mobo
with dual P3 Xeon processors and 1Gig of physical memory. I beleive
it has been configured to have 3gig swap. The disk system is a SCSI
RAID5 + hotspare thing, with 4 10000rpm 36g scsi drives...
Thanks
googleboy.
There are many ways to see how much memory is free. To see current free
memory, run `sar -r 1 1` and `swap -l`:
$ sar -r 1 1
SCO_SV deeptht 3.2v5.0.6 PentIII 06/25/2003
00:47:20 freemem freeswp availrmem availsmem (-r)
00:47:21 530044 6307456 621592 1315542
$ swap -l
path dev swaplo blocks free
/dev/swap 1,105 0 6307456 6307456
Unfortunately, the units here are of two sizes. The swap numbers (last
two columns of `swap -l`, and the "freeswp" column of `sar -r`) are in
1/2K units. The three "mem" numbers are in 4K units.
The `swap -l` output shows that 3153728K (slightly over 3GB) of swap
space is free, and that _all_ swap space is free ("blocks" and "free"
are identical). That part is important: it shows that the system has
never needed to swap since it's been up. In other words, it has more
memory than it needs to handle the load.
"freemem" shows how much RAM is currently free. It's in 4K units, so
2120176K (slightly over 2GB) is currently free.
If you turn on periodic `sar` sampling (see sar_enable(ADM)), you can
get a long-term picture of how much memory your system is using.
Now, on to your specific question. You don't want to increase NBUF so
much that the system actually runs out of memory and has to start
swapping. So you look at memory usage periodically (by hand, or by
`sar` sampling). Then you can increase NBUF to consume most of the
different.
Note, however, that NBUF cannot be larger than 450000 (450MB) on
OpenServer. If you find that you consistently have 800MB free, you
still can't use all of it for buffer cache.
A commercial product, "SarCheck", analyzes `sar` output and makes tuning
recommendations. www.sarcheck.com. I believe you can download a demo
version to "try before you buy".
>Bela<
Increase NBUF to maximum value:
idtune -m NBUF 450000
Then relink kernel and reboot.
Andrey Bondar, SysAdmin,
T.I.P.A.S. Ltd., Lithuania
On a different angle, if your raid controller support RAID 1+0, then
reconfigure your hard drives. You will have to unload/reload with a
supertar.
I have seen RAID 5 causing a bottleneck, normally with a database and
numerous small writes. The RAID 1+0 has better performance in the same
test. With a three drive RAID 5 and one spare you can sustain a single
hard drive failure without loss of data ( performance will be hurt even
more during the rebuild ) but a second failure will cause a total volume
loss. With RAID 1+0 you can have a single drive failure with no loss
of data ( performance degredation will occur when the failed drive is
replaced, but its less noticable then with RAID 5 ), and it is possible
that a second hard drive can fail without the volume going down. A third
failure before any drive replacement will always kill it. So RAID 1+0
is no less secure than RAID 5, has better performance and you already
own the four drives.
Mike
--
Michael Brown
The Kingsway Group