Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

What's the difference between Blocksize, fragment size and Number of Byte Per I-node ???

640 views
Skip to first unread message

fet...@my-dejanews.com

unread,
Nov 26, 1998, 3:00:00 AM11/26/98
to
Hy all,

in fact, what's the difference between block size and fragment size ??
(I am on RS6000 AIX 4.2.1)

I need to have file system with 8k block size to improve perf. of database.

How could i create such file system ???
Have ever read man on crfs mkfs and so on, and the only thing i have found, is
about fragment size, wich can not be bigger than 4096b.

So, can I made some "equivalence" in growing NBPI at 8k ??

So ?? Is there a Unix Guru who can explain me if I am wrong on something ??


Thanks lot !


Jean-Marc Maitrot

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

big...@my-dejanews.com

unread,
Nov 26, 1998, 3:00:00 AM11/26/98
to
I just saw this other posting...

A block is just a fixed number of bytes (it is normally 512 or 1024) that the
system uses to transfer data (that is, a transfer by block).

A fragment size should be dependent on how big/small the data will be that you
are storing.

that is, if you have:

a fragment size of 4096 bytes and file size of 1500 bytes then you are wasting
2596 bytes (using 1 fragment):

<----------- Fragment Size 4096 bytes ----->
- - - - - - - - - - - - - - - - - - - - - -
| + |
| <---file---> + <---- unused space ----->|
| + |
- - - - - - - - - - - - - - - - - - - - - -

a fragment size of 1024 bytes and file size of 1500 bytes then you are only
wasting 548 bytes (using 2 fragments, rather than 1):

<-Frgmnt->|<-Frgmnt->|<-Frgmnt->|<-Frgmnt->|
- - - - - - - - - - - - - - - - - - - - - -
| + |
| <---file---> +<---> |
| + |
- - - - - - - - - - - - - - - - - - - - - -

The fragment size will become meaningless as the file size gets to be very
large.

You could take a look at the agsize and the NBPI values (increasing both).
As the agsize increases so will the ability to increase the NBPI value. This
means that you will have less total inodes, and more bytes to each inode.

For what you are doing this may improve your perfomance, but don't quote me on
that!

good luck,

duff


In article <73j7vc$kl8$1...@nnrp1.dejanews.com>,

Super Dave Mac

unread,
Nov 27, 1998, 3:00:00 AM11/27/98
to
Block size is device specific and depends on what type of device is being
discussed. For AIX, disk drives using the standard SCSI device driver and
disk drivers, that would be 512 bytes. Standard block I/O requests to these
devices will be in 4K "pages", such that when a disk read occurs for
example, eight 512 byte blocks will be read into a buffer, and then the data
requested will be copied into the user IO buffer by the device driver
involved.

Fragment size is a "mimimum" amount of allocatable space in a JFS
filesystem. For example, if you have a frag size of 4096, and you need to
store 4097 bytes to a file, that file will be allocated 2 full fragments, or
8192 bytes. Just as Duff pointed out, this can waste disk space. You can
adjust the fragment size down to 512 bytes for maximum granularity of
allocation. In that case, the same 4097 bytes would only require 9
fragments of 512 bytes each, vice the 2 4K fragments. Fragment size can
only be specified at filesystem creation time, you cannot change it once the
filesystem has been created.

By the same token, the Number of Bytes Per Inode (NBPI) simply specifies the
number of bytes assigned to a single inode. When you create a filesystem,
there is one inode created for every NBPI of space in the filesystem. The
smaller the NBPI value, the more inodes are created and vice versa. Also,
the smaller the NBPI value, the more inodes are created therefore more space
is needed to store inode mapping data. This means more filesystem space is
used for metadata (e.g. greater overhead). Every file has at least one
inode, so if you set the inode size to 8K vice 4K, you are allocating 8K to
the file "automatically". Usually, the only time it is necessary to adjust
the NBPI value is when you have lots of really small files, few very large
files, or very large filesystems that cannot be mapped with smaller NBPI
values. The NBPI value has little to do with I/O performance. This value
can only be specified at filesystem creation time and cannot be adjusted the
reafter.

The block size of the device/device driver cannot be adjusted by the user,
and doesn't really matter in a JFS filesystem since I/O is in 4K pages
anyway. The fragment size and NBPI can be adjusted, but have almost nothing
to do with I/O performance, primarily just space allocation.

I suspect what you MAY be referring to is stripe size. If you are using a
database which is storing data in striped JFS filesystems, then the stripe
size can be set accordingly. This usually matches the record size, and
ideally would be set to either 4096 or even multiples thereof because all
JFS I/O is going through VMM anyway, which is in 4K pages. With an 8K
stripe size, for example, and assuming you have a database "record" size of
8K, with a filesystem striped accross 8 drives, then an I/O to 8 sequential
records would be spread accross the 8 drives and could occur in parallel (to
the extent your system can be "parallel"), even concurrantly in an SMP
enviroment. Under the covers, 2 4K pages would be queued for I/O to each
seperate disk, and each controller/disk could be reading/writing it's 16 512
byte blocks seperately (depending on the number of processors, controllers,
and the type of device driver involved, it could be concurrant as well).
This could certainly enhance performance. Once again, a stripe size cannot
be adjusted after the filesystem has been created, you would have to
recreate it to alter the stripe size.

You may want to check with whoever told you to adjust the "block" size, and
find out exactly what they are requesting.

--
My opinions are completely my own. As for my heart and soul,
I'm not so sure....but 30% my ass definately belongs to the IRS.
fet...@my-dejanews.com wrote in message
<73j7vc$kl8$1...@nnrp1.dejanews.com>...

Scott L. Fields

unread,
Nov 27, 1998, 3:00:00 AM11/27/98
to
The JFS filesystem is very closely tied to the operations of the underlying
VMM, which
uses 4K pages. Ultimately, this is the block size of the filesystem, from
that perspective.
It is not tunable.

I would suggest your puruse the Performance and Tuning Guide for other
issues.

fet...@my-dejanews.com wrote in message
<73j7vc$kl8$1...@nnrp1.dejanews.com>...
>Hy all,
>
>in fact, what's the difference between block size and fragment size ??
>(I am on RS6000 AIX 4.2.1)
>
>I need to have file system with 8k block size to improve perf. of database.
>
>How could i create such file system ???
>Have ever read man on crfs mkfs and so on, and the only thing i have found,
is
>about fragment size, wich can not be bigger than 4096b.
>
>So, can I made some "equivalence" in growing NBPI at 8k ??
>

>So ?? Is there a Unix Guru who can explain me if I am wrong on something ??

Peter van Gemert

unread,
Dec 2, 1998, 3:00:00 AM12/2/98
to

Super Dave Mac <mcc...@airmail.net> wrote in article
<454E993376CAC079.2BC510AF...@library-proxy.airnews.ne
t>...
> ...


> By the same token, the Number of Bytes Per Inode (NBPI) simply specifies
the
> number of bytes assigned to a single inode. When you create a
filesystem,
> there is one inode created for every NBPI of space in the filesystem.
The
> smaller the NBPI value, the more inodes are created and vice versa.
Also,
> the smaller the NBPI value, the more inodes are created therefore more
space
> is needed to store inode mapping data. This means more filesystem space
is
> used for metadata (e.g. greater overhead). Every file has at least one
> inode, so if you set the inode size to 8K vice 4K, you are allocating 8K
to
> the file "automatically".

Does this mean that a file of 4096 bytes stored in a fs with NBPI=8192 is
stored in two datablocks?? I don't think so. The NBPI value only defines
the number of inodes that are created in a filesystem. So a 4096b file will
only occupy one datablock.

> Usually, the only time it is necessary to adjust
> the NBPI value is when you have lots of really small files, few very
large
> files, or very large filesystems that cannot be mapped with smaller NBPI
> values.

One reason to change the NBPI value is when you turn on fragmentation in
your filesystem. In a fs with fragmentation turned on, you could save more
files, so you need more inodes.

> The NBPI value has little to do with I/O performance. This value
> can only be specified at filesystem creation time and cannot be adjusted
the
> reafter.

> --
> My opinions are completely my own. As for my heart and soul,
> I'm not so sure....but 30% my ass definately belongs to the IRS.

Scott L. Fields

unread,
Dec 2, 1998, 3:00:00 AM12/2/98
to

Peter van Gemert wrote in message <01be1e4c$ab386460$ac2ceed4@allet>...

>
>One reason to change the NBPI value is when you turn on fragmentation in
>your filesystem. In a fs with fragmentation turned on, you could save more
>files, so you need more inodes.


I have no idea what you are talking about here. There is no option to turn
on FRAGMENTATION.

The only kind of fragmentation that may give you trouble is free space
fragmentation.

If you use a standard filesystem (not large file enabled) with a fragment
size of 4K, then
you will never be hit by this (actually, you need to be running AIX 4.2 or
higher).

Peter van Gemert

unread,
Dec 4, 1998, 3:00:00 AM12/4/98
to

Scott L. Fields <slfi...@metronet.com> wrote in article
<744q5e$go$1...@news.metronet.com>...


>
> Peter van Gemert wrote in message <01be1e4c$ab386460$ac2ceed4@allet>...
> >
> >One reason to change the NBPI value is when you turn on fragmentation in
> >your filesystem. In a fs with fragmentation turned on, you could save
more
> >files, so you need more inodes.
>
>
> I have no idea what you are talking about here. There is no option to
turn
> on FRAGMENTATION.
>

If i have a file system of let's say 16384 bytes. In a normal fs with a
fragment size of 4096 the maximum number of files i can store is 4, so I
need 4 inodes. NBPI then is also 4096. In a fs with the same size but with
a fragment size of 1024 I can store 16 files. If i really want to make
these 16 files (of 1024bytes in size), I need also 16 inodes, so then NBPI
should be 1024 too.


Joerg Bruehe

unread,
Dec 7, 1998, 3:00:00 AM12/7/98
to
Hi Jean-Marc !

fet...@my-dejanews.com schrieb:

> Hy all,
>
> in fact, what's the difference between block size and fragment size ??
> (I am on RS6000 AIX 4.2.1)
>
> I need to have file system with 8k block size to improve perf. of database.

For a decent DBMS, you get the best performance if you do not usethe file system at all and let your DBMS work on the "raw disks" directly.

There are several reasons for this: Using files, you ...
- ... have caching done by the DBMS and by AIX - waste of RAM.
- ... need CPU cycles for the intermediate step of FS buffers.
- ... get the indirection done by the FS, lose contiguity.
- ... have to go via "indirect" blocks, waste os disk space and CPU.
- ...
- ... are forced to do a DB-specific backup, which prevents you from ever
restoring DB files in an inconsistent state (we just had a customer
doing that, thus destroying his current DB log information).

The gain may be small, but it still exists.

> ((...))


>
> Thanks lot !
>
> Jean-Marc Maitrot
>
> -----------== Posted via Deja News, The Discussion Network ==----------

> http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Regards,
Joerg Bruehe

--
Joerg Bruehe, SQL Datenbanksysteme GmbH, Berlin, Germany
(speaking only for himself)
mailto: jo...@sql.de

Joerg Bruehe

unread,
Dec 7, 1998, 3:00:00 AM12/7/98
to
Hi Peter !

Peter van Gemert schrieb:

> ((...))


> >
> If i have a file system of let's say 16384 bytes. In a normal fs with a
> fragment size of 4096 the maximum number of files i can store is 4, so I
> need 4 inodes. NBPI then is also 4096. In a fs with the same size but with
> a fragment size of 1024 I can store 16 files. If i really want to make
> these 16 files (of 1024bytes in size), I need also 16 inodes, so then NBPI
> should be 1024 too.

NBPI just sets your expectation of "mean file size".If you have a file system with 1000 blocks, and your average file size is
100 blocks, you will not have more than 10 files, hence you need 10 i-nodes
(one i-node per file).
The NBPI parameter allows you to create less i-nodes (needing less space)
in a file system with (on the average) large files,
and more i-nodes in a FS with small files.

Remember that you cannot create any more files if
_either_ you have used all i-nodes
_or_ you have used all space.

Regards,
Joerg

0 new messages