JBASE - JSTAT -rvf

178 views
Skip to first unread message

udayangi

unread,
Mar 23, 2009, 11:32:27 PM3/23/09
to jBASE
Hi,
I am trying to figure out why jstat -rvf does not show any free frames
although there are many free frames in the file. Also I see certain
frames having a large no of records thereby increasing the frame
size. Why doesn't it use the free frames which are already available.
I even tried creating a new file and copy records but it doesn't help.

Below is a sample of the jstat -rfv output.

149470019704516.700001.COB.MMI.20081201.01.NEW 259
149470020005080.030004.COB.FX.20081201.01.NEW 280
149470020005075.110004.COB.FX.20081201.01.NEW 285
149470020005069.100020.COB.FX.20081201.01.NEW 286
149470020005060.030004.COB.FX.20081201.01.NEW 288
149470020005068.030004.COB.FX.20081201.01.NEW 287
149470020005091.110004.COB.FX.20081201.01.NEW 290
149470020005098.020020.COB.FX.20081201.01.NEW 289
149470020005084.030004.COB.FX.20081201.01.NEW 289
149470020005076.030004.COB.FX.20081201.01.NEW 291
149470020005070.020020.COB.FX.20081201.01.NEW 291
149470020005094.190004.COB.FX.20081201.01.NEW 287
149470020005064.030004.COB.FX.20081201.01.NEW 284
149470020005059.110004.COB.FX.20081201.01.NEW 285
149470020005067.110004.COB.FX.20081201.01.NEW 291
149470020005088.030004.COB.FX.20081201.01.NEW 290
149470020005078.190004.COB.FX.20081201.01.NEW 286
149470020005085.270004.COB.FX.20081201.01.NEW 290
149470020005083.110004.COB.FX.20081201.01.NEW 290
4566 4096 0
4567 4096 524
SCTRSC0833600256.20081201.07.NEW 252
FT0833600777.20081201.01.NEW 272
4568 4096 0
4569 4096 0
4570 4096 0
4571 4096 0
4572 4096 0
4573 4096 0
4574 4096 0

Record Count = 36416 , Record Bytes = 9401036
Bytes/Record = 258 , Bytes/Group = 2054
Primary file space: Total Frames = 6484 , Total Bytes = 9401036
Secondary file space: Total Frames = 0 , Total Bytes = 0
0 free frames, each 4096 bytes , first frame offset 0
jsh globdev ~ -->


See above 4568 to 4574 frames are free but 4565 frame has a large no
of records in it.

Could help me on this pls.

udayangi

pat

unread,
Mar 24, 2009, 4:12:16 PM3/24/09
to jBASE
I assume the 'free frames' you refer to are Groups 4566, 4568, 4569,
thru 4574 etc.

These Groups are 'unused' simply because there are no item ids,
currently within this file, that 'Hash' into these particular Groups.

Try creating a file with Hash Method 5 ( assuming you are using jBASE
4.1 or above )

eg :

CREATE-FILE mynewhashfile TYPE=J4 HASHMETHOD=5 1 4574

And then copy your existing file / records to the ( new )
'mynewhashfile', and observe the item distribution shown by :

jstat -rv mynewhashfile

Pat.

Jim Idle

unread,
Mar 24, 2009, 4:28:47 PM3/24/09
to jB...@googlegroups.com
udayangi wrote:
> Hi,
> I am trying to figure out why jstat -rvf does not show any free frames
> although there are many free frames in the file.
It shows the group population, not the free space blocks. By definition
these are free and so have no record data in them. I think you are
misunderstanding how the file works.

> Also I see certain
> frames having a large no of records thereby increasing the frame
> size. Why doesn't it use the free frames which are already available.
>
They are not free frames, they are just empty hash buckets, which means
that the hashing algorithm you are using did not hash any of the record
keys you used into that bucket. This is a HASHed file and not a linear
hash. Empty buckets are not the same as the free space chain, which
stores overflow frames that were used by records that have since been
deleted.

> I even tried creating a new file and copy records but it doesn't help.
>
>
>
...

> 4566 4096 0
> 4567 4096 524
> SCTRSC0833600256.20081201.07.NEW 252
> FT0833600777.20081201.01.NEW 272
> 4568 4096 0
> 4569 4096 0
> 4570 4096 0
> 4571 4096 0
> 4572 4096 0
> 4573 4096 0
> 4574 4096 0
>
> Record Count = 36416 , Record Bytes = 9401036
> Bytes/Record = 258 , Bytes/Group = 2054
> Primary file space: Total Frames = 6484 , Total Bytes = 9401036
> Secondary file space: Total Frames = 0 , Total Bytes = 0
> 0 free frames, each 4096 bytes , first frame offset 0
>
So, here you can see that there are no free space blocks, everything
allocated to the file is in use and the file is about twice as big as it
needs to be as of the 4000 bytes available in each primary allocation
for the bucket, on average you are only half filling them, even though
some are empty and some are very full. The issue, such as there is one,
is that the record keys are stupid choices for standard MV hashing
algorithms, which are also generally pretty weak for historic reasons.

> jsh globdev ~ -->
>
>
> See above 4568 to 4574 frames are free but 4565 frame has a large no
> of records in it.
>
> Could help me on this pls.
>
There may or may not be any issue with this file performance wise, but
you should probably do the following (and really, everyone should do
this anyway):

1) Don't use the default hashing algorithm, use the polynomial
distribution algorithm written by someone who must have been very
handsome and clever. This is changed by creating the file with the
HASHMETHOD=n option, where n is 1, 2, 3 or 4. You should really use
HASHMETHOD=2
2) Use jrf to resize the file and changing the HASHING algortihm;
3) Change the environment so that HASHMETHOD=2 is the default;
4) Read the articles about CREATE-FILE in the jBASE knowledgebase;
5) Read the posting guidelines for the group and supply the operating
system, version of jBASE and so on ;-)

So, in your login script (.profile or whatever)

export JEDI_PREFILEOP="HASHMETHOD=2"

You can also use this command at the shell prompt. Then resize the file:

jrf MYFILENAME

jstat should now tell you that the bytes per group is closer to 4000 and
because the hashing algorithm is much better, you will find it has a
more even distribution. But with only 36K smallish records, it probably
won't make much difference performance wise.

JIm

Jim Idle

unread,
Mar 24, 2009, 4:30:11 PM3/24/09
to jB...@googlegroups.com
pat wrote:
> I assume the 'free frames' you refer to are Groups 4566, 4568, 4569,
> thru 4574 etc.
>
> These Groups are 'unused' simply because there are no item ids,
> currently within this file, that 'Hash' into these particular Groups.
>
> Try creating a file with Hash Method 5 ( assuming you are using jBASE
> 4.1 or above )
>
I have tried HASHMETHOD=5 on jBASE 5, and HASHMETHOD=2 is still way better.

Jim

pat

unread,
Mar 24, 2009, 9:38:09 PM3/24/09
to jBASE
And a scientific test, using the 19 item ids that Hashed into Group
4565 in the Original post, shows that Hash method 2 ( two )
demonstrates exactly what the original poster referred to.

Whereas Hash method 5 gives an even spread of items throughout the
Group for these 'real life' item ids

[ Test conducted on both jBASE 4.1.5 and jBASE 5.0 - Results are
obviously identical on both versions of jBASE ]


The tables below show how many items hash into each Group, using
different Modulos, under each of Hash Method 2 and Hash Method 5

The same 19 items are used in both tests.

The item id's being the 'real life' 19 item ids shown in the original
post, all of which Hashed into Group 4565 in the original posting


Hash Method 2

Group 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Modulo

1 19 - - - - - - - - - - - - - - - - - -
[ obviously ]

2 0, 19 - - - - - - - - - - - - - - - - -
[ Still Bad ! ]

3 7, 6, 6 - - - - - - - - - - - - - - - -

4 0, 0, 0, 19 - - - - - - - - - - - - - - -
[ Bad ;-( ]

5 5, 6, 2, 2, 4 - - - - - - - - - - - - - -

6 0, 6, 0, 7, 0, 6 - - - - - - - - - - - - -

7 3, 1, 2, 5, 3, 3, 2 - - - - - - - - - - - -

8 0, 0, 0, 0, 0, 0, 0, 19 - - - - - - - - - - -
[ Oh Dear ;-( ]

9 2, 1, 3, 3, 2, 1, 2, 3, 2 - - - - - - - - - -

10 0, 6, 0, 2, 0, 5, 0, 2, 0, 4 - - - - - - - - -
[ Poor ]

11 2, 1, 2, 0, 2, 3, 3, 1, 2, 2, 1 - - - - - - - -

12 0, 0, 0, 7, 0, 0, 0, 6, 0, 0, 0, 6 - - - - - - -
[ Not Brilliant ]

13 2, 1, 1, 1, 2, 0, 3, 1, 1, 2, 0, 2, 3 - - - - - -

14 0, 1, 0, 5, 0, 3, 0, 3, 0, 2, 0, 3, 0, 2 - - - - -

15 2, 1, 0, 1, 2, 1, 2, 1, 1, 1, 2, 3, 1, 0, 1 - - - -

16 0, 0, 0, 0, 0, 0, 0, 19, 0, 0, 0, 0, 0, 0, 0, 0 - - -
[ Not Good ]

17 1, 1, 0, 1, 1, 2, 1, 2, 2, 2, 2, 1, 0, 1, 1, 0, 1 - -

18 0, 1, 0, 3, 0, 1, 0, 3, 0, 2, 0, 3, 0, 2, 0, 2, 0, 2 -

19 0, 1, 0, 2, 0, 1, 0, 3, 1, 1, 1, 2, 0, 1, 1, 1, 1, 3, 0


For the same 19 items, a far superior, and even, spread of the items
throughout the Groups

with Hash Method 5

Group 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Modulo
1 19 - - - - - - - - - - - - - - - - - -
[ obviously ]

2 6, 13 - - - - - - - - - - - - - - - - -

3 3, 7, 9 - - - - - - - - - - - - - - - -

4 3, 6, 3, 7 - - - - - - - - - - - - - - -

5 5, 4, 4, 1, 5 - - - - - - - - - - - - - -

6 2, 4, 1, 1, 3, 8 - - - - - - - - - - - - -

7 3, 2, 5, 3, 1, 3, 2 - - - - - - - - - - - -

8 2, 2, 2, 3, 1, 4, 1, 4 - - - - - - - - - - -

9 1, 2, 3, 1, 2, 3, 1, 3, 3 - - - - - - - - - -

10 1, 2, 2, 0, 0, 4, 2, 2, 1, 5 - - - - - - - - -

11 0, 0, 4, 3, 1, 2, 1, 0, 3, 3, 2 - - - - - - - -

12 1, 0, 0, 1, 1, 6, 1, 4, 1, 0, 2, 2 - - - - - - - [ Not
Brilliant ;-( ]

13 3, 0, 0, 1, 2, 3, 1, 3, 2, 1, 1, 0, 2 - - - - - -

14 0, 1, 2, 3, 0, 2, 2, 3, 1, 3, 0, 1, 1, 0 - - - - -

15 0, 2, 2, 0, 1, 3, 1, 1, 0, 1, 2, 1, 1, 1, 3 - - - -

16 1, 2, 0, 2, 0, 1, 0, 2, 1, 0, 2, 1, 1, 3, 1, 2 - - -

17 0, 1, 2, 2, 0, 1, 0, 1, 1, 0, 1, 1, 3, 2, 2, 2, 0 - -

18 0, 1, 0, 0, 0, 3, 1, 1, 1, 1, 1, 3, 1, 2, 0, 0, 2, 2 -

19 0, 1, 2, 1, 4, 1, 0, 0, 0, 2, 1, 1, 1, 0, 0, 2, 2, 0, 1

Note All opinions / recommendations expressed in this Group are not
necessarily the recommendations of Temenos nor jBAE International

udayangi

unread,
Mar 25, 2009, 5:27:14 AM3/25/09
to jBASE
Hi Jim and Pat,

Thanks for your clarifications, I got my doubts cleared. I tried
creating a new file using both HASH2 and HASH5. In both cases the
record distribution looks far better than HAHS3. Anyway I am yet to
do more testing to find out the performance impact.

Udayangi
> > Jim- Hide quoted text -
>
> - Show quoted text -

Jim Idle

unread,
Mar 25, 2009, 12:13:21 PM3/25/09
to jB...@googlegroups.com
pat wrote:
> And a scientific test, using the 19 item ids that Hashed into Group
> 4565 in the Original post, shows that Hash method 2 ( two )
> demonstrates exactly what the original poster referred to.
>
This isn't scientific at all. 19 item ids in a file that big gives a
statistical spread that is pretty useless mate :-(

Try it with a realistic number relative to the modulo and you will find
that HASHMETHOD=2 is better than 5.

> Whereas Hash method 5 gives an even spread of items throughout the
> Group for these 'real life' item ids
>

But you need to have 37,000 of them, in a real file size not 19. The
perturbation is not optimized for non-reallife modulos :-) Try the
program I sent you with 1 million for instance.

Jim

Jim Idle

unread,
Mar 25, 2009, 12:17:06 PM3/25/09
to jB...@googlegroups.com
Jim Idle wrote:
Missed a few thing...

pat wrote:
  
And a scientific test, using the 19 item ids that Hashed into Group
4565 in the Original post, shows that Hash method 2 ( two )
demonstrates exactly what the original poster referred to.
  
    
This isn't scientific at all. 19 item ids in a file that big 
[or in a modulo of 19]

gives a 
statistical spread that is pretty useless mate :-(

Try it with a realistic number relative to the 
[original]
Reply all
Reply to author
Forward
0 new messages