Performance issue on files with INDEX

404 views
Skip to first unread message

[Xze]

unread,
Dec 29, 2009, 10:14:36 AM12/29/09
to jb...@googlegroups.com
 
Dear all,
 
We have noticed a huge performace impact on the system when we are running SELECT on INDEX fields for FBNK.ACCOUNT file.
 
After every SELECT the kernel was allocating additional space in RAM.
When both RAM and paging space were 99% used, the kernel killed almost all jbase sessions.
 
 
LIST-INDEX FBNK.ACCOUNT

INDEX definitions for file FBNK.ACCOUNT at 16:54:38  29 DEC 2009                                             PAGE    1
INDEX NAME    LOCALE NAME    SORT KEYS.    LOOKUP....    INDEX DEFINITION...................
ACCOUNT.OF    en_US          AR                          BY-AR 11
FICER
CATEGORY      en_US          AR                          BY-AR 2
CURRENCY      en_US          AL                          BY-AL 8
CUSTOMER      en_US          AR                          BY-AR 1
SYN.CODE      en_US          AR                          BY-AR ITYPE(\LOCAL.REF<1,7>\)
 5 Records Listed
 
VERIFY-DISTRIB FBNK.ACCOUNT
Partitioning Algorithm is USER Subroutine 'AC.11'
        User subroutine OK.
Part file 'FBNK.ACCOUNT.01', part number 1 - OK
Part file 'FBNK.ACCOUNT.02', part number 2 - OK
Part file 'FBNK.ACCOUNT.03', part number 3 - OK
Part file 'FBNK.ACCOUNT.04', part number 4 - OK
Part file 'FBNK.ACCOUNT.05', part number 5 - OK
Part file 'FBNK.ACCOUNT.06', part number 6 - OK
Part file 'FBNK.ACCOUNT.07', part number 7 - OK
Part file 'FBNK.ACCOUNT.08', part number 8 - OK
Part file 'FBNK.ACCOUNT.09', part number 9 - OK
Part file 'FBNK.ACCOUNT.10', part number 10 - OK
 
   SUBROUTINE AC.11(DUMMY,INID,RESULT)
    DEFC JLibBCONV_IB(VAR)
    IF NUM(INID[1]) THEN
        PARTNO = INID[1]
        PARTNO++
    END ELSE
        PARTNO = 11
    END
    RESULT = JLibBCONV_IB(PARTNO)
    RETURN
END
 
 
In order to to replicate the issue we have opened 10 jbase sessions and each performed 50 SELECTs on FBNK.ACCOUNT INDEX fileds.
 
At the end, the mw42 -m output is as follows:
 
Port       User     Pid      Files Perf  Del  Read Write Open  MemF  MemU   Cpu  Prog
   1     uatusr  279002      6 (5)    1    1    11     2   11     0 9.39M  0.00  1 E /opt/jbase5/bin/jsh -s jsh - (jsh.
   2     uatusr  389366      7 (6)    1    1  2567     1   10     0 2.06M  0.93  2 I mw42 -m (mw42.b,232)
   4     uatusr  225790      7 (6)  197    1  3300    99 1433     0  433M 18.83  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
   5     uatusr  229884      7 (6)  197    1  3291    99 1433     0  466M 19.98  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  12     uatusr  340238      7 (6)  193    1  3214    97 1404     0  519M 22.05  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  14     uatusr  274452      7 (6)  201    1  3391   101 1462     0  315M 13.91  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  19     uatusr  413718      7 (6)  197    1  3321    99 1433     0  422M 16.51  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  20     uatusr  110934      7 (6)  193    1  3218    97 1404     0  506M 22.17  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  21     uatusr  352482      7 (6)  197    1  3302    99 1433     0  365M 15.73  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  22     uatusr  364998    33 (19)  192    1  3232    97 1402     0  411M  0.00  3 SELECT FBNK.ACCOUNT WITH CURRENCY EQ
  27     uatusr  938076      7 (6)  197    1  3279    99 1433     0  553M 24.51  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  30     uatusr  139574      6 (5)    3    1   886     7   22     0 2.06M  0.32  1 E /opt/jbase5/bin/jsh -s jsh - (jsh.
  35     uatusr  635036      7 (6)  193    1  3252    97 1404     0  314M 14.75  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
* 36     uatusr  287038      6 (5)    0    1   902     0    6     0 9.19M  0.29  1 mw42 -m (mw42.b,764)
  38     uatusr  373142    58 (50)   22    4   626    16  154     0 12.7M  0.18  1 I EX (S.COMMUNICATION,254)
 
 The same test on identical area, but WITHOUT INDEX:
 
  14      uatpf  139592      6 (5)    1    1    14     3   13     0 9.39M  0.00  1 E /opt/jbase5/bin/jsh -s jsh - (jsh.
  19      uatpf  315638      7 (6)  181    1 76.4M    91 1317     0 37.8M   14m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  20      uatpf  209012      7 (6)  189    1 79.8M    95 1375     0 29.7M   14m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  21      uatpf  176290      7 (6)  181    1 76.4M    91 1317     0 33.7M   14m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  22      uatpf  262642      7 (6)  189    1 79.8M    95 1375     0 36.2M   14m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  27      uatpf  405766      7 (6)  181    1 76.4M    91 1317     0 31.7M   14m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  30      uatpf  139372      7 (6)  177    1 74.7M    89 1288     0 31.7M   13m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  35      uatpf  225538      7 (6)  185    1 78.1M    93 1346     0 38.2M   14m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  36      uatpf  188786      7 (6)  181    1 76.4M    91 1317     0 37.6M   14m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  39      uatpf  450742      7 (6)  181    1 76.4M    91 1317     0 37.6M   13m  1 I /opt/jbase5/bin/jsh -s jsh - (Comm
  40      uatpf   94562    33 (19)  180    1 75.3M    91 1315     0 37.6M   13m  3 SELECT FBNK.ACCOUNT WITH SYN.CODE EQ
* 41      uatpf  151944      6 (5)    0    1 51209     0    6     0 9.39M 13.43  1 mw42 -m (mw42.b,764)
 
 ( The full mw42 -m output is attached )
 
 In ther first case every next SELECT was taking additional memory and was not releasing it untill the session is closed.
 
We are using:
OS - AIX 5.3.9.0
jB  - Major 5.0 , Minor 20 , Patch 0364 (Change 85159)
 
        jdiag - jBASE diagnostic '$Revision: 1.15 $'
System Information
==================
System                      : AIX jbsec 3.5 00CED1BC4C00
OS Release                  : 5.3.9.0
UNIX User                   : uatusr (uid 213, euid 213)
Tty name                    : /dev/pts/1
Time                        : Tue Dec 29 16:38:20 2009
Environment
===========
JBCPORTNO                   : Not Set
JBCRELEASEDIR               : '/opt/jbase5'
JBCGLOBALDIR                : '/opt/jbase5'
WARNING: JBCDATADIR is not set, Default '/opt/jbase5/jbase_data'
WARNING: JBCDATADIR is subdirectory of JBCGLOBALDIR
HOME                        : '/eoy/eoy/bnk.run'
JEDIFILEPATH                : '/eoy/eoy/bnk.run'
JEDIFILENAME_MD             : 'VOC'
JEDIFILENAME_SYSTEM         : '/opt/jbase5/src/SYSTEM'
SYSTEM File is (DICT)       : '/opt/jbase5/src/SYSTEM]D'
RELEASE Information         : Major 5.0 , Minor 20 , Patch 0364 (Change 85159)
Spooler dir (JBCSPOOLERDIR) : '/var/spool/jbase'
JBCEMULATE                  : 'prime'
WARNING: Cannot access Executable path '/eoy/eoy/bnk.run/globuspatchbin', error 2
Object path (JBCOBJECTLIST) : '/eoy/eoy/bnk.run/globuspatchlib:/eoy/eoy/bnk.run/lib:/eoy/eoy/bnk.run/globuslib:/eoy/eoy/bnk.run/fixlib'
WARNING: Cannot access Object path '/eoy/eoy/bnk.run/globuspatchlib', error 2
WARNING: Cannot access Object path '/eoy/eoy/bnk.run/fixlib', error 2
jBASE Compiler Run-time     : '/opt/jbase5/config/system.properties'
Program dir (JBCDEV_BIN)    : '/eoy/eoy/bnk.run/bin'
Subroutine dir (JBCDEV_LIB) : '/eoy/eoy/bnk.run/lib'
Max open files              : 65534


Can anybody please explain what is triggering this and how to fix the issue?

mw42.zip

Jim Idle

unread,
Dec 29, 2009, 2:34:24 PM12/29/09
to jb...@googlegroups.com

Well, you are not doing yourself any favors here to be honest. First, why are you calling CONV_IB on the part number? It is only going to get converted to string anyway and as you just did ++ on it, it will already be an integer value anyway. Don’t do that, it is just extra work that should not be necessary.

 

Secondly, is there no way to reduce the number of accounts in this file? Are they all live and cannot be archived in some way? This could be possible of course.

 

However, you will get more mileage from splitting up your query into multiple selects I think. Also, is your application really going to perform 50 simultaneous selects on this file? Are you sure that you need to process this list in sorted order? It is much better to process in the natural select order unless the algorithm relies on sorted order.

 

Next, though I feel there is a lot of work you can do on your code, you need to change the memory allocation algorithms for AIX. Basically, you don’t have enough RAM to do all the indexed sorts (though as an aside I am not sure why using the indexes should require so much memory over not using them – you may need to ask TEMENOS about that), all at once, so you are causing paging, running out of paging space and AIX has a strategy to kill processes so that it does not crash. You can search past posts:

 

http://jbase.markmail.org/search/?q=AIX%20malloc#query:AIX%20malloc%20from%3A%22Jim%20Idle%22+page:1+state:facets

 

For explanations, but basically in your login script add:

 

export MALLOCTYPE=buckets

 

You should also read this:

 

http://publib.boulder.ibm.com/infocenter/tivihelp/v2r1/index.jsp?topic=/com.ibm.IBMDS.doc/tuning07.htm

 

 

Jim

 

--
Please read the posting guidelines at: http://groups.google.com/group/jBASE/web/Posting%20Guidelines
 
IMPORTANT: Type T24: at the start of the subject line for questions specific to Globus/T24
 
To post, send email to jB...@googlegroups.com
To unsubscribe, send email to jBASE-un...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/jBASE?hl=en

pat

unread,
Dec 29, 2009, 11:21:39 PM12/29/09
to jBASE
The 'CONV_IB(PartNo)' was previously recommended, to ensure the 'Part
Number' returned is an integer value, and to overcome potential
problems otherwise if the CALLing program and the Distributed
Algorithm specify different 'PRECISION's

The 'MALLOCTYPE' environment variable is no longer relevant on Aix
systems running jBASE 5.018 and above ( as in this case )

Pat.

On 29 Dec, 19:34, "Jim Idle" <j...@temporal-wave.com> wrote:
> Well, you are not doing yourself any favors here to be honest. First, why are you calling CONV_IB on the part number? It is only going to get converted to string anyway and as you just did ++ on it, it will already be an integer value anyway. Don't do that, it is just extra work that should not be necessary.
>
> Secondly, is there no way to reduce the number of accounts in this file? Are they all live and cannot be archived in some way? This could be possible of course.
>
> However, you will get more mileage from splitting up your query into multiple selects I think. Also, is your application really going to perform 50 simultaneous selects on this file? Are you sure that you need to process this list in sorted order? It is much better to process in the natural select order unless the algorithm relies on sorted order.
>

> Next, though I feel there is a lot of work you can do on your code, you need to change the memory allocation algorithms for AIX. Basically, you don't have enough RAM to do all the indexed sorts (though as an aside I am not sure why using the indexes should require so much memory over not using them - you may need to ask TEMENOS about that), all at once, so you are causing paging, running out of paging space and AIX has a strategy to kill processes so that it does not crash. You can search past posts:
>
> http://jbase.markmail.org/search/?q=AIX%20malloc#query:AIX%20malloc%2...


>
> For explanations, but basically in your login script add:
>
> export MALLOCTYPE=buckets
>
> You should also read this:
>

> http://publib.boulder.ibm.com/infocenter/tivihelp/v2r1/index.jsp?topi...

Jim Idle

unread,
Dec 30, 2009, 12:12:13 AM12/30/09
to jb...@googlegroups.com
OK - though I am not sure that you need to use CONV_IB to do that. Yes I think I looked too far back in the past for the buckets thing. Isn't there a new algorithm that can be set though that allocates from the top address space and not the bottom? I know there is because I have recommended it in the past and it stops the memory fragmentation. Perhaps that is documented in the link I sent. However, the indexes should not be so inefficient memory wise I think - that probably needs looking in to. However storing everything in the one file is really something that should be looked at in the application anyway.

Jim

[Xze]

unread,
Dec 30, 2009, 2:14:22 AM12/30/09
to jb...@googlegroups.com
 
My appologize, in my first post forgot to mention that:
 
jsh uatusr ~ -->echo $MALLOCTYPE
buckets
jsh uatusr ~ -->
 
 
Now, i changed it as follows:
 
jsh uatusr ~ -->echo $MALLOCTYPE
disclaim
jsh uatusr ~ -->
 
but again the performance-wise is unchanged
 
BASE 5.0 Monitor (2.5) - AIX
[133C08:58:04  30 DEC 2009
Port       User     Pid      Files Perf  Del  Read Write Open  MemF  MemU   Cpu  Prog
   1     uatusr  491750    37 (31)   20    4   363     6  122     0 5.25M  0.22  1 I EX (S.COMMUNICATION,254)
   2     uatusr  418194      6 (5)    2    1    15     4   15     0 2.00M  0.10  1 E /opt/jbase5/bin/jsh -s jsh - (jsh.b,111)
   4     cobusr  692460      6 (5)    1    1   507     1   16     0 2.06M  0.12  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
*  5     uatusr  139486      6 (5)    0    1  1271     0    6     0 2.00M  0.47  1 mw42 -m (mw42.b,764)
  12     uatusr  401724      7 (6)  197    1  3282    99 1433     0  466M 31.15  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
  14     uatusr  377144    33 (19)  200    1  3368   101 1460     0  428M  0.00  3 SELECT FBNK.ACCOUNT WITH SYN.CODE EQ 2259 1425 1411 (select.b,8)
  19     uatusr   94614      7 (6)  201    1  3375   101 1462     0  361M 24.98  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
  20     uatusr  372772      7 (6)  201    1  3393   101 1462     0  339M 22.84  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
  21     uatusr  356532      7 (6)  197    1  3291    99 1433     0  350M 24.28  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
  22     uatusr  229726      7 (6)  197    1  3313    99 1433     0  418M 28.16  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
  27     uatusr  114854      7 (6)  197    1  3336    99 1433     0  249M 17.89  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
  30     uatusr  512366      7 (6)  193    1  3240    97 1404     0  327M 22.56  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
  35     uatusr  102632      7 (6)  193    1  3207    97 1404     0  448M 31.09  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)
  36     uatusr  635040      7 (6)  193    1  3227    97 1404     0  465M 30.88  1 I /opt/jbase5/bin/jsh -s jsh - (CommandNext.b,103)

(he full mw42 -m output is attached)
 
The SELECTs are simple
 
     SELECT FBNK.ACCOUNT WITH SYN.CODE EQ 2259 1425 1411
     SELECT FBNK.ACCOUNT WITH ACCOUNT.OFFICER EQ 1452 1526 1825
     SELECT FBNK.ACCOUNT WITH CATEGORY EQ 1360 1550 1559
     SELECT FBNK.ACCOUNT WITH CUSTOMER EQ 332984 335629 225632 45620
     SELECT FBNK.ACCOUNT WITH CURRENCY EQ GBP MDL
 
As for the ACCOUNT file, it's the bank decision to keep it all there and moreover the kernel does not 'devour' that much memory when FBNK.ACCOUNT does not have INDEX
 
Now, why is it happening ONLY when INDEXes are created?
 
[ Xze ]
mw42.index.disclaim.txt

Paa Kwesi Barnes

unread,
Dec 30, 2009, 3:23:03 AM12/30/09
to jb...@googlegroups.com
Hello,
I would like to find out if anyone in this group has a document which lists
all standard reports in T24 (Globus and eMerge) and shows the various
functions of each of the reports.

 Paa Kwesi Barnes,
Tel:        +233 264 250 992
                +233 244 250 992
                +233 204 250 992


[Xze]

unread,
Dec 30, 2009, 3:25:58 AM12/30/09
to jb...@googlegroups.com
Dear Jim,
 
However, you will get more mileage from splitting up your query into multiple selects I think.  
1. The selects are simple, please see the above post
Also, is your application really going to perform 50 simultaneous selects on this file?
2. During my test the SELECTs were NOT run simultaneously, just one-by-one as total of 50. Also we have an average of 750 active users during online, i assume 50 concurrent queries on ACCOUNT is very likely to happen

What i'm concerned about is the resources: if every next SELECT allocates additional RAM (which is de-allocated only when sign off/logoff) then online is extremly slow and it'll take 1-2 hours to make it unoperable

VK

unread,
Dec 30, 2009, 4:49:53 AM12/30/09
to jBASE
Hi,

revise your SELECTs to find out the way to perform the task without
index if possible.

For example, you don't need to index on CUSTOMER field since there is
a "concat" file in T24:

LIST FBNK.CUSTOMER.ACCOUNT

@ID....... @ID....... CUSTOMER.CODE ACCOUNT.NUMBER..

100197 100197 100197 35025
100362 100362 100362 15156
555555 555555 555555 28298
100378 100378 100378 15172

etc so you can find all accounts belonging to particular customer
quite easily.

For other fields there might be concats as well (contact Temenos
helpdesk for this). Otherwise you can create your owns using
EB.ALTERNATE.KEY application.

I wouldn't recommend jbase indexes... Pat wouldn't like this but...
don't use them.
This is my IMHO and please don't ask me to go deeper into that
subject :(

About archiving - in T24 accounts are moved to so-called "history"
file after they are closed. All opened accounts are in one table.

Happy New Year to everyone!

VK

> ...
>
> read more »

Jim Idle

unread,
Dec 30, 2009, 1:47:56 PM12/30/09
to jb...@googlegroups.com

There are a few possibilities here:

 

1)      A straight bug is causing queries that use indexes not to free the memory allocated to the indexes. Because the query is executed in process the memory will not be freed until the process ends. You could check this by profiling a program that executes the same query twice with an INPUT statement after each one. See how much memory it uses after the first query, then hit return and see if the memory usage essentially doubles (or significantly increases at least as other memory may be freed);

2)      You are being fooled by the reporting in AIX in that it will show the memory allocated to the process even though it has been correctly free()’d by the query. Basically the allocation routines mark memory as being free for reuse but don’t reduce the allocations to the process; these are reclaimed by the system if it gets low on memory. A good operating system will only have memory ‘free’ for essential kernel activities, everything else it will just use for something as much as possible, which is what you want.

However, you claim that AIX is killing processes, which means it cannot reclaim enough memory, either because it is fragmented or really is not given back. With the information available I cannot tell if this is because you are just reaching the limits of the system as configured when you run so many SSELECT at once (again are you SURE you need to use SSELECT – it is rare that you do other than for listings), or that because memory is retained by the SSELECT with indexes, that it is just not getting back to the system until the process ends. This is something that you will need TEMENOS help with I think;

3)      As a variant of 2, the allocation patterns used in the query engine is not efficient and the use of indexes exacerbates this. This would need some analysis by TEMENOS to tune it properly.

 

If the application is indeed using lots of SELECT/SSELECT in general operation then it should really be reworked (though only TEMENOS can do this of course). Even though jBASE 4.1/5.x queries are pretty efficient, there is always a much better algorithmic way to do things than relying on the SELECT engine. Because you have so many partitions you are also likely causing way more disk IO than you should be. Generally you will want to distribute items according to access patterns and not just some arbitrary number, otherwise you select a record from partition 6 then partition 2 then partition 9 etc and the whole read-ahead system is defeated (and probably the cache) because of tons of random IO.

 

So you need to determine if this allocated memory really is allocated or just remains assigned to the process until the system really needs it. This is a very complicated area to be honest. There are probably kernel level tunings that affect the behavior too, it is more than I could advise you on via this forum. I think that you really need to get TEMENOS in to look at the system properly and need one of the guys that knows AIX well and has worked on the SELECT engine.

Jim Idle

unread,
Dec 30, 2009, 2:06:43 PM12/30/09
to jb...@googlegroups.com
Indexes are of enormous benefit in queries but it seems there are still some issues to either fix or to define operating parameters for. I wouldn't say don't use them but I think that you have to do work on tuning the environment and the queries to operate well with them and that some guidelines for this should be coming from TEMENOS. Whether is a memory leak in this particular case is difficult to determine from the information available though. Please be careful about making broad statements such as "don't use them" as such statements are virtually (perhaps literally) always too broad reaching :-)

Jim

Mike Preece

unread,
Jan 4, 2010, 5:57:21 AM1/4/10
to jBASE

Taking a more simplistic view... it is often counter-productive to
create and use indexes. This is true in cases where a very high
proportion of the data on which the index is based is similar. For
example, suppose the CURRENCY is almost always the same. The index for
CURRENCY will be enormous with consequential performance degradation.
In this case it would be better to do a straight SELECT without
selection criteria and have a simple piece of code after the READNEXT
to ignore items with a CURRENCY other than that in which you're
interested.

Can you tell us the total number of records in the file and the number
of variations for each of the indexed attributes?

> 433M18.83  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>    5     uatusr  229884      7 (6)  197    1  3291    99 1433     0
> 466M 19.98
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
>   12     uatusr  340238      7 (6)  193    1  3214    97 1404     0

> 519M22.05  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>   14     uatusr  274452      7 (6)  201    1  3391   101 1462     0

> 315M13.91  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>   19     uatusr  413718      7 (6)  197    1  3321    99 1433     0
> 422M 16.51
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
>   20     uatusr  110934      7 (6)  193    1  3218    97 1404     0

> 506M22.17  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>   21     uatusr  352482      7 (6)  197    1  3302    99 1433     0
> 365M 15.73
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
>   22     uatusr  364998    33 (19)  192    1  3232    97 1402     0  411M
> 0.00  3 SELECT FBNK.ACCOUNT WITH CURRENCY EQ
>   27     uatusr  938076      7 (6)  197    1  3279    99 1433     0

> 553M24.51  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>   30     uatusr  139574      6 (5)    3    1   886     7   22     0 2.06M
> 0.32  1 E /opt/jbase5/bin/jsh -s jsh - (jsh.
>   35     uatusr  635036      7 (6)  193    1  3252    97 1404     0

> 314M14.75  1 I /opt/jbase5/bin/jsh -s jsh - (Comm

> '/eoy/eoy/bnk.run/globuspatchlib:/eoy/eoy/bnk.run/lib:/eoy/eoy/bnk.run/glob­uslib:/eoy/eoy/bnk.run/fixlib'


> WARNING: Cannot access Object path '/eoy/eoy/bnk.run/globuspatchlib', error
> 2
> WARNING: Cannot access Object path '/eoy/eoy/bnk.run/fixlib', error 2
> jBASE Compiler Run-time     : '/opt/jbase5/config/system.properties'
> Program dir (JBCDEV_BIN)    : '/eoy/eoy/bnk.run/bin'
> Subroutine dir (JBCDEV_LIB) : '/eoy/eoy/bnk.run/lib'
> Max open files              : 65534
>
> Can anybody please explain what is triggering this and how to fix the issue?
>

>  mw42.zip
> 100KViewDownload

pat

unread,
Jan 5, 2010, 11:29:39 AM1/5/10
to jBASE
Au Contraire

Using an Index with a limited number of differing values will reduce
the SELECT time when Selecting a specific value
Say for example in the extreme case of ( only ) four different
'CURRENCIES' within the items in your file, eg :

1 million items with a 'CURRENCY' of 'GBP'
1 million items with a 'CURRENCY' of 'USD'
1 million items with a 'CURRENCY' of 'YEN'
and
1 thousand items with a 'CURRENCY' of 'FF'

SELECTing the million item ids for items with a 'CURRENCY' of 'USD'
will be instantaneous via the index, compared with processing the
'3,001,000' items looking for a 'CURRENCY' of 'USD'

The downside is Creating / Updating an Index where the majority of
values for the indexed fielsd are identical

However in jBASE 4.1 and above, the Creation / Update of an index on
such non unique values is nowhere near as painful / time consuming as
previously in jBASE 4.0 and prior releases

And is also improved by creating the index on such fields using the '-
w' option

Pat.

Bruce Willmore

unread,
Jan 4, 2010, 2:21:17 PM1/4/10
to jBASE
Hi everyone:

Thought I'd chime in on this subject, since it is a pet peeve of mine.

Experience has taught me that any file that is updated by multiple
concurrent processes is a poor candidate for an index, since the
update to the index can only occur sequentially. That means that if
you have 10 processes all performing updates to FBNK.ACCOUNT at the
same time, then they have to queue waiting to update FBNK.ACCOUNT]I.

Also, the number of nodes that are being indexed can be a factor. I
see, for instance, that ACCOUNT.OFFICER is indexed in this file. I
assume that number of values that ACCOUNT.OFFICER can take is
relatively small. If that assumption is correct, then the time it
takes to update that index grows as the number of records indexed
under that node grows.

And lastly, the number of indexes on a file also impacts the amount of
time required to perform updates. I do not know what an "optimal"
number is, but I suspect that 5 is too high, although I could be
wrong.

So, if you have a file with a large number of records that is being
updated by multiple concurrent processes, then I would suggest
removing those indexes entirely. If the file is not exceesively large,
then extracting your dataset via a jBC program would be much more
efficient than trying to use jQL to do the legwork for you. If the
file is a large one, then there are other alternatives, which I won't
go into here, as that discussion could get lengthy, but I would be
happy to share ideas on the subject.

> 433M18.83  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>    5     uatusr  229884      7 (6)  197    1  3291    99 1433     0
> 466M 19.98
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
>   12     uatusr  340238      7 (6)  193    1  3214    97 1404     0

> 519M22.05  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>   14     uatusr  274452      7 (6)  201    1  3391   101 1462     0

> 315M13.91  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>   19     uatusr  413718      7 (6)  197    1  3321    99 1433     0
> 422M 16.51
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
>   20     uatusr  110934      7 (6)  193    1  3218    97 1404     0

> 506M22.17  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>   21     uatusr  352482      7 (6)  197    1  3302    99 1433     0
> 365M 15.73
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
>   22     uatusr  364998    33 (19)  192    1  3232    97 1402     0  411M
> 0.00  3 SELECT FBNK.ACCOUNT WITH CURRENCY EQ
>   27     uatusr  938076      7 (6)  197    1  3279    99 1433     0

> 553M24.51  1 I /opt/jbase5/bin/jsh -s jsh - (Comm


>   30     uatusr  139574      6 (5)    3    1   886     7   22     0 2.06M
> 0.32  1 E /opt/jbase5/bin/jsh -s jsh - (jsh.
>   35     uatusr  635036      7 (6)  193    1  3252    97 1404     0

> 314M14.75  1 I /opt/jbase5/bin/jsh -s jsh - (Comm

> '/eoy/eoy/bnk.run/globuspatchlib:/eoy/eoy/bnk.run/lib:/eoy/eoy/bnk.run/glob­uslib:/eoy/eoy/bnk.run/fixlib'


> WARNING: Cannot access Object path '/eoy/eoy/bnk.run/globuspatchlib', error
> 2
> WARNING: Cannot access Object path '/eoy/eoy/bnk.run/fixlib', error 2
> jBASE Compiler Run-time     : '/opt/jbase5/config/system.properties'
> Program dir (JBCDEV_BIN)    : '/eoy/eoy/bnk.run/bin'
> Subroutine dir (JBCDEV_LIB) : '/eoy/eoy/bnk.run/lib'
> Max open files              : 65534
>
> Can anybody please explain what is triggering this and how to fix the issue?
>

>  mw42.zip
> 100KViewDownload

Bruce Willmore

unread,
Jan 5, 2010, 12:08:08 PM1/5/10
to jBASE
One thing to consider here, though, is the impact if multiple
processes are performing updates to the same file. They will all queue
waiting for the index file to be updated.

> ...
>
> read more »- Hide quoted text -
>
> - Show quoted text -

Jim Idle

unread,
Jan 5, 2010, 2:28:44 PM1/5/10
to jb...@googlegroups.com
Yes, this is all true. Also remember that the OP was suggesting that the memory allocated to processing the indexes was not being released after the query ends and that is the thing that needs to be verified or shown to be an error in measuring the memory usage.

If you have indexes with two values say 'Y' and 'N' and the number of 'Y's is relatively small and these are the only ones you will select, then only index the records with a 'Y' as you won't use the 'N' anyway. Look for other such optimizations.

It was the case in 4.0 and before that updating an index with a many-to-one relationship was awkward to do. However all you need to do is include the item ID in the index such that you have a unique node for each data point. As Pat points out, things have moved on with indexes since 4.0, so take his advice here on creating the indexes etc.

I cannot emphasize too much though that before looking for optimizations at this level, look to your application. A big oversight is processing records in an order that you think is a good idea but actually means nothing at all to the results of processing. For instance, if selecting on currency, do you actually need to process all the records for USD, then all the GBP and so on. If you are processing all the records anyway, then just keep accumulators for data points. Perhaps you need to perform one operation in sorted order but not anything else. Sometimes doing multiple operations at once will be a good idea, sometimes you might repeat the traversal in natural database order and find that it is faster that way. When you have your algorithms together, THEN look to tricks and tuning to improve performance.

(That last is of course not overly relevant to the original thread about whether the memory is being leaked or not, but generally relevant to getting things to work well).

Jim

> -----Original Message-----
> From: jb...@googlegroups.com [mailto:jb...@googlegroups.com] On Behalf
> Of pat

Jim Idle

unread,
Jan 5, 2010, 2:59:53 PM1/5/10
to jb...@googlegroups.com
Don't forget though that you can reduce the impact of an index having just a few values by appending the item ID to the index and adjusting the query accordingly. Also, see Pat's comments on indexes in later versions of jBASE.

Jim

> -----Original Message-----
> From: jb...@googlegroups.com [mailto:jb...@googlegroups.com] On Behalf
> Of Bruce Willmore
> Sent: Monday, January 04, 2010 11:21 AM
> To: jBASE
> Subject: [SPAM] Re: Performance issue on files with INDEX
>

Mike Preece

unread,
Jan 5, 2010, 5:16:23 PM1/5/10
to jBASE

On Jan 5, 4:29 pm, pat <pat...@gmail.com> wrote:
> Au Contraire
>
> Using an Index with a limited number of differing values will reduce
> the SELECT time when Selecting a specific value
> Say for example in the extreme case of ( only ) four different
> 'CURRENCIES' within the items in your file, eg :
>
>  1 million items with a 'CURRENCY' of 'GBP'
>  1 million items with a 'CURRENCY' of 'USD'
>  1 million items with a 'CURRENCY' of 'YEN'
> and
>  1 thousand items with a 'CURRENCY' of 'FF'
>
> SELECTing the million item ids for items with a 'CURRENCY' of 'USD'
> will be instantaneous via the index, compared with processing the
> '3,001,000' items looking for a 'CURRENCY' of 'USD'
>

I await with interest the actual figures from the OP. I wonder whether
your suggested figures for an "extreme case" reflects real life.
Perhaps the figures for one currency will almost always represent way
more than 30% of data.

> ...
>
> read more »

Jim Idle

unread,
Jan 6, 2010, 1:28:53 PM1/6/10
to jb...@googlegroups.com

> -----Original Message-----
> From: jb...@googlegroups.com [mailto:jb...@googlegroups.com] On Behalf
> Of Mike Preece
> Sent: Tuesday, January 05, 2010 2:16 PM
> To: jBASE
> Subject: [SPAM] Re: Performance issue on files with INDEX
>
>
>

> On Jan 5, 4:29 pm, pat <pat...@gmail.com> wrote:
> > Au Contraire
> >
> > Using an Index with a limited number of differing values will reduce
> > the SELECT time when Selecting a specific value Say for example in
> the
> > extreme case of ( only ) four different 'CURRENCIES' within the items
> > in your file, eg :
> >
> >  1 million items with a 'CURRENCY' of 'GBP'
> >  1 million items with a 'CURRENCY' of 'USD'
> >  1 million items with a 'CURRENCY' of 'YEN'
> > and
> >  1 thousand items with a 'CURRENCY' of 'FF'
> >
> > SELECTing the million item ids for items with a 'CURRENCY' of 'USD'
> > will be instantaneous via the index, compared with processing the
> > '3,001,000' items looking for a 'CURRENCY' of 'USD'
> >
>
> I await with interest the actual figures from the OP. I wonder whether
> your suggested figures for an "extreme case" reflects real life.
> Perhaps the figures for one currency will almost always represent way
> more than 30% of data.
>

Yes, but if you append the item ID to the indexed data and modify the query to account for it, you won't have any problems with building it anyway. Though it is really better to program to the indexes in jBC (OPENINDEX etc), where you then have complete control over how you traverse. Using SELECT is really just a 'cop-out' from doing it programmatically (I think you already said that too).

Jim


Reply all
Reply to author
Forward
0 new messages