Well, you are not doing yourself any favors here to be honest. First, why are you calling CONV_IB on the part number? It is only going to get converted to string anyway and as you just did ++ on it, it will already be an integer value anyway. Don’t do that, it is just extra work that should not be necessary.
Secondly, is there no way to reduce the number of accounts in this file? Are they all live and cannot be archived in some way? This could be possible of course.
However, you will get more mileage from splitting up your query into multiple selects I think. Also, is your application really going to perform 50 simultaneous selects on this file? Are you sure that you need to process this list in sorted order? It is much better to process in the natural select order unless the algorithm relies on sorted order.
Next, though I feel there is a lot of work you can do on your code, you need to change the memory allocation algorithms for AIX. Basically, you don’t have enough RAM to do all the indexed sorts (though as an aside I am not sure why using the indexes should require so much memory over not using them – you may need to ask TEMENOS about that), all at once, so you are causing paging, running out of paging space and AIX has a strategy to kill processes so that it does not crash. You can search past posts:
For explanations, but basically in your login script add:
export MALLOCTYPE=buckets
You should also read this:
Jim
--
Please read the posting guidelines at: http://groups.google.com/group/jBASE/web/Posting%20Guidelines
IMPORTANT: Type T24: at the start of the subject line for questions specific to
Globus/T24
To post, send email to jB...@googlegroups.com
To unsubscribe, send email to jBASE-un...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/jBASE?hl=en
The 'MALLOCTYPE' environment variable is no longer relevant on Aix
systems running jBASE 5.018 and above ( as in this case )
Pat.
On 29 Dec, 19:34, "Jim Idle" <j...@temporal-wave.com> wrote:
> Well, you are not doing yourself any favors here to be honest. First, why are you calling CONV_IB on the part number? It is only going to get converted to string anyway and as you just did ++ on it, it will already be an integer value anyway. Don't do that, it is just extra work that should not be necessary.
>
> Secondly, is there no way to reduce the number of accounts in this file? Are they all live and cannot be archived in some way? This could be possible of course.
>
> However, you will get more mileage from splitting up your query into multiple selects I think. Also, is your application really going to perform 50 simultaneous selects on this file? Are you sure that you need to process this list in sorted order? It is much better to process in the natural select order unless the algorithm relies on sorted order.
>
> Next, though I feel there is a lot of work you can do on your code, you need to change the memory allocation algorithms for AIX. Basically, you don't have enough RAM to do all the indexed sorts (though as an aside I am not sure why using the indexes should require so much memory over not using them - you may need to ask TEMENOS about that), all at once, so you are causing paging, running out of paging space and AIX has a strategy to kill processes so that it does not crash. You can search past posts:
>
> http://jbase.markmail.org/search/?q=AIX%20malloc#query:AIX%20malloc%2...
>
> For explanations, but basically in your login script add:
>
> export MALLOCTYPE=buckets
>
> You should also read this:
>
> http://publib.boulder.ibm.com/infocenter/tivihelp/v2r1/index.jsp?topi...
Jim
Paa Kwesi Barnes,
Tel: +233 264 250 992
+233 244 250 992
+233 204 250 992
However, you will get more mileage from splitting up your query into multiple selects I think.
Also, is your application really going to perform 50 simultaneous selects on this file?
What i'm concerned about is the resources: if every next SELECT allocates additional RAM (which is de-allocated only when sign off/logoff) then online is extremly slow and it'll take 1-2 hours to make it unoperable
revise your SELECTs to find out the way to perform the task without
index if possible.
For example, you don't need to index on CUSTOMER field since there is
a "concat" file in T24:
LIST FBNK.CUSTOMER.ACCOUNT
@ID....... @ID....... CUSTOMER.CODE ACCOUNT.NUMBER..
100197 100197 100197 35025
100362 100362 100362 15156
555555 555555 555555 28298
100378 100378 100378 15172
etc so you can find all accounts belonging to particular customer
quite easily.
For other fields there might be concats as well (contact Temenos
helpdesk for this). Otherwise you can create your owns using
EB.ALTERNATE.KEY application.
I wouldn't recommend jbase indexes... Pat wouldn't like this but...
don't use them.
This is my IMHO and please don't ask me to go deeper into that
subject :(
About archiving - in T24 accounts are moved to so-called "history"
file after they are closed. All opened accounts are in one table.
Happy New Year to everyone!
VK
> ...
>
> read more »
There are a few possibilities here:
1) A straight bug is causing queries that use indexes not to free the memory allocated to the indexes. Because the query is executed in process the memory will not be freed until the process ends. You could check this by profiling a program that executes the same query twice with an INPUT statement after each one. See how much memory it uses after the first query, then hit return and see if the memory usage essentially doubles (or significantly increases at least as other memory may be freed);
2)
You are being fooled by
the reporting in AIX in that it will show the memory allocated to the process
even though it has been correctly free()’d by the query. Basically the
allocation routines mark memory as being free for reuse but don’t reduce
the allocations to the process; these are reclaimed by the system if it gets
low on memory. A good operating system will only have memory ‘free’
for essential kernel activities, everything else it will just use for something
as much as possible, which is what you want.
However, you claim that AIX is killing processes, which means it cannot reclaim
enough memory, either because it is fragmented or really is not given back.
With the information available I cannot tell if this is because you are just
reaching the limits of the system as configured when you run so many SSELECT at
once (again are you SURE you need to use SSELECT – it is rare that you do
other than for listings), or that because memory is retained by the SSELECT
with indexes, that it is just not getting back to the system until the process
ends. This is something that you will need TEMENOS help with I think;
3) As a variant of 2, the allocation patterns used in the query engine is not efficient and the use of indexes exacerbates this. This would need some analysis by TEMENOS to tune it properly.
If the application is indeed using lots of SELECT/SSELECT in general operation then it should really be reworked (though only TEMENOS can do this of course). Even though jBASE 4.1/5.x queries are pretty efficient, there is always a much better algorithmic way to do things than relying on the SELECT engine. Because you have so many partitions you are also likely causing way more disk IO than you should be. Generally you will want to distribute items according to access patterns and not just some arbitrary number, otherwise you select a record from partition 6 then partition 2 then partition 9 etc and the whole read-ahead system is defeated (and probably the cache) because of tons of random IO.
So you need to determine if this allocated memory really is allocated or just remains assigned to the process until the system really needs it. This is a very complicated area to be honest. There are probably kernel level tunings that affect the behavior too, it is more than I could advise you on via this forum. I think that you really need to get TEMENOS in to look at the system properly and need one of the guys that knows AIX well and has worked on the SELECT engine.
Jim
Can you tell us the total number of records in the file and the number
of variations for each of the indexed attributes?
> 433M18.83 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 5 uatusr 229884 7 (6) 197 1 3291 99 1433 0
> 466M 19.98
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 12 uatusr 340238 7 (6) 193 1 3214 97 1404 0
> 519M22.05 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 14 uatusr 274452 7 (6) 201 1 3391 101 1462 0
> 315M13.91 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 19 uatusr 413718 7 (6) 197 1 3321 99 1433 0
> 422M 16.51
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 20 uatusr 110934 7 (6) 193 1 3218 97 1404 0
> 506M22.17 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 21 uatusr 352482 7 (6) 197 1 3302 99 1433 0
> 365M 15.73
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 22 uatusr 364998 33 (19) 192 1 3232 97 1402 0 411M
> 0.00 3 SELECT FBNK.ACCOUNT WITH CURRENCY EQ
> 27 uatusr 938076 7 (6) 197 1 3279 99 1433 0
> 553M24.51 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 30 uatusr 139574 6 (5) 3 1 886 7 22 0 2.06M
> 0.32 1 E /opt/jbase5/bin/jsh -s jsh - (jsh.
> 35 uatusr 635036 7 (6) 193 1 3252 97 1404 0
> 314M14.75 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> '/eoy/eoy/bnk.run/globuspatchlib:/eoy/eoy/bnk.run/lib:/eoy/eoy/bnk.run/globuslib:/eoy/eoy/bnk.run/fixlib'
> WARNING: Cannot access Object path '/eoy/eoy/bnk.run/globuspatchlib', error
> 2
> WARNING: Cannot access Object path '/eoy/eoy/bnk.run/fixlib', error 2
> jBASE Compiler Run-time : '/opt/jbase5/config/system.properties'
> Program dir (JBCDEV_BIN) : '/eoy/eoy/bnk.run/bin'
> Subroutine dir (JBCDEV_LIB) : '/eoy/eoy/bnk.run/lib'
> Max open files : 65534
>
> Can anybody please explain what is triggering this and how to fix the issue?
>
> mw42.zip
> 100KViewDownload
Using an Index with a limited number of differing values will reduce
the SELECT time when Selecting a specific value
Say for example in the extreme case of ( only ) four different
'CURRENCIES' within the items in your file, eg :
1 million items with a 'CURRENCY' of 'GBP'
1 million items with a 'CURRENCY' of 'USD'
1 million items with a 'CURRENCY' of 'YEN'
and
1 thousand items with a 'CURRENCY' of 'FF'
SELECTing the million item ids for items with a 'CURRENCY' of 'USD'
will be instantaneous via the index, compared with processing the
'3,001,000' items looking for a 'CURRENCY' of 'USD'
The downside is Creating / Updating an Index where the majority of
values for the indexed fielsd are identical
However in jBASE 4.1 and above, the Creation / Update of an index on
such non unique values is nowhere near as painful / time consuming as
previously in jBASE 4.0 and prior releases
And is also improved by creating the index on such fields using the '-
w' option
Pat.
Thought I'd chime in on this subject, since it is a pet peeve of mine.
Experience has taught me that any file that is updated by multiple
concurrent processes is a poor candidate for an index, since the
update to the index can only occur sequentially. That means that if
you have 10 processes all performing updates to FBNK.ACCOUNT at the
same time, then they have to queue waiting to update FBNK.ACCOUNT]I.
Also, the number of nodes that are being indexed can be a factor. I
see, for instance, that ACCOUNT.OFFICER is indexed in this file. I
assume that number of values that ACCOUNT.OFFICER can take is
relatively small. If that assumption is correct, then the time it
takes to update that index grows as the number of records indexed
under that node grows.
And lastly, the number of indexes on a file also impacts the amount of
time required to perform updates. I do not know what an "optimal"
number is, but I suspect that 5 is too high, although I could be
wrong.
So, if you have a file with a large number of records that is being
updated by multiple concurrent processes, then I would suggest
removing those indexes entirely. If the file is not exceesively large,
then extracting your dataset via a jBC program would be much more
efficient than trying to use jQL to do the legwork for you. If the
file is a large one, then there are other alternatives, which I won't
go into here, as that discussion could get lengthy, but I would be
happy to share ideas on the subject.
> 433M18.83 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 5 uatusr 229884 7 (6) 197 1 3291 99 1433 0
> 466M 19.98
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 12 uatusr 340238 7 (6) 193 1 3214 97 1404 0
> 519M22.05 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 14 uatusr 274452 7 (6) 201 1 3391 101 1462 0
> 315M13.91 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 19 uatusr 413718 7 (6) 197 1 3321 99 1433 0
> 422M 16.51
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 20 uatusr 110934 7 (6) 193 1 3218 97 1404 0
> 506M22.17 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 21 uatusr 352482 7 (6) 197 1 3302 99 1433 0
> 365M 15.73
> 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 22 uatusr 364998 33 (19) 192 1 3232 97 1402 0 411M
> 0.00 3 SELECT FBNK.ACCOUNT WITH CURRENCY EQ
> 27 uatusr 938076 7 (6) 197 1 3279 99 1433 0
> 553M24.51 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> 30 uatusr 139574 6 (5) 3 1 886 7 22 0 2.06M
> 0.32 1 E /opt/jbase5/bin/jsh -s jsh - (jsh.
> 35 uatusr 635036 7 (6) 193 1 3252 97 1404 0
> 314M14.75 1 I /opt/jbase5/bin/jsh -s jsh - (Comm
> '/eoy/eoy/bnk.run/globuspatchlib:/eoy/eoy/bnk.run/lib:/eoy/eoy/bnk.run/globuslib:/eoy/eoy/bnk.run/fixlib'
> WARNING: Cannot access Object path '/eoy/eoy/bnk.run/globuspatchlib', error
> 2
> WARNING: Cannot access Object path '/eoy/eoy/bnk.run/fixlib', error 2
> jBASE Compiler Run-time : '/opt/jbase5/config/system.properties'
> Program dir (JBCDEV_BIN) : '/eoy/eoy/bnk.run/bin'
> Subroutine dir (JBCDEV_LIB) : '/eoy/eoy/bnk.run/lib'
> Max open files : 65534
>
> Can anybody please explain what is triggering this and how to fix the issue?
>
> mw42.zip
> 100KViewDownload
> ...
>
> read more »- Hide quoted text -
>
> - Show quoted text -
If you have indexes with two values say 'Y' and 'N' and the number of 'Y's is relatively small and these are the only ones you will select, then only index the records with a 'Y' as you won't use the 'N' anyway. Look for other such optimizations.
It was the case in 4.0 and before that updating an index with a many-to-one relationship was awkward to do. However all you need to do is include the item ID in the index such that you have a unique node for each data point. As Pat points out, things have moved on with indexes since 4.0, so take his advice here on creating the indexes etc.
I cannot emphasize too much though that before looking for optimizations at this level, look to your application. A big oversight is processing records in an order that you think is a good idea but actually means nothing at all to the results of processing. For instance, if selecting on currency, do you actually need to process all the records for USD, then all the GBP and so on. If you are processing all the records anyway, then just keep accumulators for data points. Perhaps you need to perform one operation in sorted order but not anything else. Sometimes doing multiple operations at once will be a good idea, sometimes you might repeat the traversal in natural database order and find that it is faster that way. When you have your algorithms together, THEN look to tricks and tuning to improve performance.
(That last is of course not overly relevant to the original thread about whether the memory is being leaked or not, but generally relevant to getting things to work well).
Jim
> -----Original Message-----
> From: jb...@googlegroups.com [mailto:jb...@googlegroups.com] On Behalf
> Of pat
Jim
> -----Original Message-----
> From: jb...@googlegroups.com [mailto:jb...@googlegroups.com] On Behalf
> Of Bruce Willmore
> Sent: Monday, January 04, 2010 11:21 AM
> To: jBASE
> Subject: [SPAM] Re: Performance issue on files with INDEX
>
On Jan 5, 4:29 pm, pat <pat...@gmail.com> wrote:
> Au Contraire
>
> Using an Index with a limited number of differing values will reduce
> the SELECT time when Selecting a specific value
> Say for example in the extreme case of ( only ) four different
> 'CURRENCIES' within the items in your file, eg :
>
> 1 million items with a 'CURRENCY' of 'GBP'
> 1 million items with a 'CURRENCY' of 'USD'
> 1 million items with a 'CURRENCY' of 'YEN'
> and
> 1 thousand items with a 'CURRENCY' of 'FF'
>
> SELECTing the million item ids for items with a 'CURRENCY' of 'USD'
> will be instantaneous via the index, compared with processing the
> '3,001,000' items looking for a 'CURRENCY' of 'USD'
>
I await with interest the actual figures from the OP. I wonder whether
your suggested figures for an "extreme case" reflects real life.
Perhaps the figures for one currency will almost always represent way
more than 30% of data.
> ...
>
> read more »
> -----Original Message-----
> From: jb...@googlegroups.com [mailto:jb...@googlegroups.com] On Behalf
> Of Mike Preece
> Sent: Tuesday, January 05, 2010 2:16 PM
> To: jBASE
> Subject: [SPAM] Re: Performance issue on files with INDEX
>
>
>
> On Jan 5, 4:29 pm, pat <pat...@gmail.com> wrote:
> > Au Contraire
> >
> > Using an Index with a limited number of differing values will reduce
> > the SELECT time when Selecting a specific value Say for example in
> the
> > extreme case of ( only ) four different 'CURRENCIES' within the items
> > in your file, eg :
> >
> > 1 million items with a 'CURRENCY' of 'GBP'
> > 1 million items with a 'CURRENCY' of 'USD'
> > 1 million items with a 'CURRENCY' of 'YEN'
> > and
> > 1 thousand items with a 'CURRENCY' of 'FF'
> >
> > SELECTing the million item ids for items with a 'CURRENCY' of 'USD'
> > will be instantaneous via the index, compared with processing the
> > '3,001,000' items looking for a 'CURRENCY' of 'USD'
> >
>
> I await with interest the actual figures from the OP. I wonder whether
> your suggested figures for an "extreme case" reflects real life.
> Perhaps the figures for one currency will almost always represent way
> more than 30% of data.
>
Yes, but if you append the item ID to the indexed data and modify the query to account for it, you won't have any problems with building it anyway. Though it is really better to program to the indexes in jBC (OPENINDEX etc), where you then have complete control over how you traverse. Using SELECT is really just a 'cop-out' from doing it programmatically (I think you already said that too).
Jim