I have noticed some strange behavior of CREATE-INDEX command.
My database has around 500,000 records.
case 1 :
If a create an index on a field, that has one or two distinct values
the indexing process takes forever (i canceled it after 12 hours of
indexing).
case 2 :
if I create index on a field, that has several thousands of distinct
values - it takes 1.5 - 2 hours.
case 3 :
if I create index on a field that is unique (I mean all records has
different values in this field) - the whole process takes several
minutes.
So, here is obvious an algorithm problem.
I think that this building routine of jBase uses dynamic arrays and
once they grow up in size - it's more hard to work with them.
In case 1 the first several thousands of records was indexed quick, but
after that the speed reduces dramatically - if more records are
processed, the speed is more slower, and so on.
At the end each 1000 records were indexed in more 20 mins - which is
NON-SENS.
Does anybody know if in jBase 4.1 this problem is solved ?
Our version is 4.0.6.
Regards,
Alex.
But my DB has (or will have) several millions of records with a field
that has 50-100 distinct values (its a department code).
In this case indexing is also too slow. But in need this DB to be
indexed by this field to speed up the selection.
What should I do in this case ?
Alex.
as I remember in Oracle I didn't have such problem.
Regardless of the distinct values, the indexing process took same time.
It depeds just of the DB size.
May be I'm wrong, it's not me who was the admin of the Oracle DB, but
this is what I know.
alex.
If you are doing a binary search/sort against 1 million unique index
keys, you effectively cut your sample in half with each compare. At
most with this scenario, you have to do ~ 20 compares.
If you are doing a binary search/sort against 1 million "True/False"
index keys, for arguments sake, 50% true, 50% false, you can cut the
number of index keys to compare in half with your first compare, but
then, you'll have to test 500,000 times afterwards.
Regardless of the DBMS/OS, the second scenario is going to take a bit
longer...
804 Las Cimas Parkway, Suite 200
Austin, TX 78746
Voice: (512)278-5359
-----Original Message-----
From: jB...@googlegroups.com [mailto:jB...@googlegroups.com] On Behalf
Of foxy
Sent: Tuesday, June 14, 2005 10:05 AM
To: jB...@googlegroups.com
Subject: Re: CREATE-INDEX issue
Charlie,
alex.
Notice: This transmission is for the sole use of the intended recipient(s) and may contain information that is confidential and/or privileged. If you are not the intended recipient, please delete this transmission and any attachments and notify the sender by return email immediately. Any unauthorized review, use, disclosure or distribution is prohibited.
scenario can be another - all depends of the algorithm of building
index.
For example, in your case with 1,000,000 records with true/false values
: why should I compare all the 500,000 values ? If I found the first
equal to mine - that's it - I put it in the same leaf. (if binary
search (greater or less) is used).
As I sad - all depends on the algorithm.
Alex.
:) I think this is impossble to ask TEMENOS to do something.
Did You succeeded ? ;)
> Well, there is one other choice and that it is to intercept the SELECT
> statement and analyze it yourself, but this is non-trivial.
Any idea how to intercept SELECT statement ?
Alex.
Correct. I have added BY 0 to the end and there were no effect in
SELECTs.
Hope in 4.1 will be OK.
Alex.
:))) Ha-ha-ha. That one was good ;)
> > Any idea how to intercept SELECT statement ?
>
> Yes. Description below is for 3.x/4.0 only - you won't need to know what
> happens in 4.1.
>
[...]
Thank You very much for info.
Alex.