Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Indexing with MT

169 views
Skip to first unread message

CV

unread,
Jul 15, 2021, 7:56:55 PM7/15/21
to
Hi everyone

I wish to know how a simple program that builds indexes for database files can be compiled/run using MT, so, given a computer with multiple cores run in parallel at least two or three indexing operations over different databases (I'm speaking about dbf+fpt and cdx like dbfcdx).

That is, something like this:
---
wDataBases := {'DBF1', 'DBF2', 'DBF3'}
wIndexes := {'CDX1', 'CDX2', 'CDX3' }
wIndexKey := {'FLD1', 'FLD2', 'FLD3'}

for i := 1 to len(wDataBases)
select 0
use (wDataBases[i]) new exclusive
index on (wIndexKey[i]) to (wIndexes[i])
next
---
will build each index one after the other.

Then, my wish-list: building that same indexes in parallel, starting an individual thread for each database.

I don't know if this is possible, but if it is, it is worth a try.
Does someone made this?
A bit of code showing the procedure is welcome (and the libraries needed).

Best regards,
Claudio Voskian

dlzc

unread,
Jul 16, 2021, 9:43:21 AM7/16/21
to
Dear CV:

On Thursday, July 15, 2021 at 4:56:55 PM UTC-7, CV wrote:
...
> Then, my wish-list: building that same indexes in parallel, starting
> an individual thread for each database.
>
> I don't know if this is possible, but if it is, it is worth a try.
> Does someone made this?

Do you want one thread per index per database (will be slower, since the disk will be thrashing more), or just one thread per database (easier)?

Either sequence is a linear read of the database (potential time savings), but if they are not started at the "same time" the requested record might not still be in memory. And then sorting and production of the index will really start tearing things up if you have a plattered drive.

David A. Smith

Ella Stern

unread,
Jul 16, 2021, 11:12:44 AM7/16/21
to
Table indexing is about using intensively as much as possible RAM memory in order to build aka tree structure, and saving the three to the hard disk.

When the index is too big to fit completely into the local RAM, the algorithm performs more cycles, and the users have the sensation that "the database has slowed down" (this applies both to indexing and read/write operations).

IMHO indexing more than a table at once would enforce the two programs to compete for the available RAM, and each would slow down the other.

CV

unread,
Jul 16, 2021, 5:37:03 PM7/16/21
to
Ella, David

Thank you for your answers.

The computer acting as server runs windows server, is a really fast one (12 o 16 cores), the disk is a solid state drive and has 32 gb memory, so there are almost no hardware limits.
Thinking about the xharbour application is a 32 bit one, which uses no more than 50 Mb ram in the worst case so far.

The indexing process will run at night; this application is used 24 hs a day (with remote users at home, async data downloads, etc.), I don't want to run out of time when running this automated process, that is the reason of my message/need.

Should be one thread per database; as the database is used "exclusive" for indexing (I'm sure no other process will interfere), one index after the other, but many processes running in parallel recreating a set of indexes (many orders - all of them are structural := cdx and dbf with same name).

Regards,
©


Ella Stern

unread,
Jul 17, 2021, 4:40:42 AM7/17/21
to
Suggestion: explain to the admin that
- the indexing process needs a dedicated physical machine (NOT a VM) with a minimal Windows OS image (32 bits version) processor with many L1/L2 memory and high pace, 3 GB RAM, no Internet access, no end-user access, and your executable, which is doing ONLY the indexing, and nothing else
- before starting your executable, the necessary .DBF tables are copied onto that machine
- after the indexing completes successfully, the tables and indexes are picked up from that machine

Database server engines like Oracle and MS SQL are running on dedicated server with no connection to end-users, and each version is tied to specific OS versions and hardware, because they have some multi-threading features, which require advanced RAM and IO management.

Python and NodeJS are receiving user requests on different threads, but all those threads are using the RAM via time-sharing (one by one).

As I've mentioned, in case of indexing the critical resource is the RAM.

HTH

CV

unread,
Jul 17, 2021, 9:50:57 AM7/17/21
to
Ella

Thank you for your explanation.

No VM machines on that server, plenty of ram, enough speed in disk access... almost no limits in hardware.
The only limit is the available time frame to rebuild indexes in case it is needed.

How about a piece of code to do what I need to implement (or test)?
Is xharbour able to do that without errors?

Regards
Claudio Voskian

Daniele

unread,
Jul 17, 2021, 4:29:53 PM7/17/21
to
Try it yourself adapting this pseudocode:

#ifdef __XHARBOUR__
#xtranslate hb_threadStart( <x,...> ) => StartThread( <x> )
#endif
#include "hbthread.ch"

procedure main
...
// test monothread
start_time:=seconds()
? "start single thread "+ time()
index1()
index2()
elap_time=seconds()
? "End:"+time()+" seconds:"+dctrim(elap_time-start_time)

? "Start multithread:"+ time()

? "Start thread 1:"+ time()
hb_threadStart( HB_THREAD_INHERIT_PUBLIC , @index1() )
? "Start thread 2:"+ time()
hb_threadStart( HB_THREAD_INHERIT_PUBLIC, @index2() )

wait ""
return

func index1()
local start_time:=seconds(),elap_time
ferase index
use ... exclusive
index on...
use
elap_time=seconds()
? "End thread 1:"+time()+" seconds:"+dctrim(elap_time-start_time)
return nil

func index2()
...the same
return nil

Let us know
Dan
Message has been deleted

CV

unread,
Jul 18, 2021, 9:39:27 PM7/18/21
to
Hi everyone

After some testings and adapting the code to xharbour, I just receive an error in a windows message box:
"hb_xrealloc can't reallocate memory."

With this text written in the console:
Unrecoverable error 9009: Unrecoverable error 9011: hb_xrealloc can't reallocate memoryhb_xfree called with a NULL pointer Called from INDEX1(49)Called from INDEX2(0)
Called from ORDCREATE(0)Called from ORDCREATE(0)
Called from INDEX1(49)Called from INDEX2(76)

The code:
*
request dbfcdx
proc main()
? "Start thread 1:", seconds()
StartThread(@index1())
return
*
function index1()
local start_time:=seconds(),elap_time

ferase ('his_jude.cdx')
ferase ('his_jude1.cdx')

use his_jude exclusive new via 'dbfcdx'
index on STR(NRO_DEUDOR)+DTOS(FECHA) tag 'GJE_DEUDOR' to 'HIS_JUDE'
index on STR(NRO_DEUDOR)+EST_CARTA+DTOS(FECHA)+LEFT(TELEFONO,13) tag 'GJE_DEUETA' to 'HIS_JUDE' additive
index on FECHA tag 'GJE_FECHA' to 'HIS_JUDE1'
index on ARCHIVOCAM+str(NRO_DEUDOR)+left(TELEFONO,13) tag 'GJE_ARCHIV' to 'HIS_JUDE1' additive
index on ID_BML tag 'GJE_IDBML' to 'HIS_JUDE1' additive unique
use
elap_time := seconds()
? "End thread 1:"+time()+" seconds:"+str(elap_time-start_time)
return nil

And no index created.

If it is so difficult to build a simple index, I can't use MT in any other process.
Don't want to have a headache, so my simple solution: starting a couple of external programs from inside the application to pack and build indexes using shellexecute().

Regards
Claudio Voskian


Daniele

unread,
Jul 19, 2021, 8:46:27 AM7/19/21
to
Il 19/07/2021 03:39, CV ha scritto:

> StartThread(@index1())
> return

So you start the thread and then exit. How can it work?

StartThread(@index1())
wait "Press a key"
return

Anyway, such a code is just for testing, eh.
Dan

CV

unread,
Jul 19, 2021, 12:37:47 PM7/19/21
to
Dan

It was a copy and paste with missing lines, I have the wait "" before the end of the main routine, and there are 2 indexing functions for different databases (while I just copied one for the example, the other is identical).

When I start the 2 threads *sometimes* the error message occurs.
Other times just does nothing at all, I have to close the application with [X] upper right control.

Regards
Claudio Voskian

CV

unread,
Jul 21, 2021, 9:05:14 AM7/21/21
to
Hi everyone, Dan specially

I don't know why, but the very same program that previously DOESN'T work, now works properly.
I didn't change a line, tried to test it yesterday and ... WORKS.
A mistery.

Thank you for the code!

Regards
Claudio Voskian

Daniele

unread,
Jul 22, 2021, 3:57:21 PM7/22/21
to
Il 21/07/2021 15:05, CV ha scritto:

>>
>> When I start the 2 threads *sometimes* the error message occurs.
>> Other times just does nothing at all, I have to close the application with [X] upper right control.
>>
>> Regards
>> Claudio Voskian
>
> Hi everyone, Dan specially
>
> I don't know why, but the very same program that previously DOESN'T work, now works properly.
> I didn't change a line, tried to test it yesterday and ... WORKS.
> A mistery.

Well, I did not believe the two threads indexing the same file would
have succeeded. I was thinking of 2 different files!
I learned something. :-)

>
> Thank you for the code!
>
> Regards
> Claudio Voskian
>
You are welcome.
Dan

CV

unread,
Jul 22, 2021, 6:40:54 PM7/22/21
to
Dan
I used two different threads using TWO different files and indexes.
Doing the same process over the same file at the same time is non-sense.

Regards
©
0 new messages