Scaling concurrent PHANTOM processes across multiple cores using String Database

16 views
Skip to first unread message

Robert Herbin

unread,
Feb 28, 2026, 5:33:04 PM (9 hours ago) Feb 28
to Pick and MultiValue Databases
I've got a CPU-intensive batch job that has been taking longer than I would like to complete.  In an effort to speed things up, I tested splitting the work in two; half the work is done in one phantom process, the other half in a second phantom process which runs concurrently. So far the results have been encouraging, the execution time for the two combined processes was almost exactly twice as fast as a single process. Each process writes results to unique keys in a shared file so there are no record locking conflicts (there will be in excess of 100,000 results for a single combined run).

On a 2-core system I'm seeing close to a 2x speedup with 2 processes.  I theorize that this improvement would continue to scale if I had more cores. To further validate this, I ran a second test with three concurrent PHANTOM processes on the same 2-core system. Wall clock time increased by about 67% compared to the 2-process run, which is close to the theoretical expectation of 50% slower when three processes share two cores. This suggests the speedup is genuinely core-bound rather than due to some other factor, and that the optimal strategy is N processes for N cores.

A few questions:

Does this kind of parallelism continue to scale as you add more cores, or are there hidden bottlenecks I should know about?
Will concurrent writes to a shared file become a problem at higher process counts, even with no locking conflicts?
Any practical limits on the number of concurrent PHANTOMs I should be aware of?

Has anyone done something similar? Would love to hear from people with real world experience before I spend (more) money on hardware

Scott Ballinger

unread,
Feb 28, 2026, 7:29:16 PM (7 hours ago) Feb 28
to Pick and MultiValue Databases
I used to manage a large 200 user D3 system. About 15 years ago we replaced the SCSI array drives with RAM disks (both Raid-10, about 1.6TB total data).

The file-save took all night, so we broke the main system into 6 accounts with Q-pointers to the files in the data2-6 accounts ( e.g. some files left in the main account, q-pointers in main to data2 files, q-pointers in main to files in data3, etc, balanced so that all accounts were all about the same size). We then ran 1 file-save and 5 account-saves simultaneously, with the result that everything finished in 2 hours.

At the time we were working with Doug Dumitru at EasyCo and it was his claim that once you moved from rotating storage to RAM disks you were limited by the CPU bandwidth, not the disk channel. That's why we landed on the 6 simultaneous saves on 6 accounts due to the 6-core Xeon system we were running at the time. Later changed to dual Xeon CPUs (+hyperthreading) but didn't bother to increase the data accounts + account-saves as it was plenty fast enough.
/Scott Ballinger


Reply all
Reply to author
Forward
0 new messages