Thanks to all three of you for the helpful responses.
In my specific use case, selection is simple — I process every record in a file, so I pre-divide all record keys into N subsets using round-robin logic and QSELECT each subset in the corresponding PHANTOM process. This is working well.
The application reads small records, performs heavy CPU-bound processing on each record independently (millions of nested iterations per record), then writes a small result record to a separate file. Thousands of writes total, millions of iterations. As of current testing, record selection and disk I/O are not bottlenecks — the processing is overwhelmingly CPU-bound.
Bob — your point about the system-wide PHANTOM process limit is noted and appreciated. Since I'm planning to match the number of PHANTOMs to the number of cores (likely 4 on my next test system), I don't anticipate hitting that limit, but it's good to know it exists.
My next step is testing on a 4-core system to see if the 2x speedup I observed on 2 cores continues to scale linearly.
Depending on the hardware architecture you’re running on, you will eventually run into limits related to NUMA (the intel model). This is where the underlaying system needs to communicate with other process’ memory over a slower mechanism. Won’t go into NUMA here, but there is plenty to read about this especially if you are not pushing max IO for your storage but seem to be seeing performance degrade as you add cores, threads, or memory.
Bruce Decker
--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/mvdbms/4d07e518-7141-4154-af96-21c0a8c0b483n%40googlegroups.com.
tem-wide PHANTOM process limit is noted and appreciated. Since I'm planning to match the number of PHANTOMs to the number of cores (likely 4 on my next test system), I don't anticipate hitting that limit, but it's good to know it exists.
I guess what Bob and me are both saying is, i/o can be cpu-heavy, so
it's still very much worth investigating - finding the data is always
likely to be the most expensive part.