I have a number of different answers to your question:
- The sfm mode is designed to reduce RAM use by splitting up the input data into smaller chunks that are processed one by one (see the section Split and Merge tools in the README). It is possible to write your own split/merge tools that split
up data into groups and process those groups in parallel on a cluster or on the same node, similar to what you are suggesting. We used to provide such an example script using gnu parallel, but we have dropped it because of maintenance overhead. You can still
find it on github under the older releases (see for example version 3.0 under scripts).
- You may want to actually run the elprep filter mode. Just replace “sfm” in your command by “filter”. That mode will not split up the data at all. It depends on the size of your input bam if this will run on your server, but 2TB seems large.
- It is normally not necessary to tell elPrep the specific number of threads to use. The runtime of Go (our implementation language) normally does an optimal allocation and management of runtime threads.
- elPrep is best used to execute pipelines that consists of multiple steps. The example command only uses the haplotype caller, but if your pipeline consists of multiple steps, it is best to combine them in a single commandline invocation of elPrep.
elPrep internally merges and parallises the execution of multiple pipeline steps.
- There are multiple, long-running phases in the haplotype caller algorithm, some may use less threads than others. I would also look at the overall CPU usage after an entire run.
Thanks,
Charlotte