Splitting data across multiple machines?

13 views
Skip to first unread message

Tami Hildahl

unread,
Dec 1, 2022, 2:51:29 PM12/1/22
to dDocent User Help Forum
Hello -

I have ~1400 samples to run across multiple (60) populations. Best estimate is 32 days of running if I am to put it all on one hi-memory node running 20 threads. Is it unwise to split the data and run dDocent on several machines, concatenating the vcf files later for filtering? I understand that the populations should be run together for FreeBayes, so if I were to keep the populations together and split the 1400 samples by population across 4 machines, would there an impact to the results?

Please advise.
Reply all
Reply to author
Forward
0 new messages