Output format of nucleatide simulation

22 views
Skip to first unread message

Zuxi Cui

unread,
Jun 13, 2022, 2:11:45 PM6/13/22
to slim-discuss
Hi Ben,

I have a suggestion for further updates on SLiM. I've been using SLiM for simulating genome-wide data for a while and realized the output format of ".vcf" is not an efficient one. The random mating system does not take long to process but writing data as a VCF takes a lot of time. For example, it took me less than an hour to process the mating but more than 300 hours to write 13,000 samples of a single chromosome (chr15) as the VCF is not compressed at all (2376GB). PLINK2 format (binary pedigree) is much more efficient than VCF for us researching whole-genome data. After transforming, for the same dataset described above, the PLINK2 files (pgen+pvar+psam) are less than 40GB. Besides, another popular genome-wide simulator Hapgen2 also uses a more efficient file format (haps+legend). The haplotype format is less efficient than PLINK2 since it was developed around 2011 but is still ~5x more efficient than VCF.
Can you consider adding other output formats for genome-wide data in the future? It will be very helpful and timing-saving for us users.

Thanks,
Terry

Peter Ralph

unread,
Jun 13, 2022, 4:08:38 PM6/13/22
to Zuxi Cui, slim-discuss
Hi, Zuxi! Good suggestions. You might look at the tree sequence output
option, which is very efficient and can be easily output to genotype
matrices, etcetera.

Zuxi Cui

unread,
Jun 13, 2022, 4:16:35 PM6/13/22
to Peter Ralph, slim-discuss
Thanks for the tip. Is there a chapter in the manual that I can follow? I prefer some sample codes to start with.

Peter Ralph

unread,
Jun 13, 2022, 4:56:08 PM6/13/22
to Zuxi Cui, slim-discuss
Well, Chapter 17 of the SLiM manual talks about tree sequence
recording in SLiM; then you might look at these tutorial for how to
efficiently analyze the results:
https://tskit.dev/tutorials/getting_started.html#saving-and-exporting-data

If you don't find it and have a suggestion for what would be a good
simple example of the sort of thing you want to do, let me know!
* peter
Reply all
Reply to author
Forward
0 new messages