One tree seq for multiple chromosomes

Tatiana Bellagio

unread,

Aug 25, 2023, 2:28:47 PM8/25/23

to slim-discuss

Hi!

I'm reaching out to make sure my workflow is correct. My goal is to simulate a complex trait involving loci on multiple chromosomes using real genetic data from a population VCF file.

To avoid burdening SLiM with tracking neutral mutations, I first infer the tree sequence from the VCF file and then remove all neutral mutations. After the forward simulation in SLiM, I reintegrate these neutral mutations into my tree sequence.

However, there's a point that I'm still unsure about: Is it acceptable to infer one tree sequence from a VCF file with multiple chromosomes? I've noticed strategies in msprime that employ a msprime.RateMap to enhance tree inferences' accuracy. Should I also incorporate a msprime.RateMap in my tsinfer inference? Are shared nodes in between my chromosomes a problem? Since I'm more focused on utilizing the tree sequence as a data structure to effectively store and overlay mutations, rather than the precision of the tree itself, I'm inclined to believe that I don't need to worry about this aspect. Nevertheless, to be safe I wanted to ask.

When importing the .tree file into SLiM, I ensure separate chromosome treatment by implementing the recombination rate=0.5 trick between them. During the simulation, I avoid simplifications to retain all nodes (to accurately overlay mutationd afterward).

If my assumptions are correct, this workflow seems viable after all! (:
Thank you!

Tati

Ben Haller

unread,

Aug 26, 2023, 3:23:46 PM8/26/23

to Tatiana Bellagio, slim-discuss

Hi Tati! Since this looks like a question about tsinfer, you might try to find a channel where you can ask those folks directly. I'm not sure any of them hang out on this list, and I've never used tsinfer myself. :-> I'm not sure where they like to receive questions; perhaps on GitHub?

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University

Tatiana Bellagio wrote on 8/25/23 2:28 PM:

--
SLiM forward genetic simulation: http://messerlab.org/slim/
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/5c4fb06a-e598-422c-b863-6a4ceec6472en%40googlegroups.com.

Peter Ralph

unread,

Aug 27, 2023, 2:14:05 PM8/27/23

to Tatiana Bellagio, Ben Haller, slim-discuss

I'd post a discussion here:

https://github.com/tskit-dev/tsinfer/discussions

From: 'Ben Haller' via slim-discuss <slim-d...@googlegroups.com>
Sent: Saturday, August 26, 2023 12:23 PM
To: Tatiana Bellagio <tatiana...@gmail.com>
Cc: slim-discuss <slim-d...@googlegroups.com>
Subject: Re: One tree seq for multiple chromosomes

To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/871fa493-413e-6b61-75eb-ccf436a5c35b%40mac.com.

Yan Wong

unread,

Aug 28, 2023, 4:23:41 PM8/28/23

to Tatiana Bellagio, Peter Ralph, Ben Haller, slim-discuss

Hi Tati - I do keep an eye on this mailing list, but if you post the question to where Peter suggests, I can answer it there (briefly, (a) the RateMap is only used when allowing mismatch and (b) we normally infer separate tree sequences for each chromosome, but I haven’t thought much about how to turn these into a single tree sequence)

Cheers

Yan

To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/PH0PR10MB4727BA555BF90ED174E27548A5E1A%40PH0PR10MB4727.namprd10.prod.outlook.com.

Tatiana Bellagio

unread,

Aug 28, 2023, 4:47:22 PM8/28/23

to slim-discuss

Perfect, just did!

Here is the discussion in case any future user is interested in the same topic:
https://github.com/tskit-dev/tsinfer/discussions/855
Also, I added an important detail there: this is kind of specific to SLiM because the reason why I am inferring one tree seq instead of many (one per chromosome), is just based on the fact that I cannot import multiple tree seq into SLiM, each representing one chromosome (or at least I have failed on my attempts)

Thank you all!

Reply all

Reply to author

Forward