One tree seq for multiple chromosomes

77 views
Skip to first unread message

Tatiana Bellagio

unread,
Aug 25, 2023, 2:28:47 PM8/25/23
to slim-discuss
Hi!

I'm reaching out to make sure my workflow is correct. My goal is to simulate a complex trait involving loci on multiple chromosomes using real genetic data from a population VCF file.

To avoid burdening SLiM with tracking neutral mutations, I first infer the tree sequence from the VCF file and then remove all neutral mutations. After the forward simulation in SLiM, I reintegrate these neutral mutations into my tree sequence.

However, there's a point that I'm still unsure about: Is it acceptable to infer one tree sequence from a VCF file with multiple chromosomes? I've noticed strategies in msprime that employ a msprime.RateMap to enhance tree inferences' accuracy. Should I also incorporate a msprime.RateMap in my tsinfer inference? Are shared nodes in between my chromosomes a problem? Since I'm more focused on utilizing the tree sequence as a data structure to effectively store and overlay mutations, rather than the precision of the tree itself, I'm inclined to believe that I don't need to worry about this aspect. Nevertheless, to be safe I wanted to ask. 

When importing the .tree file into SLiM, I ensure separate chromosome treatment by implementing the recombination rate=0.5 trick between them. During the simulation, I avoid simplifications to retain all nodes (to accurately overlay mutationd afterward).

If my assumptions are correct, this workflow seems viable after all! (: 
Thank you!

Tati

Ben Haller

unread,
Aug 26, 2023, 3:23:46 PM8/26/23
to Tatiana Bellagio, slim-discuss
Hi Tati!  Since this looks like a question about tsinfer, you might try to find a channel where you can ask those folks directly.  I'm not sure any of them hang out on this list, and I've never used tsinfer myself.  :->  I'm not sure where they like to receive questions; perhaps on GitHub?

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University


Tatiana Bellagio wrote on 8/25/23 2:28 PM:
--
SLiM forward genetic simulation: http://messerlab.org/slim/
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/5c4fb06a-e598-422c-b863-6a4ceec6472en%40googlegroups.com.

Peter Ralph

unread,
Aug 27, 2023, 2:14:05 PM8/27/23
to Tatiana Bellagio, Ben Haller, slim-discuss
I'd post a discussion here:

From: 'Ben Haller' via slim-discuss <slim-d...@googlegroups.com>
Sent: Saturday, August 26, 2023 12:23 PM
To: Tatiana Bellagio <tatiana...@gmail.com>
Cc: slim-discuss <slim-d...@googlegroups.com>
Subject: Re: One tree seq for multiple chromosomes
 

Yan Wong

unread,
Aug 28, 2023, 4:23:41 PM8/28/23
to Tatiana Bellagio, Peter Ralph, Ben Haller, slim-discuss
Hi Tati - I do keep an eye on this mailing list, but if you post the question to where Peter suggests, I can answer it there (briefly, (a) the RateMap is only used when allowing mismatch and (b) we normally infer separate tree sequences for each chromosome, but I haven’t thought much about how to turn these into a single tree sequence) 

Cheers

Yan

Tatiana Bellagio

unread,
Aug 28, 2023, 4:47:22 PM8/28/23
to slim-discuss
Perfect, just did!
Here is the discussion in case any future user is interested in the same topic:
https://github.com/tskit-dev/tsinfer/discussions/855 
Also, I added an important detail there: this is kind of specific to SLiM because the reason why I am inferring one tree seq instead of many (one per chromosome), is just based on the fact that I cannot import multiple tree seq into SLiM, each representing one chromosome (or at least I have failed on my attempts)
Thank you all! 


Reply all
Reply to author
Forward
0 new messages