Get the frequencies of each segregating and fixed mutation from a .tree file? 

51 views
Skip to first unread message

beaangelic...@gmail.com

unread,
Oct 5, 2023, 9:15:49 AM10/5/23
to slim-discuss

Hi, 


This may be a very simple question, but I thought I'd ask because I cannot find a way to do this by myself it seems:

Is there a way to get the mutation frequency and the isSegregating and isFixed tags for all of the mutations in a population from a .tree file? 

For context, I would like to compare the frequencies of segregating and fixed mutations with their respective selection coefficients. Ideally, I would have liked to see the same thing for the mutations that were generated and then lost as well, but I assume that information is not saved by SLiM anywhere (?). The simulations have all finished and I only have the .tree files saved during (every 1000 generations) and at the end of the simulations to get information from currently. I have run my simulations (simple WF, selection coefficients drawn from deleterious gamma distribution) with convertToSubstitution = F to preserve fixed mutations.

I can currently extract the selection coefficients with tskit using the recipe from 17.7 in the manual, but I cannot find a way to get the frequency of the mutations in the population or which mutations are segregating at the end of the simulation. I have looked in the metadata from pyslim, but I haven't seen it mentioned. Is this information is available anywhere in a .tree file? 

If not, I could possibly rerun the last 1000 generations for all models from an earlier .tree file and get more output from SLiM directly, but I'd love it if there was a simpler way! 


Best regards, 


Bea

Ben Haller

unread,
Oct 5, 2023, 9:41:00 AM10/5/23
to beaangelic...@gmail.com, slim-discuss
Hi Bea!

I'm not sure that there are functions in tskit/pyslim to get this information directly, but with some Python wrangling it should be reasonably straightforward to obtain from a tree sequence.  Peter or someone will correct me if I'm mistaken:

- A mutation is fixed if the tree for its position has coalesced below the time when the mutation arose (i.e., every extant individual inherits from the node that contains the mutation); conversely, a mutation is still segregating if that is not true (i.e., some extant individuals inherit from the node that contains the mutation, some do not)

- The frequency of the mutation is simply the fraction of all extant genomes (nodes) that inherit from the tree branch that contains the mutation

- Information about mutations that were lost would also be in the tree sequence *if* it was never simplified.  Simplification would strip away branches that have no descendants, and so information about lost mutations would get stripped away.

I'm afraid I can't help with the Python code for doing these sorts of things, as I'm no good in Python.  Perhaps someone else will have code to share, though.  You'd basically just be walking up/down in the tree from the tree sequence for the given position (the mutation's position), looking at ancestry relationships.

Yes, SLiM does not save information about lost mutations anywhere, not because it would be difficult to do, but because it would be slow and would use a lot of memory; if you want to save out that information from a simulation, though, it wouldn't be particularly hard.  A simple solution would be to log out information on EVERY new mutation, in a mutation() callback, and then subtract out the information about fixed/segregating mutations, and what is left is the mutations that were lost.

Happy modeling!

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University


beaangelic...@gmail.com wrote on 10/5/23 9:15 AM:
--
SLiM forward genetic simulation: http://messerlab.org/slim/
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/95d63d08-a539-41bf-9790-883daa106ddfn%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages