Setting a starting allele frequency bias

20 views
Skip to first unread message

Baron Koylass

unread,
Apr 30, 2024, 8:13:48 AMApr 30
to slim-discuss
Hi,
I am working on a script that should simulate over-dominance in a mutation running over the whole of chromosome 3R of drosophila, specifically the fruitless gene region. The script splits the chromosome into 3 parts; 1 for the fruitless gene, and 1 region each for the rest of the flanking sequence on either side of this gene. I have set two mutations (m1 & m2) simply because I wish to play around with some parameters further on in my research.
The script loads in a tree sequence from an initial msprime coalescent simulation. This is just a neutral binary mutation simulation on N = 5000 to establish a burn-in period for genetic diversity.
The aim of the research is to then split this population into two; p1 = 2500, p2 = 2500, then into 10 "cage" populations consisting of 500 flies each (p3 - p12). AF is then tracked at 9 different time points until generation 56, sampling 48 (96 genomes) from each "cage" at each time point.

Apologies for the long detail! My question is if there is a way to set a starting allele frequency bias? I wish to be able to set the allele frequency to 90% for p1 within the fruitless gene region, and 10% for p2 within the same region. The reason being is when tracking the AF over time, I am hoping to detect instances of balancing selection.
Sincere apologies if this is clearly explained within the SLiM documentation! I somehow cannot seem to find it anywhere =(

Attached is the current script being used (the syntax could be cleaned up once everything is running as intended so apologies!)

Many thanks
Baron Koylass. LIDo DTP PhD student. Queen Mary University of London. Fumagalli lab.
cage.slim

Peter Ralph

unread,
Apr 30, 2024, 8:55:41 AMApr 30
to Baron Koylass, slim-discuss
Hi, Baron - Thanks for the clear explanation; more details are usually better. But, to answer your question: what exactly do you mean by "set the allele frequency to 90% within the fruitless gene region"? Do you want to be sure have a single mutation within that region, and then specify its frequency in p1 and p2? Since you're simulating the ancestral population with msprime, this sounds like you might want to sub-sample from the msprime simulation to achieve those frequencies. However, what do you need to be true of the msprime simulation? (e.g., at least one mutation in that region?)

--peter

From: slim-d...@googlegroups.com <slim-d...@googlegroups.com> on behalf of Baron Koylass <bkzer...@gmail.com>
Sent: Tuesday, April 30, 2024 5:13 AM
To: slim-discuss <slim-d...@googlegroups.com>
Subject: Setting a starting allele frequency bias
 
--
SLiM forward genetic simulation: http://messerlab.org/slim/
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/19469bdb-53ea-4b9c-87a3-ecf2bc70bae5n%40googlegroups.com.

Baron Koylass

unread,
Apr 30, 2024, 9:24:57 AMApr 30
to slim-discuss
Hi Peter Ralph,
Thank you for the prompt reply.

"Do you want to be sure have a single mutation within that region, and then specify its frequency in p1 and p2?"

Yes, sorry I wish to ensure there is a mutation within this region (for example, the fruitless gene is 131kbp, so an example mutation at position 10,000) and set its AF to 90% in p1 & 10% in p2. I think even reading things back myself, it would indeed seem like I should do some sub-sampling from the msprime simulation, and be sure there is a way to do this within the ancestral simulation.
Apologies as I think this was very poorly explained over on my side in my initial message.

Many thanks
Baron.

Peter Ralph

unread,
Apr 30, 2024, 9:51:42 AMApr 30
to Baron Koylass, slim-discuss
Okay, that makes sense. I can think of various ways to do this:
  1. find a mutation in the msprime simulation, and subsample (this can be done in SLiM, on loading it in, as long as there is an appropriate mutation)
  2. or, decide what you need in the initial population (ie., condition on there being a frequency between 10% and 90% in the region); and re-simulate with msprime until you get that situation
Option (2) could be done by simulating ancestry once then re-simulating mutations on the same ancestry until an appropriate mutation appears.

In general, a word of caution is it's not usually feasible or maybe even desireable to try to have the simulation match observed conditions very precisely (as an extreme example, imagine trying to produce a simulation in which the outcome of msprime had the same genotype sequences in the fruitless region of a sample of flies).

happy slimulating!
--peter


Sent: Tuesday, April 30, 2024 6:24 AM
To: slim-discuss <slim-d...@googlegroups.com>
Subject: Re: Setting a starting allele frequency bias
 

Baron Koylass

unread,
Apr 30, 2024, 11:53:29 AMApr 30
to slim-discuss
Hi Peter Ralph,

Great! I have actually done a combination of both ways you had proposed. I have now run a number of my ancestral simulations in order to build up a number of mutations with this frequency, in which I have complied into a separate file for stats analysis (SFS & covariance of AF over time) & SLiM input. I have also focused on a particular locus within the fruitless gene.
I was finding things quite difficult as I am attempting to replicate empirical data produced by my partner lab. So I agree with your words of caution, in that many of the parameters may not (or should not I should say) replicate the evolve & re-sequence experiment 100%
Thank you! This was extremely insightful and thank you for also being patient and for your time!

Happy developing =)
Many thanks
Baron.

Reply all
Reply to author
Forward
0 new messages