Modelling a selective sweep with tree-sequence recording in non-WF model

18 views
Skip to first unread message

Samuel Gornard

unread,
Dec 10, 2025, 12:33:13 PM (5 days ago) Dec 10
to slim-discuss
Hello everyone, 

I am trying to model a population of bird on an island undergoing population contraction. To do so, I use tree-sequence recording to avoid burn-in phase. After a small period of 100 years, I introduce a new beneficial mutation in half of the haplosomes of the individuals of age 0 (as I can't modify age > 0).
Afterward, I spatially explicitly sample the individuals that i want, so that I only have to recapitate and overlay mutations for the individuals I'm interested in.
Next, I load everything into python to output a vcf containing my 17 individuals. However, after simulation, I get absolutely no signal of the sweep (like higher FST around the mutation, or extended haplotypes). Is there something I am doing wrong? I can provide any other useful information if needed.

Thank you for your help!
Best,

Samuel

Here is the code used. I have removed some unnecessary chunks for clarity, but left a "removed" note where they should be. I can of course provide them if needed!

initialize() {
initializeSLiMModelType("nonWF");

initializeSLiMOptions(dimensionality="xy");
initializeSex();
initializeTreeSeq();

// mutations
initializeMutationType("m1", 0.5, "f", 0.0); // non-coding
initializeMutationType("m2", 0.5, "f", 0.1); // beneficial

initializeGenomicElementType("g1", m1, 1);  // exon
initializeGenomicElementType("g2", m1, 1);       // intron
initializeGenomicElementType("g3", m1, 1);       // NC

//Genomic structure
initializeChromosome(1);
//Removed realistic genetic structure
initializeMutationRate(0);
initializeRecombinationRate(3e-8);

initializeChromosome(2);
//Removed realistic genetic structure
initializeMutationRate(0);
initializeRecombinationRate(3e-8);

//Parameters
//Removed ecological parameters

//Interaction types
//Removed interactions
}

//Removed reproduction and modifiy child callbacks

1 early() {
sim.addSubpop("p1", pop);

p1.setSpatialBounds(c(0,0,700,800)); //Spatial bounds of the pop

map_image=Image("../mauritius2.png");
p1.defineSpatialMap("island","xy",1.0-map_image.floatK, valueRange=c(0.0,1.0),colors=c("cyan","forestgreen"));

p1.individuals.setSpatialPosition(p1.pointUniformWithMap(pop,"island"));
p1.individuals.age=sample(age_structure,pop,replace=T); //Realistic age structure for starter
}

//Introduction of the beneficial mutation
100 late() {
//Introduction of the sweep. I can only add new mutations to age 0 individuals
age0=p1.subsetIndividuals(minAge=0,maxAge=0);
nhaplo=asInteger(size(age0)*0.1);
target=sample(age0.haplosomesForChromosomes(2), nhaplo);
target.addNewDrawnMutation(m2, 10000000);

}

101 early(){
//Population and map reduction
}
162 early(){
//Population and map reduction
}
199 early(){
//Population and map reduction
}
265 early(){
//Population and map reduction
}
323 early(){
//Population and map reduction
}

//Death and fitness recalculation
//Happens last, after deforestation and moving
2: early() {
//Divide pop to kill a certain percentage
off=p1.subsetIndividuals(minAge=0,maxAge=0);
iv1=p1.subsetIndividuals(minAge=1,maxAge=1);
iv2=p1.subsetIndividuals(minAge=2,maxAge=2);

//Choose dying individuals
prob_die=c(max(0.0,min(1.0,rnorm(1,0.4,0.05))),
max(0.0,min(1.0,rnorm(1,0.05,0.05))),
max(0.0,min(1.0,rnorm(1,0.05,0.05))),
1.0);
off_dead=sample(off, asInteger(off.size()*prob_die[0]) );
iv1_dead=sample(iv1, asInteger(iv1.size()*prob_die[1]) );
iv2_dead=sample(iv2, asInteger(iv2.size()*prob_die[2]) );

sim.killIndividuals(c(off_dead,iv1_dead,iv2_dead,p1.subsetIndividuals(minAge=3)));

//Density-dep selection

i1.evaluate(p1);
inds = sim.subpopulations.individuals;
inds.fitnessScaling=1/(1+i1.totalOfNeighborStrengths(inds)); //Fitness divided proportionnaly to the number of neighbors
}


//Individual recording
128 late(){ // Year is 1800
//We sample in a small region
cand=p1.individuals[p1.individuals.x>300&p1.individuals.x<370&p1.individuals.y>(800-380)&p1.individuals.y<(800-319)];
//Check if there are individuals inside the area
if(size(cand)<1){
catn("Reached 0, stopping");
sim.simulationFinished();}

to_save=sample(cand, 1);
sim.treeSeqRememberIndividuals(to_save);
}

198 late(){ // Year is 1870
cand=p1.individuals[p1.individuals.x>300&p1.individuals.x<370&p1.individuals.y>(800-380)&p1.individuals.y<(800-319)];
//Check if there are individuals inside the area
if(size(cand)<7){
catn("Reached 0, stopping");
sim.simulationFinished();}

to_save2=sample(cand, 7);
sim.treeSeqRememberIndividuals(to_save2);
}

343 late(){ //We stop in 2015
to_save3=p1.sampleIndividuals(9);
sim.treeSeqRememberIndividuals(to_save3);
sim.killIndividuals(p1.individuals);
sim.treeSeqOutput(paste0("./"+folder+"/trees/"+out+".trees")); sim.simulationFinished();
}

Ben Haller

unread,
Dec 10, 2025, 1:00:56 PM (5 days ago) Dec 10
to slim-d...@googlegroups.com
Hi Samuel!

Well, it sounds like the sweep you're modeling is an extremely "soft" sweep; you introduce it into 50% of the haplosomes of new juveniles, and furthermore, the way you introduce it means that those haplosomes are not necessarily even closely related in terms of identity by descent (meaning they don't share the genetic patterns one would expect if they all possessed the same mutation).  Under conditions like that, I wouldn't expect to see any sweep signal.  Detecting soft sweeps is notoriously difficult even in the best of conditions; but the way you're modeling the sweep also isn't realistic at all, since the sweep mutation just magically appears out of nowhere in all those haplosomes at the same time.

If you want to model a more realistic soft sweep (which you might or might not be able to detect, since soft sweeps are hard to detect), you might model the burn-in period with the coalescent, select a mutation at a somewhat intermediate frequency, make that mutation be non-neutral, and then finish the simulation in SLiM.  The pyslim manual has an example of this type of approach: https://tskit.dev/pyslim/docs/latest/vignette_coalescent_diversity.html

If you want to model a hard sweep (which you should probably be able to detect, in at least some runs), introduce the sweep mutation in just a single haplosomes, not in 50% of the juvenile haplosomes.  You can make the simulation be conditional on the fixation of the sweep mutation, so that you ignore all the evolutionary trajectories where the sweep mutation is lost, if you want to.  Section 18.3 of the manual has an example of that.

If you're not familiar with the concept of soft versus hard sweeps, etc., perhaps chapter 9 of the SLiM manual would help, but really you'd want to consult a population-genetics textbook, where the topic ought to be covered quite thoroughly.

I hope that helps; happy modeling!

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University
--
SLiM forward genetic simulation: http://messerlab.org/slim/
Before posting, please read http://benhaller.com/slim/bugs_and_questions.html
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/slim-discuss/910e6f3f-f1c8-4733-a464-1257cdaaf236n%40googlegroups.com.

Peter Ralph

unread,
Dec 10, 2025, 5:30:49 PM (5 days ago) Dec 10
to slim-d...@googlegroups.com, Ben Haller
Follow-up: rather than "why can't I see this sweep with FST" - a very high-level question - it would probably be more effective to ask lower-level questions that get at the same thing. For instance "did the allele actually sweep" (a question you can ask in SLiM or in the VCF); and then "are alleles at different frequencies in the two populations" (which is the source of the signal for FST), for instance. The latter question would probably be answered "no", and in figuring out why not you might arrive at what Ben is suggesting.

From: 'Ben Haller' via slim-discuss <slim-d...@googlegroups.com>
Sent: Wednesday, December 10, 2025 10:00 AM
To: slim-d...@googlegroups.com <slim-d...@googlegroups.com>
Subject: Re: Modelling a selective sweep with tree-sequence recording in non-WF model
 
Hi Samuel! Well, it sounds like the sweep you're modeling is an extremely "soft" sweep; you introduce it into 50% of the haplosomes of new juveniles, and furthermore, the way you introduce it means that those haplosomes are not necessarily even
ZjQcmQRYFpfptBannerStart
This message originated outside the UO email ecosystem.
Use caution with links and attachments. Learn more about this email warning tag.
 
ZjQcmQRYFpfptBannerEnd

Samuel Gornard

unread,
3:33 AM (14 hours ago) 3:33 AM
to slim-discuss
Hello all,
Thank you for your advice and help! I will try to model harder sweeps then, and have a look at the coalescent burn-in method.

Have a nice day,
Best,

Samyel

Samuel Gornard

unread,
6:56 AM (11 hours ago) 6:56 AM
to slim-discuss
Hello all, 
I am back with some questions. The hard sweep is difficult for us to model because the mutation keeps being lost.
1) Is it possible to use my non-WF spatial model in the coalescent burn-in with msprime?
2) In this paper (https://doi.org/10.1007/s10592-023-01548-9) which we draw our maps from, they have modeled a single neutral burn-in that they input at the start of each run. Do you think we could use this procedure for our case? 

Thank you!

Samuel

Samuel Gornard

unread,
8:14 AM (9 hours ago) 8:14 AM
to slim-discuss
Alternatively, is it possible to model a reasonable amount of years in SLiM, like 1,000, then output tree-sequence, overlay mutations on this, and then use the output tree to load in the simulation with selection (using sim.readFromPopulationFile) ? 

Thanks!

Samuel
Le lundi 15 décembre 2025 à 09:33:48 UTC+1, Samuel Gornard a écrit :

Peter Ralph

unread,
12:21 PM (5 hours ago) 12:21 PM
to slim-discuss, Samuel Gornard
Briefly:

>1) Is it possible to use my non-WF spatial model in the coalescent burn-in with msprime?

No, this is fundamentally not possible, for deep reasons.

>2) In this paper (https://doi.org/10.1007/s10592-023-01548-9) which we draw our maps from, they have modeled a single neutral burn-in that they input at the start of each run. Do you think we could use this procedure for our case? 

I don't know what the question is, really? Of course you can use that procedure. I guess you're asking whether it's a good idea; but that answer depends on what your goals are, etcetera. The quick answer here is: yes, this is what I'd do. But I would also try starting that burn-in at different points in time, to check that what I choose for that time doesn't affect the things I care about. (It'll affect some things; the question is whether it affects what you care about.)

>3)  Alternatively, is it possible to model a reasonable amount of years in SLiM, like 1,000, then output tree-sequence, overlay mutations on this, and then use the output tree to load in the simulation with selection (using sim.readFromPopulationFile) ? 

Yes, this is possible. It'll take some coding in python and a pretty good understanding of how SLiM represents things in the tree sequence (in particular, to check that you've done things right). See, for instance, https://tskit.dev/pyslim/docs/latest/vignette_coalescent_diversity.html

--peter

ps. You have perhaps seen it already, but in case not, have a look at https://onlinelibrary.wiley.com/doi/full/10.1002/ece3.71098
Reply all
Reply to author
Forward
0 new messages