Estimating relatedness between parents and children in a haploid model

19 views
Skip to first unread message

Alexander Mackintosh

unread,
Jan 13, 2026, 11:02:44 AMJan 13
to slim-discuss
Hi,

I am interested in obtaining the genetic relatedness between a child and its two parents within a SLiM simulation. For a diploid model this will always be 0.5. However, I am simulating a clonal haploid population with rare sexual reproduction. Offspring that are produced by sex will have a recombinant genome and will share some proportion of the genome with each parent (not exactly 0.5), and I would like to record this information each generation. I saw that there is a relatedness function in SLiM, but my understanding is that this gives the pedigree relatedness and so would always return 0.5.

One way to do this would be to output treesequences without simplification, then match the SLiM ID of individuals produced by sex in the simulation to nodes in the treesequence. I anticipate that the file would be huge given 200,000 generations and a population size of 10,000, and I would prefer a solution that is not limited by this.

My current thinking is that if I can gain access to the sampled recombination points during sexual reproduction then perhaps I could calculate relatedness from that and record values during the simulation. Would this be possible? Or perhaps there is another solution that I am missing?

Any help would be really appreciated.

Best wishes,

Alex

Ben Haller

unread,
Jan 13, 2026, 12:56:12 PMJan 13
to slim-d...@googlegroups.com
Hi Alex!  Interesting idea!  Yes, I think you could implement this pretty easily like so:

initialize() {

initializeSLiMModelType("nonWF");

initializeSLiMOptions(keepPedigrees=T);

initializeMutationType("m1", 1.0, "f", 0.0);

initializeGenomicElementType("g1", m1, 1.0);

initializeChromosome(1, 1e9, type="H");

initializeGenomicElement(g1);

initializeMutationRate(1e-7);

initializeRecombinationRate(1e-8);

}

1 late() {

sim.addSubpop("p1", 100);

}

reproduction() {

if (runif(1) < 0.90)

{

// reproduce clonally, recording no breakpoints

subpop.addCloned(individual);

}

else

{

// reproduce sexually, recording breakpoints

mate = p1.sampleIndividuals(1, exclude=individual);

if (runif(1) <= 0.5)

{

initialStrand = individual.haploidGenome1;

secondStrand = mate.haploidGenome1;

parent1 = individual;

parent2 = mate;

}

else

{

initialStrand = mate.haploidGenome1;

secondStrand = individual.haploidGenome1;

parent1 = mate;

parent2 = individual;

}

breaks = sim.chromosome.drawBreakpoints(parent1, NULL);

offspring = subpop.addRecombinant(initialStrand, secondStrand, breaks,

NULL, NULL, NULL, parent1=parent1, parent2=parent2, randomizeStrands=F);

offspring.setValue("breaks", breaks);

}

}

2: early() {

// non-overlapping generations

adults = p1.subsetIndividuals(minAge=1);

sim.killIndividuals(adults);

}

2 late() {

for (ind in p1.individuals)

{

breaks = ind.getValue("breaks");

if (isNULL(breaks))

catn("individual " + ind.pedigreeID + " was cloned from parent " + ind.pedigreeParentIDs[0]);

else

catn("individual " + ind.pedigreeID + " had parents " + paste(ind.pedigreeParentIDs) + " with breakpoints {" + paste(breaks) + "}");

// ...could figure out ancestry proportions here, from that information...

}

}

The key element is using a nonWF model with addRecombinant() to make the sexual offspring, which allows you to use drawBreakpoints() to draw the breakpoints yourself so that you can record them. It's also important to use randomizeStrands=F so that you know which parent provided the initial copy strand; this way you know that the first parent of the individual, according to pedigree tracking, provided the initial copy strand and so you can trace the ancestry back to the correct parent. If you used randomizeStrands=T (which is the default) you wouldn't be able to trace back that way; you wouldn't know which parent provided the initial copy strand.

Using this information to go back one generation, getting ancestry fractions from the two parents, shouldn't be too hard. Going back any further than that seems like it would start to be quite a headache, but it is probably do-able with some elbow grease. :-O

If you get a working solution, even for just one generation back, it'd be great if you posted it here and/or provided it as a PR for the SLiM-Extras repository; I think it might be of interest to others. Happy modeling!

Cheers, -B. Benjamin C. Haller Messer Lab Cornell University

--
SLiM forward genetic simulation: http://messerlab.org/slim/
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/slim-discuss/d4d26a55-5a6d-4948-9f55-a15fa49515d2n%40googlegroups.com.

Alexander Mackintosh

unread,
Jan 15, 2026, 2:41:05 AMJan 15
to slim-discuss
Hi Ben,

Thank you for the quick response and your suggestion of how to do this. I had a go at implementing this, although this is my first time setting up a non-WF model and so am not super confident in the result. The code nonetheless runs and the results do make sense.

For some more context, my aim is to understand how rare recombination events in a (mostly) clonal haploid population affect fitness and mutation load. I simulate deleterious mutations with a fixed selection coefficient and then I record the deleterious mutation count of parents and offspring involved in reproduction via-recombination. I had set this up in a WF model previously by using modifyChild(), but then realised that the proportion of ancestry inherited from parent1 and parent2 is a major determinant of the deleterious mutation load in the resulting offspring. So now I record the deleterious mutation counts of 'trios' as well as the ancestry contribution of each parents.

My implementation is pasted below. I omit some of the initialization part for simplicity, as there I am reading in custom regions and a recombination map, which is quite a few lines and specific to my project.

Any feedback is welcome. I will be working on this in the coming days and plan to do some checks against an analytic model. I can let you know if I find any serious flaws / bugs.

Thanks again for the help!

Alex

initialize() {

        initializeSLiMModelType("nonWF");
        initializeSLiMOptions(keepPedigrees=T); //not obvious why this is needed (?)

        initializeTreeSeq(recordMutations=T, simplificationInterval=600);

        initializeChromosome(0, type="H");

        initializeMutationRate(3.75e-9);

        // m1 mutation type: fixed
        // dominance coeff will be ignored
        initializeMutationType("m1", 0.25, "f", -0.002);

        m1.convertToSubstitution = T;

        // g1 genomic element type: deleterious mutations

        initializeGenomicElementType("g1", m1, 1.0);

        ///
        /// here I omit some code which is just reading in CDS regions
        /// and a recombination map from a bed file
        /// there is nothing special about this part, 
        /// it is just many lines of non-relevant code
        ///

}


// create a population of N haploid individuals
// we would normally set the CloningRate here,
// but are instead doing that inside the reproduction code block

1 late() {
        sim.addSubpop("p1", pop_size);
}


// here we write vectors of the value we are interested in after 100k gens of burn-in
// the values are the deleterious mutation burden of parents and offspring (k)
// and the ancestry proportion contributed by each parent to an offspring,
// given recombination between haploids
// note that we only record this information when reproduction happens by recombination,
// otherwise we record the value as -1.0

late() {
        if (sim.cycle > 100000){
                if (sim.cycle % 100 == 0){
                        writeFile(paste0("./SLiM.constant_DFE4.haploid_asexual.template.bb.", r_seed, ".p1_k"),
                                paste(sim.getValue("p1_k_vec")), append=T);
                        writeFile(paste0("./SLiM.constant_DFE4.haploid_asexual.template.bb.", r_seed, ".p2_k"),
                                paste(sim.getValue("p2_k_vec")), append=T);
                        writeFile(paste0("./SLiM.constant_DFE4.haploid_asexual.template.bb.", r_seed, ".child_k"),
                                paste(sim.getValue("child_k_vec")), append=T);
                        writeFile(paste0("./SLiM.constant_DFE4.haploid_asexual.template.bb.", r_seed, ".p1_ancestry"),
                                paste(sim.getValue("p1_a_vec")), append=T);
                        writeFile(paste0("./SLiM.constant_DFE4.haploid_asexual.template.bb.", r_seed, ".p2_ancestry"),
                                paste(sim.getValue("p2_a_vec")), append=T);
                }
        }
}


// we also record the fixation of deleterious mutations during the sim post burn-in,
// and write this to a file

late() {
        if (sim.cycle == gens - 100000){
                defineConstant("f0", length(sim.substitutions));
        }
        if (sim.cycle == gens){
                defineConstant("f1", length(sim.substitutions));
                defineConstant("subs", f1 - f0);
        }
}


gens late() {
        writeFile(paste0(c("./SLiM.constant_DFE4.haploid_asexual.template.bb.", r_seed, ".mut_log")), paste(subs, sep="\n"));
        sim.treeSeqOutput(paste0(c("./SLiM.constant_DFE4.haploid_asexual.template.bb.", r_seed, ".fitness.trees")));
}


// the reproduction code block below is where most of the complexity is
// it is 'big bang' style reproduction mimicking a WF model
// reproduction happens mostly by cloning but occasionally by recombination
// if recombination happens then we record information about the k and ancestry of individuals involved

reproduction(p1) {

        // we will record values here
        p1_k_vec = c();
        p2_k_vec = c();
        child_k_vec = c();
        p1_a_vec = c();
        p2_a_vec = c();

        // sample parents given relative fitness
        // the aim here is for reproduction to be entirely WF-like
        parents = p1.individuals;
        s = 0.002; // selection coeff, which must match the value set above
        abs_fitnesses = (1-s)^parents.countOfMutationsOfType(m1);
        mean_fitness = mean(abs_fitnesses);
        rel_fitnesses = abs_fitnesses / mean_fitness;

        // Generate exactly pop_size offspring
        for (i in seqLen(pop_size)) {

                // Sample first parent proportional to relative fitness, no matter if cloning or recombining
                par1 = sample(parents, 1, weights=rel_fitnesses);

                // check if cloning
                if (runif(1) < 0.99) { // cloning rate

                        offspring = p1.addCloned(par1);
                        offspring.setValue("breaks", NULL);

                        p1val = -1.0; // we are not interested in changes in mutation burden through cloning
                        p2val = -1.0; // so we set all values to -1.0
                        childval = -1.0; // this is mainly for debugging purposes
                        p1_ancestry = -1.0;
                        p2_ancestry = -1.0;
                }

                else {
                        // sexual reproduction - which requires sampling a second parent
                        par2 = sample(parents, 1, weights=rel_fitnesses);

                        // sample if same individual (to ensure two distinct parents)
                        // this was suggested by Claude and is not really needed
                        while (par1 == par2)
                                par2 = sample(parents, 1, weights=rel_fitnesses);

                        // here we choose which individual corresponds to which strand
                        // is this really needed (?), probably not, but it does no harm
                        if (runif(1) <= 0.5) {
                                initialStrand = par1.haploidGenome1;
                                secondStrand = par2.haploidGenome1;
                                parent1 = par1;
                                parent2 = par2;
                        }
                        else {
                                initialStrand = par2.haploidGenome1;
                                secondStrand = par1.haploidGenome1;
                                parent1 = par2;
                                parent2 = par1;
                        }

                        // crossovers are drawn from parent1
                        breaks = sim.chromosome.drawBreakpoints(parent1, NULL);
                        offspring = p1.addRecombinant(initialStrand, secondStrand, breaks, NULL, NULL, NULL,
                                parent1=parent1, parent2=parent2, randomizeStrands=F);

                        // now we need to calculate the ancestry proportions donated by each parent
                        // this can be done using the breaks
                        genome_length = sim.chromosome.lastPosition + 1;
                        p1_ancestry = 0.0;
                        // this code is from Claude as I got stuck here
                        // it may be possible to do this without a loop, but this works
                        all_positions = c(0, breaks, genome_length);
                        for (j in seqLen(size(all_positions) - 1)) {
                                if (j % 2 == 0) {
                                        p1_ancestry = p1_ancestry + (all_positions[j+1] - all_positions[j]);
                                }
                        }
                        p1_ancestry = p1_ancestry / genome_length;

                        // we could of course calculate the p2_ancestry outside of SLiM
                        p2_ancestry = asFloat(1.0 - p1_ancestry);

                        // these values of k will be added to the vectors
                        p1val = asFloat(parent1.countOfMutationsOfType(m1));
                        p2val = asFloat(parent2.countOfMutationsOfType(m1));
                        childval = asFloat(offspring.countOfMutationsOfType(m1));

                }

                // append to local vectors
                p1_k_vec = c(p1_k_vec, p1val);
                p2_k_vec = c(p2_k_vec, p2val);
                child_k_vec = c(child_k_vec, childval);
                p1_a_vec = c(p1_a_vec, p1_ancestry);
                p2_a_vec = c(p2_a_vec, p2_ancestry);

        }

        // store vectors to sim once at the end
        sim.setValue("p1_k_vec", p1_k_vec);
        sim.setValue("p2_k_vec", p2_k_vec);
        sim.setValue("child_k_vec", child_k_vec);
        sim.setValue("p1_a_vec", p1_a_vec);
        sim.setValue("p2_a_vec", p2_a_vec);

        // this is crucial!
        self.active = 0;
}

// from generations 2 onwards we must remove all 'old' individuals in the population
// note that reproduction happens _before_ this early call,
// so we have double the pop size at this point
2: early() {

        // kill all parents to ensure non-overlapping generations
        adults = p1.subsetIndividuals(minAge=1);
        sim.killIndividuals(adults);

        // Ensure all offspring survive (no viability selection)
        p1.fitnessScaling = INF;  // is there a better way to do this?
}


Ben Haller

unread,
Jan 15, 2026, 9:51:25 AMJan 15
to slim-d...@googlegroups.com
Thanks Alex!  I don't have the time to look over this in detail right now; but thanks for posting it to the list, I think others might well find it to be useful.  :->  Happy modeling!


Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University


Reply all
Reply to author
Forward
0 new messages