Optimizing counting of homozygous loci

11 views
Skip to first unread message

Alfredo Antonio Reis Marin

unread,
Dec 16, 2025, 1:10:39 PM (3 days ago) Dec 16
to slim-discuss
Dear all,

I am working on a simple simulation with 6 loci, each under a symmetric heterozygote advantage model. I am using an infinite alleles model (with mutationStackPolicy = "l") such that each locus always contains a single mutation. For the fitness calculation i am doing this:

late() {

  indv = p1.individuals;


  allele1 = indv.haploidGenome1.mutations;

  allele2 = indv.haploidGenome2.mutations;

  homozygous = allele1 == allele2;


  // TODO: optimize

  homCount = integer(N);

  for (i in seqLen(N)) {

    homCount[i] = sum(homozygous[(6 * i):(6 * i + 5)]);

  }


  indv.fitnessScaling = (1 - s) ^ homCount;

}

The logic is I get two vectors, allele1 and allele2, containing the 6 alleles of all individuals in their haploidGenome1 and haploidGenome2, respectively. Then, I compare these vectors to see whether each individual is homozygous in each of the six loci and count the number of homozygous loci for each individual.

My end goal is simply to count the number of homozygous loci by individual, but this for-loop is taking really long to run. Is there a way to vectorize this operation? I thought of using a matrix, but I didn't find a way to make a vectorized marginal sum, so it didn't remove the need for the loop.


Thanks in advance,

AM


Ben Haller

unread,
Dec 16, 2025, 2:05:03 PM (3 days ago) Dec 16
to slim-d...@googlegroups.com
Hi Alfredo!

One question I have is why you're not just letting SLiM compute these fitness values for you...?  SLiM can model overdominance (e.g., heterozygote advantage) fine; as the doc for initializeMutationType() says:

The dominanceCoeff parameter supplies the default dominance coefficient for the mutation type, for all traits; 0.0 produces no dominance, 1.0 complete dominance, and values greater than 1.0, overdominance.

Letting SLiM do the work for you should be faster than anything you can achieve in your script.  But if what you need is a vectorized marginal sum, is colSums() and/or rowSums() not what you want?  Same function names as in R; if you know R you can usually find the corresponding Eidos function pretty easily.  :->

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University
--
SLiM forward genetic simulation: http://messerlab.org/slim/
Before posting, please read http://benhaller.com/slim/bugs_and_questions.html
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/slim-discuss/73f2ff78-5d6d-4a93-b53e-7471c294cd48n%40googlegroups.com.

Alfredo Antonio Reis Marin

unread,
Dec 16, 2025, 3:59:22 PM (3 days ago) Dec 16
to slim-discuss
That's right, I was adapting a more complex simulation we did in which using the basic settings didn't work, so I just forgot about them haha. In any case, the rowSums and colSums functions might be useful elsewhere. Thanks!
Reply all
Reply to author
Forward
0 new messages