nonWF simulation

Bjarki Eldon

unread,

Mar 28, 2022, 10:46:29 AM3/28/22

to slim-discuss

Hi all,

I would like to try a very simple nonWF simulation using a highly skewed (heavy tailed) distribution

for the number of (potential) offspring ; when initialising I would like to specify a particular heavy-tailed offspring distribution, where there can be large families, but the population size is kept constant, so that after each round of producing offspring the total pool of offspring, which with high probability is larger than the population size (number of parents) in any given generation is cut down to match the number of parents.

my skript is below, I just wanted to check with you if you would see anything strange,

and in case you would find this useful - in particular the reproduction step in the skript below is the critical part

thanks in advance for any input

best regards

bjarki

// try a simple sim
// possibly useful functions
// function (float)pow(float x, float y)
// { return x^y;}
// probability of k juveniles, this is the probability mass function I use for producing

// number of offspring

// return( (pow( 1./k, a) - pow( 1./(k+1.), a))/( pow( .5, a) - pow( 1./(cutoff + 1.), a) ) );
// function (float)pxi( float a, float x, float m)
//    {
//    // take x from 2 to m
//    return (pow(1.0/x, a) - pow( 1.0/(x + 1.0), a))/( pow(0.5, a) - pow( 1.0/(m+1) , a));
// }
// OK, here begins the real code
initialize() {

// set non-Wright Fisher evolution
initializeSLiMModelType("nonWF");

defineConstant("N", 10000 );
defineConstant("K", 2*N);

//define population size - or number of parents

// in each generation

defineConstant("POPSIZE_", 2*N);

// set cutoff for numbers of juveniles -

// maximum number of (potential) offspring per family

defineConstant("CUTOFF_", 2*100*N);

// set alpha parameter for the offspring distribution
defineConstant("ALPHA_", 1.25);

// set overall mutation rate
initializeMutationRate(5.0e-7);

// m1 mutaton type negative
// 0.1: dominance coefficient 0.1
// g: selection coefficient random Gamma
// -0.03: mean of gamma
// 0.2: shape parameter
initializeMutationType("m1", 0.1, "g", -0.1, 0.2);
// initializeMutationType("m1", 0.5, "f", 0.0);

// m2 mutation type neutral
initializeMutationType("m2", 0.5, "f", 0.0);

// g1 genomic element; use m2 all muts
initializeGenomicElementType("g1", m2, 1.0);
// initializeGenomicElementType("g1", c(m1,m2), c(1, 1));

// uniform chrom of length 100 kb
// see chromosomesize.txt
initializeGenomicElement(g1, 0, 99999);

// uniform recombination along chrom
initializeRecombinationRate(1e-8);

// here I construct the offspring distribution, or

// the CDF (cumulative mass function) for sampling

// construct the probabilities for sampling juveniles
cdf = (1.0/(2.0:(CUTOFF_)))^ALPHA_ ;

cdf = cdf - ( (1.0/(3.0:(CUTOFF_ + 1.0)))^ALPHA_ );

cdf = cdf / sum(cdf);
cdf = c(0.0, 0.0, cdf);

// construct the cdf for sampling juveniles
for( i in 3:(CUTOFF_))
{ cdf[i] = cdf[i] + cdf[i-1]; }

// store the cdf - this CDF should be used in

// every generation for sampling number of offspring

defineConstant("CDF_", cdf);
// print(CDF_);
}

// this is the critical part - here I wanted to ask if this guarantees that

// I sample number of offspring from the CDF defined in the

// initialisation step

reproduction() {

parents = sample(p1.individuals, p1.individualCount);

for ( i in seq( 0, p1.individualCount -2, by=2))
{ parent1 = parents[i]; parent2 = parents[i+1];

// sample a random uniform and check where it lands in the CDF

u = runif(1); k = 2;
while( u > CDF_[k]){
// assert( CDF_[k] > 0)
// assert( CDF_[k] <= 1)
k = k + 1; }
// assert(k > 1);
for( j in seqLen(k)){
p1.addCrossed(parent1, parent2); }
}
// disable for the generation
self.active = 0;
}

// the rest I suppose is standard SLIM
1 early()
{
sim.addSubpop("p1", POPSIZE_);
}

early()
{
p1.fitnessScaling = K/p1.individualCount;
}

// run to given generation
500 late() {

// output a random sample of sequences
p1.outputMSSample(136);
}

// end

Ben Haller

unread,

Mar 30, 2022, 11:02:04 AM3/30/22

to slim-discuss

Hi! I'm not really sure what your script is doing, so I can't comment on whether or not it is correct. :-> If you aren't sure either, perhaps adding print()/catn() calls that print out what the code is doing would be helpful? Or more rigorously, logging a large number of draws and then checking that they conform to the distribution you want? In any case, this seems like more of a mathematical/algorithmic question than a SLiM question, no? Perhaps Peter will have a comment on your approach, he knows far more about this sort of thing than I do.

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University

Bjarki Eldon

unread,

Mar 30, 2022, 11:37:17 AM3/30/22

to Ben Haller, slim-discuss

Hi Ben

OK, what I want is a large variance in litter size, so following the example in Section 16.3 page 371 in the SLiM manual :

there the litter size is random and drawn from a Poisson, but instead of that I want to sample the litter size from a specific distribution

that I define, where litter size is a random number between 2 and the constant "CUTOFF_", and since this is a specific discrete probability

distribution I try to implement a sampling mechanism that generates a cumulative density function (CDF), and then use that to sample a litter size

by sampling a random uniform, it's a standard and efficient way of sampling from a discrete distribution with a finite number of values.

Then include a population size regulation as in the example in the manual, run this for some specified number of generations, and at the end

sample a random set of chromosomes.

I think the SLiM question would be: if one wants the litter size to be random, and according to a specific discrete distribution, how would one implement that in SLiM ?

best regards

bjarki

--
SLiM forward genetic simulation: http://messerlab.org/slim/
---
You received this message because you are subscribed to the Google Groups "slim-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to slim-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/51aaa1e1-81ce-4f10-9bb1-5add80b8e3abn%40googlegroups.com.

Ben Haller

unread,

Mar 30, 2022, 12:29:24 PM3/30/22

to slim-discuss

Hi Bjarki,

> I think the SLiM question would be: if one wants the litter size to be random, and according to a specific discrete distribution, how would one implement that in SLiM ?

That is really not a SLiM question at all, but a question of mathematics and algorithmic design. The answer to "how would I do this in Eidos?" is "pretty much the same way you would do it in any programming language – whatever way that might be". I really can't comment on questions regarding algorithmic design that I know nothing about, sorry. As I wrote before, perhaps someone like Peter with more of a background in the mathematics of distributions, etc., will comment; but if you want to know whether your algorithm is correct, perhaps a different forum, such as a stackoverflow forum on mathematics or statistics, might be a better place to ask. I simply do not know the answer, because it is not a question about SLiM, in the end, even though the question arises for you in the context of writing a SLiM model.

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University

Peter Ralph

unread,

Mar 30, 2022, 5:13:50 PM3/30/22

to Ben Haller, slim-discuss

Hi, Bjarki! This looks good to me - I haven't checked the details of
your CDF, but this is a good way to do it. At first it seemed odd you
were using a just-run-once reprodcution() callback, since offspring
numbers are independent, but I see that this way everyone only
reproduces exactly once, so that makes sense. (I did wonder why you
are using 2N instead of n, as SLiM is diploid.)

Minor note for Ben: the while loop for finding the interval of the CDF
that the uniform falls in would be simpler (and, vectorizable) if
eidos had the `findInterval( )` function from R; then Bjarki could
draw a vector of everyone's offspring numbers just by doing
numOff = findInterval(runif(N), CDF_);

** peter

On Wed, Mar 30, 2022 at 5:29 PM 'Ben Haller' via slim-discuss

> To view this discussion on the web visit https://groups.google.com/d/msgid/slim-discuss/d4508566-60d4-4c1d-8493-f1377054671bn%40googlegroups.com.

Ben Haller

unread,

Mar 30, 2022, 6:07:38 PM3/30/22

to slim-discuss

Hi Peter. Interesting, thanks. I recall you suggested quite a while ago that I might add findInterval() to Eidos, just on general principles because it's a function you often find useful; but after a bit of discussion we agreed to wait until a use case arose where it would actually be useful for writing a SLiM model. I guess that use case has just arisen. :-> It can certainly be added, but I don't know when it would come out in a release. I'll open an issue for it.

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University

Ben Haller

unread,

Apr 11, 2022, 6:52:54 PM4/11/22

to slim-discuss

Just a quick followup on this matter of the findInterval() function. It has just been added to Eidos, but won't come out in a SLiM 3.x version; it will wait until SLiM 4 (which is now under development). If you're interested, you can see the relevant GitHub issue at https://github.com/MesserLab/SLiM/issues/303.

Cheers,
-B.

Benjamin C. Haller
Messer Lab
Cornell University

Reply all

Reply to author

Forward