Hi Darren!
Yes, some form of this is commonly done. You can save and load the
simulation state in both tree-sequence recording models and regular
(nono-treeseq) models in SLiM; see, e.g., sections 9.2 and 18.3 of the
SLiM manual.
If you really want multiple experiments to start from the exact same
initial conditions, then this is very simple: burn in once, save the
state, and then load that state and simulate onward for each experiment.
More often one wants to start from a model state generated
stochastically by the same burn-in procedure, but not from the exact
same initial conditions (the exact genetic state of which individuals
possess which mutations at which frequencies, etc.). In this case,
simply burning in once and reloading that state for each experiment
could be problematic; it could result in pseudoreplication issues
because the experiments are not really independent. For example,
suppose your burn-in is non-neutral, and at the end of the burn-in there
is a beneficial mutation captured mid-sweep. Every experiment run from
the burn-in would have the same sweep, in the same location etc., and
that would be likely to bias the results of the experiments to be more
similar to each other than would be expected from fully independent
simulation runs. This is obvious with a mutation captured mid-sweep;
but it could be a (probably smaller) issue for a neutral burn-in too,
depending upon what exactly you're doing – your research questions and
the analysis you're doing on your simulation results. So it's something
to think about. Depending on how much this worries you, you might need
to do completely independent burn-ins, or you might decide the
pseudoreplication issue is not a concern for you and so you can share a
single burn-in; or there are intermediate options like:
- for every ten experiment runs do one shared burn-in, reducing the
compute for your burn-ins by 10x while still having lots of independence
between your runs (and you could even analyze whether the runs sharing a
burn-in produce results more similar than runs that don't, telling you
whether it matters)
- do a single long burn-in run, saving off new saved states periodically
to use as initial states for different experiments; if you decide you
should burn in for at least 10N generations, for example, where N is the
population size, then you might do a single burn-in run where you save
off state at 10N, 11N, 12N, ... xN where x suffices to give each of your
experiments a different starting state. This sort of procedure
wouldn't completely remove the possibility of pseudoreplication issues,
but it ought to mitigate it considerably, while still cutting your
compute time for burn-in by about 10x.
Lots of ideas in this area, lots of people doing different things. I am
not aware of any comprehensive study of the pros and cons of different
approaches, or of how much of a problem the pseudoreplication issue is
for different test statistics and such. That'd be a worthwhile thing
for somebody to do, probably, similarly to the recent studies looking at
the pros and cons of model rescaling techniques!
Cheers,
-B.
Benjamin C. Haller
Messer Lab
Cornell University
Darren Li wrote on 11/10/25 4:58 PM: