Hi Kamolphat!
Please attach the profile that you took; given the scale of the
model, you shouldn't expect us to take a profile of it ourselves.
In general, provide the actual information that you have, rather
than trying to just verbally summarize it. :->
I'm puzzled that you call out three aspects of the profile, but
those three aspects barely total more than 50%, so I'd like to see
where the other ~50% is spent. You also say that for first() events
it is "probably for the life-long monogamy part", but why
"probably"? The profile should show you the exact lines of code
that are hotspots within your events; this should not be a matter of
speculation.
Is the profile taken with a smaller version of the model, or at your
desired full scale of 2 million individuals? You say that the
scaling of the model was not linear, so if you're trying to get it
to run at full scale, a profile taken at small scale will not be
informative regarding where the performance problems are. As the
number of individuals increases, some aspects of a model will
typically scale linearly (not a big problem) whereas others will
scale quadratically or worse (a big problem). So you need to figure
out what part of the model is not scaling well, and fix that. Your
profile also ought to be taken at a point when mutational density
has more or less equilibrated; profiling a model at the very start
of a burn-in period is very misleading. Ideally, you profile across
a full run of the model so that your profile captures the totality
of what is happening, because as a model runs the location of the
hotspots in it can change drastically.
You say that "the majority" of the memory usage is the tree-sequence
recording tables, and it sounds like tree-sequence simplification
isn't a major part of your runtime (you didn't mention it), so an
obvious place to start to get the memory usage down would be to
increase the frequency of simplification; did you try that?
Your justification for using pseudo-chromosomes instead of separate
chromosomes doesn't really make sense to me. You might try using
separate chromosomes, which might make the tree-sequence recording
information smaller since it would then not be struggling to record
the free recombination between your 34 chromosomes. With
pseudo-chromosomes it is having to record all of the recombinations
between the chromosomes, for every individual it generates, which is
probably a huge amount of information and would totally destroy the
correlations between adjacent trees in the tree sequence. I don't
know, but that seems probably bad.
You say that you rescaled the genome to be 1000x smaller with a
1000x higher mutation and recombination rate. That might not end up
being a win, since you will have the same number of mutations
segregating, just at a higher density. It might actually slow SLiM
down, in fact, since it will have to deal with a lot more mutation
stacking, and since its mechanisms for taking advantage of shared
haplotype structure might not work as well. So you might try just
not doing that. :-> My guess is that it probably makes little
difference either way, given that the number of mutations remains
the same and the density is not that high even after rescaling; but
it'd be interesting to check and see. The more typical model
scaling that people do is described in section 5.5 of the SLiM
manual, but it probably doesn't work well at all for a spatial
model, and would have to be done extremely carefully in that
context, so I am not recommending it, just noting it. People don't
generally rescale the genome length in the way you're doing
precisely because it doesn't generally end up making much
difference, I think. :->
I also note that although you've rescaled the genome to be 1000x
smaller, you still have 1 million sites in your model, all of which
are deemed to be potential QTL sites. From a nucleotide-level
perspective (QTNs) perhaps that makes sense; but you could shift to
a gene-based perspective and model just perhaps 1000 sites as QTLs,
if that would make sense. With the downscaled version of the model
that you provided, after 1000 ticks there are only 400 mutations
segregating, but that version of the model appears to have a
population size of only about 2500, so if you want to scale up to 2
million, you're going to have a *lot* of QTNs in your model. Most
of those will probably be extremely rare, but all of them will need
to be tracked by SLiM, recorded in the tree-sequence tables, etc.
Is that realistic, and do you really need that many QTNs, and that
level of genetic detail? How about maybe a smaller number of causal
sites, with perhaps a correspondingly larger distribution of effect
sizes, or something like that? Maybe you need to be simulating what
you're simulating, I don't know; but this seems worth thinking
about.
You define two spatial maps in your 1 first() event, but you name
them both "world", so the second map will replace the first one, I
think. Seems like a bug.
In your first() event you're calling i2.evaluate() after each
individual chooses a mate. That is going to be lethal,
performance-wise. You need a better mate-choice algorithm.
Overall, for a model with 2 million individuals, two traits with
pleiotropy, 34 chromosomes, a 1000 MB genome (scaled, but with the
same number of mutations, and with every site potentially being
causal), spatial, running for 3000 ticks, with tree-sequence
recording but without a recapitated burn-in... it's going to take a
while, and use a lot of memory. Such is life. :-> So while you
may be able to speed the model up significantly by working with the
profile and adjusting some aspects of it, in the end, resign
yourself to starting your runs and then going and working on
something else for a while. Keep in mind that biologists who do
fieldwork often have to spend years collecting data for a single
experiment – and a lot of that is manual labor, not just letting a
computer do the work! :-> Redesigning your mate-choice
algorithm might help a lot; think of a way to design it that only
requires a single evaluation of the interaction per tick. And
rethinking how you're handling the genetics might help; I'm not sure
exactly what to propose, but it seems like you've got way more
causal sites than you probably need. Beyond that, it's hard to
guess further; you need to provide a high-quality profile of the
model running at full scale.
Overall,
Cheers,
-B.
Benjamin C. Haller
Messer Lab
Cornell University