Hi A!
Interesting algorithm, I like the idea. It seems like you need to treat
the tag==0 individuals separately in your reproduction callback,
though, right? There might not be the same number of tag==0 males and
tag==0 females, especially after you sample (but even before you sample,
actually), so that means your sorted vectors won't actually correspond
one-to-one with each other correctly, doesn't it? Perhaps I'm missing
something, but I don't see how your algorithm handles that problem.
Once you've separated out the tag==0 individuals for separate treatment,
the remainder – the mated pairs – should just have one male and one
female for any given tag value, so in that context your question doesn't
matter any more, right? It would only matter for the tag==0
individuals (of which there will often be more than one male and more
than one female); but since (it seems to me) you need to treat them
separately anyway, in a different way, I think your question doesn't
matter there either, perhaps?
But to answer your question: no, Eidos makes no guarantee about the
randomness or otherwise of a sorted vector, in the case where the sort
key is identical between more than one element. There is no guarantee
as to whether the sort is "stable" or not, in computer science terms;
but even if it isn't stable, the order of elements with identical sort
keys will not be random in any reliable way. (They certainly aren't
deliberately randomized with draws from a random number generator; they
might be scrambled in some manner by the underlying algorithm, but that
is not reliably random and would probably exhibit sort-order artifacts.)
A few notes on your code, while I'm here:
- You do "which(p1.individuals.sex == "F")" to select the females, but
seem to use that only to count the number of females (and likewise for
the males); and then when you want to draw females you call
p1.sampleIndividuals() to do it. It would probably be much more
efficient to get the vector of females initially with "females =
p1.subsetIndividuals(sex='F')", and then scramble their order with
"fparents = sample(females, length(females), replace='F')".
- But since the next thing you do is sort by tag, the scrambling is
unnecessary anyway, if you exclude the tag==0 individuals from the
process first. (Which I think you need to do anyway, as I discussed
above, because there won't be an equal number of male and female tag==0
individuals, as far as I can see.)
- The way that you call "p1.individuals.tag" over and over is quite
inefficient; every time you do that, Eidos has to fetch p1, then fetch
every individual from it (p1.individuals), then fetch every tag value
from those individuals (p1.individuals.tag). If you can fetch that once
and cache it in a local variable, it's much more efficient. Of course
this only matters if you need to increase the performance of your code.
- You lean on "which()" more than I'd recommend for efficiency. For
example, "length(which(p1.individuals.tag == t))" to get the number of
individuals with tag t is better done as "sum(p1.individuals.tag == t)";
and "p1.individuals[which(p1.individuals.tag == t)]" is better done as
"p1.subsetIndividuals(tag=t)". (In fact, the count can be obtained with
"size(p1.subsetIndividuals(tag=t))" too, which might be even
better/faster for your particular algorithm since then you've already
got the subset of the individuals that you need in hand, after getting
the size.)
- Beyond the comments above, the unmatched parents algorithm can perhaps
be optimized further, if it's a bottleneck (profile in SLiMgui to find
out). One simple thing is that you should pass "preserveOrder=F" to the
call to unique(); that makes it much faster, and you don't need the
order preserved for your algorithm. That converts it from an O(N^2)
algorithm to an O(N log N) algorithm. But even faster, for large N,
would probably be to fetch p1.individuals.tag once and then use
tabulate() to determine which tag values have a count of 1; that is, in
effect, a radix sort and so would be O(N), I think. For large N if
there are many unmatched parents then the lookup of the particular
parents that are unmatched might still make things slow; I think you
could speed that part of the algorithm up using match() to give you the
indices of the individuals that have the unmatched tag values, making
that part of the algorithm, also, O(N). So if you plan to scale this
model up to large N, an approach like that would perhaps be important.
You could also use match() instead of sortBy() to find the matching male
mate for each tag != 0 female, again converting from an O(N log N)
algorithm to an O(N) algorithm. But if performance is not a problem for
you so far, or if profiling doesn't show these algorithms to be
bottlenecks, then don't worry about it. :->
It'd be nice to add a recipe to the manual demonstrating how to model
life-long monogamy like this; this comes up as a question from time to
time. I'm feeling inspired to do that by the approach you've taken
here, which has given me some ideas as you can see :->. Would it be
OK with you if I adapted your code into a recipe and added it to the
manual, with a credit to you for the idea? :->
Good luck, and happy modeling!
Cheers,
-B.
Benjamin C. Haller
Messer Lab
Cornell University
Kamolphat Atsawawaranunt wrote on 5/8/25 12:28 AM: