Robert,
we read always about convergence but most users misappropriate this. You should sample from some equilibrium therefore the chain has to be in convergence, results will be crummy if convergence is reached at the end of the run, preferably you want to reach it in the burn-in.
Uneven number may not converge quickly -- non-convergence is a unhelpful term and should not be really used because we should distinguish between "not converging however long we run" or "not converged in x steps", the first is difficult to prove, well I bet that almost all uneven sample sizes converge eventually, but think of an extreme thought example:
2 population, one with 10000 samples and one with 2. Lets further assume that migration rate is almost zero
so that the two populations have one migration event, if we start with a random tree changing one branch at a time and we have 19999+2 branches to choose from then almost all tree changes will happen on the first population and only few in the second ---> the population size estimate for pop 2 will be terrible, if you run this long enough then this will work out fine. In this case results would be much easier to get with random 10 out of the 10000 and the 2. Differences such as 20 and 10 should make little problem in terms of convergence, but I guess that 100 and 10 will need simply longer runs.
I guess that mutation=DATA will help.
For replication within a run us the
replication=YES:10
option {10 is an example -- any number > 1 is good}.
Replication DOES benefit from MPI, replication has nothing to do with the short/long chain of ML where replication can be set as complete replicates (like above) or combining over the last (long) chain [I think that this option is not as optimal as the full replicates)
I will expand my tutorial page on hardware issues (which I never finished) and give a few explanations.
In short: if you have a cluster with 100 nodes, a dataset with 10 loci and you can use the full machine I suggest to
run 10 loci and 100 replicates (use a long burn-in such as 100000 steps or larger
and sample relatively little (say 5000), the long burn-in are needs to achieve convergence and then sample from it.
[my largest trial with migrate were 10000 loci 1 replicate, 1000 loci 10 replicate , 10 loci 1000 replicate on our cluster, largest number of node was 512 {runs on 1024 nodes failed, but I was not able to trouble shoot that because of problems with the hardware; I am currently tracing an issue with MPI and migrate on windows that leads to crashes that I cannot reproduce on a mac. so for runs with many reps and loci on large clusters try a small example (and short example) first to see whether migrate can finish and then scale it up, because when it crashes on these machines it is very difficult to find out what is going on.
Peter