I just pushed a change to master that changes default behavior for simulation, so wanted to make sure anyone using it would get warned.
While the hmm inference framework has always accounted both for different overall mutation rates at each position in each gene, as well as for different rates to different bases at each of these positions (e.g. A->G vs A->T), the mutation model in the simulation previously only had the former of these. Well, now it has both. This involved updating to a much more recent version of bio++. Some more details
here.
This can make a big difference if you're looking at what amino acids get mutated to in the simulated sequences (it's the reason partis did poorly in the metrics like polarity and gravy in the
sumrep paper here), but otherwise won't make much of a difference.
If you want the old behavior, just set --no-per-base-mutation. The old method is also, unsurprisingly, five or ten times faster.
I also switched to installing the extra stuff for simulation in the docker file by default, so simulation should be a lot easier to run.