Hi Adrian,
Thx for the Q! May i ask if you got memory issue when running on the whole aln but you didn't get any error when splitting the whole into several alns?
If so, a trivial but always useful way is to find a computer with a larger memory.
Alt, it would be good to use the approx method usedata=3. In this way, you can try splitting your data into a few partitions according to some criteria. Then, run codeml separately to get the gradient and hessian, and merge them into a single in.BV file with the gradient and hessian from each partition, followed by a subsequent mcmctree analysis for dating. So the trick is to split the whole aln into partitions as CODEML on each "small" partition will lower the memory need.
for more info, you can have a look at the nicely tutorial written by the development team
best,
sishuo