Hello Justin,
> 1. I have a 90% completeness matrix that has the following patterns:
> Alignment sites / patterns: 1878880 / 350889
> Gaps: 11.50 %
> Invariant sites: 89.28 %
>
> I am not sure why the invariant sites is so high, as when I ran this using RAxML v8 (which timed out
> on the cluser), it was ~ 10-15%.
IIRC, RAxML v8 does not report prop. of invariant sites, so you might be comparing two different
values.
>Nonetheless, I am running the following code:
>
> raxml-ng --all --msa mafft-nexus-edge-trimmed-clean-most-90p.phylip --model GTR+G --prefix
> Stegonotus-90 --seed 2 --threads 2 --bs-metric fbp,tbe
Maybe consider using more threads/cores since your dataset is pretty large.
> My question for this is: currently it is running for 2 days and has gone through 95 bootstrap
> iterations. I assume it will time out at the 72 hour time limit. What is the best way to go about my
> analysis? Should I just check for convergence, and if it hasn't, run it again using the checkpoint?
Yes, that's what I would do.
> Is there a point at which I should just stop running it all together?
You can also manually limit number of bootstrap repplicates, e.g. --bs-trees 100.
> Additionally, the following
> files were created:
> Alignment-90.raxml.bootstraps.TMP
> Alignment-90.raxml.ckp
> Alignment-90.raxml.lastTree.TMP
> Alignment-90.raxml.log
> Alignment-90.raxml.mlTrees.TMP
> Alignment-90.raxml.rba
> Alignment-90.raxml.reduced.phy
> Alignment-90.raxml.startTree
>
> Where are the files that are for the Alignment-90.supportFBP and Alignment-90.supportTBE? Do those
> get created only when there is convergence?
Yes, those files are created when bootstrapping has converged or reached the maximum number of
replicates (see above).
> If there is a better way to go about my analyses, please let me know (i.e., running commands
> separately rather than the --all' flag).
This is just a question of convenience vs. feasibility, i.e. if you can run '--all' within
reasonable timeframe (which seems to be the case), then it is the best option.
>These datasets are not partitioned, so if I should change
> that, any recommendations on how to go about that would be appreciated. This is my first time
> running a genomic dataset in RAxML.
Maybe check other UCE-based phylogenetic studies to get an idea whether partitioning is considered
useful for this type of data. Also, given high proportion of invariant sites, you might want to try
the "GTR+G+I" model.
Best,
Alexey