Dear Alexey,
Two additional comments in addition to those of Minh and Rob:
> On 28 Jun 2020, at 20:43, Alexey Neverov <
neva...@gmail.com> wrote:
>
> The problem is somewhare in the '-nt AUTO' option. I sucsessfully processed this dataset when used fixed number of cores by specifying the '-nt 16' option. In the attachment the results of analysis.
With only 225 sites in the alignment, parallelisation will not efficiently, meaning time savingly, work either.
Typically, batches of sites are distributed to different processors. This causes some extra time to be required, e.g., for distributing the data and collecting the results. Hence, the batches have to be sufficiently large, to actually save more time than is caused by this overhead.
We have seen people enforcing high numbers of threads (cores) and then ended up with longer running times than when using only 1 core :(
> You mentioned that the number of sites in the alignment might be not sufficient to reliebly estimate model parameters. My alignment contains 4350 sequences. How I can estimate the reliability of obtained results?
> The log file contains the warning:
Just one comment to think about. Assume all your sites would be informative to resolve parts of the tree. Furthermore, assume that each site can resolve one split (=branch) in the tree (something you typically cannot achieve with real data). Then, your dataset would be able to only able to resolve 225 out of the 4347 branches in your final tree. All other branches would be more or less randomly resolved and might differ from run to run, because there is no information in the data.
The only way to resolve a higher percentage of branches reliably is either to add more sequence data to each taxon, e.g. more genes, or to reduce the number of sequences (the first thing here is typically to remove identical sequences - this is also done already by IQ-TREE automatically and then re-added at the end, because here it is clear in what region of the tree they would end up, however, there is no way to determine their branching order in the tree.
And as stressed by the others - with so short sequences you should not used complex models. You do not have enough information in the data for this.
Best regards,
Heiko
> To unsubscribe from this group and stop receiving emails from it, send an email to
iqtree+un...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/iqtree/1dc1c6e6-2abf-4467-85d9-c4ff919ebb76o%40googlegroups.com.
> <atp6.results.zip>
-----------------------------------------------------------------------------
Heiko Schmidt
Center for Integrative Bioinformatics Vienna (CIBIV)
University of Vienna / Max Perutz Labs
http://www.cibiv.at/
-----------------------------------------------------------------------------