Output trees with missing taxa in partitioned analyses

Diego F. Morales-Briones

May 23, 2019, 5:18:41 PM5/23/19

I'm running analyses with a partition scheme (-q) and I noticed that in several cases the output trees have less taxa than the original alignment. I check the log and this happened when some species fail the chi2 test. The number of taxa removed is equal to the number of sequences failing the test, but the missing sequences are not necessarily the same that failed the test. I thought that IQ-TREE does not remove those sequences, so I'm confused when I missing those sequences.

The command that I used was for example:

iqtree -m TESTMERGE -s cluster9987_1rr.NT.aln-gb -nt 2 -bb 1000 -wbt -st DNA -q cluster9987_1rr.NT.aln-gb.partition

and the partition file looks like this:

DNA, codon1andcodon2 = 1-393\3, 2-393\3
DNA, codon3 = 3-393\3


Minh Bui

May 27, 2019, 8:39:43 PM5/27/19
to IQ-TREE, Diego F. Morales-Briones
Hi Diego,

IQ-TREE internally removed identical sequences, but then added them back at the end to the final tree in random order but clustering the  identical sequences together. So it has nothing to do with missing sequences and it’s weird that some taxa are removed in the final tree. Can you send me the input files for further inspection?


Minh Bui

Jun 2, 2019, 6:05:56 PM6/2/19
to Diego F. Morales-Briones, IQ-TREE
Hi Diego,

Now I see the problem: it’s due to missing data. Your alignment has 11 sequences, but two sequences (TecpeSFB_Tecticornia_pergranulata, Amhypgen_Amaranthus_hyponchondriacus) have complete missing data within two input partitions, and thus removed as they don’t contribute any phylogenetic signal. Thus, the final tree has only 9 taxa.

If you, however, insist on keeping these sequences, then use option -keep_empty_seq. However, this is not recommended as such sequences can “plug” to any position in the tree and distort support values.

Hope that helps,

On 1 Jun 2019, at 12:05 am, Diego F. Morales-Briones <dfmor...@gmail.com> wrote:

Hi Minh,

The input alignment has 11 taxa, so actually you ran into the same problem that me.

Thanks for you're help,

On 31May, 2019, at 08:59, Minh Bui <minh...@univie.ac.at> wrote:

Hi Diego,

I just ran your data set, and the output .treefile has all 9 taxa… so nothing went wrong here, and I don’t know what had happened in your case.


On 28 May 2019, at 11:31 am, Diego F. Morales-Briones <dfmor...@gmail.com> wrote:

Hi Minh,

Here are the the files and the log. It seems that it has to do with the partition  because when running the same data without a partition the final tree has all taxa. 

The command I used was 'iqtree -m TESTMERGE -s cluster9987_1rr.NT.aln-gb -nt 2 -bb 1000 -wbt -st DNA -q cluster9987_1rr.NT.aln-gb.partition'


