Dear IQ-tree community,
I am a MSc student working on an insect phylogeny of 200 spp circa using 8 genes, including mitochondrial and nuclear PGS and rRNA. The concatenation has been gblocked and thus reduced to 4kbp circa. I am observing some degree of saturation using DAMBE and a lot of heterogenity in base composition using Aligroove.
While testing different intial partitioning schemes I observed that they were influencing topology and branchlength of the tree in a quite strong way.
The initial partitioning scheme i have been using were the ones I have seen beeing commonly used:
(a) all genes separately, resulting in 8 initial partitions.
(b) all codon positionin separately + rRNA, resulting in 4 initial partitions.
(c) each codon position of ecah PCG + each rRNA separately, resulting in 28 partitions.
(d) Moreover, I also tested the GHOST model with 4 classes.
My question is: how can I decide which is the "correct" - if such a thing exist! - initial partitioning scheme? Until now I observed the average nodal support of each phylogeny (ufboot) and I did likelihood mapping, to asses the performance of each analysis to resolve any quartet in a clear way.
Does it make sense? Can i compare Likelihood values of the trees and can I use any different test to focus on a single topology?
Thanks you all in advance for the support,
Jacopo
PS: does it make sense to use concordance factors for 8 gene only?