GHOST questions

162 views
Skip to first unread message

张东

unread,
May 23, 2019, 10:12:23 AM5/23/19
to IQ-TREE
Dear Developers,

I am interested in the GHOST model, and I have several questions:

1. If ModelFinder doesn't choose GHOST as the best-fit model, can I use GHOST directly to conduct the comparative topology analysis
2. When I use H5 with my working dataset, there are five trees in .treefile (corresponding to five classes).  I compared these trees, and found that they all have identical topologies. Now I am wondering whether that is a default,  or whether there may be cases (datasets) that may produce different topologies among these five trees (classes).
3. We chose GHOST model as it is able to account for rate variation across sites and lineages, and we are primarily interested in attenuation the LBA topological artefacts. We used itol to parse the tree file automatically, so we did not realize that there are mutiple trees. We relied on the tree that itol automatically parsed.  Therefore, we are wondering if this was a methodologically sound approach, given that all five of our topologies were identical.

Best,

Dong

Stephen Crotty

unread,
May 25, 2019, 4:28:31 PM5/25/19
to iqt...@googlegroups.com
Hi Dong,

1. Yes, you can of course choose to use the GHOST model is you wish, even if ModelFinder does not select it. ModelFinder uses BIC (by default) to choose the best-fit model, you may have reasons for not wanting to rely solely on BIC (or AIC or AICc) to choose your model, but you should be able to justify your decision.

2. All the trees have the same topology when we fit a GHOST model, only the branch lengths differ between classes (the ‘ST’ in GHOST stands for Single Topology).

3. I think I need more detail on exactly what you are asking here. I gather you fit a GHOST model to your dataset and obtained a .treefile with 5 trees (of identical topology). You then parsed this .treefile to some other program? Or do you parse it back to IQ-TREE. To carry out tree topology tests? If the latter, then of course there is no point doing this as the topologies within the .treefile are identical by definition.

I hope that helps a bit, thanks for your interest in using GHOST and IQ-TREE,

Stephen


Stephen Crotty, PhD
Centre for Integrative Bioinformatics Vienna (CIBIV)
Campus Vienna Biocenter 5, VBC 5, Ebene 1
A-1030 Vienna, Austria



--
You received this message because you are subscribed to the Google Groups "IQ-TREE" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iqtree+un...@googlegroups.com.
To post to this group, send email to iqt...@googlegroups.com.
Visit this group at https://groups.google.com/group/iqtree.
To view this discussion on the web visit https://groups.google.com/d/msgid/iqtree/1c7eabed-c4c4-4b94-9384-cc5a4959ccb7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

张东

unread,
May 26, 2019, 3:08:34 AM5/26/19
to IQ-TREE
Dear Stephen,

We noticed that our *.contree contains a single tree (topology-wise) with four different branch length value sets.  
Now we are wondering how to interpret that result. Does it imply that all four trees (branch length sets) are equally probable? Or have we missed an additional analysis that may help us resolve this?  As other method only have one branch length set.
As regards the parsing problem that I mentioned, I used "iTOL" webpage (https://itol.embl.de) to parse the tree file, but iTOL didn't report there are mutiple trees. Instead it appears that it randomly parsed one of the trees without a warning. Now I am wondering whether this was a major methodological error on my side. 

Best,

Dong 

在 2019年5月26日星期日 UTC+8上午4:28:31,Stephen Crotty写道:
Hi Dong,

1. Yes, you can of course choose to use the GHOST model is you wish, even if ModelFinder does not select it. ModelFinder uses BIC (by default) to choose the best-fit model, you may have reasons for not wanting to rely solely on BIC (or AIC or AICc) to choose your model, but you should be able to justify your decision.

2. All the trees have the same topology when we fit a GHOST model, only the branch lengths differ between classes (the ‘ST’ in GHOST stands for Single Topology).

3. I think I need more detail on exactly what you are asking here. I gather you fit a GHOST model to your dataset and obtained a .treefile with 5 trees (of identical topology). You then parsed this .treefile to some other program? Or do you parse it back to IQ-TREE. To carry out tree topology tests? If the latter, then of course there is no point doing this as the topologies within the .treefile are identical by definition.

I hope that helps a bit, thanks for your interest in using GHOST and IQ-TREE,

Stephen

Stephen Crotty, PhD
Centre for Integrative Bioinformatics Vienna (CIBIV)
Campus Vienna Biocenter 5, VBC 5, Ebene 1
A-1030 Vienna, Austria
On 23 May 2019, at 4:12 pm, 张东 <dongzh...@gmail.com> wrote:

Dear Developers,

I am interested in the GHOST model, and I have several questions:

1. If ModelFinder doesn't choose GHOST as the best-fit model, can I use GHOST directly to conduct the comparative topology analysis
2. When I use H5 with my working dataset, there are five trees in .treefile (corresponding to five classes).  I compared these trees, and found that they all have identical topologies. Now I am wondering whether that is a default,  or whether there may be cases (datasets) that may produce different topologies among these five trees (classes).
3. We chose GHOST model as it is able to account for rate variation across sites and lineages, and we are primarily interested in attenuation the LBA topological artefacts. We used itol to parse the tree file automatically, so we did not realize that there are mutiple trees. We relied on the tree that itol automatically parsed.  Therefore, we are wondering if this was a methodologically sound approach, given that all five of our topologies were identical.

Best,

Dong

--
You received this message because you are subscribed to the Google Groups "IQ-TREE" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iqt...@googlegroups.com.

Stephen Crotty

unread,
May 28, 2019, 5:35:20 AM5/28/19
to iqt...@googlegroups.com
Hi Dong,

The four different branch lengths are because you used a GHOST model with four classes. So essentially you have one tree topology, but four different sets of branch lengths. These are not equally probable, each class will have its own weight assigned to it, which will sum to one. You can find these weights in the .iqtree file (I think they are called "Heterotachy weights”).

With regards to parsing the tree file, if you are only interested in the topology then it doesn’t matter, as all the topologies are the same.

It might help you to read up on the GHOST model, it is quite different to fitting a single model in which you only infer one set of branch lengths. Here is a link to the paper https://www.biorxiv.org/content/10.1101/174789v2.

Cheers,

Stephen


Stephen Crotty, PhD
Centre for Integrative Bioinformatics Vienna (CIBIV)
Campus Vienna Biocenter 5, VBC 5, Ebene 1
A-1030 Vienna, Austria
To unsubscribe from this group and stop receiving emails from it, send an email to iqtree+un...@googlegroups.com.

To post to this group, send email to iqt...@googlegroups.com.
Visit this group at https://groups.google.com/group/iqtree.

张东

unread,
May 28, 2019, 11:01:30 PM5/28/19
to IQ-TREE
Hi Stephen,

I get it, thank you very much for your help.

Sincerely,

Dong


在 2019年5月28日星期二 UTC+8下午5:35:20,Stephen Crotty写道:
Hi Dong,

The four different branch lengths are because you used a GHOST model with four classes. So essentially you have one tree topology, but four different sets of branch lengths. These are not equally probable, each class will have its own weight assigned to it, which will sum to one. You can find these weights in the .iqtree file (I think they are called "Heterotachy weights”).

With regards to parsing the tree file, if you are only interested in the topology then it doesn’t matter, as all the topologies are the same.

It might help you to read up on the GHOST model, it is quite different to fitting a single model in which you only infer one set of branch lengths. Here is a link to the paper https://www.biorxiv.org/content/10.1101/174789v2.

Cheers,

Stephen
Stephen Crotty, PhD
Centre for Integrative Bioinformatics Vienna (CIBIV)
Campus Vienna Biocenter 5, VBC 5, Ebene 1
A-1030 Vienna, Austria
Reply all
Reply to author
Forward
0 new messages