Subsetting based on multiple bootstrap support values in newick file

1,692 views
Skip to first unread message

Will Chase

unread,
Jan 22, 2018, 11:22:03 AM1/22/18
to ggtree
Hello, 

I've used iqtree to build a large tree, and implemented the SH-aLRT and UFboot to compute node support values. The iqtree output stores these values as node labels with a slash separating the two values (SH-aLRT/UFboot). In the past, I have used something like "geom_point2(aes(subset=!isTip & label>0.7))" to mark nodes that are well supported on my tree, but now I would like to mark nodes that have above (or equal) 70 SH-aLRT value and above (or equal) 95 UFboot value. It is common now to combine multiple sources of node support using this notation, so I was surprised that I couldn't find an answer on google... my tree file is attached, any help would be appreciated. 



Problem:

nodes are labeled X/Y, how can I use geom_point2 to mark a subset that has X>=70 & Y>=95
MLtree_fixedmsa_nostarttree_rooted.tree

Yu, Guangchuang

unread,
Jan 22, 2018, 10:02:55 PM1/22/18
to Will Chase, ggtree

No matter how many information encoded in label, you can use computed variable which derived from the labels.

> require(ggtree)
> x = read.tree("~/Downloads/MLtree_fixedmsa_nostarttree_rooted.tree")
> x

Phylogenetic tree with 608 tips and 607 internal nodes.

Tip labels:
    Fusarium-oxysporum2, Fusarium-mangiferae3, Fusarium-avenaceum2, Fusarium-poae3, Nectria-haematococca3, Colletotrichum-graminicola, ...
Node labels:
    , 97.2/98, 75.1/95, 85/97, 96/98, 81.9/93, ...

Rooted; includes branch lengths.
> ggtree(x) + geom_nodepoint(aes(subset = sub("/.*", "", label) > 70 | sub(".*/", "", label) > 95))

Inline image 1


--
G Yu, DK Smith, H Zhu, Y Guan, TTY Lam*. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data, Methods in Ecology and Evolution, 2017, 8(1):28-36. doi:10.1111/2041-210X.12628
 
Homepage: https://guangchuangyu.github.io/ggtree
---
You received this message because you are subscribed to the Google Groups "ggtree" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bioc-ggtree+unsubscribe@googlegroups.com.
To post to this group, send email to bioc-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bioc-ggtree/5f8ba234-7c09-4219-b586-285790269051%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
--~--~---------~--~----~------------~-------~--~----~
Guangchuang Yu PhD
Postdoc researcher
State Key Laboratory of Emerging Infectious Diseases
School of Public Health
The University of Hong Kong
Hong Kong SAR, China
-~----------~----~----~----~------~----~------~--~---

Will Chase

unread,
Jan 25, 2018, 2:32:40 PM1/25/18
to ggtree
Thanks for the solution, but upon inspection of the tree, there appears to be something wrong with the code. It is labeling every node in the tree (and I know that many nodes do not meet the criteria). I thought it was because the node labels are stored as characters, but adding as.numeric() does not seem to help the problem. Any ideas?

The current code I'm using is ggtree(tree)+geom_nodepoint(aes(subset = as.numeric(sub("/.*", "", label))>70 & as.numeric(sub(".*/", "", label))>95))

Yu, Guangchuang

unread,
Jan 26, 2018, 12:01:48 AM1/26/18
to Will Chase, ggtree

oops, geom_point2 should works. The geom_nodepoint was defined to set the aes(subset = !isTip).

In release version >=1.10.4 and github version, now geom_nodepoint supports setting subset and you don’t need to explicitly passing & !isTip.

Currently you can using geom_point2(aes(subset = as.numeric(sub("/.*", "", label))>70 & as.numeric(sub(".*/", "", label))>95 & !isTip))

or installing the updated version and use previous code.




--
G Yu, DK Smith, H Zhu, Y Guan, TTY Lam*. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data, Methods in Ecology and Evolution, 2017, 8(1):28-36. doi:10.1111/2041-210X.12628
 
Homepage: https://guangchuangyu.github.io/ggtree
---
You received this message because you are subscribed to the Google Groups "ggtree" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bioc-ggtree+unsubscribe@googlegroups.com.
To post to this group, send email to bioc-...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Will Chase

unread,
Jan 26, 2018, 10:55:25 AM1/26/18
to Yu, Guangchuang, ggtree
Ah, yep that did it. I've tested with geom_point2 and that solution works. Thank you for the help!
--
Will Chase
Research Assistant
Cosgrove lab
352 North Frear Building
The Pennsylvania State University

Mila Grinblat

unread,
Jul 9, 2020, 1:22:38 AM7/9/20
to ggtree
Hi
I have a similar question with a small difference (I think it is small)
I have 3 values and not only 2 
I have the bootstrap / gCF/ sCF  values for each node
I only want the point color to represent the first value (bootstrap) then write the 2nd and 3rd as a value (geom_nodelab)
Is this possible?
Thanks



On Friday, 26 January 2018 15:01:48 UTC+10, Yu, Guangchuang wrote:

oops, geom_point2 should works. The geom_nodepoint was defined to set the aes(subset = !isTip).

In release version >=1.10.4 and github version, now geom_nodepoint supports setting subset and you don’t need to explicitly passing & !isTip.

Currently you can using geom_point2(aes(subset = as.numeric(sub("/.*", "", label))>70 & as.numeric(sub(".*/", "", label))>95 & !isTip))

or installing the updated version and use previous code.


On Fri, Jan 26, 2018 at 3:32 AM, Will Chase <wcha...@gmail.com> wrote:
Thanks for the solution, but upon inspection of the tree, there appears to be something wrong with the code. It is labeling every node in the tree (and I know that many nodes do not meet the criteria). I thought it was because the node labels are stored as characters, but adding as.numeric() does not seem to help the problem. Any ideas?

The current code I'm using is ggtree(tree)+geom_nodepoint(aes(subset = as.numeric(sub("/.*", "", label))>70 & as.numeric(sub(".*/", "", label))>95))

On Monday, January 22, 2018 at 11:22:03 AM UTC-5, Will Chase wrote:
Hello, 

I've used iqtree to build a large tree, and implemented the SH-aLRT and UFboot to compute node support values. The iqtree output stores these values as node labels with a slash separating the two values (SH-aLRT/UFboot). In the past, I have used something like "geom_point2(aes(subset=!isTip & label>0.7))" to mark nodes that are well supported on my tree, but now I would like to mark nodes that have above (or equal) 70 SH-aLRT value and above (or equal) 95 UFboot value. It is common now to combine multiple sources of node support using this notation, so I was surprised that I couldn't find an answer on google... my tree file is attached, any help would be appreciated. 



Problem:

nodes are labeled X/Y, how can I use geom_point2 to mark a subset that has X>=70 & Y>=95

--
G Yu, DK Smith, H Zhu, Y Guan, TTY Lam*. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data, Methods in Ecology and Evolution, 2017, 8(1):28-36. doi:10.1111/2041-210X.12628
 
Homepage: https://guangchuangyu.github.io/ggtree
---
You received this message because you are subscribed to the Google Groups "ggtree" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bioc-...@googlegroups.com.

To post to this group, send email to bioc-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages