Changing tip labels

2,192 views
Skip to first unread message

Ryan Kepler

unread,
Dec 8, 2016, 2:15:54 PM12/8/16
to ggtree
All,

A common task I find myself doing when making trees is changing tip labels from a shorthand code particular to a project to something more meaningful to a general audience.  For example: "RMK001" to "Metarhizium majus ARSEF 1914".  With the phylotools package, the sub.tip.label function handles this nicely.  Unfortunately this doesn't work if you import using "read.raxml" for ggtree, and there is no native ggtree way to accomplish this.  But I figured out a way to change tip labels for trees imported via "read.raxml". I assume this will work with the other data import types, but didn't test those. The example below should be reproducible with the sample data included in ggtree.  I make the name data frame by hand here, but you could easily import your own table.  This example makes a second data frame comparing old and new names, to show things are behaving as intended.

I hope you find this useful, and if you see any ways to improve this further please comment.  If it's possible to incorporate this as a ggtree function somehow, that would make an awesome package even better.

~Ryan

#get the raxml file from example data
raxml_file <- system.file("extdata/RAxML", "RAxML_bipartitionsBranchLabels.H3", package="ggtree")
raxml <- read.raxml(raxml_file)

#build a data frame with the original tip names in one column and some random names in another
N<-length(raxml@phylo$tip.label) #save some typing
name<-data.frame(old_tip = raxml@phylo$tip.label, first = rep("A", N), second = c(1:N))

#create a second raxml object for easy comparison
raxml2<-raxml

#make an empty data frame to confirm what names are used
check<-data.frame(old_tip = rep("", N), new_tip = rep("", N), stringsAsFactors = F)

for (i in 1:N) {
  #save names to table, old name pulled from raxml element, new name made on the fly from elements of name
  check[i, ] = c(raxml@phylo$tip.label[i], do.call(paste, c(name[c("first","second")], sep = " "))[name$old_tip == raxml2@phylo$tip.label[i]])
  #This is the part that actually changes the tip labels
  raxml2@phylo$tip.label[i]<-do.call(paste, c(name[c("first","second")], sep = " "))[name$old_tip == raxml2@phylo$tip.label[i]]
}

#plot the original tree
ggtree(raxml) + geom_label(aes(label=bootstrap, fill=bootstrap)) + geom_tiplab() +
  scale_fill_continuous(low='darkgreen', high='red') + theme_tree2(legend.position='right')

#plot the tree with tip labels changed
ggtree(raxml2) + geom_label(aes(label=bootstrap, fill=bootstrap)) + geom_tiplab() +
  scale_fill_continuous(low='darkgreen', high='red') + theme_tree2(legend.position='right')


gc...@connect.hku.hk

unread,
Dec 8, 2016, 9:33:32 PM12/8/16
to ggtree
thanks for your share, but this is not a ggtree way.

Try something like this:


lb = raxml@phylo$tip.label
d = data.frame(label=lb, label2 = paste("AA", substring(lb, 1, 5)))
ggtree(raxml) %<+% d + geom_tiplab(aes(label=label2))

This %<+% operator defined in ggtree was documented on the vignette. Please read the vignette for more details.

Ryan Kepler

unread,
Dec 9, 2016, 10:55:34 AM12/9/16
to ggtree
That's great, I didn't realize you could do that with geom_tiplab.  Thanks for the 
Reply all
Reply to author
Forward
0 new messages