NCBI tree rendering

179 views
Skip to first unread message

irina.n...@gmail.com

unread,
Jun 24, 2016, 4:01:28 PM6/24/16
to The ETE toolkit
Hello,

I have a list of ~6000 taxids from NCBI taxonomy. I created a tree from them and I need to generate an image. When I use all of them, my program aborts with the messages like "QPainter::setOpacity: Painter not active". I do get a picture if I use a smaller list of nodes.

So I think I can probably collapse the nodes into higher level nodes, so that only the genus nodes are shown without species/subspecies.

When I generate the tree, I tried doing intermediate_nodes=False,collapse_subspecies=True - but my test tree still looks quite big.

What is a meaningful way to collapse this tree? And is there a way to export it in the edge format - eg (node1, node2)? I always used the networkx librabry before but I it doesn't have the NCBI module which I really need. I am not sure networx can read the newick format.

I  basically need to generate a nice image of the NCBI tree that I have - what is the best way for me to do this with your or another library?

Also, in the image I was able to generate, the taxids are shown twice - is there a way to fix this? I'm attaching it.

Thank you very much! Irina.
Auto Generated Inline Image 1

Jaime Huerta Cepas

unread,
Jun 26, 2016, 5:28:47 AM6/26/16
to eteto...@googlegroups.com
Hi Irina, 
I have a list of ~6000 taxids from NCBI taxonomy. I created a tree from them and I need to generate an image. When I use all of them, my program aborts with the messages like "QPainter::setOpacity: Painter not active". I do get a picture if I use a smaller list of nodes.
This is expected when exporting PNG images. The is a limitation for the max size of images. You could try exporting as SVG (although the produced file might be very large) 

So I think I can probably collapse the nodes into higher level nodes, so that only the genus nodes are shown without species/subspecies.

yes, that's a better option
 
When I generate the tree, I tried doing intermediate_nodes=False,collapse_subspecies=True - but my test tree still looks quite big.
How many leaves does your tree have? If many (i.e. >2000 leaves), you may need to collapse by a higher rank (i.e. family). Not sure if you need to show all species taxids. If so, there is little you can do. Large trees are difficult to visualise 

 
What is a meaningful way to collapse this tree? And is there a way to export it in the edge format - eg (node1, node2)? I always used the networkx librabry before but I it doesn't have the NCBI module which I really need. I am not sure networx can read the newick format.

try collapsing by rank=family or similar. you can also export a parent child table easily:

for node in t.iter_descendants():
    print node.up.name, node.name

 
I  basically need to generate a nice image of the NCBI tree that I have - what is the best way for me to do this with your or another library?
you can export the newick using ETE and try uploading into itol3, which handles large trees well.


hope it helps, 
best, 
-jaime

irina.n...@gmail.com

unread,
Jun 27, 2016, 1:08:36 PM6/27/16
to The ETE toolkit
Hi Jaime,

Thank you! If I want to specify which rank I want to be the lowest showing - is there an option to do that when building the tree? Or do I have to iterate over all ids and check their rank and select those I want and then use them in get_topology?

Thank you!
Irina.

Jaime Huerta Cepas

unread,
Jun 28, 2016, 3:52:27 AM6/28/16
to eteto...@googlegroups.com
Hi Irina, 
you can use the rank limit option in get_topology

Signature: NCBITaxa.get_topology(self, taxids, intermediate_nodes=False, rank_limit=None, collapse_subspecies=False, annotate=True)

However, there might be many nodes that miss a rank id. They will also appear the the returned topology. You can prune them manually, as explain in this discussion: 

--
You received this message because you are subscribed to the Google Groups "The ETE toolkit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to etetoolkit+...@googlegroups.com.
To post to this group, send email to eteto...@googlegroups.com.
Visit this group at https://groups.google.com/group/etetoolkit.
For more options, visit https://groups.google.com/d/optout.

irina.n...@gmail.com

unread,
Jun 28, 2016, 11:38:11 AM6/28/16
to The ETE toolkit
Thank you! So if I say rank_limit = "genus" - it will ONLY leave genus and no rank, right? Or will it leave genus and all the nodes above it, like family, etc?

Also, on the image the taxid label for each node appears twice - how to I fix it? You can see it in my picture in the original post.
THank you!

siarhei...@gmail.com

unread,
Jan 28, 2017, 1:09:33 AM1/28/17
to The ETE toolkit
Hi Irina,

Sorry to hijack your question, but how did you convert NCBI taxonomy to a a large tree that you can visualize?

I am trying to use http://itol.embl.de/ for visualization of a large tree with a few thousand taxids, where some of the leafs are going to be highlighted. ITOL requires trees in Newick, Nexus or PhyloXML tree formats.

With NCBI taxonomy browser https://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi I can get a tree in phylip format, but I don't know how to convert it to Newick, Nexus or PhyloXML without having to write my own parser. I should say that I don't really care about accurate branch lengths for now, they can be all of equal length for my current project.

I would appreciate any tips or suggestions.

Thanks,
Sergei
Reply all
Reply to author
Forward
0 new messages