Hi,
I'm new to using ETE and am particularly keen on applying the NCBI Taxonomy functions (get_lineage, get_rank etc) to metagenomic data.
One thing I have been trying is to determine the lowest common ancestor (LCA) for the txids associated with each read pair (1 for fw read, 1 for rev read), for a large file (over 1M read pairs).
My understanding is I could do the following:
from ete2 import NCBITaxa, Tree
tree=NCBITaxa.get_topology([txid1, txid2])
common_ancestor=tree.get_common_ancestor("txid1","txid2")
print
common_ancestor.name
However, this doesn't work for, for example "root" (txid1=1) and "Viruses" (txid2=10239), as the tree will just result in the node "10239" without a node "1"
=> Is there a way of getting the LCA for two taxa, even if they're on the same branch? Here, the result I want is "1" as "10239" is a descendant of "1"
Also, is there a better/quicker way of going about this?
Many thanks,
Carlijn