What are the unites for UniFrac distances

Colin Brislawn

unread,

Oct 17, 2016, 2:55:22 PM10/17/16

to Qiime 1 Forum

Good morning,

What are the units associated with the UniFrac distance metric? Is it some measure of branch length, or is a 'percent shared branch length' and thus unit-less?

I similar question was asked, but not answered, here:

https://groups.google.com/d/msg/qiime-forum/TVuaXFN9TUM/8A7R8BlbBdgJ

Thank you for your time,

Colin

Jamie Morton

unread,

Oct 17, 2016, 5:55:52 PM10/17/16

to Qiime 1 Forum

Hi Colin,

That's actually a very interesting question. I wouldn't think it is unitless, since it is proportional to the shared branch length.

I've forwarded this to one of the developers of Unifrac, so expect to hear a confirmation soon.

Cheers,

Jamie

Colin Brislawn

unread,

Oct 20, 2016, 1:30:19 PM10/20/16

to qiime...@googlegroups.com

Feedback from Tony:

It doesn't have units - it's a proportion of the overall branch length.

So while the branch length may have units, the units cancel when calculating the proportion of branch length shared, and thus the metric is unitless.

Thanks!

Colin

PS. Out of curiosity, what are the units for branch length?

Jamie Morton

unread,

Oct 21, 2016, 1:17:14 PM10/21/16

to Qiime 1 Forum

Here's a response from Jon, who's an expert in phylogenetics

In a parsimony tree it is precisely that -- the branch length is the minimum number of nucleotide or amino acid changes necessary to get between nodes on a given topology. The parsimony tree is chosen to minimize that total distance.

In a ML or Bayesian tree, the branch lengths are going to defined by some sort of model. So you might have the same number of nucleotide changes between A and B and between B and C, but if the changes on B to C have a lower likelihood in your model, that branch will be longer. Typically, in ML trees you will see the branches in units of evolutionary distance according to the model used (e.g. GTR or Tamura-Nei). A lot of Bayesian trees will incorporate an explicit rate parameter in the model, so the branch length will be in units of time. In the Bayesian trees you're actually fitting branch lengths rather than calculating them, so the branch length may not actually precisely reflect the modeled evolutionary distance -- and if it's a problematic branch that isn't being reconstructed with high posteriors, the difference can often be quite large.

Reply all

Reply to author

Forward