Hello Léo-Paul,
The comparison tree does use the average branch length, although the
implementation in Biodiverse keeps zero length branches as zero, so
it is actually the average of the non-zero length branches. In most
cases a tree will not have such branches, though.
Your interpretation is correct for RPD (relative phylogenetic
diversity), which is the ratio of PD for the observed tree to PD for
the comparison tree. It perhaps needs more thought, but it should
be possible to apply the threshold logic from CANAPE to the PD_obs,
PD_alt and RPD scores to identify regions with branches that are
longer than the expected 30 My.
The reason it will not apply to CANAPE is that for phylogenetic
endemism (PE) the branch lengths are downweighted by their
geographic ranges. For example, a relatively long branch length
(e.g. 60 My) might be widespread. If it is found in 10 cells then
its range weighted length will be 60/10=6. The same branch on the
alternate tree will be given a weighted length of 30/10=3. However,
if another branch of length 120 is found in 20 cells then its
weighted length will also be 6, although its weighted length on the
alternate tree will be 1.5.
One approach that could be used to assess the branch ages without
correcting for ranges is to use the CANAPE regions to define the
analysis windows, and then explore the branch lengths of the
observed tree more directly using some of the PD branch and clade
summary indices.
https://github.com/shawnlaffan/biodiverse/wiki/Indices#phylogenetic-diversity-node-list
https://github.com/shawnlaffan/biodiverse/wiki/Indices#pd-clade-contributions
There are two ways to do this.
The first is to use a GIS to generate polygons of the CANAPE
regions, and then use these polygons to define the neighbourhoods
for a spatial analysis.
https://github.com/shawnlaffan/biodiverse/wiki/SpatialConditions#sp_points_in_same_poly_shape
(this documentation needs a lot of work)
The second approach is to cluster the CANAPE regions, and set the
system to calculate the indices I listed above for each node. This
would involve finding the relevant branch on the cluster dendrogram
that relates to a CANAPE region, which might be difficult if your
study region is large and complex.
More details of the cluster analysis are at
http://biodiverse-analysis-software.blogspot.com/2016/04/more-canape-how-to-restrict-your.html
That blog post describes a means to use analysis results to define
neighbourhoods, but it does not provide a means to define
non-contiguous regions. In other words, it will treat all cells
that pass the threshold as a single set, regardless of whether they
are distinct regions or not.
The above approaches might help identify regions with branches that
pre or post date some event, but I am not certain about that. If
your question does not need the RPD/RPE scores, then you could also
generate your own tree and use that in a post-CANAPE spatial or
cluster analysis. i.e., run CANAPE using your original tree, and
then calculate the PD scores for the CANAPE regions using some other
tree. Biodiverse will use whichever tree is selected at the top of
the window when it runs its analyses, so there is a high degree of
flexibility there.
You can also visualise several of the clade indices on the tree, so
if you know which branches are interesting then you can see what
their contributions are to the analysis for some neighbourhood.
http://biodiverse-analysis-software.blogspot.com/2017/09/visualise-spatial-analysis-results-on.html
Hopefully that all helps. I am sure there are points that I have
missed and/or that need further clarification, so please do ask any
follow-up questions.
Regards,
Shawn.