some problem about analysis

Mark Dong

unread,

Sep 30, 2023, 5:30:37 AM9/30/23

to Biodiverse Users

Hello,Shawn
It's a pleasure to discuss this amazing software with you again! To be honest, I had some difficulty using what it intended to calculate "phylogenetic diversity". First I imported the data of family species and nomenclature, latitude and longitude, then I imported the tree file, and finally I only checked "phylogenetic diversity" for spatial analysis, but the result did not appear any color, only white. I don't know if it's my file data or if there's an error with my software options. I would like you to help me see where I have an error above, I'm distressed that none of the results that I am satisfied with appear, there is a problem with the data or analysis settings? Or do you have a tutorial related to spatial analysis? I am distressed and hope you can answer for me, I would really appreciate your generous help!
Regards,
Mark.

Shawn Laffan

unread,

Sep 30, 2023, 5:48:27 AM9/30/23

to biodiver...@googlegroups.com, Mark Dong

Hello Mark,

The labels in the tree need to exactly match the labels in the spatial data, otherwise there will be no matches.

Have you opened the view labels tab to explore what links and what does not? And also what labels are in the spatial data and what are in the tree (control click on a tree branch to see its list of descendants).

If the names are similar then it is worth trying the remapping process.

There are some details here: https://biodiverse-analysis-software.blogspot.com/2017/04/matching-spatial-tree-matrix-and.html

Regards,
Shawn.

--
You received this message because you are subscribed to the Google Groups "Biodiverse Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biodiverse-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/biodiverse-users/954b7ea2-440e-4ddf-96cf-3a75ce703e18n%40googlegroups.com.

Message has been deleted

Mark Dong

unread,

Oct 7, 2023, 6:40:39 AM10/7/23

to Biodiverse Users

Hello,Shawn
And sorry to bother you again, but I was full of curiosity in the process of using the software! I wonder what "Creat tree from labels" is based on to create the tree file, and does the resulting tree file reflect the evolutionary relationships between species? What is the difference between a sequence-based tree file? If you can answer my curiosity, I will really appreciate it!
Regards,
Mark.

Shawn Laffan

unread,

Oct 7, 2023, 5:43:35 PM10/7/23

to biodiver...@googlegroups.com, Mark Dong

Hello Mark,

The "Create Tree from Labels" option simply converts the label components into a tree. Each label column used at import will be converted into a level of the tree. All branches in the tree are assigned a length of 1.

For example, if data were imported from a text file with different columns for family, genus and species then the result would be a taxonomic tree.

If only one label column was used then the tree will be a rake, where all terminals connect to the root.

If multiple columns were used then the labels will have colons separating the components, for example "family1:genus2:species_a", "family1:genus2:species_b".

The taxonomic tree is as close as this process will get to representing evolutionary relationships. To assess those properly one would need an actual phylogeny where the branch lengths represent things like time, number of base pairs, morphological features and the like.

Regards,
Shawn.

To view this discussion on the web visit https://groups.google.com/d/msgid/biodiverse-users/24598c83-eb0b-48e3-8792-e9ef5e2b5a81n%40googlegroups.com.

Mark Dong

unread,

Oct 8, 2023, 6:17:30 AM10/8/23

to Biodiverse Users

Hello Shawn,

I think i got the point! Thank for your kindness!

Regards,

Mark.

Shawn Laffan

unread,

Oct 8, 2023, 5:45:42 PM10/8/23

to biodiver...@googlegroups.com, Mark Dong

Not a problem Mark.

Regards,
Shawn.

To view this discussion on the web visit https://groups.google.com/d/msgid/biodiverse-users/a9214fcd-3057-49f8-b684-4aaa4025b7e8n%40googlegroups.com.

Mark Dong

unread,

Oct 11, 2023, 10:04:42 AM10/11/23

to Biodiverse Users

Hello Shawn，
I encountered some difficulties in doing specific data analysis, I would like to get some advice and help from you who are experienced, your help I would appreciate it!
Now I want to study the spatial phylogenetic index of an area, but I only have rough species distribution data. For example, I only know that a certain species is distributed in an area of 1400 square kilometers (an administrative region), and there is no accurate latitude and longitude information, so I use the central latitude and longitude of the administrative region where the specimen is located as a substitute for the latitude and longitude of the species. I would like to ask: 1. Is it meaningful to calculate PE and PD in such an operation. 2. I wonder how big a grid you would recommend me to use. 3. If I divide this administrative region into different rasters, and then assume that the species in this administrative region are present in all the rasters of this administrative region, is it feasible to calculate the biodiversity index?
I will definitely consider your suggestions carefully as a guide for me to continue learning!
Regards，
Mark.

Shawn Laffan

unread,

Oct 11, 2023, 11:08:58 PM10/11/23

to biodiver...@googlegroups.com

Hello Mark,

This is one of those questions to which there is no simple answer.

In an ideal world we have detailed locations of all the taxa, preferably as points. The reality is that we rarely do.

When we have point data they tend to be sparsely sampled - this is one of the reasons cell sizes for many analyses tend to be relatively large.

Sometimes we have range polygons, such as from the IUCN. These tend to overestimate distributions, or at least generalise the boundaries. This is partly because of the nominal map scale and purpose they are generated for does not require a high level of detail.

Sometimes we have rasters generated using species distribution models (also known as habitat suitability models or environmental niche models). These tend to also overestimate ranges, and in some case underestimate them. These might have high spatial resolutions, but remember that precision is not the same as accuracy.

You have the fourth case, where the spatial location of a taxon is known to some administrative unit. Ideally such units are small, much smaller than the resolution wanted for an analysis. In the worst case you might have something the size of Western Australia (2.6 million km^2) which spans a wide range of biomes. In your case 1400 km^2 is equivalent to a square of approximately 37.5 km x 37.5 km, so it might be reasonable to analyse the data at something like a 50 km resolution.

It is probably worth trying several resolutions to assess sensitivity, for example 10, 25, 50 etc, although bear in mind that taxon ranges in Biodiverse are in cell units so the numbers are not directly comparable across cell sizes (a taxon range in one 50 km cell might span anywhere from one to twenty-five 10 km cells).

An alternative is to use the administrative units as the spatial analysis units. If your data are a table then you can use the names of the units to define the cells (referred to in Biodiverse as a text_group). The results will not plot spatially (something to fix one day) but the indices will all be calculated and can be exported from Biodiverse and then reattached to the spatial data using a join of some sort.

Note that the units will not be an issue for PD since it is just the sum of branch lengths within the specified neighbourhood (usually sp_self_only() which is each cell in isolation). The endemism indices will be affected since now the ranges are estimated as the number of administrative units. The amount they are affected depends on how much variation there is in the areas of the admin units. If they are all close in size then it will not make much difference. If they vary widely (e.g. Western Australia and Belize) then there are clearly issues. Using this approach also implies that each taxon spans the full area of each admin unit, but that is also assumed with equal area grid cells.

Another option, if you are confident that each taxon spans all (or nearly all) of each admin unit in which it is found, is to import the data directly from polygons to square cells in Biodiverse. This means you do not need to first calculate the centroids.
https://biodiverse-analysis-software.blogspot.com/2018/12/import-polygon-and-polyline-data.html

Sorry there are no concrete recommendations but hopefully there is enough for you to start some experimentation to see what approach gives meaningful results that help answer the question(s) you are asking.

Regards,
Shawn.

To view this discussion on the web visit https://groups.google.com/d/msgid/biodiverse-users/2d131dc5-26f9-4397-ba3e-dc75cd3a7d2an%40googlegroups.com.

Reply all

Reply to author

Forward