Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Gi* Statistics

37 views
Skip to first unread message

João Afonso Poester

unread,
Apr 5, 2023, 5:00:16 PM4/5/23
to Biodiverse Users
Dear Shawn, I am trynng to use the hotspot analysis to assess the environmental correlates of my cluster analysis. I was intersted in obtaining the z-scores for each cluster and each variable, as represented in some papers using the method, such as Gonzalez-Orozco et al. (2014) Quantifying Phytogeographical Regions of Australia Using Geospatial Turnover in Species Composition.
How can I see these scores for each cluster? 

Thank you in advance, João Afonso. 

Shawn Laffan

unread,
Apr 5, 2023, 7:20:19 PM4/5/23
to biodiver...@googlegroups.com
Hello João,

You might have done several of the steps below but I'll list them anyway as they will be of use to others. 

Step 1. 
Attach the environmental variables to your basedata as group properties.  If the data are in raster files then you can use the method described here: http://biodiverse-analysis-software.blogspot.com/2022/05/importing-group-properties-directly.html

To verify the properties have been loaded correctly, open a View Labels tab and control left click (or middle-mouse click) on a cell to see the popup windows.  Then set the drop down list to PROPERTIES (which will not exist unless properties have been attached). 

Step 2.
Choose the Group property Gi* statistics as a calculation to run when setting up your analysis.
https://github.com/shawnlaffan/biodiverse/wiki/Indices#group-property-gi-statistics

Step 3.
The results are under a list called GPPROP_GISTAR_LIST, with one value per group property.  To view them you need to change the selection in the dropdown to the lower left of the map.  For a Cluster analysis this defaults to "Cluster", for a spatial analysis this defaults to SPATIAL_CONDITIONS. 


One point to note is that the z-score plotting legend has a bug in versions 4.1 and 4.2.  The values are correct but the colours are reversed.  This is fixed in the soon-to-be-released 4.3.
https://github.com/shawnlaffan/biodiverse/issues/857

Hopefully that all helps, but please do ask more questions if needed. 


Regards,
Shawn.
--
You received this message because you are subscribed to the Google Groups "Biodiverse Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biodiverse-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/biodiverse-users/e92c0ef9-4b46-4035-88aa-7889c148e460n%40googlegroups.com.

Shawn Laffan

unread,
May 13, 2023, 6:37:09 PM5/13/23
to João Afonso Poester, biodiver...@googlegroups.com
Hello João,

(I'm including the list in this response as it is broadly relevant). 

In this case you should leave the spatial conditions as sp_select_all() as this will compare each group with each other group. 

The Cluster analysis spatial conditions are useful when you need to cluster some number of regions separately before merging them into one larger cluster tree. 

Regards,
Shawn.


On 14/05/2023 01:57, João Afonso Poester wrote:



Hello Shawn, I was able to use the Gi* statistics. 
However, I am still working on some cluster analysis for species distribution data.
I want to produce a regionalization of my study area using S2 (Simpson) metric and WPGMA. 
Should I use any spatial conditions in my analysis or just leave "sp_sellect_all ()" to cluster cells based on pairwise distances?

Again, thanks for all the help, João Afonso

João Afonso Poester

unread,
Jun 2, 2023, 2:42:59 PM6/2/23
to Biodiverse Users
Hello Shawn, about the Gi* statistics, when Z-Score is calculated, it uses only the properties of the groups from the basedata or it uses all of the groups from the raster? 

Thanks for all the help, João Afonso. 

João Afonso Poester

unread,
Jun 2, 2023, 2:44:43 PM6/2/23
to Biodiverse Users
Another thing, if a group falls on th sea, for exemple, where my raster has no data, how does it calculates the mean of the property for the group?

Shawn Laffan

unread,
Jun 3, 2023, 4:21:17 AM6/3/23
to biodiver...@googlegroups.com, João Afonso Poester
Hello João,

The Gi* statistic compares the local values against the global mean and standard deviation, so the z-score is against all the data across all groups.

If you have no data in a cell then there is no local values to compare against the global so the result is undef (nodata). 

I can send links to useful references if that would help. 

Regards,
Shawn.

João Afonso Poester

unread,
Jun 3, 2023, 12:45:42 PM6/3/23
to Biodiverse Users
I understood it now, thank you very much, Shawn. I would like the references.

Shawn Laffan

unread,
Jun 5, 2023, 12:09:24 AM6/5/23
to biodiver...@googlegroups.com, João Afonso Poester
Hello João,

I have attached a slide which has the formula with some annotations.  Hopefully it is useful.


WRT references, a good one is Getis & Ord (1996) as it includes a worked example.  Unfortunately it is a book chapter, and I do not think there is an e-book version.  If you have access to inter-library loans then that is an option, but I realise many on the list will not. 

Getis & Ord (1996) Local spatial statistics : an overview. Chapter 14 in Longley P, Batty M (eds) Spatial analysis: modelling in a GIS environment.


Otherwise there are the two main references:

Getis, A. and J.K. Ord. 1992. "The Analysis of Spatial Association by Use of Distance Statistics".  Geographical Analysis 24(3). https://doi.org/10.1111/j.1538-4632.1992.tb00261.x
Ord, J.K. and A. Getis. 1995. "Local Spatial Autocorrelation Statistics: Distributional Issues and an Application". Geographical Analysis 27(4). https://doi.org/10.1111/j.1538-4632.1995.tb00912.x


Regards,
Shawn.
gi_star_formula.pdf
Reply all
Reply to author
Forward
0 new messages