Projection for contigous USA

23 views
Skip to first unread message

Lorenz Eck

unread,
Oct 5, 2023, 10:59:33 AM10/5/23
to Openspace List
Hey,

Recently, I downloaded a shapefile from the US Census Bureau and utilized its geometry column, which is in the EPSG:4269 coordinate reference system (NAD83), to construct my dataframe. Subsequently, I merged this shapefile with my dataset and imported it into GeoDa for further analysis. While attempting to perform K-Means clustering, I encountered a message indicating that the X and Y coordinates were not projected when I tried to include them as variables. After some research, I learned that GeoDa requires data to be in the WGS84 geographic coordinate system (EPSG:4326) (as mentioned here: https://geodacenter.github.io/questions.html#projection). To address this requirement, I reprojected the dataframe to EPSG:4326 (WGS84) using GeoPandas' to_crs method. However, after the reprojection, I am unsure about the next steps in my analysis. I've attempted to follow the instructions outlined in the GeoDa documentation (https://geodacenter.github.io/workbook/01_datawrangling_2/lab1b.html). Nevertheless, I encountered difficulties when trying to reproject the data. 

Consequently, I have two questions:

  1. I attempted to reproject the data to EPSG:5071 (https://epsg.io/5071), but the projection isnt working. Is it possible that this issue arises because EPSG:5071 uses NAD83(HARN) / Conus Albers?

  2. Could someone kindly recommend suitable projections that are compatible with GeoDa and effectively preserve distances when analyzing data for the contiguous USA?

I would greatly appreciate any guidance. Many thanks!

Luc Anselin

unread,
Oct 5, 2023, 2:05:58 PM10/5/23
to openspa...@googlegroups.com
The message in K-Means suggests that the Euclidean distances will be computed on the
original decimal degrees, which is inaccurate (how much, depends on the scale of your
analysis). K-Means will still work, but the impact of geographic distance will be inaccurate.

There is some confusion in the terms: NAD83 and WGS84 are datums, not coordinates.
EPSG 4326 is still in decimal degrees, not in projected coordinates.

Is the original spatial layer a shape file or a geodataframe (from geopandas)? If a shape file,
does it contain a prj file? If so, then GeoDa should be able to reproject using the correct
proj4 notation in the CRS box that appears after a Save As command.

If the original file does not contain a CRS specification - you can see than if the CRS box
is blank when carrying out Save As, then you need to specify one (either for EPSG 4269 or
4326, depending on what your data are in) and save the original file with that CRS.

The next time you load the file (which now should have a proper CRS, not a blank), one
can reproject to any proj4 specifiation in the CRS box, as illustrated in the workbook.

L.



--
You received this message because you are subscribed to the Google Groups "Openspace List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openspace-lis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openspace-list/2e0e11e2-8e01-4409-a95d-4ad69ab98eebn%40googlegroups.com.

Nicolas Cadieux

unread,
Oct 5, 2023, 4:07:31 PM10/5/23
to openspa...@googlegroups.com
Hi,
If I understand Luc, you will be able to reproject if the shape files contain a proper .prj.  If you have difficultly doing this in Geoda, you could try doing it in a GIS like QGIS.  You will still need to specify the original file crs if it’s missing.

If you have a project spanning the entire USA, then using Lat and longitudes will be a problem as a degree of latitude is not the same distance as degree longitude unless your data is all on the equator.  Reprojecting to 5071 or something like 102005 (both using meters for X and Y) could be good options. (I don’t work much with data for the USA.) Conus Albert should be an issue but a shapefile without a proper .prj file could be a problem.

If you happen to reproject data from 4269 (NAD83) to 4326 (WGS84), don’t be surprised if you don’t see any tangible differences.  The original versions of NAD8e and WGS84 are both are considered identical by most softwares as differences don’t matter unless you’re a surveyor working at sub metre (or cm) precisions.  

Nicolas Cadieux

Le 5 oct. 2023 à 14:05, Luc Anselin <lans...@gmail.com> a écrit :



Lorenz Eck

unread,
Oct 6, 2023, 4:39:05 AM10/6/23
to Openspace List
Many thanks for your answers.  I've created the original spatial layer as a geodataframe using the shapefile from the US Census Bureau, and when I import it into GeoDa, everything appears to be in order:

1.PNG2.PNG

When attempting to reproject it using ESRI:102005 (https://epsg.io/102005) and saving it as a GeoJSON file, the EPSG code changes to EPSG:4326:

3.PNG

By employing the same approach, which involves copying the proj.4 definition from https://epsg.io/102005 and saving it as a .gpkg file, the results appear to be more  adequate :

4.PNG5.PNG
However, I have noticed that "Lower left" and "Upper right"  appears to be incongruent with the projected bounds provided by https://epsg.io/102005 .

Furthermore, upon attempting to utilize the downloaded proj4 file for reprojection, an error message is encountered:
6.PNG

Moreover, when utilizing https://spatialreference.org/ref/esri/102005/, the outcome is a map oriented at a 90-degree angle. 

Many thanks,
Lorenz
Reply all
Reply to author
Forward
0 new messages