IDL misreads attribute from shapefile

68 views
Skip to first unread message

Brian McNoldy

unread,
Jan 20, 2025, 1:58:06 PMJan 20
to idl-pvwave
I am reading in a standard TIGER shapefile of U.S. counties (https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_county_20m.zip) using IDLffShape.  Almost everything reads in just fine.  Two of the attributes are called ALAND and AWATER (area of land and area of water, both in m^2).

Those do get read in, but if the value is too large, it sets the value to -2147483648.  This is clearly wrong.  When I read the same shapefile into QGIS and look at the attributes, the correct values are in there, so this is an IDL issue, not a shapefile issue.

How can I convince IDL that the attribute should not be a long integer?  In the attribute info vector, those ALAND and AWATER attributes say they are type 3 (long integer) but somehow QGIS is able to retrieve the correct values.  

For a very specific example, Miami-Dade County FL is FIPS code 12086.
In IDL, ALAND = -2147483648 and AWATER = 1378974993.  In QGIS, ALAND = 4917746437 and AWATER = 1378974993.  There are plenty of counties with an area larger than 2147 km^2, and IDL does not like that.  What I can't figure out is how to make IDL read in the attributes correctly.

Thanks,
Brian

Andrew Cool

unread,
Jan 20, 2025, 8:10:45 PMJan 20
to idl-pvwave
Hi Brian,

I've downloaded that zip file twice, and each time WINZIP refuses to unzip it, complaining that it's not a valid zip file.

The problem may not be of IDL's making...?

Perhaps you're seeing the first anti-science effects of the new Presidency... ;-)

And just now, 5 minutes later, he site is down for maintenance!

Andrew

Chris Torrence

unread,
Jan 21, 2025, 4:31:42 AMJan 21
to idl-p...@googlegroups.com
Hi Brian,

It sounds like that shapefile is technically "illegal", but you are getting lucky with QGIS. Shapefile integers are supposed to be long integers (32 bit), so the maximum number should be 2LL^32, or 4294967296. However, since your sample number 4917746437 has the same number of digits, maybe it just gets stored correctly in the file, but then misread by IDL.

I can take a look at IDL's shapefile code and see if I can fool it into returning the correct value. It will have to return it as a 64-bit integer (otherwise it won't fit!), so I'll have to be very careful about backwards compatibility...

Cheers,
Chris
NV5 Geospatial Software

Brian McNoldy

unread,
Jan 21, 2025, 8:36:19 AMJan 21
to idl-pvwave
Thanks Chris... given the fields they are storing (areas of counties in square meters), long integer was probably not the best choice in the file design.  But I was surprised when I probed around QGIS and saw it had somehow successfully read those values.  If having that ZIP file would be useful and it continues to be unavailable on the census.gov website, I can email it privately (it's only 900 kb).

In QGIS, I opened that file and sorted the "ALAND" attribute by size, and the largest county is Yukon-Koyukuk, AK (FIPS code 02290) and that's listed as 377,034,650,847 m^2.  I also managed to find the data types it read in for each attribute.  ALAND and AWATER are stored as 14-digit 64-bit long integers.  IDL reads it in as a 32-bit integer.  That's the underlying key issue it seems: data type 14 rather than 3.

Brian

Chris Torrence

unread,
Jan 31, 2025, 4:29:28 PMJan 31
to idl-pvwave
Hi Brian,

This has been "fixed" for IDL 9.2. Fixed is in quotes, because those integers are really outside of the ESRI specification. In IDL 9.2, for large integer attributes (width >= 10) we now scan through the attribute values. As soon as we find one outside of the 32-bit integer range (-2147483648 to 2147483647) then we stop and change the datatype for that attribute to IDL type DOUBLE.

I thought about returning a 64-bit integer for those values, but that leads to problems since a 64-bit int is no longer compatible with the ESRI standard. For example, if you read the data in as 64-bit integers and then tried to write it back out to a new shapefile, the underlying library would truncate the values. So the only viable solution is to just return the values as doubles, which have plenty of room to store huge integers, and will be successfully written back out to a new shapefile.

Thanks again for the sample files and bringing this to our attention.
-Chris
NV5 Geospatial Software

Brian McNoldy

unread,
Feb 4, 2025, 2:28:52 PMFeb 4
to idl-pvwave
Thanks Chris!  Good to know... and I'll keep my eye open for 9.2!
So the reason QGIS was able to read those values in as 64-bit integers is because it allows for non-standard attribute values?  That was the original confusion for me: two different interpretations of "simply" reading in attributes from a file.

Cheers,
Brian

Chris Torrence

unread,
Feb 6, 2025, 3:01:26 PMFeb 6
to idl-pvwave
Hi Brian,

I just tried your file in QGIS. The interesting thing is that it reports those ALAND and AWATER attributes as type "Integer (64 bit)". That's fine as far as working with them within QGIS, but it certainly will cause problems with other software tools that expect standard shapefiles.

We could theoretically do something similar in IDL, but I hesitate to do this because people might have written software packages in IDL (like ENVI) that only expect the basic Shapefile types.

Anyway, the good news is that starting with IDL 9.2, you'll be able to read those fields into IDL with no problem, and they'll get exported out correctly (as double-precision floats) so the output files will play nicely with other tools.

Cheers,
Chris

Reply all
Reply to author
Forward
0 new messages