Bug in "where" function introduced after 8.6.1

66 views
Skip to first unread message

Jeb Jones

unread,
Jan 21, 2025, 1:34:45 PMJan 21
to idl-pvwave
While it worked fine up to at least IDL 8.6.1, as of version 8.8.1 (and still there as of 9.0 - I don't have a 9.1 installation to test it on) there is a bug in the where function on arrays containing more than 2^31-1 elements. In such cases, the return values that tell you how many elements are in the result are cast as LONG integers instead of LONG64. If the number happens to be less than 2^31, the value is still correct.  If it is greater than or equal to 2^31, the value will be a large negative number instead of the correct positive number.

In a lot of places in my code, I query whether the number of items in the result of a "where" is greater than 0. If it is not I skip any processing I would have done on those elements, because there should be no elements to process.  Where should never return a negative number.  When a large negative number is returned instead of a large positive number, my if statement fails and the processing of those elements is not being done, leading to erroneous results.  A workaround is simple, but changing a LOT of instances of this type of construct in my code is a pain.

Correct:
IDL 8.6.1 (linux x86_64 m64).
(c) 2017, Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation.

IDL> a = fltarr(94000,27000)
IDL> a[,4000:]=1
IDL> w = where(a ne 0,n,ncomp=nm)
IDL> help, w,n,nm
W               LONG64    = Array[2162000000]
N               LONG64    =             2162000000
NM              LONG64    =              376000000

Incorrect:
IDL 9.0.0 (linux x86_64 m64).
(c) 2023, NV5 Geospatial Solutions, Inc.

IDL> a = fltarr(94000,27000)
IDL> a[*,4000:*]=1
IDL> w = where(a ne 0,n,ncomp=nm)
IDL> help, w,n,nm
W               LONG64    = Array[2162000000]
N               LONG      =  -2132967296
NM              LONG      =    376000000

Jeb Jones

unread,
Jan 21, 2025, 1:42:01 PMJan 21
to idl-pvwave
I don't know how the code got messed up (with various cutting/pasting I was doing), and I don't seem to be able to edit the post, but somehow the second command in the "correct" version is got munged.  It should read like this:

Correct:
IDL 8.6.1 (linux x86_64 m64).
(c) 2017, Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation.

IDL> a = fltarr(94000,27000)
IDL> a[*,4000:*]=1
IDL> w = where(a ne 0,n,ncomp=nm)
IDL> help, w,n,nm
W               LONG64    = Array[2162000000]
N               LONG64    =             2162000000
NM              LONG64    =              376000000

Incorrect:
IDL 9.0.0 (linux x86_64 m64).
(c) 2023, NV5 Geospatial Solutions, Inc.

IDL> a = fltarr(94000,27000)
IDL> a[*,4000:*]=1
IDL> w = where(a ne 0,n,ncomp=nm)
IDL> help, w,n,nm
W               LONG64    = Array[2162000000]
N               LONG      =  -2132967296
NM              LONG      =    376000000

Chris Torrence

unread,
Jan 22, 2025, 3:08:04 PMJan 22
to idl-p...@googlegroups.com
Hi Jeb,

I just confirmed that this is indeed a bug. Sorry that you encountered that - I’m not sure how that one slipped through the cracks but we will fix it for IDL 9.2.

In the meantime, probably the best solution is to always use the L64 keyword to force the result to be 64-bit integers.

Thanks for reporting this!

-Chris
NV5 Geospatial Software

While it worked fine up to at least IDL 8.6.1, as of version 8.8.1 (and
still there as of 9.0 - I don't have a 9.1 installation to test it on)
there is a bug in the *where* function on arrays containing more than
2^31-1 elements. In such cases, the return values that tell you how many
elements are in the result are cast as LONG integers instead of LONG64. If
the number happens to be less than 2^31, the value is still correct. If it
is greater than or equal to 2^31, the value will be a large negative number
instead of the correct positive number.
 
In a lot of places in my code, I query whether the number of items in the
result of a "where" is greater than 0. If it is not I skip any processing I
would have done on those elements, because there should be no elements to
process. *Where* should never return a negative number. When a large
negative number is returned instead of a large positive number, my if
statement fails and the processing of those elements is not being done,
leading to erroneous results. A workaround is simple, but changing a LOT
of instances of this type of construct in my code is a pain.
 
*Correct:*

IDL 8.6.1 (linux x86_64 m64).
(c) 2017, Exelis Visual Information Solutions, Inc., a subsidiary of Harris
Corporation.
 
IDL> a = fltarr(94000,27000)
IDL> a[,4000:]=1
IDL> w = where(a ne 0,n,ncomp=nm)
IDL> help, w,n,nm
W LONG64 = Array[2162000000]
N LONG64 = 2162000000
NM LONG64 = 376000000
 
*Incorrect:*

IDL 9.0.0 (linux x86_64 m64).
(c) 2023, NV5 Geospatial Solutions, Inc.
 
IDL> a = fltarr(94000,27000)
IDL> a[*,4000:*]=1
IDL> w = where(a ne 0,n,ncomp=nm)
IDL> help, w,n,nm
W LONG64 = Array[2162000000]
N LONG = -2132967296
NM LONG = 376000000
Jeb Jones <jcl...@gmail.com>: Jan 21 10:42AM -0800

I don't know how the code got messed up (with various cutting/pasting I was
doing), and I don't seem to be able to edit the post, but somehow the
second command in the "correct" version is got munged. It should read like
this:
 
 
*Correct:*IDL 8.6.1 (linux x86_64 m64).

(c) 2017, Exelis Visual Information Solutions, Inc., a subsidiary of Harris
Corporation.
 
IDL> a = fltarr(94000,27000)
IDL> a[*,4000:*]=1
IDL> w = where(a ne 0,n,ncomp=nm)
IDL> help, w,n,nm
W LONG64 = Array[2162000000]
N LONG64 = 2162000000
NM LONG64 = 376000000
 
*Incorrect:*

IDL 9.0.0 (linux x86_64 m64).
(c) 2023, NV5 Geospatial Solutions, Inc.
 
IDL> a = fltarr(94000,27000)
IDL> a[*,4000:*]=1
IDL> w = where(a ne 0,n,ncomp=nm)
IDL> help, w,n,nm
W LONG64 = Array[2162000000]
N LONG = -2132967296
NM LONG = 376000000
 
On Tuesday, January 21, 2025 at 1:34:45 PM UTC-5 Jeb Jones wrote:
 
Chris Torrence <gort...@gmail.com>: Jan 21 09:31AM

Hi Brian,
 
It sounds like that shapefile is technically "illegal", but you are getting lucky with QGIS. Shapefile integers are supposed to be long integers (32 bit), so the maximum number should be 2LL^32, or 4294967296. However, since your sample number 4917746437 has the same number of digits, maybe it just gets stored correctly in the file, but then misread by IDL.
 
I can take a look at IDL's shapefile code and see if I can fool it into returning the correct value. It will have to return it as a 64-bit integer (otherwise it won't fit!), so I'll have to be very careful about backwards compatibility...
 
Cheers,
Chris
NV5 Geospatial Software
Brian McNoldy <brian....@gmail.com>: Jan 21 05:36AM -0800

Thanks Chris... given the fields they are storing (areas of counties in
square meters), long integer was probably not the best choice in the file
design. But I was surprised when I probed around QGIS and saw it had
somehow successfully read those values. If having that ZIP file would be
useful and it continues to be unavailable on the census.gov website, I can
email it privately (it's only 900 kb).
 
In QGIS, I opened that file and sorted the "ALAND" attribute by size, and
the largest county is Yukon-Koyukuk, AK (FIPS code 02290) and that's listed
as 377,034,650,847 m^2. I also managed to find the data types it read in
for each attribute. ALAND and AWATER are stored as 14-digit 64-bit long
integers. IDL reads it in as a 32-bit integer. That's the underlying key
issue it seems: data type 14 rather than 3.
 
Brian
 
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to idl-pvwave+...@googlegroups.com.

Jeb Jones

unread,
Jan 23, 2025, 11:40:52 AMJan 23
to idl-pvwave
Thank you Chris. I wasn't even aware of the L64 keyword until now, but I will use it.

-JJ

Chris Torrence

unread,
Jan 30, 2025, 5:05:06 PMJan 30
to idl-pvwave
Hi Jeb,

This was indeed a subtle bug, introduced in 2019 to fix a different bug with 32-bit IDL. Anyway, it is now fixed for IDL 9.2. If there are more than 2^31 elements then N and NCOMPLEMENT will always be returned as 64-bit integers.

In the meantime, the L64 keyword should fix the issue.

Thanks again for reporting this,
Chris
NV5 Geospatial Software

Reply all
Reply to author
Forward
0 new messages