Faster REVERSE() for 3d images

126 views
Skip to first unread message

Wayne Landsman

unread,
Jan 24, 2020, 11:55:13 AM1/24/20
to idl-pvwave
I have lots of data cubes that need to be descrambled before they can be analyzed.
I was trying to figue out why the I/O was so slow and the culprit turned out to be my use of the REVERSE() function for 3d images.
So I wrote my own 4 line code and found a factor of ~20 speed improvement.

function reverse3,im
;Much faster than REVERSE(,1) for 3d images
dim = size(im,/dim)
for i=0,dim[2]-1 do $
     im[0,0,i] = rotate(im[*,*,i],5)    ;Rotate(,5) =  Reverse(,1) for 2d
return,im
end

Below is a sample test to show the speed improvement.

IDL> a = randomn(seed,128,2048,50)
IDL> acopy=a
IDL> tic & b = reverse(a,1) & toc
% Time elapsed: 1.7325780 seconds.
IDL> tic & b1 = reverse3(acopy) & toc 
% Time elapsed: 0.080199957 seconds.
IDL> array_equal(b,b1)
   1
IDL> print,!version
{ x86_64 linux unix linux 8.5 Jul  7 2015      64      64}
 

REVERSE() is a standard IDL function, although the source code is available.   
The code was updated to handle 3D in 1991.
It comments that the code could be faster but that it needs to conserve memory.
I would think that computer memory is slightly more available now than in 1991 ;-)

--Wayne

alx

unread,
Jan 24, 2020, 1:23:13 PM1/24/20
to idl-pvwave
An improvement can be noticed (not so efficient as yours), when a permutation matrix is used instead of the REVERSE function.

IDL> a = randomn(seed,128,2048,50)
IDL> tic & b=reverse(a,1) & toc
% Time elapsed: 1.4530001 seconds.

IDL> perm=rotate(identity(128),5)
IDL> tic & b1=reform(perm#reform(a,(a.dim)[0],a.length/(a.dim)[0],/OVER),a.dim,/OVER) & toc
% Time elapsed: 0.37500000 seconds.

IDL> array_equal(b,b1)
   1
IDL> print,!version
{ x86_64 Win32 Windows Microsoft Windows 8.7.2 Feb  7 2019 (363260)      64      64}

alx

Brian G

unread,
Jan 25, 2020, 1:28:33 PM1/25/20
to idl-pvwave
Here is an approach that uses shared memory.  The SHMMAP procedure lets you specify an offset into the block of shared memory, so you can have multiple IDL variables pointing to different parts of the same memory.  It's a trick I used in ENVIBandMathRaster to avoid making copies of each band of the raster being manipulated.

pro shm_reverse, data
  compile_opt idl2

 
DataSize = [ 0, 1, 2, 4, 4, 8, 8, 0, 0, 16, 0, 0, 2, 4, 8, 8 ]
  dim
= size(data, /dim)
  type = size(data, /
type)

  shmmap
, 'myCube', dim, Type=type, Get_OS_Handle=handle
  cube
= shmvar('myCube')
  shmunmap
, 'myCube'
  cube
[0] = data[*]

 
for i = 0, dim[2]-1 do begin
    shmmap
, 'slice', dim[0:1], Type=type, OS_Handle=handle, Offset=i*dim[0]*dim[1]*DataSize[type]
    slice
= shmvar('slice')
    shmunmap
, 'slice'
    slice
[0, 0] = rotate(slice, 5)
    slice
= 0
  endfor

  data
[0] = cube[*]
end


The procedure does an in-place reversal.  It starts by allocating a 3D block of memory the same size as the input, and copied the input array into shared memory.  It then iterates over the last dimension, and creates a 2D shared memory variable, which is offset to the beginning of the Nth 2D slice of the 3D cube in shared memory.  It uses ROTATE to perform the flip on the 2D array, and then releases the shared memory for the next loop iteration.  It finishes by copying the reversed 3D cube back into the input variable.

There are limits to this approach, I've heard on Linux that you can make an OS setting the controls how much shared memory any given process can allocate, so if you had huge arrays you might run into trouble there.  Some of the time of this procedure is the copying into and out of shared memory, so if you could start with a shared memory block as you are loading your data you could avoid that hit.
Timing on IDL 8.7.2 on my 10 year old Core i7 920:
IDL> a = randomn(seed, 128, 2048, 50)
IDL> tic & b = reverse(a, 1) & toc
% Time elapsed: 0.92100000 seconds.
IDL> tic & shm_reverse, a & toc
% Time elapsed: 0.22000003 seconds.
IDL> array_equal(b, a)
   1

Wayne Landsman

unread,
Jan 27, 2020, 12:43:01 PM1/27/20
to idl-pvwave
Thanks for this.   I wasn't aware of the a.length and a.dim methods for IDL variables.      They don't seem to be listed as methods for IDL_Variable
Are these undocumented methods?    --Wayne

John Correira

unread,
Jan 27, 2020, 12:50:53 PM1/27/20
to idl-pvwave
That's because they are not methods but variable attributes:

Wayne Landsman

unread,
Jan 27, 2020, 3:16:36 PM1/27/20
to idl-pvwave
Thanks!    I failed to notice the introduction of variable attributes in IDL 8.4.

About the existing REVERSE() code.    It actually is quite fast for rotating about the Y or Z axes, but more than 20 times slower for rotating about the X axis.  L3Harris should probably update the code.   --Wayne

IDL> im = randomn(seed,128,4096,50) 
IDL> tic & c = reverse(im,3) & toc 
% Time elapsed: 0.11316800 seconds.
IDL> tic & c = reverse(im,2) & toc
% Time elapsed: 0.11547899 seconds.
IDL> tic & c = reverse(im,1) & toc
% Time elapsed: 3.1253319 seconds.

Brian G

unread,
May 11, 2020, 4:25:24 PM5/11/20
to idl-pvwave
I am pleased to let you know that I have port the Reverse() function from the lib folder into idl.dll, where it is much uniformly performant.  This will be included in the upcoming IDL 8.8 release.
Here are my numbers running the same array:
IDL> im = randomn(seed, 128, 4096, 50)
IDL> tic & c = reverse(im, 1) & toc
% Time elapsed: 0.094000101 seconds.
IDL> tic & c = reverse(im, 2) & toc
% Time elapsed: 0.052000046 seconds.
IDL> tic & c = reverse(im, 3) & toc
% Time elapsed: 0.078000069 seconds.

In porting this to C code, I also added a couple extra features:
  • The OVERWRITE keyword now behaves the same as other functions, such as Reform().  The PRO code version could not have 2 IDL variables share the same memory, so it would use TEMPORARY() to move the input variable to the return value. Now both the input array and return value will share the same memory, and you can call it as !null = Reverse(arr, 1) if you only want 1 variable.
  • Any value in [1, 8] is now allowed for the second argument, regardless of the input array dimensions.  If you pass in a dimension index greater than the number of dimensions, the input array is returned.  This simplifies code when dealing with BSQ images, you don't have to check if your pixel buffer is 2D or 3D to "flip" on the 3rd (bands) dimension.
  • Negative dimension index values are allowed.  If you want to flip on the last dimension, call it as Reverse(arr, -1).
Attached are some performance graphs that compare the old and new versions of Reverse().  The solid lines do not specify the OVERWRITE keyword, the dashed lines do. The bright colors are the new performance, the pastel versions are the old performance.  All the arrays are of Long64, so the largest arrays consumed 8GB of memory.  The old implementation could take upwards of a minute on my machine for the very large arrays, the new implementation is always under 10 seconds, even for 2^30 elements.

Brian
IDL Tech Lead
1d_perf-blend.png
2d_perf-blend.png
3d_perf-blend.png
Reply all
Reply to author
Forward
0 new messages