Numpy Jagged Arrays

0 views
Skip to first unread message

Chris Uchytil

unread,
Dec 29, 2016, 10:04:04 PM12/29/16
to Numba Public Discussion - Public
Not sure if this should be a Numpy question but it relates to Numba tangentially. I am passing an array into my kernel that is comprised of 1D arrays of data that are flattened 3D arrays that represent boxes. The problem is that the boxes are not all the same size. To create the array and send it to the kernel I am doing something along the lines of this.
(self.imported_object_info_dict is a dictionary of info on each box object, like its dimensions)

imported_object_gpu_objects = []
for object_key in self.imported_object_info_dict.keys():
imported_object_gpu_objects.append(self.imported_object_info_dict[object_key]['cpuobject'])

When I want to turn this list of arrays into an array I just do this

np.array(imported_object_gpu_objects, dtype=np.float32)

and send that numpy array to the kernel.

If all the objects in the imported_object_gpu_objects list are of the same dimension it works perfectly. Unfortunately, very rarely are all the objects going to be the same size and if they aren't I get this error "ValueError: setting an array element with a sequence". From what I've seen online people recommend doing stuff like setting the dtype = object. Will this work with Numba? Are there other ways to pass in jagged arrays or is there another package other than numpy that can create arrays that I can pass to Numba that are jagged?

Chris Uchytil

unread,
Dec 30, 2016, 2:05:27 PM12/30/16
to Numba Public Discussion - Public
I currently have a for loop within the kernel that goes through each box. Would there be any issues of I moved the for loop outside the kernel and had a for loop something like this?

for obj in object_list:
    kernel[16,16](obj,
                          stuff...)

I've never seen a kernel written inside a for loop before so I'm not sure if this is a good idea or not. Could there be any issues if I were to do this?

Siu Kwan Lam

unread,
Dec 30, 2016, 2:58:31 PM12/30/16
to Numba Public Discussion - Public
> From what I've seen online people recommend doing stuff like setting the dtype = object. Will this work with Numba? 

No, cannot pass arbitrary objects to Numba.  Numba needs to know how to compile the dtype into machine representation (beside just a pointer).

> Are there other ways to pass in jagged arrays or is there another package other than numpy that can create arrays that I can pass to Numba that are jagged?

I can think of two ways:

1. Pad the subarrays to max length.
2. Flatten the array into 1D and keep a separate array as the index of the start of each subarray.  This is also true for CUDA-C code since a int** is a pointer-to-pointer and we want to avoid pointer chasing.

Would there be any issues of I moved the for loop outside the kernel and had a for loop something like this?

Any controlflow construct outside of the kernel should not affect the kernel.  Given that kernel calls are asynchronous to host execution, you may find that the loop ends before the all the kernel calls are done.  

--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users...@continuum.io.
To post to this group, send email to numba...@continuum.io.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/182d23bd-77ee-4671-8133-e2625cf69f2d%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
--
Siu Kwan Lam
Software Engineer
Continuum Analytics

Chris Uchytil

unread,
Dec 31, 2016, 3:06:31 PM12/31/16
to Numba Public Discussion - Public
I might give padding the array a shot. The only worry is that I will potentially be wasting a good deal of memory on the array padding. I guess we'll see how it works. Thanks for the advice.
Reply all
Reply to author
Forward
0 new messages