Recommended manner of dealing with small arrays; or compile-time constant arrays

0 views
Skip to first unread message

Eelco Hoogendoorn

unread,
Sep 6, 2017, 4:52:33 AM9/6/17
to Numba Public Discussion - Public
I frequently find myself faced with arrays that are 'big' amonst some dimensions, but small (2, 3, 4) along some other dimensions. I think in an ideal world, we could inform numba that we would like to specialize our functions on the shape of our arrays along an axis, such that some array dimensions become compile-time constants; but in the absence of that, how do we efficiently deal with small vector sizes?

The function below is obviously silly; there are easier ways to do an elementwise multiplication of an array; but you end up with similar patterns when for instance implementing a raytracer:

def process_element(a, b):
    r
= np.empty_like(a) .  # what happens here? i would really want this to be a stack allocation, but the size is not known at compile time, so i guess not?
   
for i in len(a):               # this is bad; len(a) is not a compile-time constant, so this loop will never be unrolled
        r
[i] = a[i] * b[i]


def process_array(arr):
    output
= np.empty_like(arr)
   
for i in range(len(arr)):
        output
[i] = process_element(arr[i], arr[i])
   
return output    




Struct-arrays can help here I suppose; if we define a dtype([(np.int, 3)]), and make len==1 array of that, would numba stack-allocate that 'scalar'? And would it unroll loops over the length of that structure? Is there an idiomatic pattern of approaching this?

And in general, is there a way to allocate array-like objects on the stack? Or should we always work with views to larger blocks of memory allocated on the heap? ( in the example above, make r an output arg).

Most examples and compelling benchmarks of numba ive seen appear focussed on more straightforward array processing, and not on 'complex' datastructures. Am I simply asking too much of numba in trying to write something like a raytracer in it and expect to be competitive in terms of expressiveness and performance to existing high performance languages? Or can someone point me to a project that does tackle the concerns raised here?

Regards,
Eelco

Stanley Seibert

unread,
Sep 6, 2017, 11:34:04 AM9/6/17
to Numba Public Discussion - Public
Hi Eelco,

We've talked about stack allocation of small arrays, as we've also seen the need in ray-tracing like projects that we've done with Numba.  We don't have a solution for this at the moment, and you correctly identify that more complex data structures are not handled well in Numba.  This is something we want to improve, but I don't have an ETA for you.

As for alternatives, Cython is the usual go-to for when you want compiled code, more complex data structures and to be callable from Python.

--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users+unsubscribe@continuum.io.
To post to this group, send email to numba...@continuum.io.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/c4033061-452a-45be-b8ea-c60648dfd013%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Eelco Hoogendoorn

unread,
Sep 6, 2017, 3:12:00 PM9/6/17
to Numba Public Discussion - Public
Hi Stanley,

Thanks for the quick feedback. I suppose in part it is a limitation that can be designed around, but indeed it feels like this is an aspect with potential for improvement.

How are tuples handled? Do they not compile to arrays of compile-time-constant size, allocated on the stack? Feels like x=1;y=2;z=3 and v=(1,2,3) are minimally different from a language/compiler perspective; but I might be missing a ton of details.

Regards,
Eelco



On Wednesday, September 6, 2017 at 5:34:04 PM UTC+2, Stanley Seibert wrote:
Hi Eelco,

We've talked about stack allocation of small arrays, as we've also seen the need in ray-tracing like projects that we've done with Numba.  We don't have a solution for this at the moment, and you correctly identify that more complex data structures are not handled well in Numba.  This is something we want to improve, but I don't have an ETA for you.

As for alternatives, Cython is the usual go-to for when you want compiled code, more complex data structures and to be callable from Python.
On Wed, Sep 6, 2017 at 3:52 AM, Eelco Hoogendoorn <hoogendo...@gmail.com> wrote:
I frequently find myself faced with arrays that are 'big' amonst some dimensions, but small (2, 3, 4) along some other dimensions. I think in an ideal world, we could inform numba that we would like to specialize our functions on the shape of our arrays along an axis, such that some array dimensions become compile-time constants; but in the absence of that, how do we efficiently deal with small vector sizes?

The function below is obviously silly; there are easier ways to do an elementwise multiplication of an array; but you end up with similar patterns when for instance implementing a raytracer:

def process_element(a, b):
    r
= np.empty_like(a) .  # what happens here? i would really want this to be a stack allocation, but the size is not known at compile time, so i guess not?
   
for i in len(a):               # this is bad; len(a) is not a compile-time constant, so this loop will never be unrolled
        r
[i] = a[i] * b[i]


def process_array(arr):
    output
= np.empty_like(arr)
   
for i in range(len(arr)):
        output
[i] = process_element(arr[i], arr[i])
   
return output    




Struct-arrays can help here I suppose; if we define a dtype([(np.int, 3)]), and make len==1 array of that, would numba stack-allocate that 'scalar'? And would it unroll loops over the length of that structure? Is there an idiomatic pattern of approaching this?

And in general, is there a way to allocate array-like objects on the stack? Or should we always work with views to larger blocks of memory allocated on the heap? ( in the example above, make r an output arg).

Most examples and compelling benchmarks of numba ive seen appear focussed on more straightforward array processing, and not on 'complex' datastructures. Am I simply asking too much of numba in trying to write something like a raytracer in it and expect to be competitive in terms of expressiveness and performance to existing high performance languages? Or can someone point me to a project that does tackle the concerns raised here?

Regards,
Eelco

--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users...@continuum.io.

Stanley Seibert

unread,
Sep 6, 2017, 3:15:25 PM9/6/17
to Numba Public Discussion - Public
Tuples are handled like C-structs, with the size fixed at compile time.  This does allow LLVM to do quite a bit of optimization with them, but there is currently no "ndarray view on a tuple" that would let you easily do numerical operations on tuples (beyond the usual things Python tuples allow).


To unsubscribe from this group and stop receiving emails from it, send an email to numba-users+unsubscribe@continuum.io.

To post to this group, send email to numba...@continuum.io.

Eelco Hoogendoorn

unread,
Sep 6, 2017, 4:57:31 PM9/6/17
to Numba Public Discussion - Public
Hmm I thought tuples would help me a lot but on reflection not so much I suppose. That is, you lose all ability to write code that is generic over the number of vector components by being restricted to pythons' immutable tuple semantics.

Thanks again for the input.

Reply all
Reply to author
Forward
0 new messages