Cupy vs Numba

Victor Escorcia

unread,

Nov 27, 2017, 9:06:02 AM11/27/17

to Numba Public Discussion - Public

Hi,

I couldn't find a post in SO or reddit, thus I decided to come to the source.

Could anyone with experience or high-level understanding of cupy and numba provide pros and cons of each other?

Context

I'm particularly interested in comparisons wrt GPU acceleration.
I'm working in deep learning. In particular, I use pytorch.

I found that it's possible to call cupy kernels from pytorch. Could I do the same with numba kernels?

Thanks!

Victor

Stanley Seibert

unread,

Nov 28, 2017, 10:32:13 AM11/28/17

to Numba Public Discussion - Public

Hi Victor,

I'm not an expert on cupy, but I think it comes at the GPU acceleration problem from a different (and equally useful) angle than Numba. Numba is primarily designed to compile custom algorithms written in Python for targets like the CPU and GPU. Numba includes a "GPU device array" object which is pretty barebones, and exists simply because we couldn't write a compiler for GPU kernels if we didn't have a place to put GPU data.

On the other hand, Cupy looks like a fairly complete reimplementation of the NumPy API using the GPU. This is great, because NumPy has a familiar and flexible API. Constructing a custom kernel to fuse operations together is something that Cupy also appears to do, but with less flexibility than Numba.

It is interesting to note that there are now several reimplementations of NumPy arrays on the GPU that have been created incidentally to support GPU computing:

PyTorch tensors
Cupy ndarrays
TensorFlow tensors
Numba DeviceArrays
PyCUDA DeviceAllocations

We are hugely in favor of an initiative where all these implementations agreeing to some common way of sharing their GPU device pointer, memory layout, and element types so data from one package can be shared with any other one. Numba has an private implementation of this concept, but we would happily abandon it for some shared standard. Cupy seems like an excellent Python-based GPU container, and we'd love to have Numba support reading and writing data in this container.

That said, today there isn't any of the above interop, so I would make the follow suggestion: Cupy sounds like a good choice for doing basic NumPy-like GPU computations. If you run into things that are hard to express in Cupy, Numba would be a great tool to solve that problem (and hopefully we'll have a way to share data with Cupy in the near future). If you want to write more complex GPU code where you have direct control over the operations of CUDA threads, Numba is the better choice to start with.

--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users+unsubscribe@continuum.io.
To post to this group, send email to numba...@continuum.io.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/c1f4f101-9ed3-47b1-b727-c9aa876f6bce%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Jim Pivarski

unread,

Nov 28, 2017, 11:15:55 AM11/28/17

to Numba Public Discussion - Public

One more:

gnumpy (http://www.cs.toronto.edu/~tijmen/gnumpy.html)

which was the first one I'd heard of (back in 2010). Perhaps like the Numeric/NumArray/Numpy split in the early 2000's, there hasn't been a clear winner for this space yet. It seems it ought to be easier to converge, since now Numpy determines what the API should look like. I wonder what could be so different about their implementations?

-- Jim

Reply all

Reply to author

Forward