Perhaps this has been asked before but I didn't find it in the archives. I have some code within a prange block that calls a simple function to aid in the aggregation but every time it is called, I get a segfault. I'm not very familiar with OpenMP-- perhaps this is not allowed?
Here's some code to repro:
cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil: cdef double total = 0. cdef Py_ssize_t T = X.shape[0] cdef Py_ssize_t t for t in xrange(T): total += X[t] return total
cpdef test_parallel(np.ndarray[np.double_t, ndim=1] X): cdef Py_ssize_t T = X.shape[0] cdef double total = 0. cdef Py_ssize_t t for t in prange(T, nogil=True): total += sum_elements(X) print total
And to run the code: $ python -c "import scipy; import mytest; my_test.test_parallel(scipy.ones(5, dtype=scipy.float64))"
Running Intel-based 64bit Ubuntu 11.04 with Cython 0.15.1 and python 2.7.1 and gcc 4.5.2.
Thanks for your eyes on my code! -- Jake Biesinger Graduate Student Xie Lab, UC Irvine
On 22 October 2011 16:05, Jacob Biesinger <jake.biesin...@gmail.com> wrote:
> Hi Cythoners, > Perhaps this has been asked before but I didn't find it in the archives. I > have some code within a prange block that calls a simple function to aid in > the aggregation but every time it is called, I get a segfault. I'm not very > familiar with OpenMP-- perhaps this is not allowed? > Here's some code to repro: > cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil:
Basically, the buffer syntax (np.ndarray[...]) is not allowed for cdef nogil functions (it tries to acquire a buffer without the GIL, which is invalid). The compiler should give an error for this code, but unfortunately it doesn't (this is a known bug).
Instead you could try out memoryviews, which will be new in the upcoming Cython release. For that you'd have to get a copy of Cython on github: https://github.com/cython/cython
You basically have to write
cdef double sum_elements(double[:] X) nogil: ...
and the same goes for your test_parallel function. If your buffers are contiguous you could specify 'double[::1]', which is the same as 'mode="c"' in the current buffer support. The buffers can also be sliced, transposed and some other tricks can be done without the GIL. The change from one syntax to another is unfortunate, but it basically looks cleaner and is more powerful. The older syntax will be deprecated (in the upcoming release or one or two releases thereafter) and maybe even removed eventually.
> cdef double total = 0. > cdef Py_ssize_t T = X.shape[0] > cdef Py_ssize_t t > for t in xrange(T): > total += X[t] > return total
> cpdef test_parallel(np.ndarray[np.double_t, ndim=1] X): > cdef Py_ssize_t T = X.shape[0] > cdef double total = 0. > cdef Py_ssize_t t > for t in prange(T, nogil=True): > total += sum_elements(X) > print total
> And to run the code: > $ python -c "import scipy; import mytest; > my_test.test_parallel(scipy.ones(5, dtype=scipy.float64))" > Running Intel-based 64bit Ubuntu 11.04 with Cython 0.15.1 and python 2.7.1 > and gcc 4.5.2. > Thanks for your eyes on my code! > -- > Jake Biesinger > Graduate Student > Xie Lab, UC Irvine
> On 22 October 2011 16:05, Jacob Biesinger <jake.biesin...@gmail.com> > wrote: > > Hi Cythoners, > > Perhaps this has been asked before but I didn't find it in the archives. > I > > have some code within a prange block that calls a simple function to aid > in > > the aggregation but every time it is called, I get a segfault. I'm not > very > > familiar with OpenMP-- perhaps this is not allowed? > > Here's some code to repro: > > cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil:
> Basically, the buffer syntax (np.ndarray[...]) is not allowed for cdef > nogil functions (it tries to acquire a buffer without the GIL, which > is invalid). The compiler should give an error for this code, but > unfortunately it doesn't (this is a known bug).
> Instead you could try out memoryviews, which will be new in the > upcoming Cython release. For that you'd have to get a copy of Cython > on github: https://github.com/cython/cython
> and the same goes for your test_parallel function. If your buffers are > contiguous you could specify 'double[::1]', which is the same as > 'mode="c"' in the current buffer support. The buffers can also be > sliced, transposed and some other tricks can be done without the GIL. > The change from one syntax to another is unfortunate, but it basically > looks cleaner and is more powerful. The older syntax will be > deprecated (in the upcoming release or one or two releases thereafter) > and maybe even removed eventually.
How would you specify multidimensional arrays in this syntax? Something like
cdef double sum_elements(double[:,:,:] X) nogil:
for a 3 dimensional array? The updated syntax is certainly more succinct (a good thing IMHO). I'll go grab the source and give it a shot. Is there any additional documentation on this? Or is the source readable enough to figure out?
It seems another option would be to inline the function completely, but perhaps that also tries to create the new buffer? inlining segfaults as well on my machine.
> On Sat, Oct 22, 2011 at 8:36 AM, mark florisson <markflorisso...@gmail.com> > wrote:
>> On 22 October 2011 16:05, Jacob Biesinger <jake.biesin...@gmail.com> >> wrote: >> > Hi Cythoners, >> > Perhaps this has been asked before but I didn't find it in the archives. >> > I >> > have some code within a prange block that calls a simple function to aid >> > in >> > the aggregation but every time it is called, I get a segfault. I'm not >> > very >> > familiar with OpenMP-- perhaps this is not allowed? >> > Here's some code to repro: >> > cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil:
>> Basically, the buffer syntax (np.ndarray[...]) is not allowed for cdef >> nogil functions (it tries to acquire a buffer without the GIL, which >> is invalid). The compiler should give an error for this code, but >> unfortunately it doesn't (this is a known bug).
>> Instead you could try out memoryviews, which will be new in the >> upcoming Cython release. For that you'd have to get a copy of Cython >> on github: https://github.com/cython/cython
>> and the same goes for your test_parallel function. If your buffers are >> contiguous you could specify 'double[::1]', which is the same as >> 'mode="c"' in the current buffer support. The buffers can also be >> sliced, transposed and some other tricks can be done without the GIL. >> The change from one syntax to another is unfortunate, but it basically >> looks cleaner and is more powerful. The older syntax will be >> deprecated (in the upcoming release or one or two releases thereafter) >> and maybe even removed eventually.
> How would you specify multidimensional arrays in this syntax? Something > like > cdef double sum_elements(double[:,:,:] X) nogil: > for a 3 dimensional array?
> The updated syntax is certainly more succinct (a > good thing IMHO). I'll go grab the source and give it a shot. Is there any > additional documentation on this? Or is the source readable enough to > figure out? > It seems another option would be to inline the function completely, but > perhaps that also tries to create the new buffer? inlining segfaults as > well on my machine.
Do you have an example? It shouldn't segfault. The buffer syntax also shouldn't create two buffers (in fact, they acquire buffers, they don't allocate them, although such functionality is also added).
> On 22 October 2011 17:41, Jacob Biesinger <jake.biesin...@gmail.com> > wrote: > > On Sat, Oct 22, 2011 at 8:36 AM, mark florisson < > markflorisso...@gmail.com> > > wrote:
> >> On 22 October 2011 16:05, Jacob Biesinger <jake.biesin...@gmail.com> > >> wrote: > >> > Hi Cythoners, > >> > Perhaps this has been asked before but I didn't find it in the > archives. > >> > I > >> > have some code within a prange block that calls a simple function to > aid > >> > in > >> > the aggregation but every time it is called, I get a segfault. I'm > not > >> > very > >> > familiar with OpenMP-- perhaps this is not allowed? > >> > Here's some code to repro: > >> > cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil:
> >> Basically, the buffer syntax (np.ndarray[...]) is not allowed for cdef > >> nogil functions (it tries to acquire a buffer without the GIL, which > >> is invalid). The compiler should give an error for this code, but > >> unfortunately it doesn't (this is a known bug).
> >> Instead you could try out memoryviews, which will be new in the > >> upcoming Cython release. For that you'd have to get a copy of Cython > >> on github: https://github.com/cython/cython
> >> and the same goes for your test_parallel function. If your buffers are > >> contiguous you could specify 'double[::1]', which is the same as > >> 'mode="c"' in the current buffer support. The buffers can also be > >> sliced, transposed and some other tricks can be done without the GIL. > >> The change from one syntax to another is unfortunate, but it basically > >> looks cleaner and is more powerful. The older syntax will be > >> deprecated (in the upcoming release or one or two releases thereafter) > >> and maybe even removed eventually.
> > How would you specify multidimensional arrays in this syntax? Something > > like > > cdef double sum_elements(double[:,:,:] X) nogil: > > for a 3 dimensional array?
> Indeed. You can find the hudson documentation here:
> > The updated syntax is certainly more succinct (a > > good thing IMHO). I'll go grab the source and give it a shot. Is there > any > > additional documentation on this? Or is the source readable enough to > > figure out? > > It seems another option would be to inline the function completely, but > > perhaps that also tries to create the new buffer? inlining segfaults as > > well on my machine.
> Do you have an example? It shouldn't segfault. The buffer syntax also > shouldn't create two buffers (in fact, they acquire buffers, they > don't allocate them, although such functionality is also added).
Maybe I'm using it wrong. I was under the impression from the docs that to inline a function, you just add the inline keyword to the function header:
Inlining the summation with the memory view syntax makes it run 25-50% faster:
**** With ndarray[] syntax ****
In [4]: %timeit test.test_parallel(s) 10000 loops, best of 3: 221 us per loop
In [5]: %timeit test.test_parallel_inlined_sum_elements(s) 1000 loops, best of 3: 198 us per loop
**** With memoryview syntax **** In [5]: %timeit test.test_parallel(s) 10000 loops, best of 3: 216 us per loop
In [6]: %timeit test.test_parallel_inlined_sum_elements(s) 1000 loops, best of 3: 130 us per loop
On a different note, in the memory view syntax, the array typing seems a bit more ambiguous. I have some numpy boolean arrays which are represented as np.ndarray[np.int8_t, ndim=2] in the previous syntax. I can now use char [:,:] to get the striding right, but doesn't the size of char (or int or whatever) depend on the system (32bit vs 64)? A chart with the 1:1 correspondence of each C type with each numpy type might be helpful, if there is such a mapping.
> On Sat, Oct 22, 2011 at 9:50 AM, mark florisson <markflorisso...@gmail.com> > wrote:
>> On 22 October 2011 17:41, Jacob Biesinger <jake.biesin...@gmail.com> >> wrote: >> > On Sat, Oct 22, 2011 at 8:36 AM, mark florisson >> > <markflorisso...@gmail.com> >> > wrote:
>> >> On 22 October 2011 16:05, Jacob Biesinger <jake.biesin...@gmail.com> >> >> wrote: >> >> > Hi Cythoners, >> >> > Perhaps this has been asked before but I didn't find it in the >> >> > archives. >> >> > I >> >> > have some code within a prange block that calls a simple function to >> >> > aid >> >> > in >> >> > the aggregation but every time it is called, I get a segfault. I'm >> >> > not >> >> > very >> >> > familiar with OpenMP-- perhaps this is not allowed? >> >> > Here's some code to repro: >> >> > cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil:
>> >> Basically, the buffer syntax (np.ndarray[...]) is not allowed for cdef >> >> nogil functions (it tries to acquire a buffer without the GIL, which >> >> is invalid). The compiler should give an error for this code, but >> >> unfortunately it doesn't (this is a known bug).
>> >> Instead you could try out memoryviews, which will be new in the >> >> upcoming Cython release. For that you'd have to get a copy of Cython >> >> on github: https://github.com/cython/cython
>> >> and the same goes for your test_parallel function. If your buffers are >> >> contiguous you could specify 'double[::1]', which is the same as >> >> 'mode="c"' in the current buffer support. The buffers can also be >> >> sliced, transposed and some other tricks can be done without the GIL. >> >> The change from one syntax to another is unfortunate, but it basically >> >> looks cleaner and is more powerful. The older syntax will be >> >> deprecated (in the upcoming release or one or two releases thereafter) >> >> and maybe even removed eventually.
>> > How would you specify multidimensional arrays in this syntax? Something >> > like >> > cdef double sum_elements(double[:,:,:] X) nogil: >> > for a 3 dimensional array?
>> Indeed. You can find the hudson documentation here:
>> > The updated syntax is certainly more succinct (a >> > good thing IMHO). I'll go grab the source and give it a shot. Is there >> > any >> > additional documentation on this? Or is the source readable enough to >> > figure out? >> > It seems another option would be to inline the function completely, but >> > perhaps that also tries to create the new buffer? inlining segfaults as >> > well on my machine.
>> Do you have an example? It shouldn't segfault. The buffer syntax also >> shouldn't create two buffers (in fact, they acquire buffers, they >> don't allocate them, although such functionality is also added).
> Maybe I'm using it wrong. I was under the impression from the docs that to > inline a function, you just add the inline keyword to the function header: > cdef inline double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil: > though according > to http://groups.google.com/group/cython-users/browse_thread/thread/8fc8... > perhaps the inline doesn't actually happen for np.ndarray's. According to > my tests, this bug *looks* fixed with the new memoryviews: > Inlining the summation with the memory view syntax makes it run 25-50% > faster: > **** With ndarray[] syntax **** > In [4]: %timeit test.test_parallel(s) > 10000 loops, best of 3: 221 us per loop > In [5]: %timeit test.test_parallel_inlined_sum_elements(s) > 1000 loops, best of 3: 198 us per loop > **** With memoryview syntax **** > In [5]: %timeit test.test_parallel(s) > 10000 loops, best of 3: 216 us per loop > In [6]: %timeit test.test_parallel_inlined_sum_elements(s) > 1000 loops, best of 3: 130 us per loop
Wow, that's a large difference between inlining and not inlining.
> On a different note, in the memory view syntax, the array typing seems a bit > more ambiguous. I have some numpy boolean arrays which are represented as > np.ndarray[np.int8_t, ndim=2] in the previous syntax. I can now use char > [:,:] to get the striding right, but doesn't the size of char (or int or > whatever) depend on the system (32bit vs 64)? A chart with the 1:1 > correspondence of each C type with each numpy type might be helpful, if > there is such a mapping. > Thanks for the quick responses!
You can still use your numpy types as you were doing before: 'np.int8_t[:, :] myarray' will work just fine.
> > **** With ndarray[] syntax **** > > In [4]: %timeit test.test_parallel(s) > > 10000 loops, best of 3: 221 us per loop > > In [5]: %timeit test.test_parallel_inlined_sum_elements(s) > > 1000 loops, best of 3: 198 us per loop > > **** With memoryview syntax **** > > In [5]: %timeit test.test_parallel(s) > > 10000 loops, best of 3: 216 us per loop > > In [6]: %timeit test.test_parallel_inlined_sum_elements(s) > > 1000 loops, best of 3: 130 us per loop
> Wow, that's a large difference between inlining and not inlining.
Yeah, thanks to the cython devs for that syntax sugar!
> > On a different note, in the memory view syntax, the array typing seems a > bit > > more ambiguous. I have some numpy boolean arrays which are represented > as > > np.ndarray[np.int8_t, ndim=2] in the previous syntax. I can now use char > > [:,:] to get the striding right, but doesn't the size of char (or int or > > whatever) depend on the system (32bit vs 64)? A chart with the 1:1 > > correspondence of each C type with each numpy type might be helpful, if > > there is such a mapping. > > Thanks for the quick responses!
> You can still use your numpy types as you were doing before: > 'np.int8_t[:, :] myarray' will work just fine.
On 22 October 2011 16:36, mark florisson <markflorisso...@gmail.com> wrote:
> On 22 October 2011 16:05, Jacob Biesinger <jake.biesin...@gmail.com> wrote: >> Hi Cythoners, >> Perhaps this has been asked before but I didn't find it in the archives. I >> have some code within a prange block that calls a simple function to aid in >> the aggregation but every time it is called, I get a segfault. I'm not very >> familiar with OpenMP-- perhaps this is not allowed? >> Here's some code to repro: >> cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil:
> Basically, the buffer syntax (np.ndarray[...]) is not allowed for cdef > nogil functions (it tries to acquire a buffer without the GIL, which > is invalid). The compiler should give an error for this code, but > unfortunately it doesn't (this is a known bug).
> Instead you could try out memoryviews, which will be new in the > upcoming Cython release. For that you'd have to get a copy of Cython > on github: https://github.com/cython/cython
> and the same goes for your test_parallel function. If your buffers are > contiguous you could specify 'double[::1]', which is the same as > 'mode="c"' in the current buffer support. The buffers can also be > sliced, transposed and some other tricks can be done without the GIL. > The change from one syntax to another is unfortunate, but it basically > looks cleaner and is more powerful. The older syntax will be > deprecated (in the upcoming release or one or two releases thereafter) > and maybe even removed eventually.
>> cdef double total = 0. >> cdef Py_ssize_t T = X.shape[0] >> cdef Py_ssize_t t >> for t in xrange(T): >> total += X[t] >> return total
>> cpdef test_parallel(np.ndarray[np.double_t, ndim=1] X): >> cdef Py_ssize_t T = X.shape[0] >> cdef double total = 0. >> cdef Py_ssize_t t >> for t in prange(T, nogil=True): >> total += sum_elements(X) >> print total
>> And to run the code: >> $ python -c "import scipy; import mytest; >> my_test.test_parallel(scipy.ones(5, dtype=scipy.float64))" >> Running Intel-based 64bit Ubuntu 11.04 with Cython 0.15.1 and python 2.7.1 >> and gcc 4.5.2. >> Thanks for your eyes on my code! >> -- >> Jake Biesinger >> Graduate Student >> Xie Lab, UC Irvine
tmp = np.asarray(<np.double_t[:T,:K]> phi) ^ ------------------------------------------------------------ vb_mf.pyx:102:49: Can only create cython.array from pointer
> On 22 October 2011 16:36, mark florisson <markflorisso...@gmail.com> > wrote: > > On 22 October 2011 16:05, Jacob Biesinger <jake.biesin...@gmail.com> > wrote: > >> Hi Cythoners, > >> Perhaps this has been asked before but I didn't find it in the archives. > I > >> have some code within a prange block that calls a simple function to aid > in > >> the aggregation but every time it is called, I get a segfault. I'm not > very > >> familiar with OpenMP-- perhaps this is not allowed? > >> Here's some code to repro: > >> cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil:
> > Basically, the buffer syntax (np.ndarray[...]) is not allowed for cdef > > nogil functions (it tries to acquire a buffer without the GIL, which > > is invalid). The compiler should give an error for this code, but > > unfortunately it doesn't (this is a known bug).
> > Instead you could try out memoryviews, which will be new in the > > upcoming Cython release. For that you'd have to get a copy of Cython > > on github: https://github.com/cython/cython
> > and the same goes for your test_parallel function. If your buffers are > > contiguous you could specify 'double[::1]', which is the same as > > 'mode="c"' in the current buffer support. The buffers can also be > > sliced, transposed and some other tricks can be done without the GIL. > > The change from one syntax to another is unfortunate, but it basically > > looks cleaner and is more powerful. The older syntax will be > > deprecated (in the upcoming release or one or two releases thereafter) > > and maybe even removed eventually.
> >> cdef double total = 0. > >> cdef Py_ssize_t T = X.shape[0] > >> cdef Py_ssize_t t > >> for t in xrange(T): > >> total += X[t] > >> return total
> >> cpdef test_parallel(np.ndarray[np.double_t, ndim=1] X): > >> cdef Py_ssize_t T = X.shape[0] > >> cdef double total = 0. > >> cdef Py_ssize_t t > >> for t in prange(T, nogil=True): > >> total += sum_elements(X) > >> print total
> >> And to run the code: > >> $ python -c "import scipy; import mytest; > >> my_test.test_parallel(scipy.ones(5, dtype=scipy.float64))" > >> Running Intel-based 64bit Ubuntu 11.04 with Cython 0.15.1 and python > 2.7.1 > >> and gcc 4.5.2. > >> Thanks for your eyes on my code! > >> -- > >> Jake Biesinger > >> Graduate Student > >> Xie Lab, UC Irvine
On 22 October 2011 19:53, Jacob Biesinger <jake.biesin...@gmail.com> wrote:
>>> You can still use your numpy types as you were doing before: >>> 'np.int8_t[:, :] myarray' will work just fine.
>> Ah, of course. Thanks!
> Can memoryviews be used as if they were np.ndarrays for slicing? I > expected: > cdef np.double_t[:,:] phi = np.ones((T,K)) > phi[:] = 0. > to fill the entire array with zeros, but instead I get a compile-time error: > phi[:] = 0. > ^ > ------------------------------------------------------------ > vb_mf.pyx:101:19: Cannot convert 'double' to memoryviewslice
You can slice typed memoryviews, but the problem is that slice assignment (and vector operations) are not supported yet. You can get your original object back though, using myarray.base. However, it may have been sliced in the meantime, so depending on what you're doing that may not be valid.
So usually its easier to initialize the array first using numpy vector assignment, and then acquiring a typed memoryview.
Slice assignment will definitely be supported in the future, it's just that someone needs to find the time. Of course, contributions are most welcome. The good thing with typed memoryviews is that you can do (basically) everything without the GIL: passing around, transposing, indexing, slicing, etc. Of course we're not going to re-implement all of numpy here :). It would be great if NumPy had a nogil API that worked on raw pointers + stride info, though.
> phi[:,:] = 0. also doesn't work. Perhaps a cast to a numpy array?
> numpy_array = np.asarray(<np.double_t[:T,:K]> phi) > numpy_array[:] = 0. > Nope. That gives me: > tmp = np.asarray(<np.double_t[:T,:K]> phi) > ^ > ------------------------------------------------------------ > vb_mf.pyx:102:49: Can only create cython.array from pointer
Yes, this kind of casting syntax is useful to convert C data (i.e., C pointers or arrays) to memoryviews, which can be coerced to numpy.ndarrays (all without copying any data).
> On Sat, Oct 22, 2011 at 11:46 AM, mark florisson <markflorisso...@gmail.com> > wrote:
>> On 22 October 2011 16:36, mark florisson <markflorisso...@gmail.com> >> wrote: >> > On 22 October 2011 16:05, Jacob Biesinger <jake.biesin...@gmail.com> >> > wrote: >> >> Hi Cythoners, >> >> Perhaps this has been asked before but I didn't find it in the >> >> archives. I >> >> have some code within a prange block that calls a simple function to >> >> aid in >> >> the aggregation but every time it is called, I get a segfault. I'm not >> >> very >> >> familiar with OpenMP-- perhaps this is not allowed? >> >> Here's some code to repro: >> >> cdef double sum_elements(np.ndarray[np.double_t, ndim=1] X) nogil:
>> > Basically, the buffer syntax (np.ndarray[...]) is not allowed for cdef >> > nogil functions (it tries to acquire a buffer without the GIL, which >> > is invalid). The compiler should give an error for this code, but >> > unfortunately it doesn't (this is a known bug).
> From the warning messages there, I take it that using memoryviews is more > efficient than ndarray buffers?
It depends on how you're using them. Passing them around should typically be cheaper as it doesn't reacquire the buffers. Indexing should have the same speed. Slicing (and indexing with x < ndim dimensions) should be faster as it won't go through NumPy.
The most important thing about memoryslices is that they are more elegant to write, and that a lot more functionality is available without the GIL. This seems to be especially important for cython.parallel users.
>> > Instead you could try out memoryviews, which will be new in the >> > upcoming Cython release. For that you'd have to get a copy of Cython >> > on github: https://github.com/cython/cython
>> > and the same goes for your test_parallel function. If your buffers are >> > contiguous you could specify 'double[::1]', which is the same as >> > 'mode="c"' in the current buffer support. The buffers can also be >> > sliced, transposed and some other tricks can be done without the GIL. >> > The change from one syntax to another is unfortunate, but it basically >> > looks cleaner and is more powerful. The older syntax will be >> > deprecated (in the upcoming release or one or two releases thereafter) >> > and maybe even removed eventually.
>> >> cdef double total = 0. >> >> cdef Py_ssize_t T = X.shape[0] >> >> cdef Py_ssize_t t >> >> for t in xrange(T): >> >> total += X[t] >> >> return total
>> >> cpdef test_parallel(np.ndarray[np.double_t, ndim=1] X): >> >> cdef Py_ssize_t T = X.shape[0] >> >> cdef double total = 0. >> >> cdef Py_ssize_t t >> >> for t in prange(T, nogil=True): >> >> total += sum_elements(X) >> >> print total
>> >> And to run the code: >> >> $ python -c "import scipy; import mytest; >> >> my_test.test_parallel(scipy.ones(5, dtype=scipy.float64))" >> >> Running Intel-based 64bit Ubuntu 11.04 with Cython 0.15.1 and python >> >> 2.7.1 >> >> and gcc 4.5.2. >> >> Thanks for your eyes on my code! >> >> -- >> >> Jake Biesinger >> >> Graduate Student >> >> Xie Lab, UC Irvine
> On 22 October 2011 19:53, Jacob Biesinger <jake.biesin...@gmail.com> > wrote: > >>> You can still use your numpy types as you were doing before: > >>> 'np.int8_t[:, :] myarray' will work just fine.
> >> Ah, of course. Thanks!
> > Can memoryviews be used as if they were np.ndarrays for slicing? I > > expected: > > cdef np.double_t[:,:] phi = np.ones((T,K)) > > phi[:] = 0. > > to fill the entire array with zeros, but instead I get a compile-time > error: > > phi[:] = 0. > > ^ > > ------------------------------------------------------------ > > vb_mf.pyx:101:19: Cannot convert 'double' to memoryviewslice
> You can slice typed memoryviews, but the problem is that slice > assignment (and vector operations) are not supported yet. You can get > your original object back though, using myarray.base. However, it may > have been sliced in the meantime, so depending on what you're doing > that may not be valid.
> So usually its easier to initialize the array first using numpy vector > assignment, and then acquiring a typed memoryview.
> Slice assignment will definitely be supported in the future, it's just > that someone needs to find the time. Of course, contributions are most > welcome. The good thing with typed memoryviews is that you can do > (basically) everything without the GIL: passing around, transposing, > indexing, slicing, etc. Of course we're not going to re-implement all > of numpy here :). It would be great if NumPy had a nogil API that > worked on raw pointers + stride info, though.
Is the plan to include element-wise operations, s.t. I could write something like:
phi[t,:] /= X[i,t,:] + Y[:,k]
without the GIL? That would be very powerful. On my current project, a lot of time has gone into rewriting numpy slices and element-wise operations as ugly for loops. How involved do you think implementing this would be?
I'm also following the discussion<http://mail.python.org/pipermail/pypy-dev/2011-October/008601.html>over on pypy about the micronumpy port-- they are trying to tackle similar issues. I'm Excited for faster numerical python! I wonder if cython's upcoming python backend will somehow support these memoryviews as well? Any hope of that making pypy's numpy port a bit easier/more comprehensive?
> On Sat, Oct 22, 2011 at 12:08 PM, mark florisson <markflorisso...@gmail.com> > wrote:
>> On 22 October 2011 19:53, Jacob Biesinger <jake.biesin...@gmail.com> >> wrote: >> >>> You can still use your numpy types as you were doing before: >> >>> 'np.int8_t[:, :] myarray' will work just fine.
>> >> Ah, of course. Thanks!
>> > Can memoryviews be used as if they were np.ndarrays for slicing? I >> > expected: >> > cdef np.double_t[:,:] phi = np.ones((T,K)) >> > phi[:] = 0. >> > to fill the entire array with zeros, but instead I get a compile-time >> > error: >> > phi[:] = 0. >> > ^ >> > ------------------------------------------------------------ >> > vb_mf.pyx:101:19: Cannot convert 'double' to memoryviewslice
>> You can slice typed memoryviews, but the problem is that slice >> assignment (and vector operations) are not supported yet. You can get >> your original object back though, using myarray.base. However, it may >> have been sliced in the meantime, so depending on what you're doing >> that may not be valid.
>> So usually its easier to initialize the array first using numpy vector >> assignment, and then acquiring a typed memoryview.
>> Slice assignment will definitely be supported in the future, it's just >> that someone needs to find the time. Of course, contributions are most >> welcome. The good thing with typed memoryviews is that you can do >> (basically) everything without the GIL: passing around, transposing, >> indexing, slicing, etc. Of course we're not going to re-implement all >> of numpy here :). It would be great if NumPy had a nogil API that >> worked on raw pointers + stride info, though.
> Is the plan to include element-wise operations, s.t. I could write something > like: > phi[t,:] /= X[i,t,:] + Y[:,k] > without the GIL? That would be very powerful.
Yes, and preferably it will use SIMD whenever it can, and will avoid temporary buffers.
> On my current project, a lot > of time has gone into rewriting numpy slices and element-wise operations as > ugly for loops. How involved do you think implementing this would be?
Implementing just that (without any regard for SIMD) wouldn't be so hard, the memoryview infrastructure is already there along with a lot of functionality. It would take time, of course, but you can probably do it in a weekend.
> I'm also following the discussion over on pypy about the micronumpy port-- > they are trying to tackle similar issues. I'm Excited for faster numerical > python! I wonder if cython's upcoming python backend will somehow support > these memoryviews as well?
Probably not, in fact, I'm not even sure how pypy handles the buffer interface (I think they're still working on python 2.7, which introduced the buffer interface?).
> Any hope of that making pypy's numpy port a bit > easier/more comprehensive? > Thanks again.
Hmm, as much as that would make things easier for a lot of people, it's really not what they want. They want users to write their normal Python NumPy code, and to make it fast with their JIT. For that they're reimplementing it in RPython I believe, and they don't want to just call into the NumPy C API or into C libraries, because the JIT cannot reach there.
Our goals share some similarity but they are different. We only have to re-implement very selective and simple parts of NumPy, such as indexing, slicing etc, and it will make most things fast for many users. This does mean that you cannot use NumPy functionality without (expensive) coercion back to NumPy if you have the GIL. If you don't have the GIL you can't do it at all without reacquiring the GIL (which can be very expensive).
For pypy this is not good enough (or they believe it isn't), so they have to go through a more involved process. I really don't think Cython's pypy backend does (or would do if fully complete) what they want. At some point if plain NumPy runs on PyPy, you might be able to use ctypes memoryviews. None of that is currently implemented though, and this situation may just lead people to choose between Cython and PyPy. How the rest of the scientific libraries are going to be ported or interface with PyPy's NumPy remains an open question.
> On 23 October 2011 03:36, Jacob Biesinger <jake.biesin...@gmail.com> wrote: >> On Sat, Oct 22, 2011 at 12:08 PM, mark florisson <markflorisso...@gmail.com> >> wrote:
>>> On 22 October 2011 19:53, Jacob Biesinger <jake.biesin...@gmail.com> >>> wrote: >>> >>> You can still use your numpy types as you were doing before: >>> >>> 'np.int8_t[:, :] myarray' will work just fine.
>>> >> Ah, of course. Thanks!
>>> > Can memoryviews be used as if they were np.ndarrays for slicing? I >>> > expected: >>> > cdef np.double_t[:,:] phi = np.ones((T,K)) >>> > phi[:] = 0. >>> > to fill the entire array with zeros, but instead I get a compile-time >>> > error: >>> > phi[:] = 0. >>> > ^ >>> > ------------------------------------------------------------ >>> > vb_mf.pyx:101:19: Cannot convert 'double' to memoryviewslice
>>> You can slice typed memoryviews, but the problem is that slice >>> assignment (and vector operations) are not supported yet. You can get >>> your original object back though, using myarray.base. However, it may >>> have been sliced in the meantime, so depending on what you're doing >>> that may not be valid.
>>> So usually its easier to initialize the array first using numpy vector >>> assignment, and then acquiring a typed memoryview.
>>> Slice assignment will definitely be supported in the future, it's just >>> that someone needs to find the time. Of course, contributions are most >>> welcome. The good thing with typed memoryviews is that you can do >>> (basically) everything without the GIL: passing around, transposing, >>> indexing, slicing, etc. Of course we're not going to re-implement all >>> of numpy here :). It would be great if NumPy had a nogil API that >>> worked on raw pointers + stride info, though.
>> Is the plan to include element-wise operations, s.t. I could write something >> like: >> phi[t,:] /= X[i,t,:] + Y[:,k] >> without the GIL? That would be very powerful.
> Yes, and preferably it will use SIMD whenever it can, and will avoid > temporary buffers.
>> On my current project, a lot >> of time has gone into rewriting numpy slices and element-wise operations as >> ugly for loops. How involved do you think implementing this would be?
> Implementing just that (without any regard for SIMD) wouldn't be so > hard, the memoryview infrastructure is already there along with a lot > of functionality. It would take time, of course, but you can probably > do it in a weekend.
BTW, if you're interested in that, raise an issue on cython-dev and we can give you pointers. Don't feel obligated though :)
>> I'm also following the discussion over on pypy about the micronumpy port-- >> they are trying to tackle similar issues. I'm Excited for faster numerical >> python! I wonder if cython's upcoming python backend will somehow support >> these memoryviews as well?
> Probably not, in fact, I'm not even sure how pypy handles the buffer > interface (I think they're still working on python 2.7, which > introduced the buffer interface?).
>> Any hope of that making pypy's numpy port a bit >> easier/more comprehensive? >> Thanks again.
> Hmm, as much as that would make things easier for a lot of people, > it's really not what they want. They want users to write their normal > Python NumPy code, and to make it fast with their JIT. For that > they're reimplementing it in RPython I believe, and they don't want to > just call into the NumPy C API or into C libraries, because the JIT > cannot reach there.
> Our goals share some similarity but they are different. We only have > to re-implement very selective and simple parts of NumPy, such as > indexing, slicing etc, and it will make most things fast for many > users. This does mean that you cannot use NumPy functionality without > (expensive) coercion back to NumPy if you have the GIL. If you don't > have the GIL you can't do it at all without reacquiring the GIL (which > can be very expensive).
> For pypy this is not good enough (or they believe it isn't), so they > have to go through a more involved process. I really don't think > Cython's pypy backend does (or would do if fully complete) what they > want. At some point if plain NumPy runs on PyPy, you might be able to > use ctypes memoryviews. None of that is currently implemented though, > and this situation may just lead people to choose between Cython and > PyPy. How the rest of the scientific libraries are going to be ported > or interface with PyPy's NumPy remains an open question.