Hi,
I've been srtuggling lately to get a reference to an Extension class instance inside a method declared nogil to be used in prange. I want to do something like (in parallel):
cdef Nuclide[:] nuclides # pre-populeted before
cdef bint compute_all_nuclides(double param) nogil:
cdef Nuclide nuclide
for i in range(nuclides.shape[0]):
nuclide = nuclides[i]
compute_nuclide(nuclide, param)
The above doesn't work and says Function declared nogil has Python locals or temporaries, Assignment of Python object not allowed without gil and Cannot access buffer with object dtype without gil. The the last 2 messages seem reasonable (or is accessing a view of extension types without GIL just not implemented and falls back to Python object access?), so I've tried:
cdef compute_all_nuclides(double param) nogil:
cdef Nuclide nuclide
for i in range(nuclides.shape[0]):
with gil:
nuclide = nuclides[i]
compute_nuclide(nuclide, param)
...but it still fails with "Function declared nogil has Python locals or temporaries." Is this supposed to apply to extension types, too? Extension types are supported in nogil function arguments, but unsupported as locals (even if just referenced)? Does Python object refcounting need GIL? (what about atomic integers then?) Nuclide cannot be easily converted to C/C++ type - contains methods intended to be callable from Python space.
I've also tried to use libcpp.list as a container, but this isn't probably supposed to work with extension types and crashes the Cython compiler. (and would solve just the part of my problem)
Regards and sorry for way too much questions,
Matěj
On Sat, Aug 25, 2012 at 8:01 PM, Matěj Laitl <ma...@laitl.cz> wrote:
> Hi,
> I've been srtuggling lately to get a reference to an Extension class
> instance inside a method declared nogil to be used in prange. I want to do
> something like (in parallel):
> cdef Nuclide[:] nuclides # pre-populeted before
> cdef bint compute_all_nuclides(double param) nogil:
> cdef Nuclide nuclide
> for i in range(nuclides.shape[0]):
> nuclide = nuclides[i]
> compute_nuclide(nuclide, param)
> The above doesn't work and says Function declared nogil has Python locals or
> temporaries, Assignment of Python object not allowed without gil and Cannot
> access buffer with object dtype without gil. The the last 2 messages seem
> reasonable (or is accessing a view of extension types without GIL just not
> implemented and falls back to Python object access?), so I've tried:
> cdef compute_all_nuclides(double param) nogil:
> cdef Nuclide nuclide
> for i in range(nuclides.shape[0]):
> with gil:
> nuclide = nuclides[i]
> compute_nuclide(nuclide, param)
> ...but it still fails with "Function declared nogil has Python locals or
> temporaries." Is this supposed to apply to extension types, too? Extension
> types are supported in nogil function arguments, but unsupported as locals
> (even if just referenced)? Does Python object refcounting need GIL? (what
> about atomic integers then?) Nuclide cannot be easily converted to C/C++
> type - contains methods intended to be callable from Python space.
> I've also tried to use libcpp.list as a container, but this isn't probably
> supposed to work with extension types and crashes the Cython compiler. (and
> would solve just the part of my problem)
> Regards and sorry for way too much questions,
> Matěj
On Sat, Aug 25, 2012 at 5:01 PM, Matěj Laitl <ma...@laitl.cz> wrote:
> Hi,
> I've been srtuggling lately to get a reference to an Extension class
> instance inside a method declared nogil to be used in prange. I want to do
> something like (in parallel):
> cdef Nuclide[:] nuclides # pre-populeted before
> cdef bint compute_all_nuclides(double param) nogil:
> cdef Nuclide nuclide
> for i in range(nuclides.shape[0]):
> nuclide = nuclides[i]
> compute_nuclide(nuclide, param)
> The above doesn't work and says Function declared nogil has Python locals or
> temporaries, Assignment of Python object not allowed without gil and Cannot
> access buffer with object dtype without gil. The the last 2 messages seem
> reasonable (or is accessing a view of extension types without GIL just not
> implemented and falls back to Python object access?), so I've tried:
> cdef compute_all_nuclides(double param) nogil:
> cdef Nuclide nuclide
> for i in range(nuclides.shape[0]):
> with gil:
> nuclide = nuclides[i]
> compute_nuclide(nuclide, param)
> ...but it still fails with "Function declared nogil has Python locals or
> temporaries." Is this supposed to apply to extension types, too? Extension
> types are supported in nogil function arguments, but unsupported as locals
> (even if just referenced)? Does Python object refcounting need GIL?
Yes, refcounting is the issue.
> (what about atomic integers then?)
Should CPython choose to adopt them, that would solve this issue.
> Nuclide cannot be easily converted to C/C++
> type - contains methods intended to be callable from Python space.
> I've also tried to use libcpp.list as a container, but this isn't probably
> supposed to work with extension types and crashes the Cython compiler. (and
> would solve just the part of my problem)
Yeah, you'll have refounting issues trying to do that too.
What you really want is a borrowed reference, but we haven't yet
figured out the best way to add that to Cython. (It's both a question
of syntax and making it hard for the user to shoot themselves in the
foot.) You could do things with NumPy arrays, but even easier is to do
from cpython.list cimport PyList_GET_ITEM
cdef list nuclides = [...]
with nogil:
for i in range(10):
compute_nuclide(<Nuclide>nuclides[i]), param)
You can also do (again, unteseted)
cdef compute_all_nuclides(double param) with gil:
cdef Nuclide nuclide
with nogil:
for i in range(nuclides.shape[0]):
with gil:
nuclide = nuclides[i]
compute_nuclide(nuclide, param)
> cdef class Nuclide:
> cdef readonly int i
> [...]
> with nogil:
> for i in range(10):
> a = a + (<Nuclide>ptr[i]).i
Interesting. I didn't even know that that worked. Makes sense.
Methods are still a different kind of beast, though, because someone must
own the object they are being called on, and taking ownership requires
holding the GIL to increase the refcount.
> On Sat, Aug 25, 2012 at 5:01 PM, Matěj Laitl <ma...@laitl.cz> wrote:
> > Hi,
> > I've been srtuggling lately to get a reference to an Extension class
> > instance inside a method declared nogil to be used in prange. I want to do
> > something like (in parallel):
> > (...)
> > Does Python object refcounting need GIL?
> Yes, refcounting is the issue.
> > (what about atomic integers then?)
> Should CPython choose to adopt them, that would solve this issue.
I see. There must be some road-block I'm not getting, otherwise I don't see why they're not adopted.
> What you really want is a borrowed reference, but we haven't yet
> figured out the best way to add that to Cython. (It's both a question
> of syntax and making it hard for the user to shoot themselves in the
> foot.) You could do things with NumPy arrays, but even easier is to do
> from cpython.list cimport PyList_GET_ITEM
> cdef list nuclides = [...]
> with nogil:
> for i in range(10):
> compute_nuclide(<Nuclide>nuclides[i]), param)
Interesting, this means that list indexing is GIL-free after cimporting PyList_GET_ITEM? It seems that it circumvents refcounting, but the list already holds a reference to nuclide, no problem.
> You can also do (again, unteseted)
> cdef compute_all_nuclides(double param) with gil:
> cdef Nuclide nuclide
> with nogil:
> for i in range(nuclides.shape[0]):
> with gil:
> nuclide = nuclides[i]
> compute_nuclide(nuclide, param)
Thanks, this works. (although I'm getting significant overhead, not sure whether caused by GIL handling or something different)
> On 25. 8. 2012 Robert Bradshaw wrote:
>> from cpython.list cimport PyList_GET_ITEM
>> cdef list nuclides = [...]
>> with nogil:
>> for i in range(10):
>> compute_nuclide(<Nuclide>nuclides[i]), param)
> Interesting, this means that list indexing is GIL-free after cimporting > PyList_GET_ITEM?
Nope, I think that was just a left-over from Robert's mail edits. He might
have meant to use it instead of indexing.
Note that even if you manage to get at the typed object reference without
acquiring the GIL, not holding the GIL will prevent you from doing many
interesting things with it.
If your problem really is as simple as keeping data in an array and
processing it in parallel, it might turn out to be way easier to drop
Python classes all together and using C structs with functions instead.
Remember that your internal data structures don't have to map to Python
space 1:1. It's really not uncommon to use highly specialised low-level C
data structures internally and then write a pythonic wrapper that fakes a
totally object oriented API around them.
> On 25. 8. 2012 Robert Bradshaw wrote:
>> On Sat, Aug 25, 2012 at 5:01 PM, Matěj Laitl <ma...@laitl.cz> wrote:
>>> Hi,
>>> I've been srtuggling lately to get a reference to an Extension class
>>> instance inside a method declared nogil to be used in prange. I want to do
>>> something like (in parallel):
>>> (...)
>>> Does Python object refcounting need GIL?
>> Yes, refcounting is the issue.
>>> (what about atomic integers then?)
>> Should CPython choose to adopt them, that would solve this issue.
> I see. There must be some road-block I'm not getting, otherwise I don't see
> why they're not adopted.
>> What you really want is a borrowed reference, but we haven't yet
>> figured out the best way to add that to Cython. (It's both a question
>> of syntax and making it hard for the user to shoot themselves in the
>> foot.) You could do things with NumPy arrays, but even easier is to do
>> from cpython.list cimport PyList_GET_ITEM
>> cdef list nuclides = [...]
>> with nogil:
>> for i in range(10):
>> compute_nuclide(<Nuclide>nuclides[i]), param)
> Interesting, this means that list indexing is GIL-free after cimporting
> PyList_GET_ITEM? It seems that it circumvents refcounting, but the list
> already holds a reference to nuclide, no problem.
>> You can also do (again, unteseted)
>> cdef compute_all_nuclides(double param) with gil:
>> cdef Nuclide nuclide
>> with nogil:
>> for i in range(nuclides.shape[0]):
>> with gil:
>> nuclide = nuclides[i]
>> compute_nuclide(nuclide, param)
> Thanks, this works. (although I'm getting significant overhead, not sure
> whether caused by GIL handling or something different)
Overhead compared to what?
When you stick data in Nuclide, you'll still need to follow that pointer and jump wildly around in memory, which is not very efficient for the CPU. Sticking the data directly in a NumPy array will always be significantly faster just due to how the hardware (and Python lists) work.
In summary: Don't attempt C++-style programming in languages that aren't C++.
Overhead of the parallel versus the sequential code. (prange used all the time, but -fopenmp passed only for the first time) Parallel code had twice wall clock time and ten times CPU time. This probably means that the prange was used too low in the call stack.
> When you stick data in Nuclide, you'll still need to follow that pointer
> and jump wildly around in memory, which is not very efficient for the
> CPU. Sticking the data directly in a NumPy array will always be
> significantly faster just due to how the hardware (and Python lists) work.
Nuclide is just a container for a couple of read-only parameters, actual result is written to prepared Cython memory view. (I've just omitted that in the example) There are only a few Nuclide instances - the same instances are shared by all threads. They're very likely on a memory page that doesn't get written to, thus I see no cache coherency problems.
The code in compute_nuclide(nuclide, param) is the expensive part whose total time is approx 1000 times greater than total time of compute_all_nuclides(param)
> In summary: Don't attempt C++-style programming in languages that aren't
> C++.
What do you mean by C++-style programming here? Writing results to objects?
cdef nogil functions works: refcounting is maintained by numpy's
object array. As long as we hold a reference to the nuclides array,
the elements are (indirectly) owned.
python methods won't work: they return a python object and the
parameters are passed around as python objects,
the creation of which all require GIL refcounting.
import numpy
cimport numpy
cdef class Nuclide:
cdef readonly int i
def __init__(self, i):
self.i = i
cdef void cmethod(Nuclide self, int i) nogil:
pass
On Sun, Aug 26, 2012 at 5:20 AM, Stefan Behnel <stefan...@behnel.de> wrote:
> Feng Yu, 26.08.2012 07:21:
>> cdef class Nuclide:
>> cdef readonly int i
>> [...]
>> with nogil:
>> for i in range(10):
>> a = a + (<Nuclide>ptr[i]).i
> Interesting. I didn't even know that that worked. Makes sense.
> Methods are still a different kind of beast, though, because someone must
> own the object they are being called on, and taking ownership requires
> holding the GIL to increase the refcount.