More explicitly, I have some temporary home-made C structure that holds
a pointer to an array. I prepare (using Cython) an numpy.ndarray using
the PyArray_NewFromDescr function. I can delete my temporary C structure
without freeing the memory holding array, but I wish the numpy.ndarray
becomes the owner of the data.
How can do I do such thing ?
--
Fabrice Silva
_______________________________________________
NumPy-Discussion mailing list
NumPy-Di...@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
You can't, really. numpy-owned arrays will be deallocated with numpy's
deallocator. This may not be the appropriate deallocator for memory
that your library allocated.
If at all possible, I recommend using numpy to create the ndarray and
pass that pointer to your library. Sometimes the library's API gets in
the way of this. Otherwise, copy the data.
Devs, looking into this, I noticed that we use PyDataMem_NEW() and
PyDataMem_FREE() (which is #defined to malloc() and free()) for
handling the data pointer. Why aren't we using the appropriate
PyMem_*() functions (or the PyArray_*() memory functions which default
to using the PyMem_*() implementations)? Using the PyMem_*() functions
lets the Python memory manager have an accurate idea how much memory
is being used, which can be important for the large amounts of memory
that numpy arrays can consume.
I assume this is intentional design. I just want to know the rationale
for it and would like it documented. I can certainly understand if it
causes bad interactions with the garbage collector, say (though hiding
information from the GC seems like a suboptimal approach).
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
> How can one arbitrarily assumes that an ndarray owns its data ?
>
> More explicitly, I have some temporary home-made C structure that holds
> a pointer to an array. I prepare (using Cython) an numpy.ndarray using
> the PyArray_NewFromDescr function. I can delete my temporary C structure
> without freeing the memory holding array, but I wish the numpy.ndarray
> becomes the owner of the data.
>
> How can do I do such thing ?
There is an excellent blog entry from Travis Oliphant, that describes how to create a ndarray from existing data without copy: http://blog.enthought.com/?p=62
The created array does not actually own the data, but its base attribute points to an object, which frees the memory if the numpy array gets deallocated. I guess this is the behavior you want to achieve.
Here is a cython implementation (for a uint8 array)
Gregor
"""
see 'NumPy arrays with pre-allocated memory', http://blog.enthought.com/?p=62
"""
import numpy as np
from numpy cimport import_array, ndarray, npy_intp, set_array_base, PyArray_SimpleNewFromData, NPY_DOUBLE, NPY_INT, NPY_UINT8
cdef extern from "stdlib.h":
void* malloc(int size)
void free(void *ptr)
cdef class MemoryReleaser:
cdef void* memory
def __cinit__(self):
self.memory = NULL
def __dealloc__(self):
if self.memory:
#release memory
free(self.memory)
print "memory released", hex(<long>self.memory)
cdef MemoryReleaser MemoryReleaserFactory(void* ptr):
cdef MemoryReleaser mr = MemoryReleaser.__new__(MemoryReleaser)
mr.memory = ptr
return mr
cdef ndarray frompointer(void* ptr, int nbytes):
import_array()
#cdef int dims[1]
#dims[0] = nbytes
cdef npy_intp dims = <npy_intp>nbytes
cdef ndarray arr = PyArray_SimpleNewFromData(1, &dims, NPY_UINT8, ptr)
#TODO: check for error
set_array_base(arr, MemoryReleaserFactory(ptr))
return arr
def test_new_array_from_pointer():
nbytes = 16
cdef void* mem = malloc(nbytes)
print "memory allocated", hex(<long>mem)
return frompointer(mem, nbytes)
> There is an excellent blog entry from Travis Oliphant, that describes
> how to create a ndarray from existing data without copy:
> http://blog.enthought.com/?p=62
> The created array does not actually own the data, but its base
> attribute points to an object, which frees the memory if the numpy
> array gets deallocated. I guess this is the behavior you want to
> achieve.
> Here is a cython implementation (for a uint8 array)
Even better: the addendum!
http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-from-pre-allocated-memory/
Within cython:
cimport numpy
numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor))
Seems OK.
Any objections about that ?
--
Fabrice Silva
> Le jeudi 15 décembre 2011 à 18:09 +0100, Gregor Thalhammer a écrit :
>
>> There is an excellent blog entry from Travis Oliphant, that describes
>> how to create a ndarray from existing data without copy:
>> http://blog.enthought.com/?p=62
>> The created array does not actually own the data, but its base
>> attribute points to an object, which frees the memory if the numpy
>> array gets deallocated. I guess this is the behavior you want to
>> achieve.
>> Here is a cython implementation (for a uint8 array)
>
> Even better: the addendum!
> http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-from-pre-allocated-memory/
>
> Within cython:
> cimport numpy
> numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor))
>
> Seems OK.
> Any objections about that ?
This is ok, but CObject is deprecated as of Python 3.1, so it's not portable to Python 3.2.
Gregor
My guess is then that the PyCapsule object is the way to go...
--
Fabrice Silva
Another way: With recent NumPy you should be able to do something like
this in Cython
cdef class SomeBufferWrapper:
...
def __getbuffer__(self, ...): ...
def __releasebuffer__(self, ...): ..
arr = np.asarray(SomeBufferWrapper(buf))
and then __releasebuffer__ will be called then `arr` goes out of use.
See Cython docs.
Dag
Devs, looking into this, I noticed that we use PyDataMem_NEW() and
PyDataMem_FREE() (which is #defined to malloc() and free()) for
handling the data pointer. Why aren't we using the appropriate
PyMem_*() functions (or the PyArray_*() memory functions which default
to using the PyMem_*() implementations)? Using the PyMem_*() functions
lets the Python memory manager have an accurate idea how much memory
is being used, which can be important for the large amounts of memory
that numpy arrays can consume.
I assume this is intentional design. I just want to know the rationale
for it and would like it documented. I can certainly understand if it
causes bad interactions with the garbage collector, say (though hiding
information from the GC seems like a suboptimal approach).
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
_______________________________________________
NumPy-Discussion mailing list
NumPy-Di...@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
> > How can do I do such thing ?
> You can't, really. numpy-owned arrays will be deallocated with numpy's
> deallocator. This may not be the appropriate deallocator for memory
> that your library allocated.
Coming late to the battle, but I recently followed the same route, and
came to similar conclusions: using the owndata flag is not suited, and
you will need you own deallocator.
I implemented a demo code showing all the steps to implement this
strategy to bind an existing C library with Cython in
https://gist.github.com/1249305
in particular, the deallocator is in
https://gist.github.com/1249305#file_cython_wrapper.pyx
I hope that this code sample is usefull.
Gael