You can use either, as long as you free with the one you allocated
with. The PyMem functions allocate memory on the Python heap, and are
optimized for allocating many small objects of similar size over and
over, but IIRC defers to the standard system malloc (plus some
bookkeeping) if the size is big enough (assuming a standard compile
that hasn't #defined it to go elsewhere). Personally, I tend to use
malloc/free. On this note, a useful pattern is
try:
x = malloc(...)
finally:
free(x)
It could be nice to encapsulate this in a context manager.
- Robert
I think I'd prefer variable-sized arrays that would always get
deallocated on exit of the function (which could be implemented as C99
variable sized arrays, with alloca or with malloc, depending on the
size of the array and the availability of the respective
functionalities).
> - Robert
That wouldn't tackle every use case, such as for instance mallocing
stuff in a parallel section (until we get declarations in blocks!),
but special cases can still just malloc and use try blocks, as
demonstrated.
Absolutely.
>> I think I'd prefer variable-sized arrays that would always get
>> deallocated on exit of the function
Why? A context manager is much clearer and gives users total control over
the lifetime of the memory.
>> (which could be implemented as C99
>> variable sized arrays, with alloca or with malloc, depending on the
>> size of the array and the availability of the respective
>> functionalities).
That could still be done for a context manager, just like we do with
gil/nogil blocks today.
> That wouldn't tackle every use case, such as for instance mallocing
> stuff in a parallel section (until we get declarations in blocks!),
> but special cases can still just malloc and use try blocks, as
> demonstrated.
I would consider the usage of memory over the whole lifetime of a function
the special case, not the other way round.
Stefan
That is highly subjective, I think it would be harder to read and
introduce more code blocks and nesting.
> and gives users total control over
> the lifetime of the memory.
>
Yes, but very often you don't need it. And if Cython would support
declarations in blocks you'd get it for free. Supporting that
(disregarding the difficulties of that) would also be helpful in
identifying the scope and privatization rules in parallel blocks.
The thing is that a context manager would be very Cython-specific,
whereas most people are already familiar with arrays of variable size
from C or Java. Lets compare the following statements and decide which
is more aesthetically pleasing:
cdef int array1[m]
cdef double array2[n]
vs
cdef int *array1
cdef double *arrays2
with cython.malloc(sizeof(int) * m), cython.malloc(sizeof(double)
* n) as array1, array2:
...
>
>>> (which could be implemented as C99
>>> variable sized arrays, with alloca or with malloc, depending on the
>>> size of the array and the availability of the respective
>>> functionalities).
>
>
> That could still be done for a context manager, just like we do with
> gil/nogil blocks today.
>
Sure (it was more of an observation than an argument).
>
>> That wouldn't tackle every use case, such as for instance mallocing
>> stuff in a parallel section (until we get declarations in blocks!),
>> but special cases can still just malloc and use try blocks, as
>> demonstrated.
>
>
> I would consider the usage of memory over the whole lifetime of a function
> the special case, not the other way round.
Yes, but the point is not where to deallocate the memory, the point is
that you very often don't care. You need it somewhere in the function,
and deallocation on return is fine (or, "at the end of the block").
Analogously, you don't 'del' your variables once you have stopped
using them.
I also gave this functionality some thought for memoryviews, e.g.
cdef int[:m, :n] myslice # this gets you a view on a cython.array
of size m * n
> Stefan
Not at all. It's the One Way To Do It in Python.
Stefan
FWIW, I think that the variable-sized array approach is much nicer than
using context managers.
Cython already supports C style fixed-array definitions and the C & and
* operators. All these would be written differently in Python, so I
think there's nothing wrong with having variable-length array
definitions in C style rather than with context managers.
Best,
-Nikolaus
--
»Time flies like an arrow, fruit flies like a Banana.«
PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
Some good discussion of the issues brought up here, but a comment:
On 12/10/11 8:31 AM, Aronne Merrelli wrote:
> My current use case is primarily extending python/NumPy by speeding up
> un-vectorizable calculations. So, any results from C-level calculations
> are written into NumPy arrays if I need to keep them.
There may well be a really need for allocating you memory here, but I've
found that I can generally (i.e. every use case I've had so far) speed
up non-vectorizable numpy calculations with pure Cython with no need for
custom memory allocation -- usually just small temporaries on the stack.
See various examples in the wiki.
We'd need to know your use case, but you may be making things more
complicated that you need to.
Just a thought,
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Hi,
Some good discussion of the issues brought up here, but a comment:There may well be a really need for allocating you memory here, but I've found that I can generally (i.e. every use case I've had so far) speed up non-vectorizable numpy calculations with pure Cython with no need for custom memory allocation -- usually just small temporaries on the stack.
On 12/10/11 8:31 AM, Aronne Merrelli wrote:
My current use case is primarily extending python/NumPy by speeding up
un-vectorizable calculations. So, any results from C-level calculations
are written into NumPy arrays if I need to keep them.
See various examples in the wiki.
I would propose that most people using Cython are not familiar with C
and Java, but all are familiar with Python.
> Lets compare the following statements and decide which
> is more aesthetically pleasing:
>
> cdef int array1[m]
> cdef double array2[n]
>
> vs
>
> cdef int *array1
> cdef double *arrays2
>
> with cython.malloc(sizeof(int) * m), cython.malloc(sizeof(double)
> * n) as array1, array2:
I agree. I'd say that
cdef int[m] array1
cdef double[n] array2
is an even clearer way to declare m ints and n doubles.
However, arrays are a bit painful to deal with: they
can't be returned (short of copying them into another type) or
assigned or resized (perhaps that could be supported; not sure of the
syntax). Just think how many buffer overflows are due to a fixed-size
C array holding user data... it's the classic way to smash the stack.
Perhaps this is OK for function-scoped objects.
>>>> (which could be implemented as C99
>>>> variable sized arrays, with alloca or with malloc, depending on the
>>>> size of the array and the availability of the respective
>>>> functionalities).
>>
>>
>> That could still be done for a context manager, just like we do with
>> gil/nogil blocks today.
>>
>
> Sure (it was more of an observation than an argument).
>
>>
>>> That wouldn't tackle every use case, such as for instance mallocing
>>> stuff in a parallel section (until we get declarations in blocks!),
>>> but special cases can still just malloc and use try blocks, as
>>> demonstrated.
>>
>>
>> I would consider the usage of memory over the whole lifetime of a function
>> the special case, not the other way round.
>
> Yes, but the point is not where to deallocate the memory, the point is
> that you very often don't care. You need it somewhere in the function,
> and deallocation on return is fine (or, "at the end of the block").
> Analogously, you don't 'del' your variables once you have stopped
> using them.
>
> I also gave this functionality some thought for memoryviews, e.g.
>
> cdef int[:m, :n] myslice # this gets you a view on a cython.array
> of size m * n
I think something like this would be great. The best solution would be
a chunk of memory that's optionally allocated on the stack (depending
on size and scope), but can be passed around and whose lifetime is
gc'd or tied to the lifetime of a Python object as needed.
- Robert
yup.
> I did originally try that but hit a roadblock somewhere
> - I think I had trouble getting returns from functions if they were not
> declared as pointers.
Well, I jsut say this in another note in thisthread:
On 12/12/11 11:48 AM, Robert Bradshaw wrote:
> cdef int[m] array1
> cdef double[n] array2
>
> is an even clearer way to declare m ints and n doubles.
> However, arrays are a bit painful to deal with: they
> can't be returned (short of copying them into another type)
...
which may be you ran into.
But I also meant that you may be able to everything in your extension
with numpy arrays, and not have to deal with raw C arrays or pointers at
all.
It all depends on your use-case, of course.
Sure, but this discussion is for 'with cython.malloc():' vs arrays of
variable size. In either case the lifetime of the memory is bound to
the block or function. If that's not what you want, then you should
obviously use something else.
> Just think how many buffer overflows are due to a fixed-size
> C array holding user data... it's the classic way to smash the stack.
> Perhaps this is OK for function-scoped objects.
I don't understand, if variable sized arrays don't fix that problem
then the code is simply too broken. No form of memory allocation will
protect you from that.
The memory would be always on the heap there, it would simply create a
view of a cython.array. That would be somewhat more heavy-weight than
a regular (variable-sized) array though, and it would require the GIL.
The advantage of malloc is that you can realloc, variable sized arrays
don't provide this ability. (I agree smashing the stack is a separate
issue, I was pointing out the problem with arrays in general.)
Perhaps we could infer when it's entirely local and no GIL/refcounting
is needed as an optimization. I still think there could be a case for
a Python list-like structure of primitives (in particular
automatically growing, which is a case that arrays/memory views don't
support).
- Robert
I often encapsulate malloc in an extension class, so I can
rely on Python to clean it up for me.
cimport stdlib
cdef class buffer:
cdef void *buf
cdef readonly Py_intptr_t addr
def __cinit__(buffer self, int n):
self.buf = stdlib.malloc(n)
if self.buf == NULL:
raise MemoryError, "malloc(%d) failed" % (n,)
self.addr = <Py_intptr_t> self.buf
def __dealloc__(buffer self):
if (self.buf != 0):
stdlib.free(self.buf)
This also applies to other resource allocation such as fopen/fclose.
When using Cython or C++, it is important to know that an
exception can cause resource leaks if not handled carefully. One of
the most important causes of memory leaks in C++ is use of new[]
and delete[] outside class contructors and deletors. It seems many
programmers are unaware that an exeption can cause parts of the
code to be skipped. The error is even common in C++ textbooks,
which might be the reason it is so common.
Yuor code with try/finally works correctly of course. Though it would
not be possible in C++ as there is no finally. Using class contructors
and destructors is an idiom that works in Cython and C++ alike.
It should perhaps be noted that a simpler options exists. In C++
we can use std::vector and in Cython we can use numpy.ndarray.
AFAIK, we cannot put C++ std::vector on the stack with Cython,
which limits their usefulness in ensuring proper clean-up (we must
call delete on them automatically in Cython).
I can see that a context manager would be useful in Python code
using malloc/free with ctypes, as __del__ might not be called,
e.g. if there is a circular reference, unlike __enter__ and __exit__.
But what would the benefit be in Cython, when the C initialization
and clean-up methods __cinit__ and __dealloc__ are deterministic?
Sturla
The only language I know of that handles stack smashing and automatic
arrays gracefully is Fortran 90 (and later). An automatic array might be
placed on the stack or the heap depending on its size. The compilers are
usually smart enough to emit code that knows (or guess) when to use
malloc or alloca (or some equivalents).
Cython could do that too, if anyone cares to implement it ;-)
Sturla
I was thinking that cython.malloc could be implemented however it
liked, so you wouldn't be able to call realloc on such a pointer. In
any case, I think realloc is quite a special case, one that doesn't
need any special language support, especially considering that you're
going to free the pointer when the block exits. You can always resize
your numpy array or realloc your malloced pointer manually, and if you
really claim try/finally is too hard you can use a cdef class with a
destructor like Sturla demonstrated.
Hm, if we'd implement fused types for cdef classes a user could
reasonably easily write a cdef class with list-like behaviour for the
types needed. One could also resize a numpy array, and cython.array
could implement similar behaviour if needed. Often when I use lists I
already have objects though, I don't often find myself needing a list
of primitive types and a situation where the conversion would be too
expensive.
> - Robert
I think users should be able to do that if they want.
> In
> any case, I think realloc is quite a special case, one that doesn't
> need any special language support, especially considering that you're
> going to free the pointer when the block exits. You can always resize
> your numpy array or realloc your malloced pointer manually
What's wrong with
with cython.malloc(x) as mem:
# do stuff with mem[]
mem.realloc(y) # raise MemoryError on allocation failure
# do more stuff with mem[]
?
> and if you
> really claim try/finally is too hard you can use a cdef class with a
> destructor like Sturla demonstrated.
Destructors have the disadvantage that they are not guaranteed to get
called immediately when the current reference to the object dies. The
"with" statement is made to guarantee exactly this.
Stefan
Users could trivially write such a class themselves though. Again, I
think reallocing memory is quite specific and doesn't deserve any
special language support. A language doesn't have to tackle every
problem in the most convenient way possible, and variable sized arrays
are simply much more intuitive and easier to use as the type is
already in there and you don't see any ugly arithmetic and pointers.
>> In
>> any case, I think realloc is quite a special case, one that doesn't
>> need any special language support, especially considering that you're
>> going to free the pointer when the block exits. You can always resize
>> your numpy array or realloc your malloced pointer manually
>
>
> What's wrong with
>
> with cython.malloc(x) as mem:
> # do stuff with mem[]
> mem.realloc(y) # raise MemoryError on allocation failure
> # do more stuff with mem[]
>
> ?
>
How does cython.malloc() know the type? How can it know what to return
through indexing?
I think 'cdef int[:m, :n] myslice' would be better suited for that.
The memoryviewslice struct could set the memoryview object to NULL
and keep a pointer to the runtime type information. When it would then
be coerced to an object it could create a cython.array and it could go
through the buffer interface. It simply deallocates memory when all
references are lost, and it could work without the GIL.
>
>> and if you
>> really claim try/finally is too hard you can use a cdef class with a
>> destructor like Sturla demonstrated.
>
>
> Destructors have the disadvantage that they are not guaranteed to get called
> immediately when the current reference to the object dies. The "with"
> statement is made to guarantee exactly this.
Again I would argue that I am extremely skeptical of your supposed
need to deallocate the memory immediately. Besides, the class wouldn't
have any reference cycles, so I don't see why it wouldn't deallocate
it the object and memory immediately.
> Stefan
I still think there could be a case for
a Python list-like structure of primitives (in particular
automatically growing,