unpack a C array but to a numpy ndarray

bruno pinçon

unread,

Mar 13, 2021, 11:10:28 AM3/13/21

to python-cffi

Hi all,

I 'm discovering this great tool, many thanks to the contributors.

My question concerns the best way to unpack an array allocated by the C code in a numpy array (ffi.unpack works well but gives a list and I need numpy arrays).

Suppose I want to interface a simple C function which builds an array whose dimensions

cannot be known before the call, something like the testA header given just hereafter

(nres being the length of the created/malloc array res) :

import numpy as np
from cffi import FFI
ffi = FFI()
ffi.cdef("""
         int testA(double **res, int *nres);
         void free_mem(void *ptr);
         """)
C = ffi.dlopen("./libtestA.so")
res = ffi.new("double **")
nres = ffi.new("int *")
flag = C.testA(res, nres)

My guess and its seems to work is to do the following :

r = ffi.new("double[]", nres[0])
nbytes = nres[0]*ffi.sizeof("double")
ffi.memmove(r,res[0],nbytes)
buf = ffi.buffer(r,nbytes)
y = np.frombuffer(buf, dtype=float)

(an alternative should be to set an ndarray of the good size and type and

do the memmove between the C array to the "buffer" of the ndarray). Finally I free the

C array with :

# free memory allocated by testA (using stdlib malloc)
# (free_mem is just a wrapper to stdlib free)
C.free_mem(res[0])

My questions are :

i/ is this a good way to do that ?

ii/ is there already something (in cffi or in an additionnal package) to do this

(like ffi.unpack)

iii/ my code is "manual", is it possible to write a function which do this for different kinds of arrays (various int and float arrays) by converting the "ffi ctype" to the good "ndarray dtype" ? Maybe using the long way through ffi.unpack (to get a list then transforms it with the np.array function) when there is no direct equivalent types (I think of some long double on 80/81 bits).

iv/ maybe there is a short cut to free the C array memory than calling a wrapper to free ?

Many thanks for advice

Bruno

Matthias Geier

unread,

Mar 14, 2021, 6:01:04 AM3/14/21

to pytho...@googlegroups.com

Hi Bruno.

If your library function testA() allocates memory for the resulting
"double" array, there is no need for you to allocate memory of the
same size again (what you did in your variable "r").

You are making an unnecessary allocation and an unnecessary copy of
your data, do you really want to do that?

You can simply use ffi.buffer() (and then np.frombuffer()) on "res"
(or probably "res[0]"?).

Of course you can then only call free_mem() when you are done using
the NumPy array.

You can use ffi.gc() to automatically call free_mem() for you:
https://cffi.readthedocs.io/en/latest/ref.html#ffi-gc

If you put all this in a helper function, you have to make sure that
the result of ffi.gc() is kept alive beyond the scope of the helper
function, otherwise the memory might be freed prematurely.

cheers,
Matthias

bruno pinçon

unread,

Mar 14, 2021, 6:46:27 AM3/14/21

to python-cffi

Hi Matthias,

thanks for your answers. Yes the copy is, a priori, what I want (or what I wanted) to do : I read somewhere that it is dangerous to use (for a python object) a chunk of memory which have been

allocated "outside" python. But maybe this is not always true, in particular if stdlib malloc have been used ? I like this idea (of not copying) as I want to work on big arrays ;-) . I was used to use the old scilab software (kind of free matlab) which had its own peculiar memory management and such copies was mandatory to interface C codes.

PS : I already read the ffi-gc help page but I was not sure to have all well understood. I will try again ;-).

cheers

Bruno

Matthias Geier

unread,

Mar 15, 2021, 5:57:58 AM3/15/21

to pytho...@googlegroups.com

Hi Bruno.

On Sun, Mar 14, 2021 at 11:46 AM bruno pinçon wrote:
>
> Hi Matthias,
> thanks for your answers. Yes the copy is, a priori, what I want (or what I wanted) to do : I read somewhere that it is dangerous to use (for a python object) a chunk of memory which have been
> allocated "outside" python.

It is only dangerous if you don't know when the memory is freed.

Of course you have to be careful, but if you control both the
allocation and the deallocation (via C API functions), it shouldn't be
a problem.

> But maybe this is not always true, in particular if stdlib malloc have been used ?

You don't really have to worry whether malloc is used or not, you just
have to always use "matching" functions for allocation and
deallocation.

If your C API provides both, I would assume that those are the correct
functions to use.

> I like this idea (of not copying) as I want to work on big arrays ;-) . I was used to use the old scilab software (kind of free matlab) which had its own peculiar memory management and such copies was mandatory to interface C codes.

Yes, I remember the same from the Matlab MEX API, you always had to
copy data because Matlab arrays had (have?) a copy-on-write mechanism.
I'm very glad that with NumPy arrays this is much more straightforward
and less wasteful!

> PS : I already read the ffi-gc help page but I was not sure to have all well understood. I will try again ;-).

Yes, ffi.gc() is exactly the tool to make sure memory allocated with a
certain C API function is freed at the right time with the appropriate
deallocation function.

But, as I said in my previous e-mail, you have to make sure that the
Python variable returned from ffi.gc() is alive long enough.

cheers,
Matthias

bruno pinçon

unread,

Mar 15, 2021, 11:04:13 AM3/15/21

to pytho...@googlegroups.com

HI Matthias,

maybe the fog about cffi magic is beginning to disappear for me ;-) Many thanks.

If I have understood (not sure at this point) :

I can transform directly my C array res[0] of length nres[0] into a buffer with :

buf = ffi.buffer(res[0],ffi.sizeof("double")*nres[0])

then get a numpy array with :

y = np.from_buffer(buf, dtype=float)

So I don't need a copy of the array res[0] (good).

After that, as soon as y and buf are not any more referenced, the garbage collector of python can free the underlying

objects but won't be able to free the memory used by my C array res[0] (as Python has not the ownership on it) ?

For that purpose (avoid memory leaks...), I should use ffi.gc before these 2 instructions, something like :

new_res = ffi.gc(res[0], C.free_mem)

buf = ffi.buffer(new_res, ffi.sizeof("double")*nres[0])

y = np.from_buffer(buf, dtype=float)

and now python has the ownership onto newres and will be able to free the memory used by the C array res[0] ?

I have tested this (using big arrays and top in another console to see the memory usage of the python3 process involved)

and effectively it seems to work (no derive).

Lastly as the C code uses stdlib malloc, is my wrapper (free_mem) on to stdlib free, necessary ? That is, is it possible to

tell ffi.gc that I want to use stdlib free directly ?

Well it will be a long time that I didn't face with link of various libs in a code, I see that the API mode
seems interesting (I have just try ABI currently) and discover that it seems possible with the distutils

to create a python package containing c code in a independent OS way (that is with compilation of the

C code and the interface C code without using other tools (like autotools, cmake, ...), looks really interesting

(a long time ago I wrote interfaces on sparse solvers (umfpack, taucs cholesky) for scilab but my interface

build was only working on linux...).

Cheers

Bruno

--
-- python-cffi: To unsubscribe from this group, send email to python-cffi...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/python-cffi?hl=en
---
You received this message because you are subscribed to the Google Groups "python-cffi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-cffi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/python-cffi/CAFesC-dgDC9BpG0UU%2BTMKW-fue2J0oYx%3DMJ6_5GcmiPr0z2kKg%40mail.gmail.com.

Armin Rigo

unread,

Mar 15, 2021, 2:28:58 PM3/15/21

to pytho...@googlegroups.com

Hi Bruno,

On Mon, 15 Mar 2021 at 16:04, bruno pinçon <bruno....@gmail.com> wrote:
> new_res = ffi.gc(res[0], C.free_mem)
> buf = ffi.buffer(new_res, ffi.sizeof("double")*nres[0])
> y = np.from_buffer(buf, dtype=float)

Right, but if you know that free_mem() should be called immediately
after these three lines, it is simpler to call it explicitly. For
example:

try:
buf = ffi.buffer(res[0], ffi.sizeof("double")*nres[0])
y = np.from_buffer(buf, dtype=float)
finally:
C.free_mem(res[0])

(with the try: finally: to ensure that even if an exception occurs, it
is still freed). This has the advantage of not relying on the
promptness of the Python GC: if you run on PyPy instead of CPython,
then the previous version would call free_mem() later than expected.

Another note: there is no way to call the free() function directly,
because notably on Windows there is no such thing as "the" free()
function, with every libc providing its own and Windows having
multiple APIs of its own in addition. But another reason is the extra
symmetry: if you call some C API that itself invokes malloc() and
documents that you should call free(), then the call to free() should
be done in the same way, to ensure that you're really calling the
right thing.

A bientôt,

Armin.

bruno pinçon

unread,

Mar 15, 2021, 4:12:24 PM3/15/21

to python-cffi

Thanks Armin, but I'm a little troubled by :

try:

buf = ffi.buffer(res[0], ffi.sizeof("double")*nres[0])
y = np.from_buffer(buf, dtype=float)
finally:

C.free_mem(res[0])

so maybe I miss something... The finally clause will be executed even if all goes well in the try clause.

So the values of my y ndarray will be lost (as soon as the corresponding memory will be used for

another object), no ?

Bruno

PS : i just try cffi with struct, this tool is really great ;-)

Armin Rigo

unread,

Mar 16, 2021, 2:44:55 AM3/16/21

to pytho...@googlegroups.com

Hi Bruno,

On Mon, 15 Mar 2021 at 21:12, bruno pinçon <bruno....@gmail.com> wrote:
> Thanks Armin, but I'm a little troubled by :
>
> try:
> buf = ffi.buffer(res[0], ffi.sizeof("double")*nres[0])
> y = np.from_buffer(buf, dtype=float)
> finally:
> C.free_mem(res[0])
>
> so maybe I miss something... The finally clause will be executed even if all goes well in the try clause.
> So the values of my y ndarray will be lost (as soon as the corresponding memory will be used for
> another object), no ?

Oops, I thought that np.ffrom_buffer() made a copy and you could free
the memory afterwards. If np.from_buffer() is a no-copy operation,
then yes, that's wrong! You need to make sure that the result of
ffi.gc() remains alive as long as you use `y`. Sorry for my bogus
comment.

A bientôt,
Armin.

bruno pinçon

unread,

Mar 16, 2021, 7:15:43 AM3/16/21

to python-cffi

No problem Armin and thanks to Matthias and you for helping me. cffi is really a great tool. I should investigate it more and more

(in particular the API mode) but I'm already able to interface easily some relatively complicated C code (comparing with what

I have done with scilab a long time ago). The fact to manipulate c variables with the cdata objects on the python side is amazing.