Passing struct by value

782 views
Skip to first unread message

Eli Bendersky

unread,
Mar 2, 2013, 9:16:35 AM3/2/13
to pytho...@googlegroups.com
Hello,

Suppose I have this C code:

typedef struct {
  int a, b;
} Data;

void func(Data d);

Is my understanding correct that once this is consumed by ffi.cdef, I can only create "Data*" objects, not "Data" objects? If so, how is the passing-by-value to func done? Does cffi magically know that a "Data*" object created with ffi.new is a value? What about passing it to an actual Data* argument?

The question is basic, but I can't find a place in the documentation that explicitly mentions this (adding some discussion would be nice :-)

Thanks in advance,
Eli

Armin Rigo

unread,
Mar 2, 2013, 12:05:56 PM3/2/13
to pytho...@googlegroups.com
Hi Eli,

On Sat, Mar 2, 2013 at 3:16 PM, Eli Bendersky <eli...@gmail.com> wrote:
> Is my understanding correct that once this is consumed by ffi.cdef, I can
> only create "Data*" objects, not "Data" objects?

You can do 'ffi.new("Data *")[0]'.

Did you try 'ffi.new("Data")'? Maybe it makes sense to allow that
too, actually, if Data is a struct type. I guess it would serve the
principle of least surprize. So far ffi.new() only supports pointers
and array types. Supporting struct types might also, as a bonus, mean
that we can kill a special corner case: right now, you have to do
p=ffi.new("Data *"), but then both p and p[0] actually keep the object
alive, which is strange.


A bientôt,

Armin.

Eli Bendersky

unread,
Mar 2, 2013, 9:51:24 PM3/2/13
to pytho...@googlegroups.com
On Sat, Mar 2, 2013 at 3:16 PM, Eli Bendersky <eli...@gmail.com> wrote:
> Is my understanding correct that once this is consumed by ffi.cdef, I can
> only create "Data*" objects, not "Data" objects?

You can do 'ffi.new("Data *")[0]'.


So if I have a C function taking a Data by value, I should first create a new Data* with ffi.new and then "dereference" it with [0], and pass the result to the function? This should be at least documented :-)

 
Did you try 'ffi.new("Data")'?  

The documentation states "The ctype is usually some constant string describing the C type. It must be a pointer or array type." and AFAICS this is enforced - ffi.new('Data') causes:

    TypeError: expected a pointer or array ctype, got 'struct $Data'
 
Maybe it makes sense to allow that
too, actually, if Data is a struct type.  I guess it would serve the
principle of least surprize.  So far ffi.new() only supports pointers
and array types.  Supporting struct types might also, as a bonus, mean
that we can kill a special corner case: right now, you have to do
p=ffi.new("Data *"), but then both p and p[0] actually keep the object
alive, which is strange.

Yes, I agree. But it should be very clear how the resulting values differ (the one from Data and the one from Data*). In general, I think that cffi (or any other C-interfacing library) should optimize for mapping C types rather than mapping Python types. Once we're in the land of Python, everything is easy. But since the goal of using such a library is ultimately to interface to C code, it's nice to allow easily create (and introspect, for debuggability) all possible C types.

This is subjective, of course. My first impression of CFFI was that ffi.cdef is awesome in the sense that it allows me not to re-create the C declarations in Python, and very importantly does type checking on the actual calls (instead of ctypes' segfaults when wrong arguments are passed). *However*, once it came to actually creating the objects to pass to the function pulled from the .so, it was somewhat untrivial to figure out how to create such arguments. This may be just an issue of documentation, of course.

Eli



Armin Rigo

unread,
Mar 3, 2013, 3:16:07 AM3/3/13
to pytho...@googlegroups.com
Hi Eli,

On Sun, Mar 3, 2013 at 3:51 AM, Eli Bendersky <eli...@gmail.com> wrote:
> Yes, I agree. But it should be very clear how the resulting values differ
> (the one from Data and the one from Data*). In general, I think that cffi
> (or any other C-interfacing library) should optimize for mapping C types
> rather than mapping Python types. Once we're in the land of Python,
> everything is easy. But since the goal of using such a library is ultimately
> to interface to C code, it's nice to allow easily create (and introspect,
> for debuggability) all possible C types.

I think your point is only about passing structs by value to calls.
In most cases, what we want with a struct is instead to get a pointer
to it. For example this is the case in order to pass it by pointer to
a call, to store it inside another field, and so on. But I'll fix
ffi.new() to allow "Data".

If the unclarity is actually more general, please give me a few other
examples :-)


A bientôt,

Armin.

Eli Bendersky

unread,
Mar 3, 2013, 11:46:40 AM3/3/13
to pytho...@googlegroups.com
Yes, you're right. The main confusion was about the struct by value vs. by reference issue. But I hope you see that it's a fundamental issue that can put other things in question. If you fix ffi.new to allow cleanly separating structs created by value, that can help.

Eli



Bradley Froehle

unread,
Mar 3, 2013, 4:09:38 PM3/3/13
to pytho...@googlegroups.com, ar...@tunes.org
Hi Armin,


On Saturday, March 2, 2013 9:05:56 AM UTC-8, Armin Rigo wrote:
On Sat, Mar 2, 2013 at 3:16 PM, Eli Bendersky <eli...@gmail.com> wrote:
> Is my understanding correct that once this is consumed by ffi.cdef, I can
> only create "Data*" objects, not "Data" objects?

You can do 'ffi.new("Data *")[0]'.

I'm glad I found this thread because I was running into some issues allocating just a structure.  For example, the first thing I tried seems to actually release the memory at the next garbage collection::

   ffi.new("Data[]", 1)[0]

Also, I find the 'ffi.new("Data *")' syntax quite confusing.  Shouldn't this just create enough memory to hold a pointer?  That is, this seems like a direct translation of the line 'Data *ptr;' in C which would not allocate an entire struct's worth of memory.

Regards,
Brad

Armin Rigo

unread,
Mar 4, 2013, 3:29:33 AM3/4/13
to pytho...@googlegroups.com
Hi Bradley,

On Sun, Mar 3, 2013 at 10:09 PM, Bradley Froehle <brad.f...@gmail.com> wrote:
> Also, I find the 'ffi.new("Data *")' syntax quite confusing. Shouldn't this
> just create enough memory to hold a pointer? That is, this seems like a
> direct translation of the line 'Data *ptr;' in C which would not allocate an
> entire struct's worth of memory.

We went through various solutions in the past. The idea is that
ffi.new("Data *") should be equivalent to:

Data *ptr = malloc(sizeof(Data));

It has been found less confusing than calling this operation
ffi.new("Data"), because it returns a <cdata 'Data *'>. It is natural
in this context: say you want to do the equivalent of the following C
code:

int readdir_r(DIR *dirp, struct dirent *entry, struct dirent **result);

struct dirent *entry = malloc(size(struct dirent));
struct dirent **result = malloc(size(struct dirent *));
readdir_r(dirp, entry, result);

Then it's in Python:

entry = ffi.new("struct dirent *")
result = ffi.new("struct dirent **")
readdir_r(dirp, entry, result)

We could possibly change it again --- this thread gives the idea that
malloc() is not the right abstraction at all, and instead we should
just go with "variable declarations". This would mean that we
translate this C code:

struct dirent entry;
struct dirent *result;
readdir_r(dirp, &entry, &result);

into:

entry = ffi.var("struct dirent")
result = ffi.var("struct dirent *")
readdir_r(dirp, ffi.addressof(entry), ffi.addressof(result))

But we already tweaked and changed this particular interface a few
times (though not like that so far), so I guess I'll need more input
to know if it looks like a good idea or now...


A bientôt,

Armin.

Eli Bendersky

unread,
Mar 4, 2013, 8:25:58 AM3/4/13
to pytho...@googlegroups.com
Yes, this is a good summary of things. I don't think that natively supporting by-value structures is too disruptive. You can think of it in stages - will adding ffi.new('struct Data') support require changing other places? If not then the second stage is simply renaming "new" to something. It may be "var", or it may be "define" or "create". It doesn't really matter.

P.S. is the limitation on structs with bitfields and on unions passed by value related? ctypes seems to handle them in most cases, and to *replace* ctypes, cffi would have to, as well.

Eli


Armin Rigo

unread,
Mar 4, 2013, 9:38:00 AM3/4/13
to pytho...@googlegroups.com
Hi Eli,

On Mon, Mar 4, 2013 at 2:25 PM, Eli Bendersky <eli...@gmail.com> wrote:
> P.S. is the limitation on structs with bitfields and on unions passed by
> value related? ctypes seems to handle them in most cases, and to *replace*
> ctypes, cffi would have to, as well.

Ah, no, it's not related. The issue is that the underlying libffi
doesn't support them. In ctypes, which also uses libffi, the code
apparently forgets to check against them --- and pass them anyway to
libffi, pretending it is always a regular struct with no bitfield. So
basically, ctypes tell libffi lies, and on x86-32 it doesn't matter at
all (only the total size of the struct/union matters). On x86-64 it
matters in some cases: http://bugs.python.org/issue16575 . So yes,
ctypes "handles them in most cases"... on x86-32.

Note that it's a limitation that can be worked around if you use
ffi.verify(): define your own C functions with a simpler signature, as
a wrapper to the real functions.


A bientôt,

Armin.

Eli Bendersky

unread,
Mar 4, 2013, 9:56:38 AM3/4/13
to pytho...@googlegroups.com
On Mon, Mar 4, 2013 at 6:38 AM, Armin Rigo <ar...@tunes.org> wrote:
Hi Eli,

On Mon, Mar 4, 2013 at 2:25 PM, Eli Bendersky <eli...@gmail.com> wrote:
> P.S. is the limitation on structs with bitfields and on unions passed by
> value related? ctypes seems to handle them in most cases, and to *replace*
> ctypes, cffi would have to, as well.

Ah, no, it's not related.  The issue is that the underlying libffi
doesn't support them.  In ctypes, which also uses libffi, the code
apparently forgets to check against them --- and pass them anyway to
libffi, pretending it is always a regular struct with no bitfield.  So
basically, ctypes tell libffi lies, and on x86-32 it doesn't matter at
all (only the total size of the struct/union matters).  On x86-64 it
matters in some cases: http://bugs.python.org/issue16575 .  So yes,
ctypes "handles them in most cases"... on x86-32.

Ah, so I guess libffi didn't bother to implement the x64 calling convention for unions and structs with bitfields... that sucks. I guess we should at least reject them gracefully in ctypes then (updated issue 16575).
 
Note that it's a limitation that can be worked around if you use
ffi.verify(): define your own C functions with a simpler signature, as
a wrapper to the real functions.

I see, so you can use ffi.verify to define a new C function that wraps the imported one and lets the compiler handle the argument lowering. ffi.verify seems like an even less appropriate name now ;-) It's essentially a way to sneak in "inline C" into Python code.

Eli
 
 
Reply all
Reply to author
Forward
0 new messages