Hi Marco,
Sorry for not answering you until now.
On 28 September 2016 at 10:00, Marco Gario <
marco...@gmail.com> wrote:
> ABI vs API
> ----------
>
> My understanding of the main criticism about ABI usage, is its
> limitations in dealing with positioning of fields in
> structs. However, most of the APIs that we use have very simple
> structs, and mostly operate through opaque objects. Are there
> other cases that can lead to problems?
>
> The *huge* benefit that I see on using the ABI is that we can
> have literally the same code work for Python 3/Python 2, CPython/
> PyPy, and OSx/Linux/Win. I've tried with a different wrapper [3],
> and I was very excited to see the same code work out-of-the-box
> on OSx (I didn't try it on Win).
>
> Where can I find a more detailed discussion of the problems with ABI?
See
http://cffi.readthedocs.io/en/latest/overview.html#abi-versus-api .
I'm surprized that you see the benefit of ABI to be portability. It
is actually the other way around: the API mode is more portable. You
have to use a C compiler, but that's the only drawback (which exists
mainly on Windows). (Note for example that Python 3 added a "stable
ABI" which means you don't even need to recompile when upgrading the
version of CPython.)
The benefits of the API mode are that it is generally much safer
because it works at the level of C, instead of at the level of the
machine's ABI. This fact is the main reason for why I generally push
for API mode instead of ABI mode.
But, more generally, this is also an answer to some of your questions.
Let's start with the .h file containing lots of small macros doing
some extra checks. If you use the ABI mode you can't use the macros
at all. In the API mode, you can use them as if they were functions.
You don't need to dig inside the .h file to find out what are the real
functions invoked by these macros.
That's also a reason for portability: the authors of the library can
change what is a function and what is a macro, and some internal
details in the .h file, without you needing to adapt---as long as the
changes are done in a way that should be invisible to typical C
programs using the header.
That also gives my answer to the question of copy-pasting the whole
.h. Yes, there are various tools for various use cases that exist
outside the scope of CFFI, but my answer is that there is no general
way to do that. In your case, for example, it wouldn't work when you
have macros in (this version of) the .h file. For that case you
really have to write in the cdef a line that looks like a function
declaration, and this exact line is not in the .h. More precisely, I
mean that if the .h contains something like:
#define foo(a, b, c) ((a) == NULL ? -1 : _internal_foo(a, b, c))
int _internal_foo(T *a, int b, long c);
Then what you'd like to have in API mode is:
cdef("""
int foo(T *a, int b, long c);
""")
Finally, as pointed out by Daniel Holth, usually the CFFI wrapper part
should be hidden from the rest of the Python program. For example, if
you have a lot of opaque pointers, then ideally you shouldn't pass
around cdata pointer objects through the rest of the Python program,
but only wrappers in the form of instances of a class. For example,
if your .h has these lines:
struct foo *create_foo(void);
int length_of_foo(structr foo *);
void destroy_foo(struct foo *);
Then you'd write this in your Python library:
class Foo(object):
def __init__(self):
h = lib.create_foo()
if h == ffi.NULL:
raise Exception("oups")
self._handle = ffi.gc(h, lib.destroy_foo)
def length(self):
return lib.length_of_foo(self._handle)
Doing that has the advantage of giving a library that feels a bit more
Pythonic; but also, it lets you more easily hide repetitive behavior,
or tweak it later. In the example above, Foo hides the destruction of
the object. You can also transform the result of functions at this
point:
class Foo(object):
def call_that_returns_msat_term(self, arg):
return _msat_term(lib.call_that_returns_msat_term(self._handle, arg))
def _msat_term(cdata):
return (cdata.foo, cdata.bar) # or an instance with a
__repr__, or anything Python-like
Or, as you mention unicode/bytes:
def stuff_with_text(self, mytext):
# e.g. in this library, ascii encoding is fine
return lib.stuff_with_text(self._handle, mytext.encode('ascii'))
Yes, what I'm describing is more work than a bare dlopen() link: you
need to look at the documentation (or reverse-engineer from the
header) to find a proper C signature for something that may be a
macro; and you need to write three lines of Python code for every C
function. This process is not automatic, but you get a much more
Python-friendly wrapper in the end.
In order to automate this process, you could try to look around: there
are some CFFI wrapper generators that can guess what should be written
from looking at the header, and write automatically the boilerplate I
described above---in specific cases. They always have to assume a lot
of things and certainly won't work when given a random header. It's
likely not worth it unless the C API that you need to expose has
several hundred functions and you can't go with an
"add-them-as-I-need-them" approach. If these conditions are not met,
then you'll spend much more time tweaking the binding generator than
just writing directly N times 3 lines of code...
A bientôt,
Armin.