Idea: implicit "self" for function pointers in structs

556 views
Skip to first unread message

Matthew Honnibal

unread,
Feb 22, 2015, 4:24:45 PM2/22/15
to cython...@googlegroups.com
Hi,

I'd like to propose a new piece of syntactic sugar. I thought I'd see what people thought about this before I put in the effort to write a proper CEP.

A C struct can contain one or more function pointers. This has long been used in "object oriented C" code, as a kind of light-weight class. I find myself sometimes wanting to do this, even though Cython allows cdef classes. The appeal is:

* Simple semantics. There's no interaction with the Python ref counting, vtable, etc. There's no implicit life-cycle.
* You can store the struct instances in C data structures, such as arrays and vectors.
* You can manually manage data locality.

So, the advantage is it's just a function pointer --- nothing special. The downside is the syntax for calling the function:

cdef struct Foo:
    int (*bar)(Foo* self) except -1
    int _data

cdef int bar1(Foo* self) except -1:
    return self._data

cdef int bar2(Foo* self) except -1:
    return self._data ** 2

def main():
    cdef Foo foo1 = Foo(bar=bar1, _data=2)
    cdef Foo foo2 = Foo(bar=bar2, _data=2)

    print foo1.bar(&foo1)
    print foo2.bar(&foo2)


The explicit passing of the "self" argument is verbose, and really stands out in Python code. And it's brittle --- if I swap some variable names around, I might call pass the wrong instance.

My idea is to apply a little bit of syntactic sugar. If a struct has a function pointer member, and the first argument is named "self", and typed as a pointer to the struct type, then Cython will automatically fill in the argument.

The Cython code:

    foo1.bar()

Would generate the C code:

    foo1.bar(&bar)

Advantages
----------------

This small piece of syntactic sugar would address the main situation in which I'm tempted to write a native C++ extension. Lots of small Python objects are a problem for performance sensitive code. I can't put them in an array or vector, and the Python list is slow and memory intensive. If I want to work in a block where I'm trying to release the GIL, Python objects are particularly bad.

But I do want something like a class, in some of these situations. When these performance-sensitive blocks have to execute complicated logic, I need to pair up data and functions in some way. I can elaborate on the situations when I need this, if asked.

The other way to address the need is to allow weak references to Python C extensions. I noticed Stefan raises this idea on his website somewhere. I think my proposal is much more light-weight, but is still enough to get the job done in most situations.

I think the implicit self argument can be implemented with full backwards compatibility. Calls to the function will be unambiguous, because you can't have a variable number of pointer arguments.

Disadvantages
--------------------

It's not Python and it's not C. It's a Cython-specific thing, so therefore may be surprising.
 
There's substantial functionality overlap between this and cdef classes. Extending support for "C with classes" introduces another way to write certain pieces of code.

It might be confusing to Python programmers new to C and Cython. I can imagine finding the distinction between a struct with a function pointer and a C extension type to be quite subtle, at first glance.


Thoughts?

---

Matthew Honnibal

Robert Bradshaw

unread,
Feb 24, 2015, 4:26:32 AM2/24/15
to cython...@googlegroups.com
Interesting proposal.

While I can see how this would be nice, my initial reaction is that
it's a bit too magical, especially the going off the argument name.
Maybe it could be marked with something like int (*bar)(implicit self)
except -1. Note that there is already syntactic sugar that lets one
write

cdef struct Foo:
int bar(Foo* self) except -1

instead and it's automatically (unambiguously) interpreted as a pointer.


> Advantages
> ----------------
>
> This small piece of syntactic sugar would address the main situation in
> which I'm tempted to write a native C++ extension. Lots of small Python
> objects are a problem for performance sensitive code. I can't put them in an
> array or vector, and the Python list is slow and memory intensive. If I want
> to work in a block where I'm trying to release the GIL, Python objects are
> particularly bad.

In my experience Python lists are pretty efficient. The refcounting
and lack of types is unfortunate though.

> But I do want something like a class, in some of these situations. When
> these performance-sensitive blocks have to execute complicated logic, I need
> to pair up data and functions in some way. I can elaborate on the situations
> when I need this, if asked.
>
> The other way to address the need is to allow weak references to Python C
> extensions. I noticed Stefan raises this idea on his website somewhere. I
> think my proposal is much more light-weight, but is still enough to get the
> job done in most situations.

I think that'd be preferable. Doesn't solve the allocation or memory
issues though.

> I think the implicit self argument can be implemented with full backwards
> compatibility. Calls to the function will be unambiguous, because you can't
> have a variable number of pointer arguments.

There are varargs, and C++ allows for overloads, and there's the issue
of changing a signature so it requires an extra argument but an
unmodified call site not being an error because self was optional.

> Disadvantages
> --------------------
>
> It's not Python and it's not C. It's a Cython-specific thing, so therefore
> may be surprising.
>
> There's substantial functionality overlap between this and cdef classes.
> Extending support for "C with classes" introduces another way to write
> certain pieces of code.
>
> It might be confusing to Python programmers new to C and Cython. I can
> imagine finding the distinction between a struct with a function pointer and
> a C extension type to be quite subtle, at first glance.

Yes, yes, and yes.

The other difference between structs and classes is pass-by-value vs.
pass-by-reference.

You could consider using the experimental_cpp_class_def=True directive

https://github.com/cython/cython/blob/master/tests/run/cpp_classes_def.pyx

too. Of course that's C++ only, but might be fine.

Nils Bruin

unread,
Feb 24, 2015, 8:42:55 PM2/24/15
to cython...@googlegroups.com
On Sunday, February 22, 2015 at 1:24:45 PM UTC-8, Matthew Honnibal wrote:
    print foo1.bar(&foo1)
    print foo2.bar(&foo2)

The explicit passing of the "self" argument is verbose, and really stands out in Python code. And it's brittle --- if I swap some variable names around, I might call pass the wrong instance.

Can't you solve this entirely outside cython, via the C preprocessor, by defining bar(foo) to be a macro that expands to foo.bar(&foo) ? It moves away from the object oriented notation style, but it does solve the brittleness. It's also shorter, since it saves you typing a period. Something like below perhaps

header.h:
#define bar(foo) (foo.bar(&foo))

in your cython file:

cdef extern from "header.h":
    int bar(Foo*) except -1
...
print bar(foo1)
print bar(foo2)
Reply all
Reply to author
Forward
0 new messages