How to add hook to ``setjmp`` before call a C function?

84 views
Skip to first unread message

TitanSnow

unread,
Sep 4, 2017, 9:09:35 AM9/4/17
to python-cffi
Hello,

I'm the author of ffilupa, the lua binding using cffi. I have a question about adding hook into cffi-generated C code

Below is copy from lua reference:

If an error happens outside any protected environment, Lua calls a panic function (see lua_atpanic) and then calls abort, thus exiting the host application. Your panic function can avoid this exit by never returning (e.g., doing a long jump to your own recovery point outside Lua).

The workflow is that

my python code --cffi--> cffi wrapper code(1) --> lua API --> panic function(2), then abort

To avoid exiting, I need to setjmp at (1), and longjmp from (2) to (1)

I found the cffi wrapper code generated is like this

static PyObject *
_cffi_f_lua_pushboolean(PyObject *self, PyObject *args)
{
  lua_State * x0;
  int x1;
  Py_ssize_t datasize;
  PyObject *arg0;
  PyObject *arg1;

  if (!PyArg_UnpackTuple(args, "lua_pushboolean", 2, 2, &arg0, &arg1))
    return NULL;

  datasize = _cffi_prepare_pointer_call_argument(
      _cffi_type(8), arg0, (char **)&x0);
  if (datasize != 0) {
    if (datasize < 0)
      return NULL;
    x0 = (lua_State *)alloca((size_t)datasize);
    memset((void *)x0, 0, (size_t)datasize);
    if (_cffi_convert_array_from_object((char *)x0, _cffi_type(8), arg0) < 0)
      return NULL;
  }

  x1 = _cffi_to_c_int(arg1, int);
  if (x1 == (int)-1 && PyErr_Occurred())
    return NULL;

  Py_BEGIN_ALLOW_THREADS
  _cffi_restore_errno();
  { lua_pushboolean(x0, x1); }
  _cffi_save_errno();
  Py_END_ALLOW_THREADS

  (void)self; /* unused */
  Py_INCREF(Py_None);
  return Py_None;
}

I want to add hook to setjmp before call the lua API like this

PyThreadState *_save;
if (setjmp(global_env) == 0){
  _save = PyEval_SaveThread();
  _cffi_restore_errno();
  { lua_pushboolean(x0, x1); }
  _cffi_save_errno();
  PyEval_RestoreThread(_save);
}else{
  _cffi_save_errno();
  PyEval_RestoreThread(_save);
  PyErr_SetString(...);
}

So that I can longjmp to it in panic function

How to add a hook to it? If there's no official way, is there a way to hack to cffi?

Armin Rigo

unread,
Sep 4, 2017, 1:28:42 PM9/4/17
to pytho...@googlegroups.com
Hi TitanSnow,

On 4 September 2017 at 15:09, TitanSnow <tttnn...@gmail.com> wrote:
> my python code --cffi--> cffi wrapper code(1) --> lua API --> panic
> function(2), then abort
>
> To avoid exiting, I need to setjmp at (1), and longjmp from (2) to (1)

You can do that by writing the logic entirely in C. For example (rough code):


ffibuilder.set_source("my_lua_binding", """
#include <lua.h>

int wrap_call_stuff(int arg)
{
jmpbuf_t buf;
if (setjmp(&buf) == 0) {
call_stuff_from_lua(arg);
return 0; /* ok */
}
else {
return -1; /* oops, error */
}
}
""")
ffibuilder.cdef("""
int wrap_call_stuff(int arg);
""")


A bientôt,

Armin.

TitanSnow

unread,
Sep 4, 2017, 9:03:17 PM9/4/17
to python-cffi
Thanks! But there's still some problems:

  • The thread state and GIL releasing in CPython. Does it trouble?
  • I need to let caller in Python know panic is happened, there is two ways:
    • return a unexpected value, but it's a hard work over lots of C function with different return value type
    • raise a exception, but cffi released GIL before calling C function so that I cannot raise exception in C function

I prefer to raise a exception rather than return a unexpected value, so how to do it?

TitanSnow

unread,
Sep 5, 2017, 12:25:53 AM9/5/17
to python-cffi
I found such code could raise a exception in a C function called by cffi

static void awd(void){
    PyGILState_STATE gstate;
    gstate = PyGILState_Ensure();
    PyErr_SetString(PyExc_RuntimeError, "awd");
    PyGILState_Release(gstate);
}

result:

>>> lib.awd()
RuntimeError: awd

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: <built-in method awd of CompiledLib object at 0x7f572cbe7278> returned a result with an error set

The result shows that the exception is raised successfully, but cffi doesn't not check whether a exception is occurred in C function by ``PyErr_Occurred`` and return a result

I found do this hack to cffi will fix this problem

--- a/recompiler.py
+++ b/recompiler.py
@@ -722,6 +722,7 @@ class Recompiler:
         prnt('  (void)self; /* unused */')
         if numargs == 0:
             prnt('  (void)noarg; /* unused */')
+        prnt('  if(PyErr_Occurred()) return NULL;')
         if result_code:
             prnt('  return %s;' %
                  self._convert_expr_from_c(tp.result, 'result', 'result type'))

This patch means, after call a C function, call ``PyErr_Occurred`` and returns NULL on exception

after apply this patch, no SystemError will raised

>>> lib.awd()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: awd

Is my patch meaningful to cffi? Should I open issue/PR to cffi on "call ``PyErr_Occurred`` after calling C function"?

Armin Rigo

unread,
Sep 5, 2017, 2:34:57 AM9/5/17
to pytho...@googlegroups.com
Hi,

On 5 September 2017 at 06:25, TitanSnow <tttnn...@gmail.com> wrote:
> I found such code could raise a exception in a C function called by cffi
>
> static void awd(void){
> PyGILState_STATE gstate;
> gstate = PyGILState_Ensure();
> PyErr_SetString(PyExc_RuntimeError, "awd");
> PyGILState_Release(gstate);
> }
>
> result:
>
>>>> lib.awd()
> RuntimeError: awd

Right, but that's a hack. In general, you should avoid using any of
the CPython C API ---- for example, I'm quite unsure what occurs with
the above hack on top of PyPy.

The proper way to do it is to add another level of wrappers, this time
in Python. They would call the C functions, check for the return
value, and raise an appropriate Python exception.

More generally, it's considered a good approach to not expose the C
functions directly from modules written using CFFI, but instead to
expose a Python API that internally uses the C functions (either just
as simple Python wrappers, or with classes and stuff). This allows us
to hide the "C-ness" of the underlying C functions, e.g. the fact that
they typically signal errors by returning a value.

So adding a call to PyErr_Occurred() after calling C functions would
be a wrong solution from that point of view.


A bientôt,

Armin.

TitanSnow

unread,
Sep 5, 2017, 3:13:03 AM9/5/17
to python-cffi
yep on pypy it causes this crash:

>>>> lib.awd()
RPython traceback:
  File "pypy_module_cpyext.c", line 17127, in wrapper_second_level__star_2_6
  File "pypy_module_cpyext_1.c", line 37175, in from_ref
Fatal RPython error: AssertionError
Aborted (core dumped)

I agree with signal error by returning a value, but I'm wondering whether it is worth to make it possible to call python C API in C function cffi called

Armin Rigo

unread,
Sep 5, 2017, 8:17:49 AM9/5/17
to pytho...@googlegroups.com
Hi,

On 5 September 2017 at 09:13, TitanSnow <tttnn...@gmail.com> wrote:
> I agree with signal error by returning a value, but I'm wondering whether it
> is worth to make it possible to call python C API in C function cffi called

No, the general idea of CFFI is to be entirely independent of the
Python C API. One goal is to be compatible with other Python
implementations---even though PyPy could possibly fix that problem
now, we don't want to close CFFI to yet other implementations. And
well, it is one of the real motivations to make a clean break from the
*massively huge* and ever-growing Python C API.

If you really want, you can always design and export your own small
custom API to C, using callbacks and handles. For example, you can
pass a list object into C with ffi.new_handle(), and append items to
it with a callback whose Python implementation just does
"x.append(y)".


A bientôt,

Armin.

TitanSnow

unread,
Sep 5, 2017, 11:10:23 PM9/5/17
to python-cffi
I got it. Thanks so much!

By the way, I found each cffi call releases the GIL. Will it cause little performance loss on short-time C function call?
Can I specify which C functions need GIL to be released, and which do not?

Armin Rigo

unread,
Sep 6, 2017, 3:16:05 AM9/6/17
to pytho...@googlegroups.com
Hi,

On 6 September 2017 at 05:10, TitanSnow <tttnn...@gmail.com> wrote:
> By the way, I found each cffi call releases the GIL. Will it cause little
> performance loss on short-time C function call?
> Can I specify which C functions need GIL to be released, and which do not?

No, you can't. I never measured how much time is lost, but my guess
is that it's lost in the noise in anything but microbenchmarks. If
you really have a performance problem, try PyPy instead of CPython;
doing so will make the CFFI calls be JIT-compiled away to little more
than an assembler CALL instruction. The GIL also needs to be released
in PyPy, but that is done much more efficiently, too.

The problem with calling C functions without releasing the GIL is that
it would deadlock in various cases, like if the C function indirectly
calls back Python.


A bientôt,

Armin.

TitanSnow

unread,
Sep 6, 2017, 6:19:15 AM9/6/17
to python-cffi
Thanks again!

TitanSnow

unread,
Sep 22, 2017, 12:11:17 AM9/22/17
to python-cffi
Hi,
I have just done a simple benchmark on the performance loss of GIL releasing and acquiring. It shows the performance loss is mensurable.

The C code is

#define Py_LIMITED_API
#include <Python.h>
static void voidfunc(void){}
static PyObject* with_gil(PyObject *self, PyObject *args){
    Py_BEGIN_ALLOW_THREADS
    {voidfunc();}
    Py_END_ALLOW_THREADS
    Py_XINCREF(Py_None);
    return Py_None;
}
static PyObject* without_gil(PyObject *self, PyObject *args){
    voidfunc();
    Py_XINCREF(Py_None);
    return Py_None;
}
static PyMethodDef meths[] = {
    {"with_gil", with_gil, METH_NOARGS, "with gil"},
    {"without_gil", without_gil, METH_NOARGS, "without gil"},
    {0, 0, 0, 0}
};
static struct PyModuleDef mod = {
    PyModuleDef_HEAD_INIT,
    "gilbrench",
    0,
    -1,
    meths
};
PyMODINIT_FUNC
PyInit_gilbrench(void){
    return PyModule_Create(&mod);
}

The only difference of ``with_gil`` and ``without_gil`` is that one has ``Py_BEGIN/END_ALLOW_THREADS`` and another doesn't

Then the benchmark is

>>> timeit.timeit('func()', globals={'func':gilbrench.without_gil}, number=10**9)
32.65148849999969
>>> timeit.timeit('func()', globals={'func':gilbrench.with_gil}, number=10**9)
37.30776556799901

It shows that the performance loss is mensurable, and on my machine, it has 500ms differ per 10**8 calls

My option is to provide a way to mark a C function not need GIL to be released -- Some C functions are sure won't call python

TitanSnow

Armin Rigo

unread,
Sep 22, 2017, 4:09:19 AM9/22/17
to pytho...@googlegroups.com
Hi,

On 22 September 2017 at 06:11, TitanSnow <tttnn...@gmail.com> wrote:
> I have just done a simple benchmark on the performance loss of GIL releasing
> and acquiring. It shows the performance loss is mensurable.

That's precisely what I meant: it's a microbenchmark. A
microbenchmark showing less than 13% difference usually translates to
real-life performance losses of at most 1-2%, which I personally
consider lost in the noise. If you really care about that, here are
four options:

1. use PyPy. Your CFFI code will be magically *many times* faster.

2. hack at cffi/recompiler.py.

3. contribute to CPython a more efficient version of the macro
Py_BEGIN_ALLOW_THREADS.

4. use C trickery:

ffi.set_source("foo", """
/* usual declarations here */

#undef Py_BEGIN_ALLOW_THREADS
#undef Py_END_ALLOW_THREADS
#define Py_BEGIN_ALLOW_THREADS /* nothing */
#define Py_END_ALLOW_THREADS /* nothing */
""")


A bientôt,

Armin.

TitanSnow

unread,
Sep 22, 2017, 6:55:03 AM9/22/17
to python-cffi
Hi,
Sorry for waste of too much time on this. I'm really grateful for your explanation.

As for PyPy, I'm surprised about it's performance. My work runs 2.5x faster on PyPy. That's amazing, although it becomes very slow while running coverage :)

TitanSnow
Reply all
Reply to author
Forward
0 new messages