Cython and conditional compilation

20 views
Skip to first unread message

Florian Weimer

unread,
Feb 10, 2023, 1:45:25 AM2/10/23
to cython...@googlegroups.com, c-std-...@lists.linux.dev
Cython/Includes/cpython/version.pxd contains this recommendation:

# Python version constants
#
# It's better to evaluate these at runtime (i.e. C compile time) using
#
# if PY_MAJOR_VERSION >= 3:
# do_stuff_in_Py3_0_and_later()
# if PY_VERSION_HEX >= 0x02070000:
# do_stuff_in_Py2_7_and_later()
#
# than using the IF/DEF statements, which are evaluated at Cython
# compile time. This will keep your C code portable.

I have seen one case, in an older version of breezy, where this only
works because the compiler accepts calls to undeclared functions
(implicit function declarations) and optimizes out the branches not
taken:

cdef object _split_first_line_unicode(Py_UNICODE *line, int len,
Py_UNICODE *value, Py_ssize_t *value_len):
cdef int i
for i from 0 <= i < len:
if line[i] == c':':
if line[i+1] != c' ':
raise ValueError("invalid tag in line %r" %
PyUnicode_FromUnicode(line, len))
memcpy(value, &line[i+2], (len-i-2) * sizeof(Py_UNICODE))
value_len[0] = len-i-2
if PY_MAJOR_VERSION >= 3:
return PyUnicode_FromUnicode(line, i)
return PyUnicode_EncodeASCII(line, i, "strict")
raise ValueError("tag/value separator not found in line %r" %
PyUnicode_FromUnicode(line, len))

The PyUnicode_EncodeASCII still ends up in the generated C code, but
it's no longer part of the Python 3 API, hence the implicit function
declaration.

It's since been fixed in breezy, but maybe it makes sense to revise the
recommendation in version.pxd?

Thanks,
Florian

D Woods

unread,
Feb 17, 2023, 3:03:54 PM2/17/23
to cython-users
So the principle that Cython tries to work on for its own code is that the version of Python that's used to generate the .c code isn't necessarily the same as the one that you compile the module for. We like it if the generated C code doesn't depend on the version of Python used to call it. Obviously other users may have different views than us, but that's still our recommendation.

In this case, I'd probably use the ability to insert small bits of literal C code instead

```
cdef extern from "some_file.h":
    """
    PyObject *wrap_PyUnicode_FromUnicode(Py_UNICODE *line, int i) {
    #if PY_MAJOR_VERSION > 3
        return PyUnicode_FromUnicode(line, i);
    #else
        return PyUnicode_EncodeASCII(line, i, "strict";
    #endif
    }
    """
    object wrap_PyUnicode_FromUnicode(Py_UNICODE *, int)
```

That keeps the advantage of being able to generate your C sources one. Obviously some people are more comfortable writing C than others, but that's definitely how I'd approach it.

So I don't think we want to substantially change our recommendation. Ideally we'd quite like to get rid of `DEF` and `IF` but we're not really there yet with a suitable replacement for everyone's needs. (Please don't everyone debate it here, again...)

Hopefully that makes sense.

Florian Weimer

unread,
Feb 20, 2023, 9:35:12 AM2/20/23
to D Woods, cython-users
* D. Woods:

> So I don't think we want to substantially change our recommendation.

But it's going to stop working in practice in many cases if the guarded
code uses functions that are not available in all Python versions. You
will have to use conditional compilation for that. Even today, this
approach of compiling both conditionals only works if compiling with
optimization or with lazy binding (the latter is increasingly not the
default anymore). So I think the current advice is unnecessarily
confusing to programmers.

> Ideally we'd quite like to get rid of `DEF` and `IF` but we're not
> really there yet with a suitable replacement for everyone's needs.

Couldn't Cython translate “if PY_MAJOR_VERSION >= 3:” to #if
automatically?

Thanks,
Florian

Robert Bradshaw

unread,
Feb 22, 2023, 12:44:57 AM2/22/23
to cython...@googlegroups.com, D Woods
On Mon, Feb 20, 2023 at 6:35 AM Florian Weimer <fwe...@redhat.com> wrote:
>
> * D. Woods:
>
> > So I don't think we want to substantially change our recommendation.
>
> But it's going to stop working in practice in many cases if the guarded
> code uses functions that are not available in all Python versions. You
> will have to use conditional compilation for that. Even today, this
> approach of compiling both conditionals only works if compiling with
> optimization or with lazy binding (the latter is increasingly not the
> default anymore). So I think the current advice is unnecessarily
> confusing to programmers.

I don't think we want to step away from our desire to generate
Python-version agnostic C files. I'd probably go with the
macro-defined common API as well. Even better would be to avoid a
dependence on the C API at all, e.g.

if PY_MAJOR_VERSION >= 3:
return line[:i]
else:
return line[:i].encode('ASCII', 'strict')

(True, this creates an unneeded intermediate in Python 2, but IMHO
that's not the end of the world these days.)

> > Ideally we'd quite like to get rid of `DEF` and `IF` but we're not
> > really there yet with a suitable replacement for everyone's needs.
>
> Couldn't Cython translate “if PY_MAJOR_VERSION >= 3:” to #if
> automatically?

That's an interesting idea, but I think it'd be far from trivial to
actually implement. (E.g. what if the if statement contains multiple
clauses, some of which are available at compile time and some of which
are not?)
Reply all
Reply to author
Forward
0 new messages