(GSoC project?) Using gcc-python-plugin for pxd generation

53 views
Skip to first unread message

Dag Sverre Seljebotn

unread,
Mar 21, 2012, 3:11:50 PM3/21/12
to cython...@googlegroups.com, cytho...@codespeak.net
[CC-ing cython-users because this is a remarkably simple opportunity to
enter into Cython development for whoever wants to, and because I
believe it's a very good candidate for a GSoC project.]

So at PyCon, me and Mark talked with David Malcolm, who is the author of
the Python plugin for gcc [1]. What it allows you to do is very easily
hook your own Python script into the compilation pipeline of existing
gcc on your system. I.e., given that gccpython.so is built, one can do

LD_PRELOAD=python27.so gcc -fplugin=/path/to/gccpython.so
-fplugin-arg-python-script=myscript.py -DMYFLAG=1 -o foo.o -c foo.c

Then, given myscript.py (example in footnote [2]), one gets direct
access to the internal AST of gcc.

An obvious usecase is to use this to parse C header files, and generate
pxd files from them. One simply creates a .c-file that includes a given
header-file, then runs the above line with a Python script that walks
the abstract syntax tree and emits the pxd file.

Existing work on auto-generation of pxd files has mostly used gccxml,
which is a pain to compile and rather unsupported if my understanding is
correct. This approach is much more elegant, and will often work with
the gcc the user has installed (even though an extension module must be
compiled, and GCC development headers present on the system).

For prior work, there is already cwrap,

https://github.com/enthought/cwrap

which separates the backend and the frontend. So one could write a
gcc-python-plugin frontend in addition to the current gccxml frontend.
OTOH, I worry that this project is a tad over-engineered and wouldn't
oppose something that just used gcc-python-plugin and emits declarations
directly as strings.

I think this makes for a *great* project for a GSoC student, or anyone
else. It is something that Cython users are really craving, though one
doesn't need to dig into the dirty internals of Cython. I'd be happy to
be a GSoC mentor for this.

I sat down with David Malcolm and made sure I could make this do what I
wanted to do, so I may be able to provide further details to anyone
interested.

Dag

[1]
http://gcc-python-plugin.readthedocs.org/en/latest/index.html
https://fedorahosted.org/gcc-python-plugin/

[2]
# This script is adapted from another usecase so not everything makes
# immediate sende, but it shows how easy it is to get access to the
# guts of the GCC AST.

import gcc
from gccutils import get_src_for_loc, cfg_to_dot, invoke_dot

def on_pass_execution(p, fn):
if p.name == '*free_lang_data':
# The '*free_lang_data' pass is called once,
# rather than per-function,
# and occurs immediately after "*build_cgraph_edges",
# which is the
# pass that initially builds the callgraph
#
# So at this point we're likely to get a good view of
# the callgraph before further optimization passes manipulate
# it
for u in gcc.get_translation_units():
for decl in u.block.vars: # vars means decls
print('%r:%s:%s' % (decl.location, decl, decl.type))


gcc.register_callback(gcc.PLUGIN_PASS_EXECUTION,
on_pass_execution)

Robert Bradshaw

unread,
Mar 21, 2012, 4:29:22 PM3/21/12
to cython...@googlegroups.com
+1, this sounds like a great, self-contained useful project for
someone who can't/doesn't want to dive into the internals of the
compiler itself (admittedly a hard task for many students during a
single summer).

I've heard clang has great bindings too, especially for C++, but is of
course not as commonly-used as gcc.

Reply all
Reply to author
Forward
0 new messages