Hi all,
Thanks for all your great work with Cython, it's really a fantastic tool! I'm a maintainer of
freud, a Python/Cython/C++ package for analyzing particle-based simulations. I am currently in the process of transitioning our build from vanilla setuptools+Cython to using CMake and
scikit-build, which basically results in direct command line calls to `cython`, `g++`, etc. So far the process has mostly been smooth, but I'm running into a hiccup with our code coverage metrics where the new builds result in severe under-reporting.
I'm performing an in-place build so that the generate cpp files are right next to the pyx files, e.g. `freud/box.{pyx,cpp}` using our box module as an example. Meanwhile, tests are in `tests/test*.py`, and I run them using a command like `coverage run -m unittest tests/test*.py` from the root of the repo. By running the compilation steps manually and inspecting the
Cython-generated C++, I've been able to track down the problem: there
are differences in the paths to the pyx files being encoded in the
Cython source. In particular, the filename being passed to the `__Pyx_TraceCall` macro needs to be `freud/box.pyx` for coverage to be generated correctly; however, using scikit-build, because it's performing builds inside some nested CMake-generated folder structure, the file somehow just gets encoded as `box.pyx`, and as a result the Cython profiling tool doesn't register the traces to the correct lines of code.
I am able to force the coverage to work by manually modifying the
`__pyx_f` variable in the Cython-generated C++ file and then rerunning
the compilation. I could insert an additional step in our CMake scripts
to do some regex replacement to try and resolve this issue, but that
seems rather dangerous since it depends on internal details of how
Cython implements its profiling. Do you have any suggestions on how I might proceed with addressing this? One possibly more robust way to fix this might be to encode absolute rather than relative paths, but since the `--inplace` option just results in copying the files after compilation I'm not sure that the relevant absolute path information would necessarily be available at the stage where cython generates the C++ code anyway.
Any tips you have would be appreciated!
Thanks,
Vyas
For completeness, here is the complete set of differences I note between the file generated via `Cython.build.cythonize` and by the `cython` call performed by scikit-build:
- Cython generates a variable `static const char *__pyx_f[]` that is used in all subsequent calls to `__Pyx_TraceCall`, which looks to be the macro that implements the modifications to the Python call stack used for tracing. In the working version, the variable looks like
static const char *__pyx_f[] = {
"freud/box.pyx",
...
}
whereas in the not working version it instead contains just "box.pyx".
- The working version also generates an additional variable
static const char __pyx_k_freud_box_pyx[] = "freud/box.pyx";
This variable ultimately gets tossed into another array
static __Pyx_StringTabEntry __pyx_string_tab[] = {..}
that gets used only once, in `__Pyx_InitGlobals`. I'm guessing this is populating some global symbol table? It doesn't appear to be relevant to my issue, but I'm noting it since this variable is not defined in the compiled code that produces incorrect coverage results.
- Using Cython.build.cythonize in setup.py adds a big "Cython Metadata"
comment block to the top of the generated file. As I would expect, adding/removing this
metadata doesn't appear to have any effect since it should be removed by the preprocessor before Cython can do anything with it, but I don't see any obvious
options to the `cython` command line executable that would add this. What controls the presence of this metadata comment block?