compiling Cython files in multiple working directories?

2,747 views
Skip to first unread message

Tom Swirly

unread,
Jun 24, 2016, 12:51:24 PM6/24/16
to cython-users
In my project, I have a small amount of actual .pyx code and a huge amount of generated code.  It makes grep essentially useless and is slowing me down fairly seriously.


If this were C++, I'd move all the generated code into a separate hierarchy and just add a -I to my compile line.

Unfortunately, I'm not finding a way to do this in Cython.  

Indeed, all the documentation talks about the working directory, implying that all the .pyx files involved in a compilation are always visible from one single directory room.

----

As usual, I have a tiny demo, and it's here:  https://github.com/rec/simple-cython/blob/master/simple.pyx#L3-L4   Works, but fails if I uncomment that line.

Is there a way I can change just setup.py file to also allow that .pyx in the other directory to be found?

Thanks in advance!

--
     /t

http://radio.swirly.com - art music radio 24/7 366/1000

Tom Swirly

unread,
Jun 24, 2016, 12:58:10 PM6/24/16
to cython-users

Chris Barker

unread,
Jun 24, 2016, 3:48:33 PM6/24/16
to cython-users
As I understnd it, for ease of writting cyton, it really works best fto put the generated cocee4 next to the *.pyx code. Though I'm sure patches would be considered :-)



On Fri, Jun 24, 2016 at 9:50 AM, Tom Swirly <t...@swirly.com> wrote:
In my project, I have a small amount of actual .pyx code and a huge amount of generated code.  It makes grep essentially useless

I'm really confused, your code is .pyx, the generated code is *.c (or *.cpp) -- can't you put a wildcard in to your grep command???? I do that all the time.

if you have a bunch of hand-written C code as well, it's not hard to put that in a separate dir.

though, if you are OK with running cython yourself, rather than using cythonize and having distutils run it, you could put the generated code anywhere and have it included in the extension in your setup.py.

 
As usual, I have a tiny demo, and it's here:  https://github.com/rec/simple-cython/blob/master/simple.pyx#L3-L4   Works, but fails if I uncomment that line.

can't tell what you are trying to do there.
 
Is there a way I can change just setup.py file to also allow that .pyx in the other directory to be found?

the usual way is for your pyx file to be in the place that a python file would be in your package structure. if you have more than one pyx, you probably want a package.

HTH, 
 - CHB

 


 
Thanks in advance!

--
     /t

http://radio.swirly.com - art music radio 24/7 366/1000

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris....@noaa.gov

Tom Swirly

unread,
Jun 24, 2016, 5:00:56 PM6/24/16
to cython-users
Thanks for the reply!  I feel a little bad taking you away from your work, more important than my flashing lights. :-)

I came up with an ugly but fast and effective workaround so there's no real need to read but others might find it helpful.


On Fri, Jun 24, 2016 at 3:47 PM, Chris Barker <chris....@noaa.gov> wrote:
As I understnd it, for ease of writting cyton, it really works best fto put the generated cocee4 next to the *.pyx code. Though I'm sure patches would be considered :-)



On Fri, Jun 24, 2016 at 9:50 AM, Tom Swirly <t...@swirly.com> wrote:
In my project, I have a small amount of actual .pyx code and a huge amount of generated code.  It makes grep essentially useless

I'm really confused, your code is .pyx, the generated code is *.c (or *.cpp) -- can't you put a wildcard in to your grep command???? I do that all the time.

Right now I have generated .pyx files and hand-edited .pyx (and .cpp and .h files but those I can move) in the same directory structure.  There's no obvious grep pattern that will eliminate only the generated .pyx files though I could craft one fairly easily. 

But I really want a separate directory for generated files - it's not just that I'm bad at using grep.  :-D

Right now, some of these directories contain a dozen generated .pyx files and only a few hand-written ones.  It's visual and mental clutter.

I shouldn't be writing generated things into random areas in my source directories and then having a bunch of specific .gitignores to not check them in - generated files all should go into the build subdirectory, just like object files and other temporaries.


Here's the planned structure.

 
if you have a bunch of hand-written C code as well, it's not hard to put that in a separate dir.

though, if you are OK with running cython yourself, rather than using cythonize and having distutils run it, you could put the generated code anywhere and have it included in the extension in your setup.py.

 
As usual, I have a tiny demo, and it's here:  https://github.com/rec/simple-cython/blob/master/simple.pyx#L3-L4   Works, but fails if I uncomment that line.

can't tell what you are trying to do there.

As the code is, it works.

Uncomment that one single line, it fails... because it can't find that sub2.pyx in the directory named subd.  

I want to "add that one directory to my PYXPATH" (yes, I know that idea doesn't exist :-D ).
In Python, I'd add a directory to PYTHONPATH or, at runtime, to sys.path.
In C or C++, I'd add it using a -I command line flag.
In Java I'd add it to my CLASSPATH


(I do need to package the thing into parts once it's in its full glory (thanks for putting that on my radar!) but that doesn't really fix the issue.)

----

I came up with a kludge, but one that will work 100% with little effort (my favorite type).

Since I'm currently only using include to build up my main code (this is an include-only project for the C++ code too) I don't actually have to have my Cython working directory be the same as my Python root...

I'll have a tiny .pyx file at the root that delegates to one in the pyx directory... then have long, ugly but unambiguous names for each include...

So where today I have:

include "timedata/color/color_base.pyx"
include "timedata/color/color.pyx"


I change it to:


include "src/pyx/timedata/color/color_base.pyx"
include "build/genfiles/timedata/color/color.pyx"


I can move everything into the right location, have it all working as I like, and the only cost is those ugly path names.  

One day if I come up with a patch to allow the mythical "PYXPATH" I can change those paths, but for today I'm good and can hit my goal of "no intermediate files in the source directories".

---

Thanks for your thoughts, have a great weekend!


Chris Barker

unread,
Jun 27, 2016, 7:44:01 PM6/27/16
to cython-users
I commented in your gitHub project, but thought I'd put it here, too.

note that I'm still unsure where these generated *.pyx files are coming from...

and I agree that you A) dont want generated files in git, and B) don't want to have to list them individually in .gitignore, but can't you put:

package/cython_code/*.c

in your git ignore? and put hand-written C in another place...

-CHB


A suggestion:

I'm still not sure what your goals are, but you should probably start with a  python package structure in mind, then go from there.


With Cython, it's easiest to keep your cyton files (Pyx) in line with the python package structure, so, for a simple package:

```
project_dir/
    setup.py
    README
    etc, etc....
    package/
        __init__.py
        a_module.py
        a_cython_module.pyx  
```

yes, this does mean that cython is going to dump its generated C code in that pacakge dir :-( -- but that's not SO bad, is it?

and if you want to keep your cython separate from your python (which you might if this is big), you can do something like:

```
    package/
        __init__.py
        a_module.py
        cy_modules/
            __init__.py
            a_cython_module.pyx
            b_cython_module.pyx 
``` 
then put imports into that top level __init__.py so you can get the code from one place.


--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ian Henriksen

unread,
Jun 28, 2016, 1:11:15 AM6/28/16
to cython-users, t...@swirly.com
It sounds like what you want is really just the --output-file flag. See
Using that flag you can specify a different output directory or specify
the output file extension to be something like .cxx rather than .cpp.

Putting together a set of compiled Python modules from a alternative
directory structure is certainly possible. I imagine it's even doable
just using distutils, though that's not something I've done before.
Using distutils for stuff like that can be fiddly.

We have a similar setup currently in use in the dynd-python codebase.
There we have distutils offload the build process to CMake which, in turn,
calls the Cython compiler on the command line and uses a separate
set of CMake configurations to link against the correct Python runtime.
The approach certainly has its flaws, but it has been working very well
for us for some time. It currently requires Cython to build from source.
We've talked a little about doing C++ source releases that don't require
Cython, but we're not currently set up for that, and it's not a particularly
high priority. The output from Cython is written to the same build output
directory used by the rest of the build. That way the generated files and
the original source are kept entirely separate.If you're interested in
the source code that does that, here are some relevant parts:

The CMake setup for calling back to Cython (a major rewrite of an older
commonly available set of Cython interop functions for CMake):
A few places where the CMake/Cython interop is actually used:
The part where distutils is used to call into CMake:

I wouldn't consider this at all a part of dynd-python's public interface,
but it demonstrates a pretty versatile approach. The project is
BSD licensed, so you can re-use the code pretty easily.

C++ style directory structures and build systems work fine for building
Python modules if you go through the effort to make them work right.
Having to deal with the distinct directory and build systems can be
a pain, but it's not impossible.

Best of luck,
Ian Henriksen

Chris Barker

unread,
Jun 28, 2016, 11:42:44 AM6/28/16
to cython-users, t...@swirly.com
hmm,

a thought here:

distutils creates a "build" dir for, well, building a package before installing it, etc.

perhaps the generated cython files should be put there, rather than in the source dir.

-CHB


--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ian Henriksen

unread,
Jul 1, 2016, 12:31:29 AM7/1/16
to cython-users, t...@swirly.com
On Tuesday, June 28, 2016 at 9:42:44 AM UTC-6, Chris Barker wrote:
hmm,

a thought here:

distutils creates a "build" dir for, well, building a package before installing it, etc.

perhaps the generated cython files should be put there, rather than in the source dir.

-CHB


Right. That's what we're doing in dynd-python, and it certainly applies in this case.
This might be good to have as an option. The only downside is that it runs
against the current best-practice of providing source releases that do not rely
on Cython. There is probably a way to have all of this magically hook in to
distutils and make Cython compilation output the C and C++ sources into the
build directory somewhere while still allowing for source releases that don't
require running Cython, but the details are still really fuzzy to me.

Best,
Ian Henriksen

Robert Bradshaw

unread,
Jul 1, 2016, 1:24:31 AM7/1/16
to cython...@googlegroups.com
On Fri, Jun 24, 2016 at 2:00 PM, Tom Swirly <t...@swirly.com> wrote:
Thanks for the reply!  I feel a little bad taking you away from your work, more important than my flashing lights. :-)

I came up with an ugly but fast and effective workaround so there's no real need to read but others might find it helpful.


On Fri, Jun 24, 2016 at 3:47 PM, Chris Barker <chris....@noaa.gov> wrote:
As I understnd it, for ease of writting cyton, it really works best fto put the generated cocee4 next to the *.pyx code. Though I'm sure patches would be considered :-)



On Fri, Jun 24, 2016 at 9:50 AM, Tom Swirly <t...@swirly.com> wrote:
In my project, I have a small amount of actual .pyx code and a huge amount of generated code.  It makes grep essentially useless

I'm really confused, your code is .pyx, the generated code is *.c (or *.cpp) -- can't you put a wildcard in to your grep command???? I do that all the time.

Right now I have generated .pyx files and hand-edited .pyx (and .cpp and .h files but those I can move) in the same directory structure.  There's no obvious grep pattern that will eliminate only the generated .pyx files though I could craft one fairly easily. 

But I really want a separate directory for generated files - it's not just that I'm bad at using grep.  :-D

Right now, some of these directories contain a dozen generated .pyx files and only a few hand-written ones.  It's visual and mental clutter.

I shouldn't be writing generated things into random areas in my source directories and then having a bunch of specific .gitignores to not check them in - generated files all should go into the build subdirectory, just like object files and other temporaries.


Here's the planned structure.

 
if you have a bunch of hand-written C code as well, it's not hard to put that in a separate dir.

though, if you are OK with running cython yourself, rather than using cythonize and having distutils run it, you could put the generated code anywhere and have it included in the extension in your setup.py.

 
As usual, I have a tiny demo, and it's here:  https://github.com/rec/simple-cython/blob/master/simple.pyx#L3-L4   Works, but fails if I uncomment that line.

can't tell what you are trying to do there.

As the code is, it works.

Uncomment that one single line, it fails... because it can't find that sub2.pyx in the directory named subd.  

I want to "add that one directory to my PYXPATH" (yes, I know that idea doesn't exist :-D ).
In Python, I'd add a directory to PYTHONPATH or, at runtime, to sys.path.
In C or C++, I'd add it using a -I command line flag.
In Java I'd add it to my CLASSPATH

Use PYTHONPATH just like you would in Python, which is respected by cimport just as it is by import. In other words, treat your .pyx and .pxd files just like .py files. 

So you'd have something like

src/a.py
src/b.pyx
src/b.pxd
generated/c.py
generated/d.pyx
generated/d.pxd

put the path to "generated" into your PYTHONPATH and you'll be able to import from a and cimport from b. Put "src" in your PYTHONPATH if you need to (c)import the other way around. 
 
Reply all
Reply to author
Forward
0 new messages