Compiler error on Python 3.x when using Docstrings defined in .pxi files

83 views
Skip to first unread message

Kevin Sheppard

unread,
Feb 11, 2016, 12:46:37 PM2/11/16
to cython-users
I am using some composition to build multiple modules.  The docstring is module-dependent and this seems to cause an error.  My code has the following pattern (greatly reduced):

=== file.pxi

FUNC_DOCSTRING = """
Something here
"""

=== module.pxd

include "file.pxi"

def func():
    FUNC_DOCSTRING
    
    return None



When I compile code like this, I see the error


Compiler crash traceback from this point on:
  File "Cython/Compiler/Visitor.py", line 183, in Cython.Compiler.Visitor.TreeVisitor._visit (/home/ilan/minonda/conda-bld/work/Cython-0.23.4/Cython/Compiler/Visitor.c:4646)
  File "/data/unixhome/ksheppard/anaconda/envs/py35/lib/python3.5/site-packages/Cython/Compiler/AnalysedTreeTransforms.py", line 74, in visit_FuncDefNode
    if not self.all_docstrings and '>>>' not in node.doc:
TypeError: a bytes-like object is required, not 'str'
Traceback (most recent call last):
  File "setup.py", line 196, in <module>
    ext_modules = cythonize(extensions)
  File "/data/unixhome/ksheppard/anaconda/envs/py35/lib/python3.5/site-packages/Cython/Build/Dependencies.py", line 877, in cythonize
    cythonize_one(*args)
  File "/data/unixhome/ksheppard/anaconda/envs/py35/lib/python3.5/site-packages/Cython/Build/Dependencies.py", line 997, in cythonize_one
    raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: ./randomstate/dsfmt.pyx


The offending code is 

if not self.all_docstrings and '>>>' not in node.doc:
    return node


and the problem is that node.doc is Bytes (technically String.EncodedString) but '>>>' is not.  This appears to be caused by Cython reading the .pxi file as binary (I think, 'rb').


The actual code that crashes is in this module: https://github.com/bashtage/ng-numpy-randomstate/tree/1-11-release

Specifically 


and 



Any ideas on how to fix this?

Thanks,
Kevin

Kevin Sheppard

unread,
Feb 12, 2016, 2:53:16 AM2/12/16
to cython-users
My example code is a bit wrong.  It should have a DEF so that this is a defined variable

=== file.pxi

DEF FUNC_DOCSTRING = """
Something here
"""

Kevin Sheppard

unread,
Feb 12, 2016, 2:53:31 AM2/12/16
to cython-users
One quick solution is to patch AnalysedTreeTransforms.py and change line from

        if not self.all_docstrings and '>>>' not in node.doc:
            return node

to

        doc = node.doc.decode() if hasattr(node.doc, 'decode') else node.doc
        if not self.all_docstrings and '>>>' not in doc:
            return node

Would probably be better to be more specific when deciding to call decode, rather than just decoding whenever .

Stefan Behnel

unread,
Feb 12, 2016, 3:04:24 AM2/12/16
to cython...@googlegroups.com
Kevin Sheppard schrieb am 11.02.2016 um 16:32:
> I am using some composition to build multiple modules. The docstring is
> module-dependent and this seems to cause an error.

Why would a docstring for the same function code differ?


> My code has the following pattern (greatly reduced):
>
> === file.pxi
>
> DEF FUNC_DOCSTRING = """
> Something here
> """
>
> === module.pxd
>
> include "file.pxi"
>
> def func():
> FUNC_DOCSTRING
>
> return None
>
>
> When I compile code like this, I see the error
>
>
> Compiler crash traceback from this point on:
> File "Cython/Compiler/Visitor.py", line 183, in
> Cython.Compiler.Visitor.TreeVisitor._visit
> (/home/ilan/minonda/conda-bld/work/Cython-0.23.4/Cython/Compiler/Visitor.c:4646)
> File
> "/data/unixhome/ksheppard/anaconda/envs/py35/lib/python3.5/site-packages/Cython/Compiler/AnalysedTreeTransforms.py",
> line 74, in visit_FuncDefNode
> if not self.all_docstrings and '>>>' not in node.doc:
> TypeError: a bytes-like object is required, not 'str'

Using unprefixed strings in DEF compile time variables is generally a bad
idea because they depend on the Python version that you run the *compiler*
with. Python 3.5 in your case.

Any reason you can't use u"unicode" strings here? (or compile in Py3 mode,
or use the unicode literals future import?)

Stefan

Kevin Sheppard

unread,
Feb 12, 2016, 12:14:19 PM2/12/16
to cython-users, stef...@behnel.de
Thanks so much for the help.  Simply decorating strings in the .pxi files with a u""" doc string """ fixed the issue.  



Why would a docstring for the same function code differ?



I am using C-style includes to generate a number of numpy-like RandomState objects where each has a different underlying generator. These bring a lot of modern features that are missing from NumPy including better performance, independent streams (for use in clusters) and the ability to jump/advance the generator a large number of steps without actually generating the random numbers.  The function that implements the jump, while having the same function signature, differs in implementation across the core underlying pseudo Random Number Generator and so I wanted to have docstrings that explain what is being done on a generator-by-generator basis. 

I also am trying to eventually get this into NumPy so I have to stick to the NumPy implementation for the MT19937 generator and pass the NumPy tests -- this puts some additional constraints on the implementation.  For this reason I also need it to work on Py 2.7, 3.4 and 3.5.


 

Using unprefixed strings in DEF compile time variables is generally a bad
idea because they depend on the Python version that you run the *compiler*
with. Python 3.5 in your case.

Any reason you can't use u"unicode" strings here? (or compile in Py3 mode,
or use the unicode literals future import?)



Thanks for this.  I am no string/unicode expert when it comes to the differences in Py 2.x and 3.x.
 
Thanks,
Kevin

Reply all
Reply to author
Forward
0 new messages