Automatic translation of python to cython

1,240 views
Skip to first unread message

Sid

unread,
Jul 7, 2014, 10:50:00 PM7/7/14
to cython...@googlegroups.com
Hi,

Is there a tool out there than can do type inference on python (that doesn't use unsupported constructs like generators, ec.) and automatically convert it into some form of cython?

Thanks

Stefan Behnel

unread,
Jul 8, 2014, 12:38:15 AM7/8/14
to cython...@googlegroups.com
Sid, 08.07.2014 04:49:
> Is there a tool out there than can do type inference on python (that
> doesn't use unsupported constructs like generators, ec.) and automatically
> convert it into some form of cython?

Even better: you can simply take your Python code and let Cython compile
it. No changes necessary.

Generators are supported in Cython, BTW (or maybe you meant something else?).

Stefan

Sid

unread,
Jul 8, 2014, 12:45:03 AM7/8/14
to cython...@googlegroups.com, stef...@behnel.de
Right, but I'm guessing Cython doesn't do automatic type inference?

Didn't know that support for generators had been added. Thanks for the info.

Robert Bradshaw

unread,
Jul 8, 2014, 12:47:59 AM7/8/14
to cython...@googlegroups.com
On Mon, Jul 7, 2014 at 9:45 PM, Sid <tmf...@gmail.com> wrote:
> Right, but I'm guessing Cython doesn't do automatic type inference?

Cython does some type inference.

> Didn't know that support for generators had been added. Thanks for the info.

Yep, nearly everything is supported these days.

> On Tuesday, July 8, 2014 12:38:15 AM UTC-4, Stefan Behnel wrote:
>>
>> Sid, 08.07.2014 04:49:
>> > Is there a tool out there than can do type inference on python (that
>> > doesn't use unsupported constructs like generators, ec.) and
>> > automatically
>> > convert it into some form of cython?
>>
>> Even better: you can simply take your Python code and let Cython compile
>> it. No changes necessary.
>>
>> Generators are supported in Cython, BTW (or maybe you meant something
>> else?).
>>
>> Stefan
>>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "cython-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cython-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

1989lzhh

unread,
Jul 8, 2014, 4:10:57 AM7/8/14
to cython...@googlegroups.com
I have thought this for a while. Use Cython code as a backend to translate python code is perfect choice. Since cython did the dirty work, you don't have to worry about the reference, you can switch from primary type to object model easily. 
Now I am already trying to do this translation named cyjit https://github.com/liuzhenhai/cyjit
(inspired by numba.jit)
Now I am writing type inference part by using control flow analysis (working in progress, although I am not familiar with compiler theory) to determine variable type in each control flow block.






发自我的 iPhone

1989lzhh

unread,
Jul 8, 2014, 4:31:55 AM7/8/14
to cython...@googlegroups.com


发自我的 iPhone

> 在 Jul 8, 2014,12:47,Robert Bradshaw <robe...@gmail.com> 写道:
>
>> On Mon, Jul 7, 2014 at 9:45 PM, Sid <tmf...@gmail.com> wrote:
>> Right, but I'm guessing Cython doesn't do automatic type inference?
>
> Cython does some type inference.
I tried the inference and find the type inference is too aggressive for now, it is possible to do type inference like follow:
a=1 #a is int
a=[] #a is object
a=1.0 #a is back to double
This can be done by using different named variable to identify different type, like:
cdef int a_int, object a_object, double a_double
a_int=1
a_object=[]
a_double=1.0
The difficult part is handle control flow.

Daπid

unread,
Jul 8, 2014, 4:35:43 AM7/8/14
to cython...@googlegroups.com
On 8 July 2014 04:49, Sid <tmf...@gmail.com> wrote:
Is there a tool out there than can do type inference on python (that doesn't use unsupported constructs like generators, ec.) and automatically convert it into some form of cython?

Shedskin is an experiment to fully translate a subset of Python to C++ without any type annotations. This is a central feature, as the author wanted to see how far can you go without them (and the answer seems to be quite far).

https://code.google.com/p/shedskin/

They have a bunch of successful programs working, but due to the experimental nature and slow development cycle, I wouldn't trust it for production.


/David.

Stefan Behnel

unread,
Jul 8, 2014, 5:46:54 AM7/8/14
to cython...@googlegroups.com
1989lzhh, 08.07.2014 10:31:
> I tried the inference and find the type inference is too aggressive for now, it is possible to do type inference like follow:
> a=1 #a is int
> a=[] #a is object
> a=1.0 #a is back to double
> This can be done by using different named variable to identify different type, like:
> cdef int a_int, object a_object, double a_double
> a_int=1
> a_object=[]
> a_double=1.0

Yes, we decided at least a years ago that it should eventually be done that
way but it's still not implemented. Actually, the source level names
shouldn't change, only the internal cnames of the symtab entries (i.e.
different sections would use different entries for the same variable).

If you want to give it a try, we'd certainly appreciate it.

Stefan

Mark Florisson

unread,
Jul 8, 2014, 6:08:15 AM7/8/14
to cython...@googlegroups.com
On 8 July 2014 09:31, 1989lzhh <1989...@gmail.com> wrote:
>
>
> 发自我的 iPhone
>
>> 在 Jul 8, 2014,12:47,Robert Bradshaw <robe...@gmail.com> 写道:
>>
>>> On Mon, Jul 7, 2014 at 9:45 PM, Sid <tmf...@gmail.com> wrote:
>>> Right, but I'm guessing Cython doesn't do automatic type inference?
>>
>> Cython does some type inference.
> I tried the inference and find the type inference is too aggressive for now, it is possible to do type inference like follow:
> a=1 #a is int
> a=[] #a is object
> a=1.0 #a is back to double
> This can be done by using different named variable to identify different type, like:
> cdef int a_int, object a_object, double a_double
> a_int=1
> a_object=[]
> a_double=1.0
> The difficult part is handle control flow.
>

It may be easier to translate to SSA first, to make merges of dataflow
and variable naming explicit. Alternatively you could make the type
inferencer aware of the reaching definitions and exploit that, and
rename the entries based on that same information.

You may want to look into the cartesian product algorithm. The idea is
to infer for every definition the set of possible types, and take the
cartesian product for the input types for function/method calls. E.g.
for obj.method(x) where obj has type {Foo, Bar} and x has type {Spam,
Ham} you can analyze Foo.method(Spam), Foo.method(Ham),
Bar.method(Spam) and Bar.method(Ham). If you find it helpful, here is
an example CPA implementation (the fixed point that drives the whole
thing is in 'infer_graph'):

https://github.com/markflorisson88/flypy/blob/master/flypy/compiler/typing/inference.py

This can however blow up quite quickly for very polymorphic code (it
is still exponential in the arity of the function).

Mark Florisson

unread,
Jul 8, 2014, 6:15:11 AM7/8/14
to cython...@googlegroups.com
On 8 July 2014 11:08, Mark Florisson <markflo...@gmail.com> wrote:
> On 8 July 2014 09:31, 1989lzhh <1989...@gmail.com> wrote:
>>
>>
>> 发自我的 iPhone
>>
>>> 在 Jul 8, 2014,12:47,Robert Bradshaw <robe...@gmail.com> 写道:
>>>
>>>> On Mon, Jul 7, 2014 at 9:45 PM, Sid <tmf...@gmail.com> wrote:
>>>> Right, but I'm guessing Cython doesn't do automatic type inference?
>>>
>>> Cython does some type inference.
>> I tried the inference and find the type inference is too aggressive for now, it is possible to do type inference like follow:
>> a=1 #a is int
>> a=[] #a is object
>> a=1.0 #a is back to double
>> This can be done by using different named variable to identify different type, like:
>> cdef int a_int, object a_object, double a_double
>> a_int=1
>> a_object=[]
>> a_double=1.0
>> The difficult part is handle control flow.
>>
>
> It may be easier to translate to SSA first, to make merges of dataflow
> and variable naming explicit. Alternatively you could make the type
> inferencer aware of the reaching definitions and exploit that, and
> rename the entries based on that same information.

(The difficulty is of course to assign different definitions
(different cnames) to a common definition that can be used after a
control flow join).

刘振海

unread,
Jul 8, 2014, 8:31:35 AM7/8/14
to cython...@googlegroups.com
    Yes, If the type inference is done on cython side, the source level names shouldn't change.

If you want to give it a try, we'd certainly appreciate it.
    Since I am not familiar with cython's codebase, I thought visiting python code to emit cython code maybe a easy start. 

刘振海

unread,
Jul 8, 2014, 9:07:16 AM7/8/14
to cython...@googlegroups.com
2014-07-08 18:08 GMT+08:00 Mark Florisson <markflo...@gmail.com>:
On 8 July 2014 09:31, 1989lzhh <1989...@gmail.com> wrote:
>
>
> 发自我的 iPhone
>
>> 在 Jul 8, 2014,12:47,Robert Bradshaw <robe...@gmail.com> 写道:
>>
>>> On Mon, Jul 7, 2014 at 9:45 PM, Sid <tmf...@gmail.com> wrote:
>>> Right, but I'm guessing Cython doesn't do automatic type inference?
>>
>> Cython does some type inference.
> I tried the inference and find the type inference is too aggressive for now, it is possible to do type inference like follow:
> a=1 #a is  int
> a=[] #a is object
> a=1.0 #a is back to double
> This can be done by using different named variable to identify different type, like:
> cdef int a_int, object a_object, double a_double
> a_int=1
> a_object=[]
> a_double=1.0
> The difficult part is handle control flow.
>
Hi mark,
    It's been a long time since you show up on the public last time. How are you?
    I have been following you on github for almost two years and learned many many fresh knowledge from you.Here I want to thank you!
It may be easier to translate to SSA first, to make merges of dataflow
and variable naming explicit. Alternatively you could make the type
inferencer aware of the reaching definitions and exploit that, and
rename the entries based on that same information.
   Actually I have no idea about how to translate python ast to SSA form, I will google it and try to find out. 
   I think reading numba's corresponding code will be a good start.  


You may want to look into the cartesian product algorithm. The idea is
to infer for every definition the set of possible types, and take the
cartesian product for the input types for function/method calls. E.g.
for obj.method(x) where obj has type {Foo, Bar} and x has type {Spam,
Ham} you can analyze Foo.method(Spam), Foo.method(Ham),
Bar.method(Spam) and Bar.method(Ham). If you find it helpful, here is
an example CPA implementation (the fixed point that drives the whole
thing is in 'infer_graph'):

    https://github.com/markflorisson88/flypy/blob/master/flypy/compiler/typing/inference.py

This can however blow up quite quickly for very polymorphic code (it
is still exponential in the arity of the function).
Thanks for your information, For now I want to implement type inference on variable first.
The signature of function can be set by user.
                                                                                         Yours,
                                                                                         Liu Zhenhai

Mark Florisson

unread,
Jul 8, 2014, 10:20:27 AM7/8/14
to cython...@googlegroups.com
Hi Liu,

I'm very well, thank you :)

>>
>> It may be easier to translate to SSA first, to make merges of dataflow
>> and variable naming explicit. Alternatively you could make the type
>> inferencer aware of the reaching definitions and exploit that, and
>> rename the entries based on that same information.
>
> Actually I have no idea about how to translate python ast to SSA form, I
> will google it and try to find out.
> I think reading numba's corresponding code will be a good start.

I would recommend Engineering a Compiler, which is quite an accessible
yet comprehensive introduction overview of intermediate
representations and SSA. Possibly yes, I don't know if numba still
uses SSA. In any event, it may be easier to read up control and
dataflow analysis and reaching definitions which is already in Cython.
That should already make it possible implement this without tackling
this other beast.

>>
>> You may want to look into the cartesian product algorithm. The idea is
>> to infer for every definition the set of possible types, and take the
>> cartesian product for the input types for function/method calls. E.g.
>> for obj.method(x) where obj has type {Foo, Bar} and x has type {Spam,
>> Ham} you can analyze Foo.method(Spam), Foo.method(Ham),
>> Bar.method(Spam) and Bar.method(Ham). If you find it helpful, here is
>> an example CPA implementation (the fixed point that drives the whole
>> thing is in 'infer_graph'):
>>
>>
>> https://github.com/markflorisson88/flypy/blob/master/flypy/compiler/typing/inference.py
>>
>> This can however blow up quite quickly for very polymorphic code (it
>> is still exponential in the arity of the function).
>
> Thanks for your information, For now I want to implement type inference on
> variable first.
> The signature of function can be set by user.
>

I see. As Robert mentioned there's already some type inference in
Cython, which I believe is here:

https://github.com/cython/cython/blame/master/Cython/Compiler/TypeInference.py

IIRC Vitja worked on typing individual variable definitions
previously, but perhaps those results are not utilized in the code
generator?

Daniele Nucera

unread,
Jul 2, 2023, 3:27:17 PM7/2/23
to cython-users
I know I'm late in this group, and I don't know if this is even of interest, I'm trying to write some library that leverages type hinting to infer variables types in python code to add cdef lines with type definition of variables:


I'm currently working/improving it, and it's a WIP, but I'm trying to work on it just for fun.

Nuc

Stefan Behnel

unread,
Jul 3, 2023, 7:58:33 AM7/3/23
to cython...@googlegroups.com
Daniele Nucera schrieb am 02.07.23 um 21:21:
> I know I'm late in this group, and I don't know if this is even of
> interest, I'm trying to write some library that leverages type hinting to
> infer variables types in python code to add cdef lines with type definition
> of variables:
>
> https://github.com/nucccc/markarth
>
> I'm currently working/improving it, and it's a WIP, but I'm trying to work
> on it just for fun.

Interesting. You should also consider generating .pxd files instead:

https://docs.cython.org/en/latest/src/tutorial/pure.html#augmenting-pxd

That way, you can avoid the need to change or duplicate the original source
code.

Note also that users can write their Cython type annotations in Python
syntax, so there is less need to use intermediate code generation.

https://docs.cython.org/en/latest/src/tutorial/pure.html#pep-484-type-annotations

But it can still be nice to have (different?) ways to automatically add
type annotations to Python code in order to make it faster than Cython does
for regular Python code. Sort-of a "C-ify" tool where you could configure
different degrees of safety vs. speed.

Stefan

Daniele Nucera

unread,
Jul 3, 2023, 4:54:08 PM7/3/23
to cython-users
My current idea is the latest one! Adding annotations that make the code faster "C-ifying" it. My starting point was that if I have a "for i in range..." then it would be nice to have some program capable of spotting those variables and "C-ify" them by adding a dedicated "cdef int i". Just for performance purposes.

I will take a look at hinting support as I admit my ignorance on the topic. I have a long way to go, but I'm hoping to put some effort into the project!

Nuc
Reply all
Reply to author
Forward
0 new messages