GSoC 2014: Yet Another Python Binding for Julia

Kenta Sato

unread,

Mar 19, 2014, 1:41:26 AM3/19/14

to juli...@googlegroups.com

Hi everyone,

I'm new to both this group and the Julia language, and I've found an attractive project in the GSoC project list -- Python binding for Julia.

I know that other students are also working on this project, but please let me introduce my proposal.

In my project, I'm going to use Cython as a main tool to implement the binding library.

There are three goals I hope to achieve in this project:

* fast and seamless type conversion system between Python and Julia

* loading Julia module and use it like a Python module

* improvement of the Julia C API for ease of other developers

My current working prototype will demonstrate the concept of the first two goals. https://github.com/bicycle1885/libjulia

I think the third goal is also important for the benefit of the Julia community.

Using Cython, the interface library can eliminate the dependencies on third-party libraries and I guess that there is less performance impact around type conversion and function calling.

As I said above, I'm new to Julia, so I really need the help of the community.

Any questions and advice are welcome!

Thanks.

Steven G. Johnson

unread,

Mar 19, 2014, 5:04:53 PM3/19/14

to juli...@googlegroups.com

What's the advantage of this over the approach in the pyjulia (https://github.com/jakebolewski/pyjulia) project (call Julia with ctypes and use PyCall on the Julia side for type conversions)?

Here are a couple of problems that have already been solved if you use PyCall to do conversions on the Julia side, but which you would have to implement yourself if you do conversions on the Cython side:

* How would you go about taking a NumPy array, passing it to Julia and then getting the result back as another NumPy array that you can pass to another NumPy (or Julia) function, all without making a copy?

* How would you pass a Python function to Julia and allow it to be used as a callback function in Julia?

Jake Bolewski

unread,

Mar 19, 2014, 5:53:00 PM3/19/14

to juli...@googlegroups.com

I think that this project is not really about developing the high level interface per say, Steven's PyCall project really does all the hard work necessary to bridge Python -> Julia. What we were missing before the recent Julia C API work were simple things, like easy ways of marshaling Julia's error messages / strings over to Python, better integration with (I)Python's help for Julia's base functions, limited field/ type introspection of Julia values on the Python side, etc. Now that all this has been more fleshed out, I think that this project is really filling in the multitude of small details to make Python -> Julia integration incredibly seamless. This means better lazy loading of Julia function objects (it takes quite a while to wrap and load all of Julia's base functions). Making the library integrate with IPython like help (?) and possibly source (??). Implementing missing IPython magics so you can run Julia code and send / receive python values to and from the Julia environment Solid support for python source file reloading, so that when you reload the python module the Julia side doesn't maintain its past state. Integrating Python's and Julia's standard streams (STDERR/STDOUT). Making sure the library could be used in sanely in a threaded environment without causing weird issues. Better integration of python's generators / Julia's co-routines (esp for Python 3). Making sure all of this works on both Python 2/3, on multiple platforms, and can handle Python 2's / Python 3's Unicode divide as Julia allows pervasive Unicode in its syntax. Developing a solid test suite. These are the types of low level details that when added together would really set apart Python -> Julia integration.

Kenta Sato

unread,

Mar 20, 2014, 1:46:36 AM3/20/14

to juli...@googlegroups.com

Hi, Steve. Thank you for your comment.

About PyCall.jl and pyjulia

To enforce my proposal, I measured two benchmarks that compared my prototype with pyjulia.

1. Startup time

The script programs like Python should start up quickly, but I guess loading and JIT compiling of the PyCall.jl code takes a little time.

I checked the startup time of both my prototype and pyjulia (with a slight patch about the library path).

The startup script (startup.py) of my version is:

import os
import libjulia as jl
jl_home = os.environ["JULIA_HOME"]
jl.init(jl_home)
assert jl.eval_string("1") == 1

, and pyjulia version:

import julia
assert julia.eval("1") == 1

And I got the startup time with `time python startup.py`:

my prototype: python startup.py 0.28s user 0.08s system 124% cpu 0.288 total

pyjulia: python startup.py 5.04s user 0.20s system 101% cpu 5.148 total

pyjulia took about 5 seconds, but my prototype less than 0.3 second.

I used the same Julia version for both benchmarks (commit: 06e10af*).

2. Overhead of calling Julia

Once a program starts running, we want a fast FFI system to call Julia from your Python program.

Using Cython, I can avoid the extra overhead of calling Julia code.

I measured the evaluation time of the Julia's literal using IPython:

my prototype:

In [4]: timeit jl.eval_string("1")

10000 loops, best of 3: 36.2 µs per loop

In [5]: timeit jl.eval_string("true")

10000 loops, best of 3: 28.8 µs per loop

pyjulia:

In [7]: timeit julia.eval("1")

10000 loops, best of 3: 55.4 µs per loop

In [8]: timeit julia.eval("true")

10000 loops, best of 3: 46.8 µs per loop

This result suggests my prototype runs faster than pyjulia.

I think it is because there are extra memory allocations or processes in PyCall.jl to get the Python values.

Please note that these benchmarks can be unfair due to the lack of my understanding, and there is no intent to attack the idea of pyjulia.

About NumPy Arrays

Using low level tools like Cython and C extension, I can touch the internal fields of NumPy arrays.

Also, the NumPy C API is well documented (http://docs.scipy.org/doc/numpy/reference/c-api.html).

I think it is not so difficult to implement zero-copy conversion between arrays in Julia and NumPy.

In addition, Cython supports the new buffer protocol written in PEP 3118 (http://legacy.python.org/dev/peps/pep-3118/, http://docs.cython.org/src/userguide/memoryviews.html).

This support may help a lot to intelligently transfer array-like data structures.

I'm going to tackle this problem soon.

About Callback Functions

I know there is a suitable mechanism to call Python functions from C code (http://docs.python.org/3.3/extending/extending.html#calling-python-functions-from-c).

I think this and the ccall method in Julia can solve the problem, but I'm not sure at the current moment.

Thanks.

Kenta Sato

unread,

Mar 20, 2014, 2:08:30 AM3/20/14

to juli...@googlegroups.com

Hi, Jake. Thank you for your comment and pyjulia package.

I also believe that a binding library requires a lower level approach, and notice that these libraries should face the gory details.

I have to confess that I notice the term of GSoC is not enough to support all features and make it work perfectly in various environments.

But someone should create a core package that is extensible in the future.

I hope the product at the end of GSoC 2014 will be the core for many developers.

Thanks.

Kenta Sato

unread,

Mar 20, 2014, 2:14:45 AM3/20/14

to juli...@googlegroups.com

I'm very sorry to have mistyped your name in the previous post..

On Thursday, March 20, 2014 6:04:53 AM UTC+9, Steven G. Johnson wrote:

Steven G. Johnson

unread,

Mar 20, 2014, 10:39:30 AM3/20/14

to juli...@googlegroups.com

On Thursday, March 20, 2014 1:46:36 AM UTC-4, Kenta Sato wrote:

1. Startup time
The script programs like Python should start up quickly, but I guess loading and JIT compiling of the PyCall.jl code takes a little time.

You can configure Julia to precompile PyCall, in which case this overhead goes away.

> 2. Overhead of calling Julia

> Once a program starts running, we want a fast FFI system to call Julia from your Python program.

> Using Cython, I can avoid the extra overhead of calling Julia code.

It's not clear to me what precisely you mean by "extra overhead". The conversion between Python and Julia types has to happen somewhere, either on the Python side or on the Julia side. What makes you think that this will be faster if it happens to execute on the Cython side?

> I measured the evaluation time of the Julia's literal using IPython:

I'm not sure that this is a fair comparison, since your code is not as full-featured as pyjulia. In particular, you are not dynamically determining the return type of eval("1") and are not converting it to a native Python object, as I understand it.

In exchange for these type-conversion features, it looks like pyjulia has less than a factor of 2 overhead compared to the bare-bones evaluation, which seems pretty good.

> Using low level tools like Cython and C extension, I can touch the internal fields of NumPy arrays.Also, the NumPy C > API is well documented (http://docs.scipy.org/doc/numpy/reference/c-api.html).

> I think it is not so difficult to implement zero-copy conversion between arrays in Julia and NumPy.

Consider the fact that NumPy arrays are typically in row-major order. To represent this without a copy in Julia, you need to define a new AbstractArray type in Julia that wraps around your NumPy array. It's going to be extremely tricky to define a new subtype using the Julia C API, and will require a lot of digging around in in the Julia internals.

Even for column-major arrays, you can't just convert it into a Julia object and forget about it, because you have to be careful about garbage collection--you have to make sure that the Julia object is not garbage-collected before the Python object, or vice versa. Again, this is a surmountable difficulty, but the easiest solution is defining a wrapper type in Julia that holds a reference to the Python object.

Matters are even trickier going in the opposite direction (passing a Julia array to Python without making a copy), because Julia's garbage collection is not simply reference counting, so there is no easy way for the Python object to "hold a reference" to a Julia object. ( In PyCall, this is implemented by keeping a global Julia dictionary of objects that are needed by Python, and the Python destructor is set up to remove the object from the Julia dictionary.)

It would be easier to define the AbstractArray type by writing a little glue code in Julia, but that is starting to go in the direction of reimplementing PyCall.

Again, all of these are surmountable difficulties in theory, but I think it may be more tricky than you think, and I think you will find yourself re-implementing a lot of stuff in PyCall.

> In addition, Cython supports the new buffer protocol written in PEP 3118 (http://legacy.python.org/dev/peps/pep-3118/, http://docs.cython.org/src/userguide/memoryviews.html).

This runs into the same issues; you need to define a new Julia type to wrap a buffer.

(There is an issue for PyCall to support this protocol: https://github.com/stevengj/PyCall.jl/issues/38)

> I know there is a suitable mechanism to call Python functions from C code (http://docs.python.org/3.3/extending/extending.html#calling-python-functions-from-c).

> I think this and the ccall method in Julia can solve the problem, but I'm not sure at the current moment.

The main difficulty here is that creating a Julia Function object from the C API looks extremely hairy: the jl_function_t type is fairly Julia-specific, much more than just a wrapper around a C function pointer. (And again, you need to keep a reference to the Python object in the Julia Function object to prevent the former from being garbage collected.) Probably this requires some modification to the Julia C API.

(Again, it would be easier to implement this on the Julia side, which is what PyCall does.)

I don't want to discourage you too much; it's great that you are interested in this. I just want to make sure that you understand the scope of the problem here. Part of the difficulty is that, unlike CPython, Julia is not really set up to be easily extensible from the C side... extending Julia with C code is really designed to be done by calling C from Julia rather than the other way around, and the C API is currently very minimal. Julia is similar to PyPy in this way.

In some sense this is a good thing: writing the glue code in the high level language is more flexible, and doesn't constrain future implementation choices the way CPython's API has constrained CPython (necessitating the breakage of backward compatibility in PyPy). On the other hand, it means that when you interface Julia with another high-level language, it is much easier if the *other* high-level language has a well-defined C API. Interfacing Julia with PyPy would be harder than with CPython, for example (it would probably require one to write glue code in both languages rather than just in one).

Kenta Sato

unread,

Mar 20, 2014, 7:50:11 PM3/20/14

to juli...@googlegroups.com

On Thursday, March 20, 2014 11:39:30 PM UTC+9, Steven G. Johnson wrote:

On Thursday, March 20, 2014 1:46:36 AM UTC-4, Kenta Sato wrote:
1. Startup time
The script programs like Python should start up quickly, but I guess loading and JIT compiling of the PyCall.jl code takes a little time.

You can configure Julia to precompile PyCall, in which case this overhead goes away.

I didn't know that fantastic feature! The cost of the startup is not a problem now.

> 2. Overhead of calling Julia
> Once a program starts running, we want a fast FFI system to call Julia from your Python program.
> Using Cython, I can avoid the extra overhead of calling Julia code.

It's not clear to me what precisely you mean by "extra overhead". The conversion between Python and Julia types has to happen somewhere, either on the Python side or on the Julia side. What makes you think that this will be faster if it happens to execute on the Cython side?

> I measured the evaluation time of the Julia's literal using IPython:

I'm not sure that this is a fair comparison, since your code is not as full-featured as pyjulia. In particular, you are not dynamically determining the return type of eval("1") and are not converting it to a native Python object, as I understand it.

I'm sorry for my poor explanation, but I'm afraid you may misunderstand my prototype program.

It also does type conversion between Julia and Python, and the returned value is a common Python object.

So you can write the code as:

assert jl.eval_string("1") is 1

, and the assertion passes.

In exchange for these type-conversion features, it looks like pyjulia has less than a factor of 2 overhead compared to the bare-bones evaluation, which seems pretty good.

Yes, the difference is not so large for a primitive values.

But in other benchmarks, pyjulia becomes much slower than my program or NumPy's utilities.

my program and NumPy:

In [1]: run sample.py
OKIn [2]: import numpy as np
In [3]: rs = np.random.RandomState(0)
In [4]: arr = rs.randn(10000)
In [5]: sum = jl.get_base_function("sum")
In [6]: arr
Out[6]:
array([ 1.76405235,  0.40015721,  0.97873798, ...,  0.51687218,
       -0.03292069,  1.29811143])
In [7]: sum(arr)
Out[7]: -184.33720158265817
In [8]: arr.sum()
Out[8]: -184.33720158265783
In [9]: timeit sum(arr)
100000 loops, best of 3: 5.44 µs per loop 

In [10]: timeit arr.sum()
100000 loops, best of 3: 18.3 µs per loop

pyjulia:

In [1]: import julia
In [2]: import numpy as np
In [3]: rs = np.random.RandomState(0)
In [4]: arr = rs.randn(10000)
In [5]: sum = julia.eval("Base.sum")
In [6]: arr
Out[6]:
array([ 1.76405235,  0.40015721,  0.97873798, ...,  0.51687218,
       -0.03292069,  1.29811143])
In [7]: sum(arr)
Out[7]: -184.33720158265817
In [8]: timeit sum(arr)
1000 loops, best of 3: 392 µs per loop

That is:

* my program: 5.44us

* NumPy: 18.3us

* pyjulia: 392us

I don't know why pyjulia becomes so slow, maybe I'm doing something wrong. The reproduction will be needed in others environment.

> Using low level tools like Cython and C extension, I can touch the internal fields of NumPy arrays.Also, the NumPy C > API is well documented (http://docs.scipy.org/doc/numpy/reference/c-api.html).
> I think it is not so difficult to implement zero-copy conversion between arrays in Julia and NumPy.

Consider the fact that NumPy arrays are typically in row-major order. To represent this without a copy in Julia, you need to define a new AbstractArray type in Julia that wraps around your NumPy array. It's going to be extremely tricky to define a new subtype using the Julia C API, and will require a lot of digging around in in the Julia internals.

The utilities which the Julia C API export seem to be enough to create arrays that is available from Julia.

The previous benchmark used `jl_array_t*` as an argument to the `sum` method, and it worked without any special wrapper type in Julia. I agree that the difference of row-major or column-major of multidimensional arrays is confusing.

Even for column-major arrays, you can't just convert it into a Julia object and forget about it, because you have to be careful about garbage collection--you have to make sure that the Julia object is not garbage-collected before the Python object, or vice versa. Again, this is a surmountable difficulty, but the easiest solution is defining a wrapper type in Julia that holds a reference to the Python object.

Matters are even trickier going in the opposite direction (passing a Julia array to Python without making a copy), because Julia's garbage collection is not simply reference counting, so there is no easy way for the Python object to "hold a reference" to a Julia object. ( In PyCall, this is implemented by keeping a global Julia dictionary of objects that are needed by Python, and the Python destructor is set up to remove the object from the Julia dictionary.)

It would be easier to define the AbstractArray type by writing a little glue code in Julia, but that is starting to go in the direction of reimplementing PyCall.

Again, all of these are surmountable difficulties in theory, but I think it may be more tricky than you think, and I think you will find yourself re-implementing a lot of stuff in PyCall.

Yes, garbage collection is a source of concern to me. I think Julia should have a mechanism to protect objects from the garbage collection on the Julia side in order to keep objects that is allocated in Julia for a long time. The current protection mechanism using JL_GC_POP/PUSH does not work in this scenario because it doesn't work outside of the stack as far as I know.

I think the Julia C API should be much easier to use.

> In addition, Cython supports the new buffer protocol written in PEP 3118 (http://legacy.python.org/dev/peps/pep-3118/, http://docs.cython.org/src/userguide/memoryviews.html).

This runs into the same issues; you need to define a new Julia type to wrap a buffer.

(There is an issue for PyCall to support this protocol: https://github.com/stevengj/PyCall.jl/issues/38)

A data buffer that is passed from Python to Julia can handle the ownership thanks to `jl_ptr_to_array` function, which has `own_buffer` argument to indicate the ownership of the data.

The opposite direction seems to be a problem. I couldn't find a function to oust Julia from the ownership of an array.

> I know there is a suitable mechanism to call Python functions from C code (http://docs.python.org/3.3/extending/extending.html#calling-python-functions-from-c).
> I think this and the ccall method in Julia can solve the problem, but I'm not sure at the current moment.

The main difficulty here is that creating a Julia Function object from the C API looks extremely hairy: the jl_function_t type is fairly Julia-specific, much more than just a wrapper around a C function pointer. (And again, you need to keep a reference to the Python object in the Julia Function object to prevent the former from being garbage collected.) Probably this requires some modification to the Julia C API.

(Again, it would be easier to implement this on the Julia side, which is what PyCall does.)

Yes, I found it is hard to manipulate `jl_function_t` in the Julia C API.

What about attaching two type conversion functions on both input and output of a function?

I don't want to discourage you too much; it's great that you are interested in this. I just want to make sure that you understand the scope of the problem here. Part of the difficulty is that, unlike CPython, Julia is not really set up to be easily extensible from the C side... extending Julia with C code is really designed to be done by calling C from Julia rather than the other way around, and the C API is currently very minimal. Julia is similar to PyPy in this way.

In some sense this is a good thing: writing the glue code in the high level language is more flexible, and doesn't constrain future implementation choices the way CPython's API has constrained CPython (necessitating the breakage of backward compatibility in PyPy). On the other hand, it means that when you interface Julia with another high-level language, it is much easier if the *other* high-level language has a well-defined C API. Interfacing Julia with PyPy would be harder than with CPython, for example (it would probably require one to write glue code in both languages rather than just in one).

I think it would be a great journey to implement the full-features like PyCall.jl. In my use case, what I really want is a programming language that eliminate a bottleneck of my Python code. I've used C/C++ to solve those problems, but I think Julia is suitable language for that purpose. So, I want to create a lightweight and fast binding library that has sufficient function to implement ideas.

Thanks.

Jiahao Chen

unread,

Mar 20, 2014, 10:55:50 PM3/20/14

to juli...@googlegroups.com

Thanks for your interest in Julia GSoC. Please note that submissions
close tomorrow, Friday, March 21 at noon PDT (GMT-8) and we cannot
accept applications after that deadline.

If you would like to take up this project, I would encourage you to send
in an application based on your email as soon as possible to the GSoC
website, taking into account our suggested guidelines:

http://julialang.org/gsoc/guidelines/

Thanks,

Jiahao Chen
Staff Research Scientist
MIT Computer Science and Artificial Intelligence Laboratory

Kenta Sato

unread,

Mar 21, 2014, 7:00:46 AM3/21/14

to juli...@googlegroups.com

Thank you for your concern.

I've submitted my proposal just now.

Steven G. Johnson

unread,

Mar 21, 2014, 10:42:27 AM3/21/14

to juli...@googlegroups.com

On Thursday, March 20, 2014 7:50:11 PM UTC-4, Kenta Sato wrote:

I'm sorry for my poor explanation, but I'm afraid you may misunderstand my prototype program.
It also does type conversion between Julia and Python, and the returned value is a common Python object.

It would be interesting to know the reason for the extra overhead then. Is it because ctypes is slow, and pyjulia has 2-3 ctypes calls instead of 1? Or is it because you hard-code the return type rather than dynamically detecting it and performing the corresponding conversion? Or...

Yes, the difference is not so large for a primitive values.
But in other benchmarks, pyjulia becomes much slower than my program or NumPy's utilities.

For passing arrays from Python to Julia, by default PyCall makes a copy of the array. The reason is that the copy-free array type PyArray cannot currently be passed to Julia's LAPACK linear algebra routines, mainly because it is in row-major order, so it is a bit annoying to use in Julia sometimes. This is likely to change in the future, now that Julia's array hierarchy is becoming more flexible in version 0.3, and now that Julia is getting native linear-algebra routines for generic array types.

Basically, PyCall defaults to doing the most user-friendly thing even if it is not the fastest. In cases where you need more performance, the user can call the lower-level interface (e.g. the pycall function, which allows you to specify the desired return type).

(But there is also some extra overhead because PyCall has to call back to Python and perform some conversions in order to figure out whether somethings is a numpy array, and if so what its size etc. is. Some of this should get faster if we switch over to the buffer interface, which will allow us to perform direct C calls for this rather than querying PyObjects.)

Still, it would be good to look more carefully into the performance bottlenecks here; there should be room for improvement even without sacrificing user-friendliness.

The utilities which the Julia C API export seem to be enough to create arrays that is available from Julia.
The previous benchmark used `jl_array_t*` as an argument to the `sum` method, and it worked without any special wrapper type in Julia. I agree that the difference of row-major or column-major of multidimensional arrays is confusing.

Yes, if you have a 1d array, or if you don't care about transposing the data, you can just call jl_ptr_to_array to treat it as an Array in Julia. (And of course, functions like sum(A) don't care about transposition.) But I don't think this is a long-term solution because it is too ugly to deal with transposed data in general, and it also doesn't deal with the garbage-collection issue.

Yes, garbage collection is a source of concern to me. I think Julia should have a mechanism to protect objects from the garbage collection on the Julia side in order to keep objects that is allocated in Julia for a long time.

You could just store the objects in a global dictionary, which will prevent them from being garbage collected in Julia.

But you also need to prevent the Python object from being garbage collected while the Julia object lives. This doesn't matter for a single call like sum(A) where A only has to live for the duration of the call, but it does matter if you have a Julia routine foo(A) that stores a reference to A for later use. If you just use jl_ptr_to_array, as above, then you aren't embedding a reference to the Python object in the Julia object, and for a long-lived Julia object this is a problem. If you define your own type on the Julia side, on the other hand, it is trivial to have it hold a reference to the Python object.

A data buffer that is passed from Python to Julia can handle the ownership thanks to `jl_ptr_to_array` function, which has `own_buffer` argument to indicate the ownership of the data.

No, this isn't enough in general. If own_buffer is false, it just means that Julia doesn't call free() on the buffer when the object is garbage-collected. Julia still assumes that the buffer exists for as long as the Julia Array is being used, which will cause it to crash if the Python object is garbage-collected before you are done with the Julia object (see above).

(Since there is no reliable way to detect automatically whether a Julia function saves a reference to its arguments, you have to assume the worst in every call.)

I think it would be a great journey to implement the full-features like PyCall.jl. In my use case, what I really want is a programming language that eliminate a bottleneck of my Python code. I've used C/C++ to solve those problems, but I think Julia is suitable language for that purpose. So, I want to create a lightweight and fast binding library that has sufficient function to implement ideas.

For that application, the per-call overhead is irrelevant (although copy-free array passing is important), because you won't be calling Julia from within an inner loop in Python -- the inner loop will be implemented in Julia.

Kenta Sato

unread,

Mar 24, 2014, 2:49:16 AM3/24/14

to juli...@googlegroups.com

Thank you for your detailed indications.

There seems to be a bunch of problems to transfer objects between Python and Julia, and I will keep struggling to solve these problems. The problem of my assumption got much clearer thanks to your advice.

Reply all

Reply to author

Forward