WIP: PyCall module to call Python functions from Julia

1,039 views
Skip to first unread message

Steven G. Johnson

unread,
Feb 15, 2013, 10:23:20 AM2/15/13
to juli...@googlegroups.com
Hi all,

I've been playing around with a little module to call Python functions
from Julia, and a basic proof-of-concept is working:

https://github.com/stevengj/PyCall.jl

For example:

require("PyCall")
using PyCall
pyinitialize() # initialize the Python interpreter
math = pyimport("math") # import the Python math module
pycall(math["sin"], Float64, 3.0) - sin(3.0) # returns 0.0

calls Python's math.sin function.

There are a few major to-do items (see the README on the github page)
for this to be really useful, but I thought I'd mention it on the
mailing list in case there are comments. (It's quite nice to be able
to write glue like this purely in Julia!)

--SGJ

John Myles White

unread,
Feb 15, 2013, 10:40:21 AM2/15/13
to juli...@googlegroups.com
Having this will be an enormous gain for Julia. Thank you so much for doing this!

-- John

Stefan Karpinski

unread,
Feb 15, 2013, 10:44:43 AM2/15/13
to Julia Dev
Yes, this is great. I find writing math["sin"] awkward, but since we don't allow overloading math.sin, it does seem to be the only option. Of course if this were a macro, then you could do @pycall math.sin(3.0)::Float64 or something like that.

Steven G. Johnson

unread,
Feb 15, 2013, 11:00:32 AM2/15/13
to juli...@googlegroups.com
Stefan Karpinski wrote:
> Yes, this is great. I find writing math["sin"] awkward, but since we
> don't allow overloading math.sin, it does seem to be the only option. Of
> course if this were a macro, then you could do @pycall
> math.sin(3.0)::Float64 or something like that.

Some sort of macro to beautify the syntax seems like a nice idea,
although hopefull someday we can just overload ".".

One also wants to be able to look up other symbols, e.g math["pi"] works
as well. Basically, Python objects behave semantically just like Dicto
objects, so it made sense to me to have a similar syntax.


John Myles White

unread,
Feb 15, 2013, 11:01:55 AM2/15/13
to juli...@googlegroups.com
Overloading "." is also something that DataFrames would benefit from.

-- John

Diego Javier Zea

unread,
Feb 15, 2013, 12:04:07 PM2/15/13
to juli...@googlegroups.com
Fantastic Package. It's great have a Julia - Python interface!!!
Maybe can be good if you post the package in a CPython list in order to get feedback for the Python community

P.S.: Overload "." looks like a pythonic way to go

Steven G. Johnson

unread,
Feb 15, 2013, 12:14:37 PM2/15/13
to juli...@googlegroups.com
Diego Javier Zea wrote:
> Fantastic Package. It's great have a Julia - Python interface!!!
> Maybe can be good if you post the package in a CPython list in order to
> get feedback for the Python community

Will do, once it's a little more mature (and doesn't require patching
the Julia source code, see https://github.com/JuliaLang/julia/issues/2312).

By the way, a lot of the credit here goes to Python for having a sane
and well-documented C API. It sets an example that other languages
(*cough*, *cough*) would do well to follow, as it greatly increases the
flexibility of Python.

--SGJ

Toivo Henningsson

unread,
Feb 15, 2013, 1:38:02 PM2/15/13
to juli...@googlegroups.com
Nice work! Once it's a bit more mature, (and given that you are able to extract a listing of the identifiers available on the python side) I think that it should be quite feasible to generate a julia module on the go with natural looking wrappers, so that you could get something like

math = pyimport("math") # returns a freshly created wrapper module
math.sin(3)

In fact, such wrapping functionality could probably be pretty independent of the wrapped language, so perhaps there could be a JuliaWrap package that could be used by interfaces to many different ones.

Steven G. Johnson

unread,
Feb 15, 2013, 9:18:48 PM2/15/13
to juli...@googlegroups.com
Toivo Henningsson wrote:
> Nice work! Once it's a bit more mature, (and given that you are able to extract a listing of the identifiers available on the python side) I think that it should be quite feasible to generate a julia module on the go with natural looking wrappers, so that you could get something like
>
> math = pyimport("math") # returns a freshly created wrapper module
> math.sin(3)

I agree that this would be nice, although for performance reasons
(mainly to tell Julia's JIT what type to expect) it would be good to
have the option to specify the return types of the functions (since
these cannot be determined by introspection of the library).

I'm thinking of something along the lines of

math = pyimport("math", [:sin => Float64, :cos => Float64, ...])

where the second argument is an optional Dict of return types (which are
used if available to tell Julia that math.sin always returns Float64,
etcetera).

--SGJ

John Myles White

unread,
Feb 15, 2013, 9:23:00 PM2/15/13
to juli...@googlegroups.com
Seeing that hash just makes me long even more for a general mechanism to specify function return signatures in Julia.

-- John

Steven G. Johnson

unread,
Feb 15, 2013, 9:33:38 PM2/15/13
to juli...@googlegroups.com
John Myles White wrote:
> Seeing that hash just makes me long even more for a general mechanism to specify function return signatures in Julia.

I'm not sure I follow you...what exactly are you longing for?

The problem here is not that the Julia specification of a function is
difficult, it is that Python doesn't provide a way to query the return
type of a function (since it might vary at runtime, although in practice
many functions will return fixed types). So, someone has to input those
return types manually if we want the compiler to take advantage of this
information. A Dict seems as good a format as any to hand-input this
information.

Of course, this is not too hard if one just wants to call a small number
of Python routines, but if you are calling a lot of routines (or
pre-packaging a Julia interface for a large Python module like SciPy) it
becomes tedious.

--SGJ

John Myles White

unread,
Feb 15, 2013, 9:39:02 PM2/15/13
to juli...@googlegroups.com
I would like to be able to assert that a function (in pure Julia) always returns outputs of a given type. The compiler can infer this information in general in pure Julia, but it would be nice to be able to declare the return type so that misuse fails type checking.

-- John

Jeff Bezanson

unread,
Feb 15, 2013, 10:10:55 PM2/15/13
to juli...@googlegroups.com
This is huge. I expect many people will be excited about this.

Steven G. Johnson

unread,
Feb 17, 2013, 1:05:09 AM2/17/13
to juli...@googlegroups.com
Have added a bunch more functionality, including support for tuples and
arrays, with copy-free conversion of Julia Arrays into NumPy ndarrays
(the converse is not implemented yet; basically requires a subclass of
SubArray).

You can now do e.g.

using PyCall
nl = pyimport("numpy.linalg")
a = rand(100,100);
(pycall(nl["det"], Float64, a) - det(a)) / det(a)

(which returns a difference on the order of machine precision).

Note that, for this to work, the patch
https://github.com/JuliaLang/julia/pull/2317
is required (in order to load Python in a way that permits inter-library
dependencies).

--SGJ

PS. Note that calling NumPy's C API from Julia required obscene
contortions because of the way NumPy does things, but I was happy that
it was nevertheless possible without C glue.

Steven G. Johnson

unread,
Feb 17, 2013, 9:40:41 AM2/17/13
to juli...@googlegroups.com
Steven G. Johnson wrote:
> Have added a bunch more functionality, including support for tuples and
> arrays, with copy-free conversion of Julia Arrays into NumPy ndarrays
> (the converse is not implemented yet; basically requires a subclass of
> SubArray).

I have a question about how to make this work best with Julia's garbage
collection.

No-copy passing of Julia data into Python is obviously ideal for large
data structures, and it is safe as long as Python keeps no references to
the Julia data after the Python routine exits. However, in the case
where Python keeps a reference to the data (e.g. returning some object
that encapsulates the data), this is a bit dangerous: the user needs to
keep the original Julia object around until the Python object is
garbage-collected, or terrible things (e.g. crashes) may result.

Any suggestions on how to embed a Julia reference inside a Python object
so that Julia's garbage collector knows that the reference exists until
the Python object is finalized?

Python calls a __del__ method when an object is finalized, so it seems
like I should somehow (a) attach a Julia function to this __del__ method
[not sure how?] that (b) releases the reference from the Julia garbage
collector [how?].

--SGJ

Isaiah Norton

unread,
Feb 17, 2013, 12:03:20 PM2/17/13
to juli...@googlegroups.com
(a) attach a Julia function to this __del__ method [not sure how?]

__del__ is tricky because of reference cycles. weakref might be more straightforward for this purpose:


This will let you set up a callback to be called during finalization (but before __del__). The callback could then use ctypes to call back into a Julia function pointer (from cfunction) which would do whatever needs to happen on the Julia side for (b). I just cloned PyCall to poke around at this, I'll let you know if I get anything working.

Isaiah

Jameson Nash

unread,
Feb 17, 2013, 12:23:46 PM2/17/13
to juli...@googlegroups.com
I would create a global const pygc dictionary which contains pointers as keys and arrays as values. (or just a list of objects, if you can manage to store the actual reference in the Python object for later retrieval)

I'm not very familiar with the Python API, but it appears that NumPy allows you to attach metadata to an object, including a free function:

Steven G. Johnson

unread,
Feb 17, 2013, 11:54:49 PM2/17/13
to juli...@googlegroups.com
Steven G. Johnson wrote:
> Have added a bunch more functionality, including support for tuples and
> arrays, with copy-free conversion of Julia Arrays into NumPy ndarrays
> (the converse is not implemented yet; basically requires a subclass of
> SubArray).

ndarray -> Julia is now implemented as well, including a no-copy PyArray
type (though it cannot be used in most Julia functions without a copy
due to issue #2345).

Incidentally, I noticed that, while transpose uses FFTW's optimized
routines, permutedims does not (FFTW can do > 2-dimensional transposes
out of place as well); will have to play with this at some point.
permutedims is a useful routine to (hopefully quickly) convert C-order
arrays (the default in NumPy) to Fortran-order for Julia. For that
matter, FFTW should be able to do optimized copies for arbitrary
StridedArray-like types, at least for objects whose size is an integer
multiple of sizeof(Float32) or sizeof(Float64).

--SGJ

Steven G. Johnson

unread,
Feb 18, 2013, 4:30:23 PM2/18/13
to juli...@googlegroups.com
Isaiah Norton wrote:
> > (a) attach a Julia function to this __del__ method [not sure how?]
>
> __del__ is tricky because of reference cycles. weakref might be more
> straightforward for this purpose:

A weak reference is not what we want here, because it is only for cached
resources that are allowed to disappear. We need the Python object to
have a "real" reference to the Julia object, in that the Julia object
*must* not disappear as long as Python wrappers around it exist.

Jameson Nash wrote:
> I'm not very familiar with the Python API, but it appears that NumPy
> allows you to attach metadata to an object, including a free function:
>
http://docs.scipy.org/doc/numpy/reference/c-api.array.html#auxiliary-data-with-object-semantics

This looks like it is only to attach metadata to the array elements (for
arrays of user-defined types), not to the array itself.

--SGJ

Isaiah Norton

unread,
Feb 18, 2013, 11:04:57 PM2/18/13
to juli...@googlegroups.com
A weak reference is not what we want here, because it is only for cached resources I don't know of a way to inject something like this on the Python side, but ifthat are allowed to disappear.  We need the Python object to have a "real" reference to the Julia object, in that the Julia object *must* not disappear as long as Python wrappers around it exist.

I don't know of a way to inject such a reference into a python object. If I understand Jameson's suggestion correctly, you can prevent Julia from gc'ing an object by keeping it in a strongly-referenced collection on the Julia side.

Using the weakref mechanism in Python is a clean way to get a notification when the Python object is dead so that the Julia object can be removed from this collection.

I'm not sure how to do this purely in Julia, but here is an example of getting a callback from Python->Julia when a Python object is finalized after pydecref:

Note that the cleanup callback passes the weakref object, not the original object.

Isaiah

Steven G. Johnson

unread,
Feb 19, 2013, 8:28:02 AM2/19/13
to juli...@googlegroups.com
Isaiah Norton wrote:
> Using the weakref mechanism in Python is a clean way to get a
> notification when the Python object is dead so that the Julia object can
> be removed from this collection.

Ah, I see what you mean. Julia keeps a weakref to the Python object,
not the other way around.

Steven G. Johnson

unread,
Feb 20, 2013, 9:08:35 PM2/20/13
to juli...@googlegroups.com
My PyCall module has made a lot of progress, and now supports much
greater functionality and nicer syntax:

https://github.com/stevengj/PyCall.jl

(NOTE: You need to patch Julia git master as described here for most of
this to work: https://github.com/JuliaLang/julia/pull/2317)

Some examples:

julia> using PyCall
julia> pyinitialize() # will go away once #2378 is fixed

julia> @pyimport math
julia> math.sin(3) - sin(3)
0.0

julia> @pyimport scipy.special as s
julia> s.airy(4.2)
(0.0006274958683091624,-0.0013210006638876843,124.03800986864213,246.14599171178563)

julia> @pyimport numpy
julia> numpy.transpose(rand(3,5))
5x3 Float64 Array:
0.139254 0.492393 0.617765
0.971371 0.945398 0.55352
0.883429 0.477562 0.604881
0.622781 0.207569 0.918686
0.117342 0.531707 0.350668

julia> x = linspace(0,2*pi,100); y = sin(3*x + 4*cos(5*y));
julia> @pyimport pylab
julia> pylab.plot(x,y)
julia> pylab.show()
...opens plot window...

Please bang on it if you get a chance (especially once #2317 is
committed so you don't have to patch Julia).

--SGJ



Fernando Perez

unread,
Feb 20, 2013, 10:17:06 PM2/20/13
to juli...@googlegroups.com
On Wed, Feb 20, 2013 at 6:08 PM, Steven G. Johnson <ste...@alum.mit.edu> wrote:
> julia> x = linspace(0,2*pi,100); y = sin(3*x + 4*cos(5*y));
> julia> @pyimport pylab
> julia> pylab.plot(x,y)
> julia> pylab.show()
> ...opens plot window...


Awesome! Let's get that julia IPython kernel going, and we'll have
native Julia notebooks with inline plots in no time flat. I'd love to
see the first julia section in our gallery
(https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks).

Cheers,

f

Rahul Dave

unread,
Feb 20, 2013, 10:32:29 PM2/20/13
to juli...@googlegroups.com
Fernando,
Working on this as soon as I finish my ipython/nltk/pandas course this week (for librarians!). 

First try is to basically copy the matlab magic (using the python-matlab bridge) to create a simple %%julia magic talking to the existing webserver implementation...

The second one would be to use Avik's zmq2 module to communicate directly with the tornado front end.

The third one would be (once the notebook js/css is refactored according to the roadmap you guys posted) to communicate directly with julia using the websockets protocol, but this is not as flexible as the second one as you cant mix in pythonic and R'ic and other fun.

Thoughts?

Are there any plans for any direct file editing in the notebook? RStudio has this very nice capability of sending stuff back and forth from the repl to the source and vice versa. And from history to source. Would be lovely to have this in the notebook....in tomorrow's course I'm finally getting to modules and code organization and such, and will be using a text editor. But a browser based one would be so good for teaching....

The notebook rocks!
Thanks!
Rahul

-- 
Rahul Dave
Sent with Sparrow

Fernando Perez

unread,
Feb 21, 2013, 12:14:53 AM2/21/13
to juli...@googlegroups.com
Hi Rahul,

On Wed, Feb 20, 2013 at 7:32 PM, Rahul Dave
<rahuldave.m...@gmail.com> wrote:
> Fernando,
> Working on this as soon as I finish my ipython/nltk/pandas course this week
> (for librarians!).

Great! BTW, I'll be giving a talk at MIT next Friday about IPython
(time/place TBD), if you can make it there I'd be happy to talk more
about the details there.

> First try is to basically copy the matlab magic (using the python-matlab
> bridge) to create a simple %%julia magic talking to the existing webserver
> implementation...

Yup. That would be useful in and of itself, as it would also let
python users use Julia as a way to speed up loopy code. I'd love to
have this also as an easy way of learning about Julia from the
comforts of code I already know well, the python side.

> The second one would be to use Avik's zmq2 module to communicate directly
> with the tornado front end.
>
> The third one would be (once the notebook js/css is refactored according to
> the roadmap you guys posted) to communicate directly with julia using the
> websockets protocol, but this is not as flexible as the second one as you
> cant mix in pythonic and R'ic and other fun.

Actually I don't think you should go to #3. The way we think about
the architecture, kernels must speak the zmq protocol
(http://ipython.org/ipython-doc/rel-0.13.1/development/messaging.html).
That's because the http server is a particular detail of running via
a web browser, but for example the Qt console could also be used for
Julia, and other, different web services could be built using a
different server architecture.

The protocol *is* the zmq/json layer, that's where your kernel should
sit. If you speak the protocol correctly, you'll be able to use every
compliant client, including the terminal, qt console and notebook.

I'm sure we'll find places where we've made assumptions too close to
python for this to work perfectly, but that's the point of this
exercise: to identify them and fix them so the protocol is really
language-agnostic.

> Thoughts?
>
> Are there any plans for any direct file editing in the notebook? RStudio has
> this very nice capability of sending stuff back and forth from the repl to
> the source and vice versa. And from history to source. Would be lovely to
> have this in the notebook....in tomorrow's course I'm finally getting to
> modules and code organization and such, and will be using a text editor. But
> a browser based one would be so good for teaching....

Yup, we know we need it. Patience, patience :)

> The notebook rocks!

Glad you like it!

Cheers,

f

Stefan Karpinski

unread,
Feb 21, 2013, 10:58:34 AM2/21/13
to Julia Dev
This is incredibly exciting stuff all around.

Stefan Karpinski

unread,
Feb 21, 2013, 11:01:10 AM2/21/13
to Julia Dev
On Wed, Feb 20, 2013 at 9:08 PM, Steven G. Johnson <ste...@alum.mit.edu> wrote:

julia> using PyCall
julia> pyinitialize() # will go away once #2378 is fixed

Can't the pyinitialize() call go at the end of the PyCall module load? I know that might not be the best form, but it seems like it would be reasonable to do the pyinitilized there and it can also setup an atexit handler to call pydeinitialize (or whatever it's called, I forget).

Stefan Karpinski

unread,
Feb 21, 2013, 2:54:55 PM2/21/13
to Julia Dev
Hmm. That tempts me to want to put perl-style hooks into using so that you can write things like "using PyCall python=2.7.3" or something like that. But that's probably an unhealthy inclination.


On Thu, Feb 21, 2013 at 2:15 PM, Steven G. Johnson <steve...@gmail.com> wrote:

No, pyinitialize() can't go in the module load.  The user has to have the option to call pyinitialize() manually (after loading the module but before calling any Python functions), for example to specify a different version of Python to load.

--SGJ

Gustavo Goretkin

unread,
Feb 21, 2013, 9:12:23 PM2/21/13
to juli...@googlegroups.com
Does this mean we can use matplotlib from julia?

Steven G. Johnson

unread,
Feb 21, 2013, 11:46:20 PM2/21/13
to juli...@googlegroups.com
Gustavo Goretkin wrote:
> Does this mean we can use matplotlib from julia?

Yes.

Steven G. Johnson

unread,
Feb 22, 2013, 12:00:35 PM2/22/13
to juli...@googlegroups.com


On Thursday, February 21, 2013 11:01:10 AM UTC-5, Stefan Karpinski wrote:

Update: I found a workaround for #2378, so pyinitialize() is no longer needed (unless you want to override the default python version).

Diego Javier Zea

unread,
Feb 23, 2013, 9:10:38 PM2/23/13
to juli...@googlegroups.com
Looks like a fantastic module. I'm without time right now, but I'm going to use this with BioPython next week (or the next next) :D

lgautier

unread,
Feb 25, 2013, 4:21:20 AM2/25/13
to juli...@googlegroups.com

Does this work by initializing Python with whatever initialization parameters are set if Python is not already whenever the first call to Python functionalities is made ?

That's a good idea. I will use it for the R interface.

Steven G. Johnson

unread,
Feb 25, 2013, 9:16:08 AM2/25/13
to juli...@googlegroups.com


On Monday, February 25, 2013 4:21:20 AM UTC-5, lgautier wrote:
Does this work by initializing Python with whatever initialization parameters are set if Python is not already whenever the first call to Python functionalities is made ?
 
Not sure I understand your question, but let me just explain in more detail.

Every (high-level) PyCall function first checks an "initialized" global to see whether Python is initialized (shared library loaded, Python interpreter started, etcetera).  If not, it calls pyinitialize() to initialize Python with the default parameters (via running the "python" executable to query the Python library name).  Optionally, the user can call pyinitialize manually in order to initialize using a different Python version or at a different time.

The problem with #2378 (which is now fixed anyway) had to do with calling the "python" executable from the @pyimport macro, which deadlocked with the parser. I worked around this by changing the @pyimport macro so that most of the heavy lifting is done from a pywrap function called by the macro.  That way, the initialization is moved from parse-time to run-time, which seemed like a good idea anyway.

--SGJ

R. Michael Weylandt

unread,
Feb 25, 2013, 10:08:25 AM2/25/13
to juli...@googlegroups.com
On Mon, Feb 25, 2013 at 2:16 PM, Steven G. Johnson
<steve...@gmail.com> wrote:
>
> The problem with #2378 (which is now fixed anyway) had to do with calling
> the "python" executable from the @pyimport macro, which deadlocked with the
> parser.

I'm not sure I follow this bit (which is probably my fault for not
looking at the code): couldn't you have shelled out to something like
'env python' or 'which python' instead?

Michael

Steven G. Johnson

unread,
Feb 25, 2013, 11:26:11 AM2/25/13
to juli...@googlegroups.com


On Monday, February 25, 2013 10:08:25 AM UTC-5, Michael Weylandt wrote:
I'm not sure I follow this bit (which is probably my fault for not
looking at the code): couldn't you have shelled out to something like
'env python' or 'which python' instead?

Bug #2378 meant that shelling out (the run command) deadlocked with the parser.  But anyway, it turned out nicer to do this at runtime rather than parse time.

Also, `which python` is not what I want.  I don't want the path of the python executable, I want the name (and possibly the path) of the libpython library.  I do this by running
    python -c "import distutils.sysconfig; print distutils.sysconfig.get_config_var('LDLIBRARY')"

--SGJ

lgautier

unread,
Feb 25, 2013, 8:01:46 PM2/25/13
to juli...@googlegroups.com


On Monday, February 25, 2013 3:16:08 PM UTC+1, Steven G. Johnson wrote:


On Monday, February 25, 2013 4:21:20 AM UTC-5, lgautier wrote:
Does this work by initializing Python with whatever initialization parameters are set if Python is not already whenever the first call to Python functionalities is made ?
 
Not sure I understand your question, but let me just explain in more detail.
 

Every (high-level) PyCall function first checks an "initialized" global to see whether Python is initialized (shared library loaded, Python interpreter started, etcetera).  If not, it calls pyinitialize() to initialize Python with the default parameters (via running the "python" executable to query the Python library name).  Optionally, the user can call pyinitialize manually in order to initialize using a different Python version or at a different time.

Ok. The answer to the (long) question is then: "yes".
;-)
 

The problem with #2378 (which is now fixed anyway) had to do with calling the "python" executable from the @pyimport macro, which deadlocked with the parser. I worked around this by changing the @pyimport macro so that most of the heavy lifting is done from a pywrap function called by the macro.  That way, the initialization is moved from parse-time to run-time, which seemed like a good idea anyway.

Yes. The (conditional) Initialization is a definitely a run-time event.


Thanks for the answer.

L.  

--SGJ

Steven G. Johnson

unread,
Mar 6, 2013, 12:26:36 AM3/6/13
to juli...@googlegroups.com


On Monday, February 18, 2013 11:04:57 PM UTC-5, Isaiah wrote:
Using the weakref mechanism in Python is a clean way to get a notification when the Python object is dead so that the Julia object can be removed from this collection.

I'm not sure how to do this purely in Julia, but here is an example of getting a callback from Python->Julia when a Python object is finalized after pydecref:

Note that the cleanup callback passes the weakref object, not the original object.

I ended up implementing a very similar weakref+callback+dictionary technique to prevent Julia from garbage-collecting an object until Python is done with it:
    https://github.com/stevengj/PyCall.jl/commit/be43093a83b67c89a4e944f01f5f2ec60a9b85e9
Thanks for the suggestion!

--SGJ
Reply all
Reply to author
Forward
0 new messages