r backend working, with plots

77 views
Skip to first unread message

James Casbon

unread,
Nov 28, 2009, 5:58:49 AM11/28/09
to codenod...@googlegroups.com
Hi,

I managed to get R working quite easily using the rpy2 bindings.
Attached is a pic. You can pick up the code from the r-backend branch
on my github page. It has immediately clarified some issues around how
engines should be implemented, and raised some others about the
protocol.

Firstly, I think using python to wrap whatever engine is going to be
easier a lot of the time. This also means that we only need solve the
protocol issues once and we can reuse the code many times. For
example, you really wouldn't want to implement something that pickles
an image in R when you already have that code in python. So I think
that is the way to go for most engines, IMHO.

Secondly, some obvious protocol enhancements that an R engine needs:
1. the ability to return multiple images from a single command (some
plot commands return 5 images)
2. support for an engine state image - R can reload its state from an
RData file. It would be great of engines supported a save and restore
state methods that can be given a file to use.

We could very easily get a javascript engine going with these bindings to v8:
http://code.google.com/p/pyv8/

James
rincodenode.png

James Casbon

unread,
Nov 28, 2009, 6:03:20 AM11/28/09
to codenod...@googlegroups.com
2009/11/28 James Casbon <cas...@gmail.com>:
> Secondly, some obvious protocol enhancements that an R engine needs:
>  1. the ability to return multiple images from a single command (some
> plot commands return 5 images)
>  2. support for an engine state image - R can reload its state from an
> RData file.  It would be great of engines supported a save and restore
> state methods that can be given a file to use.
>

I really want to be able to return urls as well for placing in the
page. Then you could use http://pygooglechart.slowchop.com/ to render
things.

Matthew Turk

unread,
Nov 28, 2009, 12:25:07 PM11/28/09
to codenod...@googlegroups.com
Hi James,

Getting plots to be transported more clearly is something Dorian and I
worked on the other night -- the idea was that you'd set up one or
multiple "payloads," with the implicit conversion of stdout to a
payload of that type. This would be done inside the Interpreter
object, and then the async backend would properly format the cells.
So far we've got it working for a single image, as multiple images was
slightly more complicated to implement and would require more
javascript stuff.

This is in my repo on github under parallel-python-engine, and
unfortunately as of now it's only implemented for the IPControlelr
backend - I'm away for the weekend (it's a holiday here) but on Monday
I was going to take a look at this and put it into the standard
interpreter.

I'm not sure this is exactly what you were looking for, but I hope it
can be of some kind of help. I've added you to the wave where we
sketched some of this out.

-Matt
> --
> http://groups.google.com/group/codenode-devel?hl=en
> http://codenode.org

James Casbon

unread,
Nov 28, 2009, 12:56:14 PM11/28/09
to codenod...@googlegroups.com
I was just hacking around with the idea of using html as the engine
output protocol. This means we do not need to anticipate potential
payloads in advance - the rule is engines return html which the client
side javascript attaches to the dom tree. An image is then just
encoded in the cell's html. For example, matplotlib plotting could be
achieved like by base64 encoding and an img tag:

def show(fn=None, *args, **kwargs):
# save file
fn = str(uuid.uuid4()) + '.png'
realpath = os.path.join(os.path.abspath('.'), fn)
savefig(realpath, dpi=80, **kwargs)

# read and encode
data = rc="<img src=data:image/png;base64,%s />\n" %
base64.b64encode(file(realpath).read())
sys.stdout.write(data)

So then the kind of widgets that can be returned is up to the 'engine
context' implementor and not us here and now. Html seems the logical
choice for payload description.

In my case with R multiple images, I can add a div called something
like plotcollection or whatever, and the presentation is left totally
up to me in terms of css and javascript.

James





2009/11/28 Matthew Turk <matth...@gmail.com>:

Dorian Raymer

unread,
Nov 29, 2009, 12:46:52 AM11/29/09
to codenod...@googlegroups.com
Hey James,
Congratulations on making the first R engine! The image you posted was exciting to see!

I'm having trouble getting going with your r-engine branch:
2009-11-28 14:57:34-0800 [-] Unhandled Error
    Traceback (most recent call last):
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/twisted/application/app.py", line 445, in startReactor
        self.config, oldstdout, oldstderr, self.profiler, reactor)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/twisted/application/app.py", line 348, in runReactorWithLogging
        reactor.run()
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/twisted/internet/base.py", line 1166, in run
        self.mainLoop()
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/twisted/internet/base.py", line 1175, in mainLoop
        self.runUntilCurrent()
    --- <exception caught here> ---
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/twisted/internet/base.py", line 752, in runUntilCurrent
        f(*a, **kw)
      File "/Users/dorian/code/python/github/james/codenode/codenode/external/twisted/web/wsgi.py", line 299, in wsgiFinish
        self.request.finish()
      File "/Users/dorian/code/python/github/james/codenode/codenode/external/twisted/web/server.py", line 239, in finish
        http.Request.finish(self)
      File "/Users/dorian/code/python/github/james/codenode/codenode/external/twisted/web/http.py", line 805, in finish
        self._cleanup()
      File "/Users/dorian/code/python/github/james/codenode/codenode/external/twisted/web/http.py", line 575, in _cleanup
        self.channel.requestDone(self)
    exceptions.AttributeError: 'NoneType' object has no attribute 'requestDone'
   
2009-11-28 14:57:35-0800 [HTTPChannel,2,127.0.0.1] Unhandled Error
    Traceback (most recent call last):
      File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/threading.py", line 460, in __bootstrap
        self.run()
      File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/threading.py", line 440, in run
        self.__target(*self.__args, **self.__kwargs)
    --- <exception caught here> ---
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/twisted/python/threadpool.py", line 210, in _worker
        result = context.call(ctx, function, *args, **kwargs)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/twisted/python/context.py", line 59, in callWithContext
        return self.currentContext().callWithContext(ctx, func, *args, **kw)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/twisted/python/context.py", line 37, in callWithContext
        return func(*args,**kw)
      File "/Users/dorian/code/python/github/james/codenode/codenode/external/twisted/web/wsgi.py", line 290, in run
        appIterator = self.application(self.environ, self.startResponse)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/core/handlers/wsgi.py", line 241, in __call__
        response = self.get_response(request)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/core/handlers/base.py", line 134, in get_response
        return self.handle_uncaught_exception(request, resolver, exc_info)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/core/handlers/base.py", line 154, in handle_uncaught_exception
        return debug.technical_500_response(request, *exc_info)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/views/debug.py", line 40, in technical_500_response
        html = reporter.get_traceback_html()
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/views/debug.py", line 114, in get_traceback_html
        return t.render(c)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/template/__init__.py", line 178, in render
        return self.nodelist.render(context)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/template/__init__.py", line 779, in render
        bits.append(self.render_node(node, context))
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/template/debug.py", line 81, in render_node
        raise wrapped
    django.template.TemplateSyntaxError: Caught an exception while rendering: No module named django_nose
   
    Original Traceback (most recent call last):
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/template/debug.py", line 71, in render_node
        result = node.render(context)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/template/debug.py", line 87, in render
        output = force_unicode(self.filter_expression.resolve(context))
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/template/__init__.py", line 572, in resolve
        new_obj = func(obj, *arg_vals)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/template/defaultfilters.py", line 687, in date
        return format(value, arg)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/dateformat.py", line 269, in format
        return df.format(format_string)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/dateformat.py", line 30, in format
        pieces.append(force_unicode(getattr(self, piece)()))
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/dateformat.py", line 175, in r
        return self.format('D, j M Y H:i:s O')
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/dateformat.py", line 30, in format
        pieces.append(force_unicode(getattr(self, piece)()))
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/encoding.py", line 71, in force_unicode
        s = unicode(s)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/functional.py", line 201, in __unicode_cast
        return self.__func(*self.__args, **self.__kw)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/translation/__init__.py", line 62, in ugettext
        return real_ugettext(message)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/translation/trans_real.py", line 286, in ugettext
        return do_translate(message, 'ugettext')
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/translation/trans_real.py", line 276, in do_translate
        _default = translation(settings.LANGUAGE_CODE)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/translation/trans_real.py", line 194, in translation
        default_translation = _fetch(settings.LANGUAGE_CODE)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/translation/trans_real.py", line 180, in _fetch
        app = import_module(appname)
      File "/Users/dorian/code/python/github/james/codenode/_venv/lib/python2.5/site-packages/django/utils/importlib.py", line 35, in import_module
        __import__(name)
    ImportError: No module named django_nose

I have installed all the dependencies and double checked the nose python package was installed in my virtual env, but I still see this error.

Any ideas on how to get passed this?

-Dorian


On Sat, Nov 28, 2009 at 9:56 AM, James Casbon <cas...@gmail.com> wrote:
I was just hacking around with the idea of using html as the engine
output protocol.  This means we do not need to anticipate potential
payloads in advance - the rule is engines return html which the client
side javascript attaches to the dom tree.  An image is then just
encoded in the cell's html.  For example, matplotlib plotting could be
achieved like by base64 encoding and an img tag:

def show(fn=None, *args, **kwargs):
   # save file
   fn = str(uuid.uuid4()) + '.png'
   realpath = os.path.join(os.path.abspath('.'), fn)
   savefig(realpath, dpi=80, **kwargs)

   # read and encode
   data = rc="<img src=data:image/png;base64,%s />\n" %
base64.b64encode(file(realpath).read())
   sys.stdout.write(data)

Dorian Raymer

unread,
Nov 29, 2009, 2:05:03 AM11/29/09
to codenod...@googlegroups.com
On Sat, Nov 28, 2009 at 9:56 AM, James Casbon <cas...@gmail.com> wrote:
I was just hacking around with the idea of using html as the engine
output protocol.  This means we do not need to anticipate potential
payloads in advance - the rule is engines return html which the client
side javascript attaches to the dom tree.  An image is then just
encoded in the cell's html.  For example, matplotlib plotting could be
achieved like by base64 encoding and an img tag:

def show(fn=None, *args, **kwargs):
   # save file
   fn = str(uuid.uuid4()) + '.png'
   realpath = os.path.join(os.path.abspath('.'), fn)
   savefig(realpath, dpi=80, **kwargs)

   # read and encode
   data = rc="<img src=data:image/png;base64,%s />\n" %
base64.b64encode(file(realpath).read())
   sys.stdout.write(data)

I was wondering exactly how you embed image data directly into a page. Now I know!

I think the base64 encoding is the way to go, and we should do that as much as possible, where appropriate/instead of pickling.

The idea of returning an img tag is an interesting, effective way to indicate an image was produced by the engine.
In general, though, I think that the engines should not know about client presentation formatting details; this means they shouldn't ever deal with HTML. Some mechanism in the interpreter should attribute an 'output type', however, like 'image', etc. But, maybe we don't want to jump to using HTML to markup the output.
I believe this strategy will be more maintainable in the long run.
I would like to see only data returned from the engines.

The thing that returning HTML simplifies in this case is how the javascript decides what to do with an evaluation result. A robust way to classify engine output is something we have been thinking about for a while, and we have been getting by with light hacks in the meantime (assume it is always text unless there is an indication the output is a plot image).
I have been talking with Alex and Matthew about how there needs to be a "type" attribute describing each result, which means there needs to be a list of types that make sense for presentation: 'text', 'image', 'interact widget', 'widget x', etc.
 


So then the kind of widgets that can be returned is up to the 'engine
context' implementor and not us here and now.  Html seems the logical
choice for payload description.

From another perspective, the engine is supposed to be really good at being an engine, that is, it is a computation machine that given input, returns data as output.
It shouldn't effect the client presentation of the data, though; that would add complexity... the relationship between the javascript and the engine would be too coupled, becoming difficult to maintain and extend as time goes on.
 

In my case with R multiple images, I can add a div called something
like plotcollection or whatever, and the presentation is left totally
up to me in terms of css and javascript.


I think that this statement makes sense, in the context of the notebook javascript that handles output from the engine.
So, what if we just do the work and add the general capability to have multi-part output (which plot collections/multi-part plots will be a specific case of)?

I have to agree that the ability to return any HTML from your engine function and slap it in the DOM is attractive sounding... and me imposing the rule of never allowing html produced in the engine to be rendered in the Notebook sounds a bit extreme. I think there might be a middle ground, though.

The middle ground has to do with centralizing what has become our convention/standard for Cell structure and Cell type classification (image cell, text cell, etc.) in the notebook.
The html for the cells are generated by javascript functions, and there are different functions for the different cell types. What if we made it so all output had a 'type', and the javascript always decides how to display the output based on this type attribute. 
We could start by making errors (stuff that comes out of stderr) be it's own type of output cell in the notebook.
To throw crazy ideas out, there could even be more explicit ways for the engine to trigger a Cell, or something, to be added by the javascript to the notebook.

Ideas have been brewing on the Wave you(James) were invited to.
We should figure out how we can publish Waves so that everyone on the list can view, and possibly contribute if they have ideas. Good momentum has been building on Matthew's "codenode-transport" concept AND implementation.

I kind of hijacked this thread about the R engine, which is an awesome achievement :)
This is a hot discussion item, though, and I think this is the time to discuss it. Now that we have a few engine types, including one of another language, we have a good set of reference implementations to test what might eventually be called: the X protocol (where X might be codenode, or engine, or notebook, or something).

Also, we should start a project for collecting engine plugins, that may be just a dir in the root of the codenode repo for now. This could be a 'cookbook' for engine plugins supported by codenode, but are not part of the codenode library code (as they are plugins ;).

-Dorian
 

James Casbon

unread,
Nov 29, 2009, 5:55:12 AM11/29/09
to codenod...@googlegroups.com
2009/11/29 Dorian Raymer <deld...@gmail.com>:
> Hey James,
> Congratulations on making the first R engine! The image you posted was
> exciting to see!
>
> I'm having trouble getting going with your r-engine branch:
> 2009-11-28 14:57:34-0800 [-] Unhandled Error
>     Traceback (most recent call last):
snip
>
> Any ideas on how to get passed this?
>

Try disabling django_nose in installed apps? Or install it from here:
http://github.com/jbalogh/django-nose

James

James Casbon

unread,
Nov 29, 2009, 6:26:27 AM11/29/09
to codenod...@googlegroups.com
Ok so I made a bit more ;html should be the transport mechanism
format' on that wave

2009/11/29 Dorian Raymer <deld...@gmail.com>:

Alex Clemesha

unread,
Nov 30, 2009, 12:31:47 AM11/30/09
to codenod...@googlegroups.com
On Sat, Nov 28, 2009 at 2:58 AM, James Casbon <cas...@gmail.com> wrote:
Hi,

I managed to get R working quite easily using the rpy2 bindings.
Attached is a pic.  
Very cool!

Do the rpy2 bindings present a "perfect R interface" to a user?
Put in another way, can a user who know absolutely nothing about
Python use all existing R functionality via the bindings?
(Either way, from looking at the rpy2 docs, it looks like it pretty
much does everything a normal, everyday user needs - and is
quite awesome!).

 
You can pick up the code from the r-backend branch
on my github page. It has immediately clarified some issues around how
engines should be implemented, and raised some others about the
protocol.
This is an important topic right now (as you know), I'll start a
new thread with some of the current ideas in it. Looks like Dorian responded
with some good stuff related to the engines and a (much needed) protocol, in this thread already.

 

Firstly, I think using python to wrap whatever engine is going to be
easier a lot of the time.  This also means that we only need solve the
protocol issues once and we can reuse the code many times. For
example, you really wouldn't want to implement something that pickles
an image in R when you already have that code in python.  So I think
that is the way to go for most engines, IMHO.
Agreed. Are you familiar with anything similar to rpy2 but for Ruby?

I found this: http://pypi.python.org/pypi/rython - "rython transparently mixes Ruby code into Python"

 

Secondly, some obvious protocol enhancements that an R engine needs:
 1. the ability to return multiple images from a single command (some
plot commands return 5 images)
 2. support for an engine state image - R can reload its state from an
RData file.  It would be great of engines supported a save and restore
state methods that can be given a file to use.

We could very easily get a javascript engine going with these bindings to v8:
http://code.google.com/p/pyv8/
Nice! Soon we'll probably need to create (as Dorian mentioned) a
"repository" of different engine types. If we formalize and document the process of
creating engines it will be much easier for people with specific interests to come
in and create an engine of their specialty language, and everyone will benefit.


thanks,
Alex


 

James



--
Alex Clemesha
clemesha.org

James Casbon

unread,
Nov 30, 2009, 4:27:47 AM11/30/09
to codenod...@googlegroups.com
2009/11/30 Alex Clemesha <clem...@gmail.com>:
>
>
> On Sat, Nov 28, 2009 at 2:58 AM, James Casbon <cas...@gmail.com> wrote:
>>
>> Hi,
>>
>> I managed to get R working quite easily using the rpy2 bindings.
>> Attached is a pic.
>
> Very cool!
>
> Do the rpy2 bindings present a "perfect R interface" to a user?
> Put in another way, can a user who know absolutely nothing about
> Python use all existing R functionality via the bindings?
> (Either way, from looking at the rpy2 docs, it looks like it pretty
> much does everything a normal, everyday user needs - and is
> quite awesome!).

I think, so. You seem to just eval any string.

>
>>
>> You can pick up the code from the r-backend branch
>> on my github page. It has immediately clarified some issues around how
>> engines should be implemented, and raised some others about the
>> protocol.
>
> This is an important topic right now (as you know), I'll start a
> new thread with some of the current ideas in it. Looks like Dorian responded
> with some good stuff related to the engines and a (much needed) protocol, in
> this thread already.
>
>
>>
>> Firstly, I think using python to wrap whatever engine is going to be
>> easier a lot of the time.  This also means that we only need solve the
>> protocol issues once and we can reuse the code many times. For
>> example, you really wouldn't want to implement something that pickles
>> an image in R when you already have that code in python.  So I think
>> that is the way to go for most engines, IMHO.
>
> Agreed. Are you familiar with anything similar to rpy2 but for Ruby?
>
> I found this: http://pypi.python.org/pypi/rython - "rython transparently
> mixes Ruby code into Python"

No, sorry. You should really steer clear of defined by implementation
languages you know ;)

>
>
>>
>> Secondly, some obvious protocol enhancements that an R engine needs:
>>  1. the ability to return multiple images from a single command (some
>> plot commands return 5 images)
>>  2. support for an engine state image - R can reload its state from an
>> RData file.  It would be great of engines supported a save and restore
>> state methods that can be given a file to use.
>>
>> We could very easily get a javascript engine going with these bindings to
>> v8:
>> http://code.google.com/p/pyv8/
>
> Nice! Soon we'll probably need to create (as Dorian mentioned) a
> "repository" of different engine types. If we formalize and document the
> process of
> creating engines it will be much easier for people with specific interests
> to come
> in and create an engine of their specialty language, and everyone will
> benefit.

I got PyV8 going in a few minutes (after a lot longer spent building it!).

Both PyV8 and rython call the execution environment the context. This
is good terminology. So we should talk about engines and engine
contexts.

James
Reply all
Reply to author
Forward
0 new messages