rOpenSci: make your ggplot2 plots shareable, collaborative, and with D3

109 views
Skip to first unread message

M Sundquist

unread,
Apr 22, 2014, 8:26:45 AM4/22/14
to ggp...@googlegroups.com
Hi all, 

We've just released a beta of a ggplot2-supporting library as part of the rOpenSci project. You can add "py$ggplotly" to a script, and make your ggplot2 plots into interactive, D3, web-based plots you can jointly edit with others. Here's a post and couple of examples:


You can also keep your data and graphs together online. 

The library was just released and is still very much in beta. We'd love to hear what you think, how we can improve it, and if it's useful for you.

All the best,
Matt

Zack Weinberg

unread,
Apr 22, 2014, 10:30:33 AM4/22/14
to M Sundquist, ggplot2
On Tue, Apr 22, 2014 at 8:26 AM, M Sundquist <matt.su...@gmail.com> wrote:
> Hi all,
>
> We've just released a beta of a ggplot2-supporting library as part of the
> rOpenSci project. You can add "py$ggplotly" to a script, and make your
> ggplot2 plots into interactive, D3, web-based plots you can jointly edit
> with others. Here's a post and couple of examples:

That's really cool, but the dependence on plot.ly is troublesome. For
scientific work that needs to stay accessible into the indefinite
future, I require a system that allows me to host all of the data,
scripts, generated images, etc. together on a server I control. D3 is
fine that way, but it appears that plot.ly isn't. What are you
actually using plot.ly for and can it be gotten rid of?

zw

Matthew Sundquist

unread,
May 2, 2014, 10:38:07 PM5/2/14
to Zack Weinberg, ggplot2
Sorry for being slow! This went into the wrong folder. 

Thanks for writing back and checking it out. The library is quite new so it's super helpful to hear feedback.

Plotly in this case is meant to be a sharing, collaboration, and visualization layer. Like what GitHub is for sharing and collaborating on code, we'd love to do for sharing and collaborating on data and plots. 

So the collaboration use is that you could write to a plot with R, and someone else you're working with could write to the plot with matplotlib, MATLAB, or by copy and pasting data into the GUI. You can then get the data from that figure or the figure object itself, find and fork other data and plots. For sharing, you can keep all your graphs in a profile you can come back to for your plots and data, like this (Rhett is a physicist who writes for Wired):


We don't have everything together for R yet, but here's how it works using matplotlib and Python as your sharing layer:


We would love to hear your feedback, thoughts, and advice. And Zack, I'm happy to do a video call if you'd like.

All the best,
M


William Beasley

unread,
May 5, 2014, 1:55:51 PM5/5/14
to ggp...@googlegroups.com, Zack Weinberg
There's a lot of cool things about your software, including that most of the existing ggplot2 code can be adopted so easily, but I share many of Zack's reservations.  Could you please correct me if I'm misunderstanding something with your service?

If development & support for plot.ly stops in the future, how will our research results continue to be accessible & reproducible?  I'm comfortable with the my current workflow (which is roughly R, ggplot2, knitr, & GitHub) because if development stops on the first three pieces people can still download the software from the CRAN archives.  And if GitHub stops operating, the repository's code, data, & reports can be distributed many other ways (eg, a different git service, a different VCS, or simply serving the package as a zip on a little website). People (including me) can download the pieces and run the analyses on their local machine.

These fall back plans are important to me.  How does plot.ly's long-term/distributed/offline capabilities compare ggplot's long-term/distributed/offline capabilities?

Zack Weinberg

unread,
May 5, 2014, 2:21:14 PM5/5/14
to ggp...@googlegroups.com
On 2014-05-02 10:38 PM, Matthew Sundquist wrote:
> Sorry for being slow! This went into the wrong folder.
>
> Thanks for writing back and checking it out. The library is quite new so
> it's super helpful to hear feedback.
>
> Plotly in this case is meant to be a sharing, collaboration, and
> visualization layer. Like what GitHub is for sharing and collaborating
> on code, we'd love to do for sharing and collaborating on data and plots.

This is helpful but I am still not clear on where various pieces of data
and code are *stored*.

Github is unusual in (so far) maintaining its existence and
independence; the track record of similar collaboration services is not
good. They typically have a lifetime of less than five years, after
which they go out of business entirely, are purchased and shut down, or
reinvent themselves as something which does not serve their original
function. As such, in a scientific context I am only okay with using
them for secondary functions -- not the core function of preserving my
data and results long-term.

What this means for your project is: I'm only interested in using it if
*all of my data* and *all of the essential code to produce the plots*
can be stored on a server that I (or some entity that I trust to remain
in existence long-term) control. Essential code means code that, if it
went away, the plots would disappear or lose basic interactivity. It is
okay if the sharing-and-remixing part depends on an external service,
because that is a secondary function -- in principle someone could just
grab the raw data, do what they want with it, and put it up on their own
host. (It would be *better* if the sharing-and-remixing didn't depend
on an external service, but I recognize that that's a tall order in
today's Web.)

zw
Reply all
Reply to author
Forward
0 new messages