Reproducible computational science

92 views
Skip to first unread message

Brad

unread,
Apr 7, 2020, 3:57:27 PM4/7/20
to leo-editor
Hello All,

As I see it, one of the more important trends in computational sciences is reproducibility. I have tried out a number of platforms that attempt to enable reproducibility and capture the provenance necessary to faithfully recapitulate computational analyses; however, I found them burdensome in terms of the imposed workflows.

I wonder if Leo could be a compelling platform for this use case.

The idea would be to have a sharable Leo file of a given format that would include enough information (exact code, data, specifics of the platform and libraries, etc.) such that the 'sharee' could exactly re-create the results of the 'sharer'.

It seems that Leo is a rich enough platform that a 'schema' could be created to facilitate this kind of sharing.

Is anyone else interested in this use case?

Kind regards,
Brad
 

Thomas Passin

unread,
Apr 7, 2020, 8:14:14 PM4/7/20
to leo-editor
It's interesting to me, anyway.  Could you talk about why you haven't found Jupyter notebooks to be satisfactory?  On other threads we have been discussing whether Leo, with the Viewrendered3 plugin, might be able to do much of what Jupyter does, and have some advantages besides.  Your question seems to fit right in.


On Tuesday, April 7, 2020 at 3:57:27 PM UTC-4, Brad wrote:
Hello All,

As I see it, one of the more important trends in computational sciences is reproducibility. I have tried out a number of platforms that attempt to enable reproducibility and capture the provenance necessary to faithfully recapitulate computational analyses; however, I found them burdensome in terms of the imposed workflows.

I wonder if Leo could be a compelling platform for this use case.

Marcel Franke

unread,
Apr 8, 2020, 8:03:31 AM4/8/20
to leo-editor


Am Dienstag, 7. April 2020 21:57:27 UTC+2 schrieb Brad:

I wonder if Leo could be a compelling platform for this use case.


Why do you think that?
What is leo offering that other solutions miss?

The idea would be to have a sharable Leo file of a given format that would include enough information (exact code, data, specifics of the platform and libraries, etc.) such that the 'sharee' could exactly re-create the results of the 'sharer'.


 Out of the box, that't barely possible even for regular apps. Binary data would be a problem, big data would be a significant problem and libraries would be somewhat tricky.

Also, if you aim for real reproducibility, you would need a bunch of more stuff, like a plaintext-readme with instructions, because leo-files are not well readable without leo. You would also need to preserve the used leo-version, which means deliveriung a dedicated folderstructure would be better anyway.

It seems that Leo is a rich enough platform that a 'schema' could be created to facilitate this kind of sharing.


 It entirely depends on what your aim is. Would it be used as an internal tool in a small group with low amount of data for day to day-usage?
Yes, some work is neccessary, but would be doable.

As a public tool, competing with something like Jupyter, for serious scientific work and data with potential thousands and tenthousands of unexperienced users? No way. Leo is just not a good fit for this. Leo is just a small petproject where people explorer interessting ideas while ignoring quality and stability, not a dedicated vision maintainend by hundreds of dedicated experts. For serious reproducibility you must plan in matters of decades, and leo is not able to deliver that. To be fair, something like Jupyter is hardly delivering this too, but it's a big well maintained and documented project with a well prospering community going into this direction, so it will at some point likely reach it.

Brad

unread,
Apr 8, 2020, 3:46:23 PM4/8/20
to leo-editor
I use Jupyter notebooks for a lot of my analyses.
Though I realize a lot more is possible, my personal preference is not to use this platform beyond exploratory analyses where one can embed relatively short snippets of code into the notebook.

I know that one could zip a directory with Jupyter notebooks and data to satisfy some of my requirements, but it seemed to me that with Leo's very versatile structure, and the capability to naturally incorporate meta data in a structured manner, might offer some advantages.

Per Marcel's perceptive comments, I understand that  Leo has only a fraction of the users of Jupyter notebooks. However, that doesn't mean that Jupyter notebooks are more capable for this task.

This a hard problem and I was just suggesting that an 'out of the box' solution using something like Leo might be worth considering.

Kind regards,
Brad

Thomas Passin

unread,
Apr 8, 2020, 6:23:50 PM4/8/20
to leo-editor
I think that Leo does have potential in this area, as long as it can export usable notebooks in Jupyter format.  That's not necessarily out of the question.  In fact, there is already an exporter (and an importer) for notebooks.  They don't seem to be able to bring in execution results and images so far.  But that could be fixed.

The export capability is needed so that Leo users would be able to share their work with the much larger number of people who use Jupyter notebooks.

We have found that the VR3 plugin (still in beta but nearly there) can be used to embed graphical output from calculations, even interactive output (e.g., from Bokeh) without much difficulty.  Right now it can only execute python code, but there doesn't seem to be any reason that couldn't change.

What we need is an approach to converting a Leo tree of nodes that uses VR3 to/from a Jupyter notebook.  Since Jupyter notebooks for the most part either point to image files or embed output as data: urls, that should be very doable.  Even if it can't do everything that Jupyter can do, it would be able to do a useful subset.  Then we could have the benefits of working in Leo combined with the benefits of sharing Jupyter notebooks.

Offray Vladimir Luna Cárdenas

unread,
Apr 13, 2020, 6:18:53 PM4/13/20
to leo-e...@googlegroups.com

HI Brad,

I was thinking in combining something like the outline capabilities of Leo with the interactive capabilities of IPython/Jupyter, and I explored such possibility, but I found a lot of incidental complexity in the Python ecosystem[1], so I finally developed a simpler prototype for interactive outlining, called Grafoscopio[2], using the Pharo live coding/programming/computing environment [3]

I think, as you, that there is a lot of potential for such interactive outlining, for complex reproducible research documents, and you can see something like that in the Org Mode world using Babel [4][4a]. I did my own prototype about Panama Papers as reproducible research, as you can see in [5], using pretty non-complicated tech stack (described there).

Regarding reproducibility in a time frame of decades, you can make this already with a Smalltalk, thanks to the image concept (which is there from 70's). You can froze the state of execution of your object in the image and reopen them a decade later, as I did with the simulation I made for my Masters. It was as I left it in my master presentation a decade ago see [6]. I propose to use Pharo/Smalltalk in tandem with functional package managers (Guix/Nix alike), so you can have a pretty reproducible environment and be more agile that Jupyter/Python community without the baggage of incidental complexity.

Cheers,

Offray


[1] http://mutabit.com/offray/static/blog/output/posts/grafoscopio-idea-and-initial-progress.html
[2] https://mutabit.com/grafoscopio/en.html
[3] https://pharo.org/
[4] https://www.youtube.com/watch?v=dljNabciEGg&feature=youtu.be
[4a] https://www.youtube.com/watch?v=GK3fij-D1G8
[5] https://mutabit.com/offray/blog/en/entry/panama-papers-1
[6] https://twitter.com/offrayLC/status/927313455543091200

--
You received this message because you are subscribed to the Google Groups "leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leo-editor+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/73dc42a5-f61d-4a44-81b9-b51861f3d876%40googlegroups.com.

Brad

unread,
Apr 14, 2020, 3:21:17 PM4/14/20
to leo-editor
Hello Offray,

Thank you for the insights!

I will check out the resources you described.

Kind regards,
Brad
To unsubscribe from this group and stop receiving emails from it, send an email to leo-e...@googlegroups.com.

Kent Tenney

unread,
Apr 15, 2020, 2:21:58 PM4/15/20
to leo-editor
I'm not interested in reproducibility, but I think the notion of
a Leo 'schema' is a great pattern: Domain Specific Leo,
a configuration of Leo dedicated to a problem domain
where the menus, buttons, scripts, commands etc have
been optimized for, in this case, reproducibility research.

Other candidates for a specialized Leo would be database
management, frameworks like Flask, Django, React etc

For each, the specialized Leo would be focused on files,
commands, queries, rendering, documentation etc
germane to the domain.

Leo is so expansive it strikes me as a programming language,
these instances which encapsulate specifics looking like
applications built with the Leo language.

Thanks,
Kent

--
You received this message because you are subscribed to the Google Groups "leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leo-editor+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/7494538a-f530-48d7-aff0-dabe9763d87e%40googlegroups.com.

Edward K. Ream

unread,
Apr 15, 2020, 5:24:50 PM4/15/20
to leo-editor
On Wed, Apr 15, 2020 at 1:21 PM Kent Tenney <kte...@gmail.com> wrote:

Good to hear from you, Kent.

I'm not interested in reproducibility, but I think the notion of
a Leo 'schema' is a great pattern: Domain Specific Leo,
a configuration of Leo dedicated to a problem domain
where the menus, buttons, scripts, commands etc have
been optimized for, in this case, reproducibility research.

Other candidates for a specialized Leo would be database
management, frameworks like Flask, Django, React etc

For each, the specialized Leo would be focused on files,
commands, queries, rendering, documentation etc
germane to the domain.

Leo is so expansive it strikes me as a programming language,
these instances which encapsulate specifics looking like
applications built with the Leo language.

This is exactly what I had in mind when I talked about Leo outlines as graphs, scripts creating projections, etc.

Leonine scripts applied to Leo outlines is a new world, not just a new programming language.

Edward
Reply all
Reply to author
Forward
0 new messages