Scala Kernel Discussion

780 views
Skip to first unread message

Kyle Kelley

unread,
Mar 3, 2017, 8:14:52 PM3/3/17
to jup...@googlegroups.com

On February 27, 2017 a group of us met to talk about Scala kernels and pave a path forward for Scala users. There is a youtube video available of the discussion available here:


https://www.youtube.com/watch?v=0NRONVuct0E


What follows is a summary from the call, mostly in linear order from the video itself.

Attendees

  • Alexander Archambault - Jupyter Scala, Ammonium

  • Ryan Blue (Netflix) - Toree

  • Gino Bustelo (IBM) - Toree

  • Joy Chakraborty (Bloomberg) - Spark Magic with Livy

  • Kyle Kelley (Netflix) - Jupyter

  • Haley Most (Cloudera) - Toree

  • Marius van Niekerk (Maxpoint) - Toree, Spylon

  • Peter Parente (Maxpoint) - Jupyter

  • Corey Stubbs (IBM) - Toree

  • Jamie Whitacre (Berkeley) - Jupyter

  • Tristan Zajonc (Cloudera) - Toree, Livy


Each of the people on the call has a preferred kernel, way of building it, and integrating it. We have a significant user experience problem in terms of users installing and using Scala kernels, beyond just Spark usage. The overarching goal is to create a cohesive experience for Scala users when they use Jupyter.


When a Scala user tries to come to the Jupyter ecosystem (or even a familiar Python developer), they face many options for kernels. Being faced with choice when trying to get things done is creating new friction points for users. As examples see https://twitter.com/chrisalbon/status/833156959150841856 and https://twitter.com/sarah_guido/status/833165030296322049.

What are our foundations for REPL libraries in Scala?


Toree was built on top of the Spark REPL and developers tried to use as much code as possible from Spark. For Alex’s jupyter-scala, he recognized that the Spark REPL was changing a lot from version to version. At the same time, Ammonite was created to assist in Scala scripting. In order to make big data frameworks such as Spark, Flink, and Scio to work well in this environment, a fork called Ammonium was created. There is some amount of trepidation in using a separate fork as part of the kernel community. We should make sure to unify with the originating Ammonite and contribute back as part of a larger scala community that can maintain these together.

Action Items:

  • Renew focus on Scala within Toree, improve outward messaging about how Toree provides a scala kernel

  • Unify Ammonite and Ammonium (+alexandre....@gmail.com)

    • To be used in jupyter-scala, potentially for spylon

There is more than one implementation of the Jupyter protocol in the Java Stack.


Toree has one, jupyter-scala does one, clojure kernels have their own. People would like to see a stable Jupyter library for the JVM. Some think it’s better to have one per language. Regardless of choice, we should have a well supported Jupyter library.

Action Items:


  • Create an idiomatic Java Library for the Jupyter messaging protocol - propose this as an incubation project within Jupyter

Decouple Spark from Scala in kernels


Decouple language specific parts from the computing framework to allow for using other computing frameworks. This is paramount for R and Python. When we inevitably want to connect to a GPU cluster, we want to be able to use the same foundations of a kernel. The reason that these end up being coupled is that Spark does “slightly weird things” for how it wants its classes compiled. It’s thought that there is some amount of specialization and that we can work around it. At the very least, we can bake it into the core and leave room for other frameworks to have solid built in support if necessary.


An approach being worked on in Toree right now is lazy loading of spark. One concern that is different between jupyter-scala and Toree is that jupyter-scala can dynamically load spark versions whereas for Toree is bound to a version of Spark on deployment. For end users that have operators/admins, kernels can be configured per version of spark it will use (common for Python, R). Spark drives lots of interest in Scala kernel, many kernels conflate the two. This results in poor messaging and experiences for users getting started.

Action Items:


  • Lazy load spark within Toree

Focus efforts within kernel communities


Larger in scope than just the Scala kernel, we need jupyter to acknowledge fully supported kernels. In contrast, the whole community in Zeppelin collaborates in one repository around their interpreters.


“Fragmentation of kernels makes it harder for large enterprises to adopt them.”

- Tristan Zajonc (Cloudera)


Beyond the technical implementation of what is a supported kernel, we also need the messaging to end users to be simple and clear. There are several objectives we need to do to improve our messaging, organization, and technical underpinnings.

Action Items


  • On the Jupyter site provide blurbs and links to kernels for R, Python, and Scala

  • Create an organized effort around the Scala Kernel, possibly by unifying in an organization while isolating projects in separate repositories

  • Align a specification of what it takes to be acknowledged as a supported kernel

Visualization

We would like to be able to push on the idea of mimetypes that output a hunk of JSON and are able to draw beautiful visualizations. Having these adopted in core Jupyter by default would go a long way towards providing simple just works visualization. The current landscape of visualization with the Scala kernels includes



There is a bit of worry about standardization around the HTML outputs. Some libraries try to use frontend libraries that may not exist on the frontend or mismatch in version - jquery, requirejs, ipywidgets, jupyter, ipython. In some frontends, at times dictated by the operating environment, the HTML outputs must be in null origin iframes.

Action Items

  • Continue involvement in Jupyter frontends to provide rich visualization out of the box with less configuration and less friction

Standardizing display and reprs for Scala


Since it’s likely that we there will still be multiple kernels available for the JVM, not just within Scala, we want to standardize the way in which you inspect objects in the JVM. IPython provides a way for libraries to integrate with IPython automatically for users. We want library developers to be able to follow a common scheme and be well represented regardless of the kernel.

Action Items:

  • Create a specification for object representation for JVM languages as part of the Jupyter project

--
Kyle Kelley (@rgbkrklambdaops.com)

Brian Granger

unread,
Mar 4, 2017, 11:15:03 AM3/4/17
to Project Jupyter
Thanks for taking the lead on this Kyle!
> --
> You received this message because you are subscribed to the Google Groups
> "Project Jupyter" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jupyter+u...@googlegroups.com.
> To post to this group, send email to jup...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jupyter/CA%2BtbMaUQzt4tb9HVtEKaxrpmGib%3DbENhoYk%3D910vc01oid%3DNhA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.



--
Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgra...@calpoly.edu and elli...@gmail.com

MinRK

unread,
Mar 6, 2017, 11:58:40 AM3/6/17
to Project Jupyter
This is awesome, thanks Kyle (and everyone)!

On Fri, Mar 3, 2017 at 5:14 PM, Kyle Kelley <rgb...@gmail.com> wrote:

On February 27, 2017 a group of us met to talk about Scala kernels and pave a path forward for Scala users. There is a youtube video available of the discussion available here:


https://www.youtube.com/watch?v=0NRONVuct0E


What follows is a summary from the call, mostly in linear order from the video itself.

Attendees

  • Alexander Archambault - Jupyter Scala, Ammonium

  • Ryan Blue (Netflix) - Toree

  • Gino Bustelo (IBM) - Toree

  • Joy Chakraborty (Bloomberg) - Spark Magic with Livy

  • Kyle Kelley (Netflix) - Jupyter

  • Haley Most (Cloudera) - Toree

  • Marius van Niekerk (Maxpoint) - Toree, Spylon

  • Peter Parente (Maxpoint) - Jupyter

  • Corey Stubbs (IBM) - Toree

  • Jamie Whitacre (Berkeley) - Jupyter

  • Tristan Zajonc (Cloudera) - Toree, Livy


Each of the people on the call has a preferred kernel, way of building it, and integrating it. We have a significant user experience problem in terms of users installing and using Scala kernels, beyond just Spark usage. The overarching goal is to create a cohesive experience for Scala users when they use Jupyter.


When a Scala user tries to come to the Jupyter ecosystem (or even a familiar Python developer), they face many options for kernels. Being faced with choice when trying to get things done is creating new friction points for users. As examples see https://twitter.com/chrisalbon/status/833156959150841856 and https://twitter.com/sarah_guido/status/833165030296322049.

What are our foundations for REPL libraries in Scala?


Toree was built on top of the Spark REPL and developers tried to use as much code as possible from Spark. For Alex’s jupyter-scala, he recognized that the Spark REPL was changing a lot from version to version. At the same time, Ammonite was created to assist in Scala scripting. In order to make big data frameworks such as Spark, Flink, and Scio to work well in this environment, a fork called Ammonium was created. There is some amount of trepidation in using a separate fork as part of the kernel community. We should make sure to unify with the originating Ammonite and contribute back as part of a larger scala community that can maintain these together.

Action Items:

  • Renew focus on Scala within Toree, improve outward messaging about how Toree provides a scala kernel

--
You received this message because you are subscribed to the Google Groups "Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+unsubscribe@googlegroups.com.

Alejandro Guerrero

unread,
Mar 6, 2017, 5:25:22 PM3/6/17
to Project Jupyter
Thank you all for your thoughts and to Kyle for organizing!

Sorry I didn't attend the call, but I didn't receive an invite. I'd be happy to join further calls.

I am the co-creator of sparkmagic, which relies on Livy as the connection layer to Spark clusters. Sparkmagic provides Jupyter users with Python, Scala, and R kernels. All kernels have the same features:
  • SparkSQL magic
  • Automatic visualizations
  • Ability to capture Spark dataframes into Pandas dataframes to be visualized with any of Python's visualization libraries
I agree that it's important for all of us to try to build a consistent experience for all Jupyter users. We started sparkmagic because we wanted a platform that would:
  • Provide multiple language support for Spark at the same level
  • Provide a standardized visualization framework across kernels
  • Allow for users to change the Spark cluster that is being targeted from the same Jupyter installation, without complicated network setups
  • Have the installation be as straightforward as possible
  • Add a layer that could handle different authentication methods to clusters (Joy's work on Kerberos authentication is an example of this)
We are happy with what we've achieved so far, but we would like to see the following things happen:
  • Improvements on the auto-visualization framework. Today, we are using ipywidgets and plotly to do the visualization, and this has led to visualizations not to be preserved on documents. We would like to move away from ipywidgets and go with a mimetype-based approach, where everyone can converge.
  • Progress bars/Spark application status/cancel buttons. We see these features as ways for users to monitor cell progress and act on it. Today, users get a fire code and hope everything is going well experience; looking at job status requires several clicks, a different tab, and for you to correlate what your cell is doing with what the Spark UI says.
  • Cluster information. We've seen plenty of errors when clusters run out of resources, and users do not know that the cluster was out of resources, who's using them, or if they can clean up. We would love to have a cluster status pane that allows users to understand the resource utilization of a cluster (or other cluster if its status/characteristics are better) and probably do some admin tasks on their clusters.
Our team is concerned with Big Data support in Jupyter, so we have few opinions on a "Small Data" Scala kernel. I agree that it would be nice to separate languages from backends from an architectural standpoint. Having Jupyter libraries for JVM based kernels would be a step in the right direction. Adding Spark and other back ends as add-ons to kernels could also be a nice idea, provided we are wary of how these add-on's installation and configuration experience ends up looking like for end users. Spark, and I imagine other backends, require network access to all worker nodes from the driver. I'm wary of the experience we'll create if we make kernels the driver and require kernels to be in the cluster. Livy solves a lot of that by making Livy the driver, which is collocated in a cluster, and for Jupyter to simply manage connection strings via sparkmagic. In the add-ons to kernels way of the world, how would a data scientist target different clusters or back ends? What kind of set up work does she have to do?

On the visualizations front, I saw an effort to create a mimetype-based visualization library here: https://github.com/gnestor/jupyterlab_table
If all kernel, regardless of language (e.g. Python, R, Scala) were to output that mimetype, users would get a standard visualization library to use, and us devs could converge on it.

Best,
Alejandro


On Monday, March 6, 2017 at 8:58:40 AM UTC-8, Min RK wrote:
This is awesome, thanks Kyle (and everyone)!
On Fri, Mar 3, 2017 at 5:14 PM, Kyle Kelley <rgb...@gmail.com> wrote:

On February 27, 2017 a group of us met to talk about Scala kernels and pave a path forward for Scala users. There is a youtube video available of the discussion available here:


https://www.youtube.com/watch?v=0NRONVuct0E


What follows is a summary from the call, mostly in linear order from the video itself.

Attendees

  • Alexander Archambault - Jupyter Scala, Ammonium

  • Ryan Blue (Netflix) - Toree

  • Gino Bustelo (IBM) - Toree

  • Joy Chakraborty (Bloomberg) - Spark Magic with Livy

  • Kyle Kelley (Netflix) - Jupyter

  • Haley Most (Cloudera) - Toree

  • Marius van Niekerk (Maxpoint) - Toree, Spylon

  • Peter Parente (Maxpoint) - Jupyter

  • Corey Stubbs (IBM) - Toree

  • Jamie Whitacre (Berkeley) - Jupyter

  • Tristan Zajonc (Cloudera) - Toree, Livy


Each of the people on the call has a preferred kernel, way of building it, and integrating it. We have a significant user experience problem in terms of users installing and using Scala kernels, beyond just Spark usage. The overarching goal is to create a cohesive experience for Scala users when they use Jupyter.


When a Scala user tries to come to the Jupyter ecosystem (or even a familiar Python developer), they face many options for kernels. Being faced with choice when trying to get things done is creating new friction points for users. As examples see https://twitter.com/chrisalbon/status/833156959150841856 and https://twitter.com/sarah_guido/status/833165030296322049.

What are our foundations for REPL libraries in Scala?


Toree was built on top of the Spark REPL and developers tried to use as much code as possible from Spark. For Alex’s jupyter-scala, he recognized that the Spark REPL was changing a lot from version to version. At the same time, Ammonite was created to assist in Scala scripting. In order to make big data frameworks such as Spark, Flink, and Scio to work well in this environment, a fork called Ammonium was created. There is some amount of trepidation in using a separate fork as part of the kernel community. We should make sure to unify with the originating Ammonite and contribute back as part of a larger scala community that can maintain these together.

Action Items:

  • Renew focus on Scala within Toree, improve outward messaging about how Toree provides a scala kernel

To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+u...@googlegroups.com.

Kyle Kelley

unread,
Mar 6, 2017, 7:38:06 PM3/6/17
to jup...@googlegroups.com, Holden Karau, Ryan Blue
Alejandro,

Thanks for responding. I did a poor job of maintaining the list of emails of everyone I was reaching out to, hopefully everyone interested is on the jupyter mailing list now and we can hold regular meetings. I'm really happy to see all of your feedback.

This does blend into 

Responses inline.


On Mon, Mar 6, 2017 at 2:25 PM, Alejandro Guerrero <agg....@gmail.com> wrote:
Thank you all for your thoughts and to Kyle for organizing!

Sorry I didn't attend the call, but I didn't receive an invite. I'd be happy to join further calls.

I am the co-creator of sparkmagic, which relies on Livy as the connection layer to Spark clusters. Sparkmagic provides Jupyter users with Python, Scala, and R kernels. All kernels have the same features:
  • SparkSQL magic
  • Automatic visualizations
  • Ability to capture Spark dataframes into Pandas dataframes to be visualized with any of Python's visualization libraries
I agree that it's important for all of us to try to build a consistent experience for all Jupyter users. We started sparkmagic because we wanted a platform that would:
  • Provide multiple language support for Spark at the same level

What difficulties did you run into where support needed to be in Apache Spark itself? There are at least two committers on the list and involved now that are interested in improving the support of the libraries themselves.
 
  • Provide a standardized visualization framework across kernels
  • Allow for users to change the Spark cluster that is being targeted from the same Jupyter installation, without complicated network setups
  • Have the installation be as straightforward as possible
<3
  • Add a layer that could handle different authentication methods to clusters (Joy's work on Kerberos authentication is an example of this)
We are happy with what we've achieved so far, but we would like to see the following things happen:
  • Improvements on the auto-visualization framework. Today, we are using ipywidgets and plotly to do the visualization, and this has led to visualizations not to be preserved on documents. We would like to move away from ipywidgets and go with a mimetype-based approach, where everyone can converge.

+1

I'll respond to this a bit more in the section below on the new table/data resource mimetype.
 
  • Progress bars/Spark application status/cancel buttons. We see these features as ways for users to monitor cell progress and act on it. Today, users get a fire code and hope everything is going well experience; looking at job status requires several clicks, a different tab, and for you to correlate what your cell is doing with what the Spark UI says.
Ryan Blue can chime in on this one more, he ended up writing some custom reprs for spark context and jobs for our use at Netflix.

On the Jupyter side, we have a little bit of this outlined in the Spark roadmap for Jupyter: https://github.com/jupyter/roadmap/blob/master/spark.md; It would be great to have more outlined there for us to iterate on.
  • Cluster information. We've seen plenty of errors when clusters run out of resources, and users do not know that the cluster was out of resources, who's using them, or if they can clean up. We would love to have a cluster status pane that allows users to understand the resource utilization of a cluster (or other cluster if its status/characteristics are better) and probably do some admin tasks on their clusters.
This one is so important.
 
Our team is concerned with Big Data support in Jupyter, so we have few opinions on a "Small Data" Scala kernel. I agree that it would be nice to separate languages from backends from an architectural standpoint. Having Jupyter libraries for JVM based kernels would be a step in the right direction. Adding Spark and other back ends as add-ons to kernels could also be a nice idea, provided we are wary of how these add-on's installation and configuration experience ends up looking like for end users. Spark, and I imagine other backends, require network access to all worker nodes from the driver.

For some organizations (mine included), we provide the network access necessary and the spark binaries. We will likely never support Livy in our environment. There should be plenty of room for people to use Livy though and we can have focused efforts on deployment agnostic components supporting Spark.
 
I'm wary of the experience we'll create if we make kernels the driver and require kernels to be in the cluster. Livy solves a lot of that by making Livy the driver, which is collocated in a cluster, and for Jupyter to simply manage connection strings via sparkmagic. In the add-ons to kernels way of the world, how would a data scientist target different clusters or back ends? What kind of set up work does she have to do?

On the visualizations front, I saw an effort to create a mimetype-based visualization library here: https://github.com/gnestor/jupyterlab_table
If all kernel, regardless of language (e.g. Python, R, Scala) were to output that mimetype, users would get a standard visualization library to use, and us devs could converge on it.

Over the weekend https://github.com/pandas-dev/pandas/pull/14904 was merged which exports application/vnd.dataresource+json. It's exactly what jupyterlab_table relies on as well as https://github.com/nteract/nteract/pull/1534. I'm greatly looking forward to this!

I'd like to see some visualization built into the component that uses that mimetype, possibly something polestar or lyra like (from the vega folks) as well as some autovisualization. :D
 
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+unsubscribe@googlegroups.com.

To post to this group, send email to jup...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

hadim

unread,
Mar 14, 2017, 4:18:00 PM3/14/17
to Project Jupyter
Hello,

I am glad to see some people working on this.

I am the creator of a very early stage kernel for ImageJ (a widely used Java software for scientific imaging analysis) called scijava-jupyter-kernel. In ImageJ with have a script editor that allow the user to interact easily with ImageJ. A lot of languages are supported but we are mainly focusing on Groovy and Python.

Being a Python developers and Jupyter user for a long time now, I am really excited to be able to interact with ImageJ via Jupyter. Also, the lead developer of ImageJ (Curtis Rueden) started to make some notebooks using Beaker and Groovy.

In short, the scijava-jupyter-kernel is a java library to communicate with a Jupyter server. All the code execution rely on the various language specific packages available languages are supported and discovered during execution.

As probably everyone here we are very interested in features such as nice output formatting (image, table, plot, etc), code completion, etc. And I think a generic Jupyter library for JVM based kernels (as already said before) is definitively a step in the right direction.

Best,

tris...@cloudera.com

unread,
Mar 20, 2017, 7:45:20 PM3/20/17
to Project Jupyter, hol...@pigscanfly.ca, rb...@netflix.com
Kyle,

Many thanks for organizing this call and I apologize for the delay responding to this thread.  I agree with the summary and key action items.  I'm particularly interested in features that all frontends can benefit from, e.g. well-supported kernels and associated display protocols.  There's a number of things we can do to improve the Spark experience, but with the right composable pieces this can be done entirely in user space.

I look forward to contributing to some of these initiatives.  As some may have seen, Cloudera just announced an upcoming Data Science Workbench product (https://www.cloudera.com/products/data-science-and-engineering/data-science-workbench.html).  It leverages Jupyter kernels at the core.  Some things Kyle mentions like the lack of clean HTML isolation do make things more difficult than they should be.  But I think nteract, JupyterLab, and Data Science Workbench show how flexible Jupyter is when building on top of the core primitives.

By the way, if anybody is interested in working on these sorts of things full time, we just posted a related software engineering position at Cloudera.  Feel free to email me directly.

Tristan

Fernando Perez

unread,
Mar 20, 2017, 10:43:46 PM3/20/17
to Project Jupyter, hol...@pigscanfly.ca, rb...@netflix.com
Hi Tristan,

On Mon, Mar 20, 2017 at 4:45 PM, <tris...@cloudera.com> wrote:
I look forward to contributing to some of these initiatives.  As some may have seen, Cloudera just announced an upcoming Data Science Workbench product (https://www.cloudera.com/products/data-science-and-engineering/data-science-workbench.html).  It leverages Jupyter kernels at the core.  Some things Kyle mentions like the lack of clean HTML isolation do make things more difficult than they should be.  But I think nteract, JupyterLab, and Data Science Workbench show how flexible Jupyter is when building on top of the core primitives.

This is excellent, congrats on the release! Is there any public mention of the fact that under the hood it actually leverages the Jupyter protocols and primitives? I couldn't find any info about that on the docs that (admittedly, rather hastily) I scanned.

Very best,

f


--
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail

Tristan Zajonc

unread,
Mar 21, 2017, 1:19:43 PM3/21/17
to jup...@googlegroups.com, Holden Karau, rb...@netflix.com
We haven't publicly released quite yet -- just a public announcement -- so the information is still sparse.  Our official docs will include more information on the architecture including the use of Jupyter kernels.

Tristan

--
You received this message because you are subscribed to a topic in the Google Groups "Project Jupyter" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jupyter/O1KnaEPqCM4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jupyter+unsubscribe@googlegroups.com.

To post to this group, send email to jup...@googlegroups.com.

Ryan Blue

unread,
Mar 21, 2017, 1:28:55 PM3/21/17
to Tristan Zajonc, Project Jupyter, Holden Karau
Sorry for sending twice, but I think the original was rejected since I wasn't subscribed.

On the subject of pieces that all front-ends can benefit from, I've posted a repository with what we're currently using for a display/repr API here:


This is the API that we are currently using to inspect JVM objects in our Scala kernel. I'd like for something like this to be supported and published by the Jupyter community, so that JVM library authors can register code that displays library-specific objects that can be used by any JVM-based kernel.

rb
--
Ryan Blue
Software Engineer
Netflix

Kyle Kelley

unread,
Mar 21, 2017, 1:53:00 PM3/21/17
to jup...@googlegroups.com, Tristan Zajonc, Holden Karau
It would be great to move the repr api into the jupyter project as a standard we can evolve. Since I work with Ryan though, it shouldn't be me pushing for it though so I'd love to hear from others in the jupyter community. My primary interest is to make sure we're moving towards standards we can all collaborate on and improve across the ecosystem of libraries.

--
You received this message because you are subscribed to the Google Groups "Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+unsubscribe@googlegroups.com.

To post to this group, send email to jup...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Kyle Kelley (@rgbkrklambdaops.com)

Fernando Perez

unread,
Mar 21, 2017, 5:59:07 PM3/21/17
to Project Jupyter, Holden Karau, rb...@netflix.com
On Tue, Mar 21, 2017 at 10:19 AM, Tristan Zajonc <tris...@cloudera.com> wrote:
We haven't publicly released quite yet -- just a public announcement -- so the information is still sparse.  Our official docs will include more information on the architecture including the use of Jupyter kernels.

Got it, thanks! Please do let us know when you make this info publicly visible, I'm sure many would find it interesting, and congrats again! Very happy to see others build upon the infrastructure.

Cheers,

Fernando Perez

unread,
Mar 21, 2017, 6:00:33 PM3/21/17
to Project Jupyter

On Tue, Mar 21, 2017 at 10:52 AM, Kyle Kelley <rgb...@gmail.com> wrote:
My primary interest is to make sure we're moving towards standards we can all collaborate on and improve across the ecosystem of libraries.

+lots! I want to emphasize how much, as a project, we want this to be the case.  All our primitives were built openly precisely so we could evolve them, with community input, to maximize interoperability and the growth of a healthy ecosystem.

Scott Draves

unread,
Mar 22, 2017, 11:28:03 AM3/22/17
to Project Jupyter
I am really happy to hear that Spark support is getting serious attention.

We have been working for some time in this area, and have some code to share: https://github.com/twosigma/beaker-notebook-private (despite the name this repository is open).

This includes a Scala kernel (that is implemented on base JVM kernel and includes other languages such as Java and Groovy, and more coming).  The base kernel (implemented in java) has classes for comm and widgets.  In its previous incarnation in Beaker Notebook, we have a nice UI for integration with spark: https://github.com/twosigma/beaker-notebook/issues/4943, and we are looking for equivalent functionality in Jupyter.

There is a lot more to say about this, and questions to ask, but just want to join the conversation sooner rather than later since it is moving so fast.

Best, -Scott

Kyle Kelley

unread,
Mar 24, 2017, 4:24:58 PM3/24/17
to jup...@googlegroups.com
Scott,

I really like the base Java package for building kernels on the JVM, since it isn't tied to Scala or Groovy -- people can build on top of it. I'm especially happy you licensed it all as Apache 2.

All,

What do people think is next for integrating amongst the various projects and kernels?



--
You received this message because you are subscribed to the Google Groups "Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+unsubscribe@googlegroups.com.
To post to this group, send email to jup...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Scott Draves

unread,
Mar 24, 2017, 8:14:14 PM3/24/17
to Project Jupyter
Thanks!  We have kernels for Clojure, SQL, C++, and R also derived from this base kernel (I know the last two sound weird, but they use JNI and Rserve to good effect).

I understand more kernels isn't exactly what you are looking for :) But they were all pretty much working already, so we might as well get them out there before comparing their features and properties with the rest, and working towards standard kernels.

Next for me on this front is definitely to learn more about the all the options, I am a newcomer and it will take some time.  I hope you will be forgiving in the meantime.

Best, -Scott
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+u...@googlegroups.com.

To post to this group, send email to jup...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages