What is the relationship between jupyter notebook and kernals?

388 views
Skip to first unread message

mangecoeur

unread,
Jun 15, 2015, 10:44:27 AM6/15/15
to jup...@googlegroups.com
Some time ago I started work on an "ipython desktop" app, which basically wrapped the notebook interface in a chromeless browser (originally node-webkit, later Atom/Electron) and did some subprocess-fu to start/stop kernals.

I basically put this project on hold while the whole ipython->jupyter transition was underway (and while atom became more stable).

One thing I'd like more clarity on to be able to continue is the relationship between the Notebook interface and the kernels: how integrated are they? For instance, does the same python interpreter have to be used to run both the notebook server and the ipython kernels?

My thinking is, to build a desktop app, would it make more sense to bundle the notebook server + Python interpreter in the app itself and allow it to connect to (remote?) kernals installed using a separate python interpreter? If you have to use the same interpreter, then the notebook server would have to ship with every dependance someone might want - this was always something I wanted to avoid, because I think people need to be free to set up their computation environment however they like.

The alternative is for the desktop app to be simple a "cruft-less" browser window that displays the notebook running within the kernel. The downside to this is its harder to customise the UI/UX to take advantage of the fact that you aren't limited by running in a browser. In the experiments I have done so, I found you end up with very fragile monkey-patching of the notebook JS, with the risk of it breaking if someone changes anything. Particular issue is the conflict between the browser-side "requirejs" usage and the server-side nodjs "require" system.

If I understood the architecture correctly, the notebook server and ipython kernel servers shouldn't need to share the same interpreter - but I'd like to know more about this and how it might work to start a kernal in a different interpreter from within the notbook UI.

-

Matthias Bussonnier

unread,
Jun 15, 2015, 11:00:03 AM6/15/15
to jup...@googlegroups.com
Hi Jon. 

One thing I'd like more clarity on to be able to continue is the relationship between the Notebook interface and the kernels: how integrated are they?

As few has possible

For instance, does the same python interpreter have to be used to run both the notebook server and the ipython kernels?

Not at all. The kernel can even be pure Ruby, or R, or Go, or Julia, or Haskell…. The fronted does not care. 



My thinking is, to build a desktop app, would it make more sense to bundle the notebook server + Python interpreter in the app itself and allow it to connect to (remote?) kernals installed using a separate python interpreter? If you have to use the same interpreter, then the notebook server would have to ship with every dependance someone might want - this was always something I wanted to avoid, because I think people need to be free to set up their computation environment however they like. 

The alternative is for the desktop app to be simple a "cruft-less" browser window that displays the notebook running within the kernel.

I’m confused about this sentence. I don’t get the “notebook within the kernel”.

The downside to this is its harder to customise the UI/UX to take advantage of the fact that you aren't limited by running in a browser. In the experiments I have done so, I found you end up with very fragile monkey-patching of the notebook JS, with the risk of it breaking if someone changes anything. Particular issue is the conflict between the browser-side "requirejs" usage and the server-side nodjs "require" system.
If I understood the architecture correctly, the notebook server and ipython kernel servers shouldn't need to share the same interpreter - but I'd like to know more about this and how it might work to start a kernal in a different interpreter from within the notbook UI.


So let me dive a bit into the infrastructure itself, simplified for the notebook only. 
When you are editing a notebook document there is roughly 3 pieces. 

 - The kernel
 - The web server
 - the Browser.

The web server is a bit confusing here has it has 2 roles:
  - being a ZMQ-Websocket bridge 
  - serving as a backend for a web in the browser.

Nothing prevent you from having (especially in nodewebkit) a fully js-based frontend that speaks directly to the kernel bypassing the Notebook Bridge. 
(that’s what atom hydrogen plugin does). 

If you tried the more notebook UI, Starting another kernel (like not a python one, or switch from python2 to python 3), is just a click on the menu. 
In the pack it will just kill the current running kernel, and spawn a new subprocess and ask it to bind to previous port. 
The exact detail you are interested with will depends on  what you want to do. Will you still use the tornado server ? Or write things purely with node ?
I’m a bit confused about the monkey patching things. Without better details it hard to discuss what we can do about that. 

Hope that clarify things a bit.
— 
M




-

--
You received this message because you are subscribed to the Google Groups "Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+u...@googlegroups.com.
To post to this group, send email to jup...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jupyter/ba721a76-a25c-441e-97ed-55d5dc7fcf5a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thomas Kluyver

unread,
Jun 15, 2015, 12:40:28 PM6/15/15
to jup...@googlegroups.com
On 15 June 2015 at 07:44, mangecoeur <jon.cham...@gmail.com> wrote:
One thing I'd like more clarity on to be able to continue is the relationship between the Notebook interface and the kernels: how integrated are they? For instance, does the same python interpreter have to be used to run both the notebook server and the ipython kernels?

As Matthias has said, no, it doesn't. However, the server will look for a kernel installed with the same Python interpreter to make available to the user. It should always have at least one kernel for normal use cases, and that's the simplest way to ensure that one is available.

Thomas

mangecoeur

unread,
Jun 15, 2015, 1:33:42 PM6/15/15
to jup...@googlegroups.com
Thanks for the feedback. Some details:


> the server will look for a kernel installed with the same Python interpreter to make available to the user. It should always have at least one kernel for normal use cases, and that's the simplest way to ensure that one is available.

So the current notebook implementation does assume that an IPython kernal is installed for the current interpreter? Where does the Notebook get information on what kernals are installed from, and where to find them?



The alternative is for the desktop app to be simple a "cruft-less" browser window that displays the notebook running within the kernel.
> I’m confused about this sentence. I don’t get the “notebook within the kernel”.

This was because had understood that when you started an IPython notebook instance, the notebook was served from the ipython kernel process, rather than from the notbook server process. While what actually happens (correct me if I'm wrong) is that the notebook lives in the tornado server process and when you run a given cell it sends those commands to the kernel which does the processing, sends back a response, and the Notebook system (combining serverside python and client side JS) converts that response into content to display in the Cell. The Hydrogen plugin for atom does do something similar, sending single lines of codes or highlighted chunks and displaying the response.

My idea, more concretely, was to run the python tornado notebook server as a local process and tell it where to find remote kernels, which the user can set up however they like. The reasoning is that a lot of work has gone into building the Jupyter UI, and this work is a mix of tornado webserver and client-side JS. Re-implementing all of it in Node seems like a lot of work, plus every new feature that's added to the Tornado web UI would have  to be ported to Node.

So I would have a "loose clone" of the web notebook but converted into (most likely) an Atom plugin with a limited set of changes - such as replacing the editor areas with Atom TextEditors which would give you the full power of Atom's editor. I haven't looked into how to set up the project yet, but ideally I could make it possible to merge changes from the web UI relatively painlessly.

That said, I haven't really dived into the python part of the notebook server - maybe porting it to Node/Atom is not that hard (there are already Jinja-like template libraries for instance) in which case it might be simpler to just copy the UI bits. I could imagine something that includes both Hydrogen and Notebook UIs to give you a pretty neat live coding experience.

Thomas Kluyver

unread,
Jun 15, 2015, 1:41:11 PM6/15/15
to jup...@googlegroups.com
On 15 June 2015 at 10:33, mangecoeur <jon.cham...@gmail.com> wrote:
So the current notebook implementation does assume that an IPython kernal is installed for the current interpreter?

It doesn't assume this - it will work correctly if that kernel is not found - but if it's there (importable), it does give it a little bit of special treatment, so that there is at least one kernel available out of the box if you 'pip install notebook'.
 
Where does the Notebook get information on what kernals are installed from, and where to find them?

If there isn't a kernel spec for the Python version the server is running on, it tries to import ipykernel to check if it's available.
 
This was because had understood that when you started an IPython notebook instance, the notebook was served from the ipython kernel process, rather than from the notbook server process. While what actually happens (correct me if I'm wrong) is that the notebook lives in the tornado server process and when you run a given cell it sends those commands to the kernel which does the processing, sends back a response, and the Notebook system (combining serverside python and client side JS) converts that response into content to display in the Cell

Your understanding is correct. :-)

At present, the server side Python mostly passes through messages from the kernel to the browser, but we plan to build more intelligence into it in the future.

Thomas

mangecoeur

unread,
Jun 16, 2015, 5:51:13 AM6/16/15
to jup...@googlegroups.com

At present, the server side Python mostly passes through messages from the kernel to the browser, but we plan to build more intelligence into it in the future.

What kind of intelligence? This would heavily impact on whether it makes more sense to embed the tornado notebook vs port to Node. What about notebook security? Is this implemented in the server or the kernel (I see a lot of auth stuff in the server) ? If you wanted to provide a kernel cluster that you could connect to from a desktop app, I guess you would either need security in the kernel itself or have some kind of security middleware on your server that the desktop app would be able to talk to.

Thomas Kluyver

unread,
Jun 16, 2015, 1:07:27 PM6/16/15
to jup...@googlegroups.com
On 16 June 2015 at 02:51, mangecoeur <jon.cham...@gmail.com> wrote:
What kind of intelligence? This would heavily impact on whether it makes more sense to embed the tornado notebook vs port to Node. What about notebook security? Is this implemented in the server or the kernel (I see a lot of auth stuff in the server) ? If you wanted to provide a kernel cluster that you could connect to from a desktop app, I guess you would either need security in the kernel itself or have some kind of security middleware on your server that the desktop app would be able to talk to.

I definitely wouldn't recommend porting it to Node. It's not a trivial amount of code even now.

To elaborate on the 'more intelligence' point: currently the state of the notebook document is maintained in the browser, and when you save, it sends the whole thing to the server as a JSON blob. We plan to move to a model where the server tracks the state of the document and sends updates to the client. This should have a few benefits:

- If there's a long-running task and you close the notebook, output can still be captured (currently anything that arrives when you don't have the notebook open is lost).
- More efficient saves, because the whole content doesn't need to be sent over HTTP each time.
- It will ultimately enable real-time collaborative editing

Thomas
Reply all
Reply to author
Forward
0 new messages