Connecting iPython kernel with Jupyter notebook via ssh pipe

3,163 views
Skip to first unread message

Bob F

unread,
Nov 30, 2015, 9:43:37 PM11/30/15
to Project Jupyter
I'd like to use iPython/Jupyter to analyze data sitting on a supercomputer cluster.  The supercomputer is a shared resource sitting behind a bastion host firewall.  It is not easy to get to, and opening ports on it is probably a bad idea.  Consider ssh as the only good way to talk to things on the supercomputer; assume that password-less ssh is possible.

Based on the docs, I see that iPython/Jupyter involves three process communicating with each other:

 a) The kernel --- where the actual work gets done
 b) The notebook --- talks to the kernel on one side, and to a web browser or Qt client on the other.
 c) The user's client (web browser)

I would like to run (a) on the server and (c) on the user's desktop.  (b) needs to run on the server too, since getting an http proxy through to the server is just about impossible / impractical / against policy in this case.

I see that iPython offers a way to run part (a) on the server, while (b) is running on your local machine:


The problem here is the connection between (a) and (b) is via a port on the server, which is forwarded via SSH to the notebook (running in this case on the user's desktop machine).  Port forwarding and running processes that open server ports are probably both against supercomputer policy.

Instead, I'd like to connect the notebook and kernel via an ssh pipe.  The notebook would launch a kernel by invoking ssh with a command line to run the kernel on the remote machine, something like:
        ssh server kernel

The notebook would then talk to the kernel through a pipe to the ssh process; and the kernel talks to the notebook via STDIN/STDOUT.  I've built such a system for another project, and it really works nicely.  I understand and am OK with the "downsides" of this approach:
  a) The lifetime of the kernel depends on keeping an open ssh connection between client and server machines.
  b) A kernel could only server one notebook.

Questions:
 a) Assuming the infrastructure for this KIND of networking has already been built (it has), how difficult would it be to use it in iPython/Jupyter?
 b) Would anyone on the Jupyter team be available to do this, or to advise/assist in doing it?

If we could get this going, it could be a BIG win for our lab: all the data are sitting on the supercomputer cluster, and we don't have good ways to access them.  People are resorting to manually rsync'ing a bunch of files to their desktop for analysis, or even just to plot it!  But there's really too much data to do that easily.  Jupyter would be much better.

Thank you,
-- Bob

Carl Waldbieser

unread,
Nov 30, 2015, 9:57:36 PM11/30/15
to Project Jupyter
Bob,

You should check out Andrea Zonca's blog Run Jupyterhub on a Supercomputer or if you are not interested in the hub, he has posts on plain old notebook over SSH tunnel setups for 2 other supercomputers:

  IPython/Jupyter notebook setup on SDSC Comet
  IPython/Jupyter notebook setup on NERSC Edison

Thanks,
Carl Waldbieser

Bob F

unread,
Nov 30, 2015, 10:13:56 PM11/30/15
to jup...@googlegroups.com
Carl,

Thank you for your reply.  However, I don't think this is what we need.  All three of Zonca's links involve running  the notebook on the supercomputer and tunneling an HTTP port to the client's desktop.  This will NOT be looked upon kindly by our sysadmins: we are working in a government installation with tight security.

I want to run ONLY the kernel on the supercomputer, not the kernel + notebook.  AND, I want to run the kernal WITHOUT opening up any ports.  This is the setup that I KNOW will not cause problems at our site.

-- Bob
 

--
You received this message because you are subscribed to a topic in the Google Groups "Project Jupyter" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jupyter/sIT6g1NG_x8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jupyter+u...@googlegroups.com.
To post to this group, send email to jup...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jupyter/6be7fe4a-c5fc-4971-89e6-939fc012acb5%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Sam Moskwa

unread,
Dec 1, 2015, 12:36:29 AM12/1/15
to jup...@googlegroups.com
You just need to write a suitable kernelspec, there are a couple examples here:

http://stackoverflow.com/questions/29037211/how-do-i-add-a-kernel-on-a-remote-machine-in-ipython-jupyter-notebook

-Sam


You received this message because you are subscribed to the Google Groups "Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+u...@googlegroups.com.

To post to this group, send email to jup...@googlegroups.com.

Andrea Zonca

unread,
Dec 1, 2015, 2:45:20 AM12/1/15
to jup...@googlegroups.com
hi,
there are also a couple of plugins on github:


I tested rk once and was working fine.
Andre



Michael Milligan

unread,
Dec 1, 2015, 9:50:01 AM12/1/15
to Project Jupyter, zo...@sdsc.edu
Unless I'm misunderstanding some magic in the kernels implementation, both of the projects you cite expect to be able to open sockets between the client host and the remote host where the kernel runs. Bob is interested in running a Jupyter notebook without any listening sockets, using just the ssh console channel. I doubt this is possible without writing a custom ZMQ implementation, since you would need to multiplex the various ZMQ sockets over the SSH channel somehow. Of course, SSH provides that capability already -- it's called port forwarding.

But Bob, I think you're overthinking this. My approach would be to simply run the notebook on the HPC host and use SSH port forwarding correctly. Done properly, you will never open a socket that listens to anything but the local loopback interface, so you are only vulnerable to a hostile party that already has hostile code running on one of the machines you are using.

For instance, with client=your local machine, login1=bastion host, login2=second bastion host inside hpc center, hpc=hpc compute host:

hpc> jupyter notebook --ip=localhost --port=9999 (opens a port on the HPC host that listens ONLY to connections coming from that host)
client> ssh -L localhost:10022:login2:22 user@login1 (opens port 10022 on your system that listens ONLY to local connections, that forwards to port 22 on login2)
client> ssh -L localhost:10023:hpc:22 -p 10022 user@localhost (ssh to login2 via your tunnel on login1, open a second tunnel, again restricted to localhost)
client> ssh -L localhost:9999:localhost:9999 -p 10023 user@localhost (ssh to hpc via your nested tunnel, open tunnel to notebook server)
client> my-web-browser http://localhost:9999

Conveniently, you can automate most of this using ProxyCommand directives in your .ssh/config file.

Michael

Bob F

unread,
Dec 1, 2015, 8:17:43 PM12/1/15
to jup...@googlegroups.com
>
> But Bob, I think you're overthinking this. My approach would be to simply run the notebook on the HPC host and use SSH port forwarding correctly. Done properly, you will never open a socket that listens to anything but the local loopback interface,


Sorry, this is a no-go at our center.  I checked.

 
>
> so you are only vulnerable to a hostile party that already has hostile code running on one of the machines you are using.

 
Yes, that's what they're afraid of.


Hmm... digging into the docs and the code, I don't think will be so difficult.  The main idea is that the existing iPython kernel already does a Popen to a raw Python session, and talks to that session through a pipe.  So... instead of doing Popen of 'python...', do a Popen of 'ssh my-server python...'

1. It looks like Jupyter has a well-defined interface to kernels:
   http://jupyter-client.readthedocs.org/en/latest/api/kernelspec.html
   http://jupyter-client.readthedocs.org/en/latest/kernels.html

2. The iPython kernel has an implementation of the kernelspec interface at:

https://github.com/ipython/ipykernel/blob/8c8a9f82e0556a41630aec76f7c05b1ff1a5440e/ipykernel/kernelspec.py

3. Inside ipykernel/kernelspec.py, there's a function to build a Popen command:

def make_ipkernel_cmd(mod='ipykernel', executable=None, extra_arguments=[], **kw):
"""Build Popen command list for launching an IPython kernel...."""
    if executable is None:
        executable = sys.executable
        arguments = [ executable, '-m', mod, '-f', '{connection_file}' ]
        ...
I could just change that something like:
        arguments = [ 'ssh', 'my-server', 'python', '-m', mod, '-f', '{connection_file}' ]

So... it looks like technically, the kernel, notebook and web browser will all run on the desktop, where they can open as many ports as they like.  And the kernel

Is there something else I know before I try to implement this approach?  Minor issues to resolve would be:
 a) How to best get ssh connection details to my modified iPython kernel
 b) How to make my "ssh iPython kernel" without copying the entire ipykernel codebase --- or else, how to add it as an option to the existing kernel.
 c) How to configure Jupyter to recognize and use my new "ssh iPython kernel."

Thank you,
-- Bob


Matthias Bussonnier

unread,
Dec 2, 2015, 4:43:06 AM12/2/15
to jup...@googlegroups.com
On Dec 2, 2015, at 02:17, Bob F <cit...@citibob.net> wrote:

>
> But Bob, I think you're overthinking this. My approach would be to simply run the notebook on the HPC host and use SSH port forwarding correctly. Done properly, you will never open a socket that listens to anything but the local loopback interface,


Sorry, this is a no-go at our center.  I checked.
 
>
> so you are only vulnerable to a hostile party that already has hostile code running on one of the machines you are using.

 
Yes, that's what they're afraid of.


Hmm... digging into the docs and the code, I don't think will be so difficult.  The main idea is that the existing iPython kernel already does a Popen to a raw Python session, and talks to that session through a pipe.  So... instead of doing Popen of 'python...', do a Popen of 'ssh my-server python...'

1. It looks like Jupyter has a well-defined interface to kernels:
   http://jupyter-client.readthedocs.org/en/latest/api/kernelspec.html
   http://jupyter-client.readthedocs.org/en/latest/kernels.html

2. The iPython kernel has an implementation of the kernelspec interface at:
...
 
a) How to best get ssh connection details to my modified iPython kernel
 b) How to make my "ssh iPython kernel" without copying the entire ipykernel codebase --- or else, how to add it as an option to the existing kernel.
 c) How to configure Jupyter to recognize and use my new "ssh iPython kernel.”


That’s roughly exactly what remote kernel does :


which was proposed by Andrea 2 mails ago. 
-- 
M


Reply all
Reply to author
Forward
0 new messages