Best-practices for turning reusable notebook cells into importable Python modules?

2,418 views
Skip to first unread message

Alex W

unread,
Feb 4, 2016, 4:55:50 PM2/4/16
to Project Jupyter
Hello all,
Long-time notebook user. After many years of using notebooks to do data analysis in my PhD, and now handing the project over to others to continue the work, it's been pointed out quite rightly and clearly to me that much of what I've written is redundantly spread across many notebooks, and that some code which should be in importable modules is instead in a "copy and paste and reuse" state. 

Is there any ongoing work on making this "modularization" process easier? Especially when working remotely, this is a pretty high-friction process -- find where in the file system the notebook is, ssh in, open up a new module file in a remote editor, selectively copy/paste code, try to import, fix import errors, try to import, succeed, try to use functions, fix errors, etc.

For instance, having %edit open up a text web-based editor would greatly speed up this process. One cell has %edit mymodule.py, the cell below is used for testing and debugging the contents of mymodule.py. I understand this would be very difficult to implement well, but I just wanted to check if there's anything like this on the horizon. 

Although it is ultimately my responsibility to write reusable and maintainable code, the lure of notebooks for prototyping into creating a copy/paste mentality has been a source of frustration, and a pain point for the continued use of notebooks in our laboratory.

Best,
Alex

Jan Schulz

unread,
Feb 4, 2016, 5:09:50 PM2/4/16
to jup...@googlegroups.com
Hi,

I'm using the writeandexecute IPython extension for such tasks:
https://ipython-extensions.readthedocs.org/en/latest/magics.html

The magic "Writes the content of the cell to a file and then executes
it." In other places you can then import the function. I usually have
one place where I define such functions (sometimes a extra notebook,
sometimes in the normal nb flow) and the rest of the notebooks then
import the stuff.

[note: I wrote the writeandexecute extension...]

There is also some stuff in
https://github.com/jupyter-incubator/contentmanagement ("IPython
kernel extension to make Python notebooks reusable as modules and
cookbooks"). But I haven't looked into that.

Regards,

Jan
--
Jan Schulz
mail: ja...@gmx.net
web: http://www.katzien.de
> --
> You received this message because you are subscribed to the Google Groups
> "Project Jupyter" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jupyter+u...@googlegroups.com.
> To post to this group, send email to jup...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jupyter/e14a31f8-3a26-4129-9fc8-0bbacd5bc8ff%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Alex Wiltschko

unread,
Feb 4, 2016, 5:11:03 PM2/4/16
to jup...@googlegroups.com
Awesome, did not know about this library.

You received this message because you are subscribed to a topic in the Google Groups "Project Jupyter" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jupyter/V2knsyBCUYU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jupyter+u...@googlegroups.com.

To post to this group, send email to jup...@googlegroups.com.

Thomas Kluyver

unread,
Feb 4, 2016, 5:32:06 PM2/4/16
to Project Jupyter
On 4 February 2016 at 21:55, Alex W <ale...@gmail.com> wrote:
For instance, having %edit open up a text web-based editor would greatly speed up this process. One cell has %edit mymodule.py, the cell below is used for testing and debugging the contents of mymodule.py. I understand this would be very difficult to implement well, but I just wanted to check if there's anything like this on the horizon. 

Actually, I think most of the machinery is in place that that *could* be done. We have a web-based text editor in the notebook interface, and %edit already sends a message with the path to the file to be edited. The missing piece, I think, is that something somewhere along the line needs to translate the absolute filesystem path to a relative path from the notebook server's starting directory. That could be the server itself,  though it's a bit awkward since it's currently passing all messages on unchanged, or the frontend, if it knows the filesystem path the notebook started in.

Thomas

Alex Wiltschko

unread,
Feb 5, 2016, 8:40:59 AM2/5/16
to Project Jupyter
That would be awesome. How can I get involved with planning? I think a feature like this is crucial to project cleanliness.
--
You received this message because you are subscribed to a topic in the Google Groups "Project Jupyter" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jupyter/V2knsyBCUYU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jupyter+u...@googlegroups.com.
To post to this group, send email to jup...@googlegroups.com.

Thomas Kluyver

unread,
Feb 5, 2016, 9:12:47 AM2/5/16
to Project Jupyter
On 5 February 2016 at 13:40, Alex Wiltschko <ale...@gmail.com> wrote:
That would be awesome. How can I get involved with planning? I think a feature like this is crucial to project cleanliness.

Open an issue on the jupyter/notebook repo: https://github.com/jupyter/notebook

Hopefully someone there will have a more specific idea than me how we can do it.

Yuvi Panda

unread,
Feb 7, 2016, 5:39:53 PM2/7/16
to Project Jupyter
A slightly different approach to this (solving a different problem, maybe?) would be to write a Python import hook that allows it to read .ipynb files. Then you can just treat .ipynb files as regular .py files and use general python principles to modularize them. Not sure how well that'll work in practice, but definitely worth a shot.

Anyone know if this kind of import hook already exists?

Fernando Perez

unread,
Feb 7, 2016, 11:39:10 PM2/7/16
to Project Jupyter
On Sun, Feb 7, 2016 at 2:39 PM, Yuvi Panda <yuvi...@gmail.com> wrote:
A slightly different approach to this (solving a different problem, maybe?) would be to write a Python import hook that allows it to read .ipynb files. Then you can just treat .ipynb files as regular .py files and use general python principles to modularize them. Not sure how well that'll work in practice, but definitely worth a shot.

Anyone know if this kind of import hook already exists?



--
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail

MinRK

unread,
Feb 8, 2016, 3:33:57 AM2/8/16
to jup...@googlegroups.com
On Mon, Feb 8, 2016 at 5:38 AM, Fernando Perez <fpere...@gmail.com> wrote:
On Sun, Feb 7, 2016 at 2:39 PM, Yuvi Panda <yuvi...@gmail.com> wrote:
A slightly different approach to this (solving a different problem, maybe?) would be to write a Python import hook that allows it to read .ipynb files. Then you can just treat .ipynb files as regular .py files and use general python principles to modularize them. Not sure how well that'll work in practice, but definitely worth a shot.

Anyone know if this kind of import hook already exists?

Hey, we now even have docs to answer that question!! :)


Follow up to "not sure how well that'll work in practice": After writing that import hook some time ago, I looked around for notebooks to import, and never found one that made sense to import. People don't seem to write notebooks that are amenable to importing, but if you knew you were planning on it, you could write accordingly. The same `if __name__ == '__main__':` logic works in notebooks, but people don't use it since the "script == module" pattern doesn't occur to folks in the interactive environment. You could use a notebook-specific way to apply equivalent marks (e.g. in cell metadata) that the import hook would respect.

-MinRK



--
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail

--
You received this message because you are subscribed to the Google Groups "Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+u...@googlegroups.com.

To post to this group, send email to jup...@googlegroups.com.

Alex Wiltschko

unread,
Feb 8, 2016, 11:52:24 AM2/8/16
to Project Jupyter
MinRK, that's my experience as well. Notebooks, among all the scientists I work with, are scratchpads that slowly congeal and settle into something semi-reusable. Less experienced folks in the lab copy and paste entire notebooks to redo analyses on different data sets. Having a mechanism to inline or import notebook code wouldn't solve the reuse problem, just make it easier to get into trouble, faster.

There is currently no recommended or documented way I've found to do proper code re-use inside the notebook ecosystem, and it's causing a ton of trouble for less experienced coders.

--
You received this message because you are subscribed to a topic in the Google Groups "Project Jupyter" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jupyter/V2knsyBCUYU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jupyter+u...@googlegroups.com.

To post to this group, send email to jup...@googlegroups.com.

Thomas Kluyver

unread,
Feb 8, 2016, 12:28:42 PM2/8/16
to Project Jupyter
On 8 February 2016 at 16:52, Alex Wiltschko <ale...@gmail.com> wrote:
There is currently no recommended or documented way I've found to do proper code re-use inside the notebook ecosystem,

We don't intend notebooks to be a separate ecosystem from regular Python modules (or their equivalents in other languages). Most of our discussion about improving this situation focuses on making it easier to move between notebooks and modules, rather than trying to recreate something module-like inside notebooks.

Alex Wiltschko

unread,
Feb 8, 2016, 1:32:30 PM2/8/16
to Project Jupyter
Right, I meant that there's no current way to create modules easily from notebooks. 

--
You received this message because you are subscribed to a topic in the Google Groups "Project Jupyter" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jupyter/V2knsyBCUYU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jupyter+u...@googlegroups.com.
To post to this group, send email to jup...@googlegroups.com.

Yuvi Panda

unread,
Feb 8, 2016, 3:37:23 PM2/8/16
to jup...@googlegroups.com
I wonder if advertising and producing some model notebooks that can be
reused as modules would be a fun direction to explore :) Maybe I can
structure one of my upcoming projects as a purely ipynb setup (instead
of .py files) and see how that goes!

Has the import hook been librarized and made available on pypi / other
distro setups?

Peter Parente

unread,
Feb 8, 2016, 9:22:49 PM2/8/16
to Project Jupyter, yuvi...@gmail.com
The jupyter_cms extension in the incbuator has two flavors of reuse for Python only as a proof of concept:

1. importing notebooks as modules
2. injecting snippets from notebooks as cookbooks


Here's a tutorial notebook showing both off.


My team and I have relied on this simple form of reuse in many engagements where our data munging, modeling, evluation prototyping, doodling, etc. has spanned many notebooks.

It's far from perfect, but it has proven to be better than nothing.

Cheers,
Pete

Michael Milligan

unread,
Feb 10, 2016, 4:13:25 PM2/10/16
to Project Jupyter
Agreed, and I do think that tooling notebooks to be importable *as* modules would be moving in the wrong direction. If the problem is inexperienced programmers getting into trouble because they're only coding in the notebook, I'd be much more interested in seeing some in-notebook tooling to better automate moving their logic out into a module. If I was going to dig into this from scratch my development plan would look something like:

* magic command to set up a skeleton module file
* magic command to push the definition for a named function or class out into a module file, with some checks to see that it still works there (e.g. doesn't depend on globals in the notebook session)
* notebook extension to take that logic and push a function/class out to a module, replacing usage in the notebook with corresponding import/call from module

Maybe all the parts for this exist and I just haven't run across them? If so, that's awesome and please tell me about them so I can include them in the next workshop I teach!

Michael

Jan Schulz

unread,
Feb 11, 2016, 6:46:47 AM2/11/16
to jup...@googlegroups.com
Hi,

On 10 February 2016 at 22:13, Michael Milligan <mill...@umn.edu> wrote:
> * magic command to set up a skeleton module file
> * magic command to push the definition for a named function or class out
> into a module file, with some checks to see that it still works there (e.g.
> doesn't depend on globals in the notebook session)

That's more or less what the writeandexecute magic does:

https://ipython-extensions.readthedocs.org/en/latest/magics.html

Write a cell to a module file (and create it if it does not exist)
each time the cell is executed.

What it does not is checking that it works, so you have to make sure
that you also write all dependencies of that function to the module.
Due to the execution part of this magic, you know the code compiles.

> * notebook extension to take that logic and push a function/class out to a
> module, replacing usage in the notebook with corresponding import/call from
> module

That you have to do manually.

My current workflow is this:

* Write Code, when it works, turn it into a function which gets it's
own cell (no call to the function in the same cell!)
* if the function is reuseable, add the cell magic to write it to a
module (but keep the code here, so that I can change it and rewrite it
to module)
* in other notebooks: import from the module

If I need a change to the code, I go back to the original notebook,
change the code and run the cell (which only defines the function and
overwrites the code in the module, but not executes the function)

> Maybe all the parts for this exist and I just haven't run across them? If
> so, that's awesome and please tell me about them so I can include them in
> the next workshop I teach!

Not sure if this helps, but it seems a blogpost on how to make
reusable code with the notebook would be a nice thing :-)

Jan
Reply all
Reply to author
Forward
0 new messages