Jupyter-based CMS

1,081 views
Skip to first unread message

Doug Blank

unread,
Nov 19, 2014, 6:24:07 PM11/19/14
to jup...@googlegroups.com
So, a few weeks back a discussion was started on the idea of a Jupyter-based Content Management System (CMS) centered on notebooks:

http://thread.gmane.org/gmane.comp.python.ipython.devel/14049/

There are a few developers and users that are interested in the the idea that additional functionality could be easily added to a Jupyterhub-based server. Below, I've tried to summarized the ideas mentioned so far, and added a few new ones. I've tried to list those things that are beyond what is imagined for IPython 3.0. But it may be the case that some of these could be incorporated into the base Jupyter at some point.

Some of the ideas for additional features:

* search and indexing (advanced search code-, library-specific)
* site maps
* collaboration tools
* git API connection
* public interface for sharing notebooks
* tag clouds
* favorites
* social media connections (twitter, google+, facebook, tumblr, etc)
* app store for kernels/languages, libraries, and CMS addons
* comments (commenting on cells, or on notebooks)
* trending notebooks/libraries/tags/users/kernels
* classroom management tools (nbgrader, export to moodle/blackboard, etc)
* CSS themes

These features could be put together to make notebook-based apps, like:

* wikis
* blogs
* on-line classroom
* books
* journals
* reproducible research collections
* topic/language collections
* mooc-oriented app
* collaborative workgroup

Writing such a CMS could be done at a few levels, using various frameworks, such as:

* tornado + Python + jinja2 (what jupyterhub is written in)
* flask
* allura + turbogears
* django
* drupal
* wordpress
* YOUR FAVORITE FRAMEWORK in FAVORITE LANGUAGE

It was noted that one could easily have different parts running on different frameworks, and different servers. Using standard RESTful web APIs and data representations, one could easily mix and match services. However, thinking about a project suggests that some coherent tools and infrastructure would be useful.

My initial feeling is that building at the tornado/Python/jinja2 level would be the best way to start, because of the common codebase with jupyterhub, and ease of installation/maintenance. Of course, if that level provided data in a RESTful manner, then other frameworks could easily consume such output.

Did I miss major features? Frameworks? Pros and cons?

-Doug

Wes Turner

unread,
Nov 21, 2014, 3:06:28 PM11/21/14
to jup...@googlegroups.com

Wes Turner

unread,
Nov 21, 2014, 4:26:30 PM11/21/14
to jup...@googlegroups.com

Wes Turner

unread,
Nov 21, 2014, 4:43:07 PM11/21/14
to jup...@googlegroups.com
Commenting and annotations:

* Stable IPython notebook URIs for changing notebooks without commit hashes may be a challenge (github + nbviewer) [*]
  could the notebook checksums help with this? (ot.js, share.js) ... https://en.wikipedia.org/wiki/Operational_transformation#OT_software


... I don't know how tornado works with any of the git bindings.

There are lots of static CMS tools that generate static HTML that lives in a gh-pages branch.



On Wednesday, November 19, 2014 5:24:07 PM UTC-6, Doug Blank wrote:

Wes Turner

unread,
Nov 21, 2014, 4:44:11 PM11/21/14
to jup...@googlegroups.com

Wes Turner

unread,
Nov 21, 2014, 4:56:14 PM11/21/14
to jup...@googlegroups.com
* gh-pages branch: https://github.com/ipython/ipython-website
* versioned releases of documentation in a gh-pages branch: https://github.com/ipython/ipython-doc/tree/gh-pages (Sphinx)


On Wednesday, November 19, 2014 5:24:07 PM UTC-6, Doug Blank wrote:

Wes Turner

unread,
Nov 21, 2014, 5:31:40 PM11/21/14
to jup...@googlegroups.com

Reproducible Sphinx and IPython Projects

- [ ] Host a GitHub repository with IPython notebooks:


  (a template which reflows to an 8.5x11 two column PDF may or may not be as helpful as a responsive sphinx theme with #fragment links that work with browser back/forward).

- [ ] Build documentation with .. ipython:: directives:


- [ ] Write documentation in ReStructuredText which references docstrings built by sphinx-apidoc


.. index:: Changes
.. index:: What's New
.. _whatsnew:

What's New
============
This links to the :ref:`What's New <whatsnew>` section.

Here's a glossary link to the :term:`IPython` glossary term,
and a link to the :py:mod:`IPython` module documentation
(if it were included by sphinx-apidoc or generated by hand).

.. glossary::

   IPython
      Description of IPython

      | Wikipedia: `<https://en.wikipedia.org/wiki/IPython>`__
      | Homepage: `<https://ipython.org/>`__

http://sphinx-doc.org/markup/para.html#glossary glossary terms create index entries but no #fragment link like headers.

Wes Turner

unread,
Nov 21, 2014, 5:41:40 PM11/21/14
to jup...@googlegroups.com
TL;DR "A drop-in, closed loop provisioned #opensource lab setup with a pkg mirror, central file storage, and CI would be sweet #edupython


Wes Turner

unread,
Nov 21, 2014, 5:44:38 PM11/21/14
to jup...@googlegroups.com

Wes Turner

unread,
Nov 21, 2014, 5:45:36 PM11/21/14
to jup...@googlegroups.com

Wes Turner

unread,
Nov 21, 2014, 5:59:50 PM11/21/14
to jup...@googlegroups.com

Wes Turner

unread,
Nov 21, 2014, 6:11:44 PM11/21/14
to jup...@googlegroups.com
Basically trying to index the JSON fulltext of an ipnb v3 (-> v4) document,

Django Haystack:

Wes Turner

unread,
Nov 22, 2014, 8:54:08 PM11/22/14
to jup...@googlegroups.com
I do apologize if this incremental research question feedback conversation log has appeared to be link spam.

To Summarize without a traceback:

* Tornado supports WSGI
* Things block
* Django is well supported
* Django Pinax is a collection of social media content management apps
* python-social-auth connects to most social media social web authentication providers

* There is already much time spent on
  *optimizing the software development build chain*
  e.g. DevOps automation:
  * GitHub Pages (a CDN-hosted git branch named 'gh-pages')
    * ipython/ipython-website
    * ipython/ipython-docs (versioned build releases in a directory hierarchy)
  * Lightweight Markup Languages:
    * Markdown
    * REST
  * Testing input for quality characteristics
    * Source code metrics
    * Documentation syntax validation
  * Communications / Project Management
    * Wikis, Issues
    * Comments
      * Comment hosting services
  * HTTP POST webhooks on_git_repository.commit()
    * ReadTheDocs
    * TravisCI
* the edX wikis cover much of the development process
* OpenStack is mostly Python (Horizon is a Django dashboard)
code.edx.org is mostly Python
* there are many great examples of shooting for reproducible science with IPython
  * scientific-python-lectures, scipy-lecture-notes, gallery of interesting notebooks

Generating as much static HTML and JS as possible is generally a good strategy
both for resource utilization and for version control.

As a core syntax for Sphinx Jinja2 docutils Templates,
ReStructuredTest is a good fit for generating static HTML
with and for for Python projects.

The Python EDU-sig .. ( #edupython ?) has
many resources for such an endeavour.
https://www.python.org/community/sigs/current/edu-sig

There are also various reddit forums for learning python,
which many people often go to.

AFAIU, the most relevant reddit sub for finding resources
for this type of project would still be /r/ipython .

If the real problem is

   "how do we search, annotate, comment on changing
    things that are described as JSON
    which have various versions URIs and URLs"

Indexing and searching JSON-LD with Haystack (Django)
and any of those backends would be a great way
to find what it is that is being looked for.

Realtime collaboration:
https://github.com/jupyter/colaboratory (Local IPython kernel // Chrome NaCl) 
* OT: operational transformation (ot.js, share.js)
* OA: OpenAnnotation core schema (annotator, hypothesis)
* "Why can't I store this all in my drive?" (git + cloud drive)
* Why is this conversation so fragmented?
* Why there is a new version, where do the comments go? (OT)
  What are they anchored to? (OA)
* Is there value in having a chronological perspective
  into the research process?

--
You received this message because you are subscribed to a topic in the Google Groups "Project Jupyter" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jupyter/oe52-F0Bnuc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jupyter+u...@googlegroups.com.
To post to this group, send email to jup...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jupyter/e70b9dc7-4bff-4ad0-83b8-3bf9b8ca5d23%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--

Brian Granger

unread,
Nov 23, 2014, 11:47:25 PM11/23/14
to jup...@googlegroups.com
Doug,

Thanks for getting this going. After teaching with the notebook, I am
acutely aware of the pain points associated with using the notebook in
that context - the exact pain points that a full CMS addresses in
various ways.

Here are some thoughts about how this might be approached:

* 80/20 rule. Can we get 80% of what we need in the CSM system without
actually building it or with building something that is only 20% of
the complexity and scope. Maybe we just need to invest in working with
other existing solutions, such as Moodle...
* Scope limitation. It may seem completely hypocritical for me to
bring up the issue of scope given how vast ipython/jupyter has become,
but the scope of what you are describing is super huge - waaay bigger
than jupyter/ipython even. Which brings me to...
* Identify the build only the MVP (minimal viable project). This is
how we do every new thing in IPython (including the notebook and
jupyterhub). We figure out the absolutely minimal set of features that
need to be there, then cut that in half and build that. For example,
the notebook didn't have directory navigation for years. Then we get
people using the MVP and see where it goes.
* Any web app of any complexity and scalability needs to have a robust
non-blocking server. Don't even consider using WSGI, flask, Django for
something like this...you will just have to rewrite it later.

Cheers,

Brian
> --
> You received this message because you are subscribed to the Google Groups
> "Project Jupyter" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jupyter+u...@googlegroups.com.
> To post to this group, send email to jup...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jupyter/ad0206d3-3c30-4b64-9154-8a8c889a4d43%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Brian E. Granger
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgra...@calpoly.edu and elli...@gmail.com

Doug Blank

unread,
Nov 24, 2014, 6:44:52 AM11/24/14
to jup...@googlegroups.com, elli...@gmail.com
Brian,

Thanks for the advice... that largely fits with what I was thinking, but from a different perspective. The following seems like a reasonable approach:

1) Build on the most abstract existing levels (tornado, jinja2, jupyterhub, nbviewer).

2) Pick a couple of components and build/integrate them. I'm thinking of Kyle's ElasticSearch-based indexing in nbviewer. That should be able to be turned into a component that could be reused in Jupyterhub. That might also form the basis for other components (such as "latest public notebooks") as it is a queryable database.

3) Attempt to build a configuarable app with templates and additional components. Django has a settings.py which allows configuring urls, middleware, etc. Perhaps this is just an IPython config file. Build on the new template search dirs for Jupyterhub. Then we can create a Jupyterhub-based app with some additional "Search" features in the right places. 

4) May require some additional hooks into existing templates. For example, it would be very nice to be able to create new Tabs on the "dashboard". Is that something once could just do with templates? What would be the easiest way to add new elements to the menus? Thinking here about how to put customizations into a single location, loosely coupled---without having to have too much knowledge of the IPython internals.

Some parts of these may even be sprint-worthy. In some cases, perhaps an example, or more documentation can help put existing pieces together.

-Doug

Doug Blank

unread,
Nov 24, 2014, 7:12:43 AM11/24/14
to jup...@googlegroups.com
Thanks, Wes, for the links! These might be more useful for someone building *on top* of what I am thinking. But I did discover just how far Kyle has come with the elasticsearch in nbviewer by following your links. nbviewer isn't a repo that I pay too much attention to, but this component should be quite useful.

If you are interested, it could be useful to find other tornado + jinja2 components that could be adapted to work. For example, perhaps someone has already made a "widget-based component" system for tornado/jinja2. Or something like these chat and blog demos:


I am thinking about being able to take such components and compose them into a coherent Notebook-based site. 

-Doug

Wes Turner

unread,
Nov 24, 2014, 11:51:51 AM11/24/14
to jup...@googlegroups.com
On Mon, Nov 24, 2014 at 6:12 AM, Doug Blank <doug....@gmail.com> wrote:
Thanks, Wes, for the links! These might be more useful for someone building *on top* of what I am thinking. But I did discover just how far Kyle has come with the elasticsearch in nbviewer by following your links. nbviewer isn't a repo that I pay too much attention to, but this component should be quite useful.

I haven't looked into it too much; but I'd imagine that it's actually easier to present search snippets from the rendered HTML version.
 

If you are interested, it could be useful to find other tornado + jinja2 components that could be adapted to work. For example, perhaps someone has already made a "widget-based component" system for tornado/jinja2. Or something like these chat and blog demos:
 

I haven't much experience with tornado.
 
I am thinking about being able to take such components and compose them into a coherent Notebook-based site. 

In terms of separation of concerns, that's an interesting prospect.

From a CMS standpoint:
* CMIS
* JSR Portlets
* Plone Portlets
* <x> widgets
* jinja2 {% extends %} {% includes %}
* IPython _repr_html

From a RESTful data standpoint, AFAIU, there versioned resources with URIs,
JSON[-LD!] metadata, and WebSocket channel events.

And then there are scaffolding and content workflow UIs.

Brian Granger

unread,
Nov 24, 2014, 11:56:27 AM11/24/14
to Doug Blank, jup...@googlegroups.com
One thing to keep in mind is that after 3.0 we will start exploring a
more powerful app framework that allows for more general page layouts
with different content areas, tabs, etc.:

https://github.com/phosphorjs

If we go this direction all of our existing components (notebook,
dashboard, terminal, text editor) will be services that can be plugged
into any phosphor page...

Nikolas Tezak

unread,
Jan 5, 2015, 1:05:41 PM1/5/15
to jup...@googlegroups.com, doug....@gmail.com, elli...@gmail.com
Hi Brian, this looks _very_ exciting! Is there a page with phosphorjs examples? Is is related to this: http://www.divergentmedia.com/phosphor ?
Nik

Brian Granger

unread,
Jan 5, 2015, 5:12:20 PM1/5/15
to Nikolas Tezak, jup...@googlegroups.com, Doug Blank
That is a completely different project. I don't know if any public page that has examples, but the repo has some.

Cheers,

Brian
Reply all
Reply to author
Forward
0 new messages