Report from Sage Days 77

19 views
Skip to first unread message

Robert Lehmann

unread,
Apr 17, 2016, 3:42:50 PM4/17/16
to sphin...@googlegroups.com
[bcc: Sage Days participants]

I have visited the Sage Days 77 in Cernay, France, this week to help them with their Sphinx setup.

tl;dr: Sage is looking at adopting vanilla Sphinx;  autodoc will be a big upstreamer for them, memory might be a show stopper.

Sage is a distribution of mathematical Python libraries (or "mathematics software system"), currently bundling Sphinx 2.2.x.  Their documentation consists of ~200 plain text files and ~2000 automodule stubs.  This brings a number of interesting challenges.

First off, they have their own copy of autodoc to support a number of use cases around Cython and wrapped functions.  Since it diverged back around Sphinx 1.0, it is unclear what's required from our side.  They have started work to merge forward, and will then approach us with whatever extension points / improvements they'd need.  Most of these fixes should be useful in core Sphinx as well.

They also have their own poor man's parallelization (splitting up the documentation into multiple Sphinx projects and then building through make -j) based on sphinx-multidoc.  We have experimented with Sphinx's parallel build support, and it seems to give similiar speedups.

Last but not least, they also have their own copy of apidoc.  We agreed that plain sphinx-apidoc covers them well, with the caveat that it'd be a big advantage for them (and other projects, I believe) if they could spare the stub automodule files.  They suggested putting "automodule:" statements directly into the toctree, which I have a rough draft implementation of.

Building their apidoc consumes 1.8G in memory for the doctrees alone.  Their resource footprint peaks at 2.6G (or at 3+G with extensions enabled);  doctrees are 200M in pickled format.

I have filed some bugs with the results of our memory forensics.  There are a couple of caches that unnecessarily blow up the memory consumption (MemoryAnalyzersys.modules, and probably linecache.) and quite a few opportunities to reduce the doctree in-memory blowup (see #2426 for Text nodes.)

I can only recommend other Sphinx developers (to be) to visit user projects like this.  It is a valuable learning opportunity for us.

Cheers,
Robert

Georg Brandl

unread,
Apr 18, 2016, 10:20:09 AM4/18/16
to sphin...@googlegroups.com, sage-...@listes.math.cnrs.fr
Thanks for the illuminating report! And also thanks for visiting Sage Days,
which I unfortunately couldn't.

Georg

On 04/17/2016 09:42 PM, Robert Lehmann wrote:
> [bcc: Sage Days participants]
>
> I have visited the Sage Days 77 <https://wiki.sagemath.org/days77> in Cernay,
> France, this week to help them with their Sphinx setup.
>
> tl;dr: Sage is looking at adopting vanilla Sphinx; autodoc will be a big
> upstreamer for them, memory might be a show stopper.
>
> Sage <http://www.sagemath.org/> is a distribution of mathematical Python
> libraries (or "mathematics software system"), currently bundling Sphinx 2.2.x.
> Their documentation <http://doc.sagemath.org/html/en/index.html> consists of
> ~200 plain text files and ~2000 automodule stubs
> <http://doc.sagemath.org/html/en/reference/index.html>. This brings a number of
> interesting challenges <http://trac.sagemath.org/ticket/20080>.
>
> First off, they have their own copy of autodoc
> <https://github.com/sagemath/sagelib/blob/master/doc/common/sage_autodoc.py> to
> support a number of use cases around Cython and wrapped functions. Since it
> diverged back around Sphinx 1.0, it is unclear what's required from our side.
> They have started work to merge forward <http://trac.sagemath.org/ticket/20359>,
> and will then approach us with whatever extension points / improvements they'd
> need. Most of these fixes should be useful in core Sphinx as well.
>
> They also have their own poor man's parallelization
> <https://github.com/sagemath/sage/blob/master/src/sage_setup/docbuild/ext/multidocs.py>
> (splitting up the documentation into multiple Sphinx projects and then building
> through make -j) based on sphinx-multidoc. We have experimented with Sphinx's
> parallel build support, and it seems to give similiar speedups.
>
> Last but not least, they also have their own copy of apidoc
> <https://github.com/sagemath/sagelib/blob/master/doc/common/builder.py>. We
> agreed that plain sphinx-apidoc covers them well, with the caveat that it'd be a
> big advantage for them (and other projects, I believe) if they could spare the
> stub automodule files. They suggested putting "automodule:" statements directly
> into the toctree, which I have a rough draft implementation of.
>
> Building their apidoc consumes 1.8G in memory for the doctrees alone. Their
> resource footprint peaks at 2.6G (or at 3+G with extensions enabled); doctrees
> are 200M in pickled format.
>
> I have filed some bugs with the results of our memory forensics. There are a
> couple of caches that unnecessarily blow up the memory consumption
> (MemoryAnalyzer <https://github.com/sphinx-doc/sphinx/issues/2422>, sys.modules
> <https://github.com/sphinx-doc/sphinx/issues/2423>, and probably linecache.) and
> quite a few opportunities to reduce the doctree in-memory blowup (see #2426
> <https://github.com/sphinx-doc/sphinx/issues/2426> for Text nodes.)
>
> I can only recommend other Sphinx developers (to be) to visit user projects like
> this. It is a valuable learning opportunity for us.
>
> Cheers,
> Robert
>
> --
> You received this message because you are subscribed to the Google Groups
> "sphinx-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to sphinx-dev+...@googlegroups.com
> <mailto:sphinx-dev+...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages