citations in sphinx/rest

407 views
Skip to first unread message

Alex Ghitza

unread,
Nov 14, 2009, 12:53:07 AM11/14/09
to sage-...@googlegroups.com

Hi,

Martin Albrecht pointed out to me that there is a way to markup
citations and references in ReST (and that I should be using that in
my patches). I was so happy to see this that I immediately fixed my
patch and added a few words to the developer guide to this effect (see
http://trac.sagemath.org/sage_trac/ticket/7456 for this).

It would seem that it's a fairly simple habit to get into: instead of

def do_this():
"""
Return really great thing, based on [ABC].

REFERENCES:

- [ABC] The ABC of great things

"""

you would write

def do_this():
"""
Return really great thing, based on [ABC]_.

REFERENCES:

.. [ABC] The ABC of great things

"""

However, there are some issues in doing this on the large scale that
is the Sage library.

1. These references used to be local to the docstring they appear in.
As soon as we ReST-ify them, they become global in the reference
manual. Therefore if there is already a reference labeled [ABC],
Sphinx will rightfully complain. That's easy to fix, just use a
different label.

2. What happens if we cite the same reference from two different
docstrings? We have to pick which one docstring will contain the
"definition" of the reference, otherwise Sphinx will complain about
duplicates. The effect of this is:

a) in the html output, the citation itself is a link to the reference,
so it's easy to get to it

b) in the pdf output, all the references are *only* listed in the
bibliography section toward the end of the document. This means that
there are a bunch of lines that just say "REFERENCES" and nothing else
in the docstrings. Not very useful or pretty, so it would be great if
we could somehow remove them.

c) in the docstring itself (obtained by introspection or by reading
the code directly), the citation will be there saying "see [ABC]" but
the actual reference could be nontrivial to find. In fact, the only
way to find it that's sure to work is to do search_src(".. [ABC]"),
because the definition could be in a different file, for instance.


I'm not sure I like c), but maybe that's the price that we have to pay
for having beautiful and consistent-looking documentation?


Alex


--
Alex Ghitza -- Lecturer in Mathematics -- The University of Melbourne
-- Australia -- http://www.ms.unimelb.edu.au/~aghitza/

Robert Bradshaw

unread,
Nov 14, 2009, 2:18:13 AM11/14/09
to sage-...@googlegroups.com

This is one of the many, many cases where I think it would be easier
to fix the tool rather than change people's behavior. Why not have our
cake and eat it too? [ABC] can automatically be turned into [ABC]_
(perhaps only if it matches something below). Global name conflicts
can automatically be resolved, either by picking the first (if they're
"close enough") or making them unique. The REFERENCES section can be
stripped if its empty.

I reallly like ReST, and it's a huge step ahead of what we used to
have, but (and I've said this before) in many ways using raw ReST
seems like a step backwards (double ::'s for doctests, ` for math
mode, double-spacing function argument specs, etc.)

- Robert

Florent Hivert

unread,
Nov 14, 2009, 5:36:00 AM11/14/09
to sage-...@googlegroups.com
Hi,

> 1. These references used to be local to the docstring they appear in.
> As soon as we ReST-ify them, they become global in the reference
> manual. Therefore if there is already a reference labeled [ABC],
> Sphinx will rightfully complain. That's easy to fix, just use a
> different label.
>
> 2. What happens if we cite the same reference from two different
> docstrings? We have to pick which one docstring will contain the
> "definition" of the reference, otherwise Sphinx will complain about
> duplicates. The effect of this is:

[...]

One good habits which will probably solve a large part of the duplicate it to
put the reference not in the doctring of the methods or function but in that
of the module = file. It is very likely that several methods/function or
related to the same paper will appear. It is as well likely that they appear
in the same class/file. Also, it's coherent to put them close to the AUTHOR:
part since both are some kinds of acknowledgment (one for the mathematical
idea, one for the implantation).

[...]

Florent

William Stein

unread,
Nov 14, 2009, 4:55:29 PM11/14/09
to sage-...@googlegroups.com, Jarrod Millman

I strongly agree with this. In fact, last week when I was hanging out
with Jarrod Millman (of scipy/numpy), he baited me -- "so, are you
guys using Sphinx a lot for Sage yet?" When I said "yes!", he
responded, "so how did you guys solve the references problem, since
references are a total mess in Sphinx?" And said that we hadn't yet.
Of course, it was inevitable that we would run head first into this
problem eventually, and it's amusing that we did this week.

Anyway, I agree with Robert -- to solve this particular problem we
will have to go creatively beyond what Sphinx offers "out of the box".

William

Robert Bradshaw

unread,
Nov 14, 2009, 4:58:23 PM11/14/09
to sage-...@googlegroups.com

I'm not sure this will be a good thing--it might make a cleaner
reference manual but I think docstrings are most used via the powerful
introspection that's available in Sage, and we want to put as much as
possible (e.g. references) right at the user's fingertips.

- Robert

Alex Ghitza

unread,
Nov 14, 2009, 5:08:33 PM11/14/09
to sage-...@googlegroups.com
On Sat, Nov 14, 2009 at 01:58:23PM -0800, Robert Bradshaw wrote:

> > One good habits which will probably solve a large part of the
> > duplicate it to
> > put the reference not in the doctring of the methods or function but
> > in that
> > of the module = file. It is very likely that several methods/
> > function or
> > related to the same paper will appear. It is as well likely that
> > they appear
> > in the same class/file. Also, it's coherent to put them close to the
> > AUTHOR:
> > part since both are some kinds of acknowledgment (one for the
> > mathematical
> > idea, one for the implantation).
>
> I'm not sure this will be a good thing--it might make a cleaner
> reference manual but I think docstrings are most used via the powerful
> introspection that's available in Sage, and we want to put as much as
> possible (e.g. references) right at the user's fingertips.
>

I strongly agree with Robert here, I wouldn't want to break
introspection. Moreover, this doesn't solve the problem of the same
reference being cited from different files (and this is not just a
hypothetical situation, I've run into it with steenrod_algebra.py and
steenrod_algebra_element.py, and I think it also comes up in the
graphs code).


Best,

Alex Ghitza

unread,
Nov 14, 2009, 5:14:59 PM11/14/09
to sage-...@googlegroups.com, Jarrod Millman
On Sat, Nov 14, 2009 at 01:55:29PM -0800, William Stein wrote:
>
> I strongly agree with this. In fact, last week when I was hanging out
> with Jarrod Millman (of scipy/numpy), he baited me -- "so, are you
> guys using Sphinx a lot for Sage yet?" When I said "yes!", he
> responded, "so how did you guys solve the references problem, since
> references are a total mess in Sphinx?" And said that we hadn't yet.
> Of course, it was inevitable that we would run head first into this
> problem eventually, and it's amusing that we did this week.
>
> Anyway, I agree with Robert -- to solve this particular problem we
> will have to go creatively beyond what Sphinx offers "out of the box".
>

Since we're not the only ones having this sort of problem, I wonder if
we can convince the Sphinx developers to do something about it. We
would probably have a better chance of doing this if we had a good
idea of how we would like references to work in an ideal world.

(A quick look at Sphinx' trac didn't yield any relevant tickets. I'll
try to have a look at their mailing list as well to see if anything
was brought up.)


Best,

John H Palmieri

unread,
Nov 14, 2009, 5:20:43 PM11/14/09
to sage-devel
On Nov 14, 1:55 pm, William Stein <wst...@gmail.com> wrote:
> On Fri, Nov 13, 2009 at 11:18 PM, Robert Bradshaw
>
> <rober...@math.washington.edu> wrote:
>
> > This is one of the many, many cases where I think it would be easier
> > to fix the tool rather than change people's behavior. Why not have our
> > cake and eat it too? [ABC] can automatically be turned into [ABC]_
> > (perhaps only if it matches something below). Global name conflicts
> > can automatically be resolved, either by picking the first (if they're
> > "close enough") or making them unique. The REFERENCES section can be
> > stripped if its empty.
>
> > I reallly like ReST, and it's a huge step ahead of what we used to
> > have, but (and I've said this before) in many ways using raw ReST
> > seems like a step backwards (double ::'s for doctests, ` for math
> > mode, double-spacing function argument specs, etc.)

(a) You need double ::'s for verbatim blocks to be processed correctly
for the documentation output. For locating doctests, as long as it's
indented, I don't think it matters whether there are any colons at
all. You can check that doctests from old files which haven't been
sphinxified (and so which have single colons) are still being run.

(b) For math mode, you can use $ now.

(c) I don't know what you mean about double-spacing function argument
specs. You can write

INPUT:

- n - an integer
- m - another integer
- k - a third integer, with a
longer description

Or you can write

INPUT:

:param n: an integer
:param m: another integer
:param k: a third integer, with a
longer description


> I strongly agree with this.  In fact, last week when I was hanging out
> with Jarrod Millman (of scipy/numpy), he baited me -- "so, are you
> guys using Sphinx a lot for Sage yet?"  When I said "yes!", he
> responded, "so how did you guys solve the references problem, since
> references are a total mess in Sphinx?"   And said that we hadn't yet.

In what way are they a total mess? I'm curious.

>    Of course, it was inevitable that we would run head first into this
> problem eventually, and it's amusing that we did this week.
>
> Anyway, I agree with Robert -- to solve this particular problem we
> will have to go creatively beyond what Sphinx offers "out of the box".

Which particular problem? Alex just pointed out that we haven't been
using standard Sphinx/reST for citations, and we should. That's easy
to solve "in the box".

More generally, we don't use pure Python in Sage -- we've changed it
so that ^ means exponentation -- but it's pretty close to pure, and it
seems that we're reluctant to change it much. I feel the same way
about Sphinx or reST: it seems like a bad idea to just change it when
we happen to not like the syntax. Sage is a computer program, and it
is reasonable to ask people to learn Python syntax to write Sage
programs, and I think it is also reasonable to ask people to learn
reST to write docstrings. It's not that hard, and it makes our
docstrings portable.

John

William Stein

unread,
Nov 14, 2009, 5:38:06 PM11/14/09
to sage-...@googlegroups.com, Mike Hansen

(1) Almost all of the docstrings in Sage look like this:

INPUT:

- `n` - an integer

- `m` - another integer

- `k` - a third integer, with a longer description


with spaces. I'm not sure why, but I think Mike Hansen argued that he
had to do things this way when he was transition from our old
non-Sphinx docs. Evidently Sphinx has changed since then (or he was
wrong).

(2) Regarding your example:
"""
INPUT:

- n - an integer
- m - another integer
- k - a third integer, with a
longer description
"""

I just want to point out that the space before the "longer" above is
*critical*. Moreover, it has to be at least *two* spaces, i.e., the
start of the word "longer" has to line up with the "k" above (your
example maybe didn't).

>
> Or you can write
>
> INPUT:
>
> :param n: an integer
> :param m: another integer
> :param k: a third integer, with a
>   longer description
>
>
>> I strongly agree with this.  In fact, last week when I was hanging out
>> with Jarrod Millman (of scipy/numpy), he baited me -- "so, are you
>> guys using Sphinx a lot for Sage yet?"  When I said "yes!", he
>> responded, "so how did you guys solve the references problem, since
>> references are a total mess in Sphinx?"   And said that we hadn't yet.
>
> In what way are they a total mess?  I'm curious.

My impression is that it was in exactly the way that is being
discussed in this thread. The way Jarrod put it was "we need a Bibtex
for Sphinx", i.e., a way to have a database of references that is
separate from the individual docstrings. That's basically the same
problem being discussed in this thread.

>
>>    Of course, it was inevitable that we would run head first into this
>> problem eventually, and it's amusing that we did this week.
>>
>> Anyway, I agree with Robert -- to solve this particular problem we
>> will have to go creatively beyond what Sphinx offers "out of the box".
>
> Which particular problem?

Problem = the lack of a "Bibtex for Sphinx".

> Alex just pointed out that we haven't been
> using standard Sphinx/reST for citations, and we should.  That's easy
> to solve "in the box".
>
> More generally, we don't use pure Python in Sage -- we've changed it
> so that ^ means exponentation -- but it's pretty close to pure, and it
> seems that we're reluctant to change it much.  I feel the same way
> about Sphinx or reST: it seems like a bad idea to just change it when
> we happen to not like the syntax.  Sage is a computer program, and it
> is reasonable to ask people to learn Python syntax to write Sage
> programs, and I think it is also reasonable to ask people to learn
> reST to write docstrings. It's not that hard, and it makes our
> docstrings portable.

My understanding is that the problem is that nobody has yet written
something like Bibtex for Sphinx. That's a bit of a different problem
than just changing syntax. How would you write large latex
papers/books without Bibtex? Ick.

-- William

Alex Ghitza

unread,
Nov 14, 2009, 5:46:01 PM11/14/09
to sage-...@googlegroups.com
On Sat, Nov 14, 2009 at 02:20:43PM -0800, John H Palmieri wrote:
>
>
> In what way are they a total mess? I'm curious.
>
> >    Of course, it was inevitable that we would run head first into this
> > problem eventually, and it's amusing that we did this week.
> >
> > Anyway, I agree with Robert -- to solve this particular problem we
> > will have to go creatively beyond what Sphinx offers "out of the box".
>
> Which particular problem? Alex just pointed out that we haven't been
> using standard Sphinx/reST for citations, and we should. That's easy
> to solve "in the box".
>

Here's an example (I'm not picking on you, John, this was just the
first instance I ran into this after doing search_src("REFERENCE") to
look for un-Sphinx-ified references.)

In the module docstring of steenrod_algebra.py, there is a reference
to Milnor's Annals paper [Mil]. I eagerly put this in proper Sphinx
syntax. Moving on to steenrod_algebra_element.py, I had to remove the
same reference from the module docstring because Sphinx complained.
So you can have something like "see [Mil]_" but then the actual
reference is nowhere to be found in that file.
This means that someone reading only this docstring does not have this
information at hand any more. Then we get to
steenrod_algebra_bases.py, where the Monks and Wood papers are now [M]
and [W], whereas in steenrod_algebra_element.py they were [Mon] and
[Woo]. Sphinxify these and you get duplicate references
with different names and labels.

Once again, I'm not picking on you or anybody else. But I think this
shows that it's hard to keep consistency alive even in cases where a
single author wrote a bunch of files, never mind what happens when ten
different people work on the same code. It adds to the author's
workload and to the reviewer's workload. And as we pointed out
already, it cripples docstrings introspection.


Best,

Alex Ghitza

unread,
Nov 14, 2009, 6:02:56 PM11/14/09
to sage-...@googlegroups.com, Mike Hansen


There is an enhancement ticket for making Sphinx read Bibtex files:

http://bitbucket.org/birkenfeld/sphinx/issue/63/make-sphinx-read-bibtex-files-for

There seems to have been some activity on it, but the latest was 10
months ago.

William Stein

unread,
Nov 14, 2009, 6:10:24 PM11/14/09
to sage-...@googlegroups.com, Mike Hansen
On Sat, Nov 14, 2009 at 3:02 PM, Alex Ghitza <agh...@gmail.com> wrote:
> There is an enhancement ticket for making Sphinx read Bibtex files:
>
> http://bitbucket.org/birkenfeld/sphinx/issue/63/make-sphinx-read-bibtex-files-for
>
> There seems to have been some activity on it, but the latest was 10
> months ago.


Interesting. I imagined more that one would create an _analogue_ of
Bibtex for Sphinx, rather than have Sphinx actually read Bibtex.
After all, Bibtex entries can contain fairly complicated LaTeX, and
there are subtle rules (e.g., incollection...). Of course, those
rules are all important for actual applications.

-- William

John H Palmieri

unread,
Nov 14, 2009, 6:13:28 PM11/14/09
to sage-devel
On Nov 14, 2:38 pm, William Stein <wst...@gmail.com> wrote:
>
> (1) Almost all of the docstrings in Sage look like this:
>
> INPUT:
>
>   - `n` - an integer
>
>   - `m` - another integer
>
>   - `k` - a third integer, with a longer description
>
> with spaces.  I'm not sure why, but I think Mike Hansen argued that he
> had to do things this way when he was transition from our old
> non-Sphinx docs.   Evidently Sphinx has changed since then (or he was
> wrong).

It may not have changed, maybe double-spacing was just the best way to
do his automatic conversion all of the old docstrings to the new
format without causing too many problems. I don't know.

> (2) Regarding your example:
> """
> INPUT:
>
> - n - an integer
> - m - another integer
> - k - a third integer, with a
>   longer description
> """
> I just want to point out that the space before the "longer" above is
> *critical*.  Moreover, it has to be at least *two* spaces, i.e., the
> start of the word "longer" has to line up with the "k" above (your
> example maybe didn't).

Right, this is what I get for typing in a non-fixed-width font.

> My impression is that it was in exactly the way that is being
> discussed in this thread.  The way Jarrod put it was "we need a Bibtex
> for Sphinx", i.e., a way to have a database of references that is
> separate from the individual docstrings.   That's basically the same
> problem being discussed in this thread.

> My understanding is that the problem is that nobody has yet written
> something like Bibtex for Sphinx.  That's a bit of a different problem
> than just changing syntax.   How would you write large latex
> papers/books without Bibtex?   Ick.

Well, maybe what we need is this: in each module, class, whatever, we
keep doing what we're doing:

{{{
See the paper [Mil] for more details.

REFERENCES:

- [Mil] John W. Milnor, "The Steenrod algebra and its dual" ...
}}}

and then we also append to this a Sphinx/reST citation like "[Mil58]_"
which points to a master bibliography file. Then local citations will
still be valid and will point to local references, and there will in
addition be a single place where all of the references are listed.
The local citations ([Mil] or [M] or ...) can also vary (as they do in
some of my code -- sorry, Alex), as long as [Mil] or [M] or whatever
all have the same master citation "[Mil58]_" appended.

If we want to do this, then we need a style guide for what goes in
these master citations; imitating the Bibtex style amsalpha seems like
a fine plan to me.

John

John H Palmieri

unread,
Nov 14, 2009, 6:17:18 PM11/14/09
to sage-devel
On Nov 14, 2:46 pm, Alex Ghitza <aghi...@gmail.com> wrote:
> On Sat, Nov 14, 2009 at 02:20:43PM -0800, John H Palmieri wrote:
>
> > In what way are they a total mess?  I'm curious.
>
> > >    Of course, it was inevitable that we would run head first into this
> > > problem eventually, and it's amusing that we did this week.
>
> > > Anyway, I agree with Robert -- to solve this particular problem we
> > > will have to go creatively beyond what Sphinx offers "out of the box".
>
> > Which particular problem?  Alex just pointed out that we haven't been
> > using standard Sphinx/reST for citations, and we should.  That's easy
> > to solve "in the box".
>
> Here's an example (I'm not picking on you, John, this was just the
> first instance I ran into this after doing search_src("REFERENCE") to
> look for un-Sphinx-ified references.)

The curse of alphabetical order and working on something in the
"algebras" directory, I suppose...

> In the module docstring of steenrod_algebra.py, there is a reference
> to Milnor's Annals paper [Mil].  I eagerly put this in proper Sphinx
> syntax.  Moving on to steenrod_algebra_element.py, I had to remove the
> same reference from the module docstring because Sphinx complained.
> So you can have something like "see [Mil]_" but then the actual
> reference is nowhere to be found in that file.
> This means that someone reading only this docstring does not have this
> information at hand any more.  Then we get to
> steenrod_algebra_bases.py, where the Monks and Wood papers are now [M]
> and [W], whereas in steenrod_algebra_element.py they were [Mon] and
> [Woo].  Sphinxify these and you get duplicate references
> with different names and labels.

Oops, sorry about that. See my response to William's message about a
possible solution.

> Once again, I'm not picking on you or anybody else.  

It's fine. (I would feel worse if we had a policy about consistency
in references/citations which I had violated in those files.)

> But I think this
> shows that it's hard to keep consistency alive even in cases where a
> single author wrote a bunch of files, never mind what happens when ten
> different people work on the same code.  It adds to the author's
> workload and to the reviewer's workload.  And as we pointed out
> already, it cripples docstrings introspection.

Right, I completely agree with Robert and with you that we need
introspection to work right, with all of the references included
"locally".

John
Reply all
Reply to author
Forward
0 new messages