Sage policy question: require docstrings and doctests?

John H Palmieri

unread,

May 9, 2017, 3:14:51 PM5/9/17

to sage-devel

Sage's developer's guide says: "Every function must have a docstring" which must include "An EXAMPLES block for examples. This is not optional." What is meant by "function"? Here is an example, taken from ticket #21399:

import sys

if sys.platform == X:
    def auxiliary_function(...):
        ...
elif sys.platform == Y:
    def auxiliary_function(...):
        ...
else:
    def auxiliary_function(...):
        ...

def main_function(...):
    return auxiliary_function(...)

What needs a docstring here? What needs doctests?

- each instance of auxiliary_function?
- main_function?

Several comments:

- Perhaps in this case the doctests for the different functions would end up being redundant. The docstrings shouldn't be, since there are specific reasons for treating the platforms differently.
- The "sage --coverage ..." command expects all of them to have both docstrings and doctests, for what that's worth.
- This particular style of having cases depending on the platform is based on some code in the main Python library (ctypes/util.py). To satisfy Sage's coverage requirement, the whole thing could be rewritten as

def main_function(...):
    """
    docstring, doctests
    """
    if sys.platform == X:
        ...
    elif ...

Is it worth breaking with the Python library style to satisfy Sage's docstring and doctest requirements? (I think the earlier questions are worth answering to clarify Sage's policy, so please do not just focus on this one.)

--
John

Jori Mäntysalo

unread,

May 9, 2017, 11:28:25 PM5/9/17

to sage-devel

On Tue, 9 May 2017, John H Palmieri wrote:

> Sage's developer's guide says: "Every function must have a docstring" which
> must include "An EXAMPLES block for examples. This is not optional."

Could we have that as a formal requirement, but allow something like

EXAMPLES::

sage: pass # Actually tested in foobar()

? Then we could automatically check the code and see that every function
has the examples-block.

--
Jori Mäntysalo

Erik Bray

unread,

May 10, 2017, 5:25:33 AM5/10/17

to sage-devel

On Tue, May 9, 2017 at 9:14 PM, John H Palmieri <jhpalm...@gmail.com> wrote:
> Sage's developer's guide says: "Every function must have a docstring" which
> must include "An EXAMPLES block for examples. This is not optional." What is
> meant by "function"? Here is an example, taken from ticket #21399:
>
> import sys
>
> if sys.platform == X:
> def auxiliary_function(...):
> ...
> elif sys.platform == Y:
> def auxiliary_function(...):
> ...
> else:
> def auxiliary_function(...):
> ...
>
> def main_function(...):
> return auxiliary_function(...)
>
> What needs a docstring here? What needs doctests?
>
> - each instance of auxiliary_function?
> - main_function?
>
> Several comments:
>
> - Perhaps in this case the doctests for the different functions would end up
> being redundant. The docstrings shouldn't be, since there are specific
> reasons for treating the platforms differently.

I don't agree with this because I don't think that's necessarily a
proper use of docstrings.

I see docstrings as primarily information for users of a function. It
*may* provide some information about why / how the function is
implemented if relevant, but not necessarily. It may also contain
doctests, but again these serve as useful examples to the reader.

However, information that is purely a development note like "we
override this function here because of this bug" is purely a
development note and what comments were made for. Especially for an
internal function definition that a user would never see--you're just
wasting (and admittedly trivial amount of) memory for a comment that
has no use at runtime.

Michael Orlitzky

unread,

May 10, 2017, 7:42:37 AM5/10/17

to sage-...@googlegroups.com

On 05/09/2017 03:14 PM, John H Palmieri wrote:
> Sage's developer's guide says: "Every function must have a docstring"
> which must include "An EXAMPLES block for examples. This is not
> optional." What is meant by "function"? Here is an example, taken from
> ticket #21399:
>
> |
> importsys
>
> ifsys.platform ==X:
> defauxiliary_function(...):
> ...
> elifsys.platform ==Y:
> defauxiliary_function(...):
> ...
> else:
> defauxiliary_function(...):
> ...
>
> defmain_function(...):
> returnauxiliary_function(...)
> |
>
> What needs a docstring here? What needs doctests?
>

You can write this in a way that lets you doctest everything...

def auxiliary_function_X:
...

def auxiliary_function_Y:
...

def auxiliary_function_default:
...

def main_function(platform = None):
if platform is None:
platform = sys.platform

if platform == X:
return auxiliary_function_X()
elif platform == Y:
return auxiliary_function_Y()
else:
return auxiliary_function_default()

Now you can test/comment both the auxiliary functions and the main
function easily.

John H Palmieri

unread,

May 10, 2017, 2:07:06 PM5/10/17

to sage-devel

This distinction between docstrings vs. comments has never been the Sage practice: Sage practice and official policy has always been to use docstrings. Part of the point is that Sage has intentionally tried to blur the line between user and developer, so any documentation should be made as widely available, and as easily available, as possible.

--
John

Kwankyu Lee

unread,

May 11, 2017, 4:37:48 AM5/11/17

to sage-devel

I agree with Erik.

This distinction between docstrings vs. comments has never been the Sage practice: Sage practice and official policy has always been to use docstrings. Part of the point is that Sage has intentionally tried to blur the line between user and developer, so any documentation should be made as widely available, and as easily available, as possible.

Perhaps this was a practice and reality when Sage was not mature. But for future and for better software, the right direction should be to make the line clear. I (We are) am not a user and a developer at the same time: When I am a user, I would see docstrings and help on a terminal or in a web browser. When I am a developer, I would read the comments in the file as well as the docstrings. When I am a user, I hate to see irrelevant implementation or development details.

For original post and that particular example, I would add docstrings for all the four functions.

Erik Bray

unread,

May 15, 2017, 9:56:46 AM5/15/17

to sage-devel

This is still pretty needless refactoring. It's pointless to define
each function when only one will ever be relevant on the current
platform. If you think in terms of C, it might not even be possible
to *compile* each platform-specific implementation. In Python that's
far less likely to be an issue, albeit not unheard of either (for
example, a Windows-specific implementation might import the "msvcrt"
module, not present on other platforms--you could move the import to
inside the function call but this is just implementing bad practice
for no reason).

In PEP-8, Python's style guide for Python code, there's an admonition
right at the beginning titled "A Foolish Consistency is the Hobgoblin
of Little Minds" (the wording is tongue-in-cheek; don't take it
literally!): https://www.python.org/dev/peps/pep-0008/#id15

The point is that no matter how well motivated a coding guideline is
it's just that--a guideline (maybe even a very important one!). But
one never bend over backwards to write non-idiomatic, convoluted, or
not well-motivated code just to satisfy a guideline. The guideline
should absolute be *considered* strictly when reviewing some code,
especially if it's critical mathematical code (in the case of Sage).
But that doesn't mean there will never be exceptions where one can't
use their common sense.

William Stein

unread,

May 15, 2017, 11:27:16 AM5/15/17

to sage-devel, Craig Citro

On Mon, May 15, 2017 at 6:56 AM, Erik Bray <erik....@gmail.com> wrote:
[...]

> In PEP-8, Python's style guide for Python code, there's an admonition
> right at the beginning titled "A Foolish Consistency is the Hobgoblin
> of Little Minds" (the wording is tongue-in-cheek; don't take it
> literally!): https://www.python.org/dev/peps/pep-0008/#id15
>
> The point is that no matter how well motivated a coding guideline is
> it's just that--a guideline (maybe even a very important one!). But
> one never bend over backwards to write non-idiomatic, convoluted, or
> not well-motivated code just to satisfy a guideline. The guideline
> should absolute be *considered* strictly when reviewing some code,
> especially if it's critical mathematical code (in the case of Sage).
> But that doesn't mean there will never be exceptions where one can't
> use their common sense.

+1, but...

<ancient history>

The reason for the "100% doctest policy" is because in 2007 during a
discussion at Sage Days 5 (see [1]) Craig Citro proposed "We should
require 100% doctesting in new code". Many other people in the room
and I thought this sounded like a good idea and interesting challenge,
since Sage was starting to rapidly grow at the time. I wrote a
coverage script to see where we stood and it was fun to include the
stats with each new release. The stats are *very* impressive to me
today:

~/sage/src/sage$ ../../sage -coverage --summary .
Global score: 96.1% (42697 of 44428)

We quickly started realizing how incredibly useful Craig's suggestion
was in practice, and that it was clearly going to be absolutely
critical to growing our developer base beyond a handful of people.
Also, being able to take absolutely any function in the code, and
paste something into the sage: prompt that interactively exercised
that function was also very useful. Moreover, it was shocking the
extent to which anybody touching since matrices over the rationals
could easily break something else "far away" in modular forms (heh --
it happened today! [2])...

The only serious challenge to this policy that I recall was when
Michael Abshoff merged a massive amount of non-doctested quadratic
forms code from Jon Hanke in 2008. I was pretty annoyed and ensured
that the code was properly tested in a timely manner.

</ancient history>

- William

[1] https://wiki.sagemath.org/days5

[2] https://groups.google.com/forum/#!topic/sage-devel/7VpAxhycBe0

--
William (http://wstein.org)

Erik Bray

unread,

May 15, 2017, 12:39:53 PM5/15/17

to sage-devel, Craig Citro

Oh, neat! Though it would be nice to actually be able to use existing
runtime code coverage tools get a real view of how much the tests are
covering. As I said before there are of course limitations, but it
helps to find major use cases that aren't being tested. It's possible
using something like coverage.py will just work, but I haven't tried
it yet.

> We quickly started realizing how incredibly useful Craig's suggestion
> was in practice, and that it was clearly going to be absolutely
> critical to growing our developer base beyond a handful of people.
> Also, being able to take absolutely any function in the code, and
> paste something into the sage: prompt that interactively exercised
> that function was also very useful. Moreover, it was shocking the
> extent to which anybody touching since matrices over the rationals
> could easily break something else "far away" in modular forms (heh --
> it happened today! [2])...
>
> The only serious challenge to this policy that I recall was when
> Michael Abshoff merged a massive amount of non-doctested quadratic
> forms code from Jon Hanke in 2008. I was pretty annoyed and ensured
> that the code was properly tested in a timely manner.
>
> </ancient history>

Thanks for the background. And of course, I think it's a great policy
by and large. It's just important to be clear what we mean when we
ask whether or not a function is "tested". If you wrap a function
immediately in a wrapper function, the purpose of which is to supply
documentation/tests in one place, then the wrapped function is being
tested--you don't need a formal proof to see that. It's just a simple
application of the DRY principle :)

Jeroen Demeyer

unread,

May 17, 2017, 4:06:56 AM5/17/17

to sage-...@googlegroups.com

On 2017-05-15 15:56, Erik Bray wrote:
> The point is that no matter how well motivated a coding guideline is
> it's just that--a guideline (maybe even a very important one!). But
> one never bend over backwards to write non-idiomatic, convoluted, or
> not well-motivated code just to satisfy a guideline.

Totally +1

Too bad that too many people take the coverage script too literal.

For functions which are meant to be called directly by end users,
doctests are essential because they should show examples of how the
function should actually be used. However, for internal functions or
things like __init__, it is often not easy to write meaningful
docstrings. Something I regularly see is that people literally write the
same doctest twice in two different places (say, an end-user function
and an internal function) just to satisfy a stupid script.

Jeroen Demeyer

unread,

May 17, 2017, 4:11:17 AM5/17/17

to sage-...@googlegroups.com

On 2017-05-15 18:39, Erik Bray wrote:
> It's possible
> using something like coverage.py will just work, but I haven't tried
> it yet.

For Cython, see
http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html#enabling-coverage-analysis

Kwankyu Lee

unread,

May 17, 2017, 5:08:50 AM5/17/17

to sage-devel

For functions which are meant to be called directly by end users,
doctests are essential because they should show examples of how the
function should actually be used.

For the case that a function is meant to be internal but can be accessed by an end user via say tab completion (which I thought is the case for the original example), it should carry a docstring to guide the user away.

If the name of the function starts with an underline, this should be enough to frighten away the user.

Vincent Delecroix

unread,

May 17, 2017, 6:20:26 AM5/17/17

to sage-...@googlegroups.com

I do definitely support the idea of doctests for illustrative purposes.
However, as already mentioned, relying on them for the whole testing
framework is bad IMHO. Here are several reasons:
- random seeds are always the same
- impossible to have complicated test code
- impossible to tune the tests to make it more intensive (let say I
want to run the tests for 1000 inputs rather than 10)
- TESTS part are often unreadable
- `TestSuite(my_object).run()` is not illustrative

As underlined by Jeroen, our 100% coverage policy made some people do
copy-paste. As a consequence, some quite long doctests get executed
twice and makes `sage -tp --all` much slower.

I would suggest:

1) Change the guildelines to become "have 100% doctest coverage for
*public* methods/functions". We could even make it stronger by adding
"Test and document each function with each possible range of options.
Each code path (including errors) must be tested".

2) Introduce an alternative way of testing Sage code in the flavour of
unit tests (eg with python files in SAGE_ROOT/tests/) and move all the
content of the TESTS blocks into this framework.

Vincent

Jeroen Demeyer

unread,

May 17, 2017, 7:53:03 AM5/17/17

to sage-...@googlegroups.com

On 2017-05-17 12:19, Vincent Delecroix wrote:
> 1) Each code path (including errors) must be tested".

You can easily translate this requirement as "require 100% *line*
coverage" instead of *function* coverage which is what is current
sage-coverage script tests.

> 2) Introduce an alternative way of testing Sage code in the flavour of
> unit tests

We have had this discussion before... which problem would that solve?

Doctesting is just a tool. You can run the same tests as doctest or as
py.test test: a test is just some code which is run. There is nothing
magical about "unit testing" that it cannot be done using doctests.

Before we add a new testing framework, I would like to see at least one
compelling advantage of it.

Jeroen Demeyer

unread,

May 17, 2017, 8:00:08 AM5/17/17

to sage-...@googlegroups.com

On 2017-05-17 12:19, Vincent Delecroix wrote:

> Here are several reasons:
> - random seeds are always the same

That can easily be fixed by explicitly changing the random seed in the
doctest (probably some helper context should be provided for this)

> - impossible to have complicated test code

First of all, doctests can contain arbitrary code, so it can be as
complicated as you want.

Second, you could write a dedicated test function like

def complicated_test_function(many, arguments, here):
"""
TESTS:

sage: complicated_test_function(1, 2, 3)
"""

This is like a unit test without requiring a new testing framework.

> - impossible to tune the tests to make it more intensive (let say I
> want to run the tests for 1000 inputs rather than 10)

You can run many iterations of the doctests.

> - TESTS part are often unreadable

If the test code is complicated, using a different test framework will
not help. You would just be running the same unreadable code in a
different way.

> - `TestSuite(my_object).run()` is not illustrative

I don't understand what you mean with this.

Jori Mäntysalo

unread,

May 17, 2017, 8:05:20 AM5/17/17

to sage-...@googlegroups.com

On Wed, 17 May 2017, Jeroen Demeyer wrote:

>> Here are several reasons:
>> - random seeds are always the same
>
> That can easily be fixed by explicitly changing the random seed in the
> doctest (probably some helper context should be provided for this)

Currently sage -t --randorder <somenumber> will give quite massive output
of failing tests...

What kind of testing we are talking now? As an example we can create a
random graph g and check that g.is_planar() == (g.genus() == 0) etc.
Should we have that kind of randomized test? (I do those, but have not
included them to Sage code.)

--
Jori Mäntysalo

Michael Orlitzky

unread,

May 17, 2017, 8:06:50 AM5/17/17

to sage-...@googlegroups.com

On 05/17/2017 04:07 AM, Jeroen Demeyer wrote:
>
> Totally +1
>
> Too bad that too many people take the coverage script too literal.
>
> For functions which are meant to be called directly by end users,
> doctests are essential because they should show examples of how the
> function should actually be used. However, for internal functions or
> things like __init__, it is often not easy to write meaningful
> docstrings. Something I regularly see is that people literally write the
> same doctest twice in two different places (say, an end-user function
> and an internal function) just to satisfy a stupid script.
>

I like that we err on the side of safety. Even internal functions need
to be documented:

* What is the function for? What is it supposed to do?

* What parameters does it take? What are their default values?

* What does it return?

* Does it throw any exceptions?

* Can you show me an example of how I might call it?

All of those are just as useful to me as a developer as they would be
for the user of the e.g. matrix() function. Enforcing the doctest rule
ensures that the necessary documentation is there.

You might wind up with some extra verbosity, but if you get rid of the
rule, people just won't write docs. As an example, consider pretty much
any project on Github that doesn't have a "you must document your
functions" rule. The Sage source code is much nicer to work with than
most projects, largely due to that rule.

Marc Mezzarobba

unread,

May 17, 2017, 8:16:54 AM5/17/17

to sage-...@googlegroups.com

Jeroen Demeyer wrote:

>>- random seeds are always the same
>
> That can easily be fixed by explicitly changing the random seed in the
> doctest (probably some helper context should be provided for this)

Or, in the case of complicated tests done by a dedicated function, by
using sage.misc.random_testing.

--
Marc

Marc Mezzarobba

unread,

May 17, 2017, 8:26:01 AM5/17/17

to sage-...@googlegroups.com

Jeroen Demeyer wrote:
> For functions which are meant to be called directly by end users,
> doctests are essential because they should show examples of how the
> function should actually be used. However, for internal functions or
> things like __init__, it is often not easy to write meaningful
> docstrings.

I suspect the policy of doctesting every function also discourages
people from breaking their code into simpler functions.

And it is not clear that it improves the documentation (or testing) even
of the helper functions themselves, because pretty often it would make
more sense to describe and test, say, a number of private methods of a
class at once, in the class docstring.

--
Marc

Vincent Delecroix

unread,

May 17, 2017, 9:27:24 AM5/17/17

to sage-...@googlegroups.com

On 17/05/2017 14:00, Jeroen Demeyer wrote:
> On 2017-05-17 12:19, Vincent Delecroix wrote:

Let me tell that my wish to have another test framework is mostly
organizational. First of all, the code in the docstrings are intended to
be read by users. I think that we would benefit from a clearer
distinction between "illustrative examples" and "tests". This is
currently achieved by EXAMPLES vs TESTS. Secondly, as I started to say
the Sage doctest framework is quite limited.

>> Here are several reasons:
>> - random seeds are always the same
>
> That can easily be fixed by explicitly changing the random seed in the
> doctest (probably some helper context should be provided for this)
>
>> - impossible to have complicated test code
>
> First of all, doctests can contain arbitrary code, so it can be as
> complicated as you want.
>
> Second, you could write a dedicated test function like
>
> def complicated_test_function(many, arguments, here):
> """
> TESTS:
>
> sage: complicated_test_function(1, 2, 3)
> """
>
> This is like a unit test without requiring a new testing framework.

And you want this to appear in the reference manual? Add a useless
function to the Sage library?

>> - impossible to tune the tests to make it more intensive (let say I
>> want to run the tests for 1000 inputs rather than 10)
>
> You can run many iterations of the doctests.

How do I run a specific doctest (let say line 135 of
sage/my_module/my_file.py) where "for i in range(10)" is replace by "for
i in range(100)"?

>> - TESTS part are often unreadable
>
> If the test code is complicated, using a different test framework will
> not help. You would just be running the same unreadable code in a
> different way.

It could be more readable since it can be orgainzed within classes

class MyTestClass:
parameter1 = xxx
parameter2 = yyy

def _test1(self):
...

def _test2(self):
...

Nobody would write such class inside TESTS blocks.

>> - `TestSuite(my_object).run()` is not illustrative
>
> I don't understand what you mean with this.

I meant "It does not help a user to read this in the documentation of
the class corresponding to "my_object"."

Erik Bray

unread,

May 18, 2017, 8:10:48 AM5/18/17

to sage-devel

I agree in part with both of you. I think some kinds of code is
difficult to test without significant test harnessing that is hard to
set up in the doctest format. This especially tends to be the case
with tests of code that touch external resources, such as networks or
databases, that you might prefer to mock for the purposes of unit
testing. Fortunately I haven't personally run into too many examples
like that in Sage, but I'm sure they exist. Doctests can also make
testing some numerical results tricky, but Sage's doctest checker has
already implemented the usual workarounds for that (parsing floats,
etc.)

As to whether or not private functions / methods should have their own
tests, I'm split. I think the "EXAMPLES" section is genuinely useful,
even for code that only programmers will see. But I have also seen
examples like Jeroen has mentioned of literally copy/pasting entire
test sequences just to satisfy this, which is completely pointless and
not useful for example purposes either. I can't find a specific
example right now but it will be things like:

def public_function():
"""
<Tests for public function>
"""

with some_context_manager_that_was_probably_added_later():
_private_implementation_of_public_function()

def _private_implementation_of_public_function():
"""
<Exact copy of tests for public_function, because hey those tests
cover this function too>
"""

and that's just silly.

Personally, when I'm writing unit tests, I will *rarely* write unit
tests specifically for private functions/methods if that code is
covered by the tests for public functions. Although I'm not a TDD
evangelist, tests for the public functions tend to be a more valuable
use of one's time, because they ensure that the public interface works
as intended, and public interfaces are less subject to change.
Whereas private functions are usually implementation details that are
highly subject to change, and there are diminishing returns in writing
specific tests for them when they are covered by the public function
tests.

That said, there are cases where if a private function is complicated
enough on its own, or is used in a complicated way by functions that
are complicated on their own, sometimes it's worth having tests for
it. One has to use their judgement, along with *line* profiling to
see if the most important bits are being covered.

If you're really concerned about keeping a number looking good, most
linting and profiling tools have a way to exclude lines (in Python
this is usually a comment like "# noqa") for those rare exceptions to
whatever rules you're otherwise trying to stick to.

Reply all

Reply to author

Forward