Unittests vs doctests

117 views
Skip to first unread message

Tobia...@gmx.de

unread,
Sep 4, 2020, 10:02:27 AM9/4/20
to sage-devel

Hi everybody,

I'm currently in the progress of cleaning up my code implementing symplectic structures in sage. While doing so, I noticed that there are a lot of doctests in the existing code that test rather elementary things. These are often not utterly important for a user of the method, but are rather unit tests that verify the correct behavior in some edge case. For this reason, I wanted to move these doctests to unittests - where I then realized that the `tests` folder is almost empty. So obviously I'm missing something here.

Finding almost no unit tests in sage made me a bit uncertain, and I did a bit of research. The general opinion (for example echoed in https://stackoverflow.com/questions/361675/python-doctest-vs-unittest) seems to be that doc tests are there to verify that the documentation is correct (in sync with the implementation) while unit tests make sure that the code is correct. This also make sense since you usually don't want to bloat the documentation with edge cases, and IDEs have limited supported for doctests while unit tests get all the support of normal python code, including debugging etc.

So what's the sage convention concerning unit tests vs doctests? 

John H Palmieri

unread,
Sep 4, 2020, 11:35:45 AM9/4/20
to sage-devel
The Sage convention is to use doctests:

EXAMPLES::

    sage: test1
    output
    ...

To test the code but not (by default) appear in the documentation, use "TESTS" blocks:

TESTS::

    sage: test2
    output
    ...


--
John

Matthias Koeppe

unread,
Sep 4, 2020, 1:30:53 PM9/4/20
to sage-devel
On Friday, September 4, 2020 at 7:02:27 AM UTC-7, Tobia...@gmx.de wrote:
I noticed that there are a lot of doctests in the existing code that test rather elementary things. These are often not utterly important for a user of the method, but are rather unit tests that verify the correct behavior in some edge case. [...]

Just a quick note that in addition to the doctests, Sage also uses _test... methods, which are defined in abstract base classes to test that subclasses implement the protocol correctly. 


Tobia...@gmx.de

unread,
Sep 4, 2020, 2:18:18 PM9/4/20
to sage-devel
Thanks for the quick answers. It's good to know that sage does have a distinction between classical doctests and unit tests. Is there a deeper reason than tradition that the latter is implemented as doc tests using TESTS, instead of more conventional approaches using pytest or nose? Refs https://trac.sagemath.org/ticket/28936

kcrisman

unread,
Sep 5, 2020, 9:47:35 AM9/5/20
to sage-devel
On Friday, September 4, 2020 at 2:18:18 PM UTC-4 Tobia...@gmx.de wrote:
Thanks for the quick answers. It's good to know that sage does have a distinction between classical doctests and unit tests. Is there a deeper reason than tradition that the latter is implemented as doc tests using TESTS, instead of more conventional approaches using pytest or nose? Refs https://trac.sagemath.org/ticket/28936


One quick answer to this is that it is often helpful for developers (and even users) to be able to see these in documentation via ? very easily  A lot of the so-called unit tests also function to show why we coded them that way in the first place.  And many are testing previously broken behavior, which links to Trac, so there is a more "one stop shop" for all of this.  Naturally, there could be philosophical disagreements about whether this is best, but remember also that Sage is definitely not a project where the developers are not really the same people as the users.

Sébastien Labbé

unread,
Sep 5, 2020, 10:52:27 AM9/5/20
to sage-devel
Very early in the sage development, it became an obligation that all new code getting into Sage must be 100% doctested (see the command sage -coverage <file/folder>). Also, it was a goal in the first years to increase the coverage of the sage library which went from a low 60% to above 90% (I don't know where we are now, but I would guess it must be 95%). One explanation is that asking 100% doctested code made the community focus on that (it is a lot of energy already) which left unit tests not much used, and neither known (I never used it myself, so I don't know what a unit test can do that a doctest can't).

I do remember some discussion on sage-devel about it but can't find it. Possibly the opinion of the early developers was important in that orientation. Personally, I do find it very practical that the triangle (documentation + tests + code which must always be in sync) all live in the same place in a single file when I want to write or read such code.

Sébastien

Sébastien Labbé

unread,
Sep 5, 2020, 11:00:41 AM9/5/20
to sage-devel
> (I don't know where we are now, but I would guess it must be 95%).

I was curious to check where we are now:

$ sage -coverage --summary src/sage

Global score: 96.5% (49627 of 51409)

480 files with wrong documentation
1292 functions with no doc
490 functions with no test
473 doctest are potentially wrong

Files with wrong documentation:
-------------------------------
...
 
As you can see, `sage -coverage` does not check whether each of the 51409 functions in sage has a unit test somewhere else making sure it works. But it says that 49627 of them have a doctest.

Nils Bruin

unread,
Sep 5, 2020, 1:45:53 PM9/5/20
to sage-devel
On Friday, September 4, 2020 at 11:18:18 AM UTC-7, Tobia...@gmx.de wrote:
Thanks for the quick answers. It's good to know that sage does have a distinction between classical doctests and unit tests. Is there a deeper reason than tradition that the latter is implemented as doc tests using TESTS, instead of more conventional approaches using pytest or nose? Refs https://trac.sagemath.org/ticket/28936

I think an important reason is that a lot of authors don't know what unit tests are or what is desirable in writing them, but everybody can easily understand what doctests are. It's easy to see examples of them, because they're right there with the code. So, the threshold for writing doctests is much lower. Some test coverage is much better than none, so requiring people to contribute tests with the lowest threshold method probably gives us most benefit (in getting tests while not scaring away contributors).

Tobia...@gmx.de

unread,
Sep 7, 2020, 10:13:28 AM9/7/20
to sage-devel
I wasn't aware of the fact that many contributors to sage are users in the first place, and contributors second. So they are probably more familiar with sage's doctests than with unittests. That's good input, thanks!

For me, it's actually the converse: I didn't know what doctests are, but was familiar with the concept of unit tests. What do you think about recommending both ways to write unit tests (as doctest-like TESTS, and unittest python files in src/sage/tests) in the developer documentation, to help onboarding of developers from the broader python community? (Keeping the requirement that methods need to have doctests using the EXAMPLES tag.)

As for the advantages of unittests over doctests (given my limited experience with the latter):
- Can easily run and debug single tests
- Get full intellisense and linting support for writing tests
- Easily share common initialization / teardown code between tests
- Stronger assertions not only relying on string-comparisons
- Supports generation of tests based on (external) data, e.g. if you want to test a method against a range of input -> output pairs
- Ability to mock external objects and services (web resources, libraries etc) and use dependency injection
- Test code can rely on additional libraries and other code, that doesn't need to be shipped

TB

unread,
Sep 7, 2020, 1:53:09 PM9/7/20
to sage-...@googlegroups.com
On 07/09/2020 17:13, Tobia...@gmx.de wrote:
> I wasn't aware of the fact that many contributors to sage are users in
> the first place, and contributors second. So they are probably more
> familiar with sage's doctests than with unittests. That's good input,
> thanks!
>
> For me, it's actually the converse: I didn't know what doctests are, but
> was familiar with the concept of unit tests. What do you think about
> recommending both ways to write unit tests (as doctest-like TESTS, and
> unittest python files in src/sage/tests) in the developer documentation,
> to help onboarding of developers from the broader python community?
> (Keeping the requirement that methods need to have doctests using the
> EXAMPLES tag.)
>
> As for the advantages of unittests over doctests (given my limited
> experience with the latter):
> - Can easily run and debug single tests
> - Get full intellisense and linting support for writing tests
> - Easily share common initialization / teardown code between tests
> - Stronger assertions not only relying on string-comparisons
> - Supports generation of tests based on (external) data, e.g. if you
> want to test a method against a range of input -> output pairs
> - Ability to mock external objects and services (web resources,
> libraries etc) and use dependency injection
> - Test code can rely on additional libraries and other code, that
> doesn't need to be shipped
For the first point you mention, please see the thread "doctest a single
function or class" at
https://groups.google.com/forum/#!msg/sage-devel/mlcfDY-Hur0/Ws8sIP3SAwAJ
that ended with an astute observation by embray...
I would find it useful. Better yet, is having an editor keybinding to
run the doctest of the current method or block (by cursor position).

Assertions in a doctest can be essentially arbitrary, like foo(x,y) ==
bar(y,x) with the output True or False.

Regards,
TB

Nils Bruin

unread,
Sep 7, 2020, 5:27:25 PM9/7/20
to sage-devel
On Monday, September 7, 2020 at 7:13:28 AM UTC-7, Tobia...@gmx.de wrote:
As for the advantages of unittests over doctests (given my limited experience with the latter):
- Can easily run and debug single tests

For the most part, cut/paste realizes that for doctests too
 
- Get full intellisense and linting support for writing tests

I don't know what intellisense is, but I guess you're worried about code in strings not being recognized by your development environment as code. Usually doctests start out life in an interactive session, and then are copied over.
 
- Easily share common initialization / teardown code between tests
- Stronger assertions not only relying on string-comparisons

That's not really an issue, since you can always just test:

    sage: print(assertion)
    True
 
- Supports generation of tests based on (external) data, e.g. if you want to test a method against a range of input -> output pairs

I think you could do that in a doctest; it wouldn't really fit the general setting, though

- Ability to mock external objects and services (web resources, libraries etc) and use dependency injection

???
- Test code can rely on additional libraries and other code, that doesn't need to be shipped

That just makes the tests less reproducible and more fragile.

 I think it's possible to write unit tests already, so if you prefer you should be able to do that. I know the category framework has some standardized testing. It may be able to be triggered from doctests, even! I don't think it would be wise to allow unit tests *instead of* unit tests, though. There's real value in having almost complete (if flimsy) coverage through one platform.

I also don't think it's wise to require *both*. I already find it cumbersome to provide doctests when I contribute something. Having to learn yet another platform and fill in more boiler plate would reduce the attractiveness of contributing considerably.

Matthias Koeppe

unread,
Sep 7, 2020, 5:42:06 PM9/7/20
to sage-devel
On Monday, September 7, 2020 at 2:27:25 PM UTC-7, Nils Bruin wrote:
 I think it's possible to write unit tests already, so if you prefer you should be able to do that. I know the category framework has some standardized testing. It may be able to be triggered from doctests, even!

That's right, the _test... methods are invoked by explicit calls to "TestSuite(object).run()" in doctests; see for example https://github.com/sagemath/sage/blob/develop/src/sage/categories/examples/facade_sets.py#L33

In addition to _test... methods provided by the category framework, several ABCs use them to enforce a protocol: 




Reply all
Reply to author
Forward
0 new messages