doctest random output?

Leam Hall

unread,

Aug 28, 2017, 5:41:28 AM8/28/17

to

Is this a good way to test if random numeric output? It seems to work
under Python 2.6 and 3.6 but that doesn't make it 'good'.

### Code
import random

def my_thing():
""" Return a random number from 1-6
>>> 0 < my_thing() <=6
True
>>> 6 < my_thing()
False
"""

return random.randint(1,6)

if __name__ == "__main__":
import doctest
doctest.testmod()

### Results
python3 test_doctest.py -v
Trying:
0 < my_thing() <=6
Expecting:
True
ok
Trying:
6 < my_thing()
Expecting:
False
ok
1 items had no tests:
__main__
1 items passed all tests:
2 tests in __main__.my_thing
2 tests in 2 items.
2 passed and 0 failed.
Test passed.

Peter Otten

unread,

Aug 28, 2017, 6:08:21 AM8/28/17

to

Leam Hall wrote:

> Is this a good way to test if random numeric output? It seems to work
> under Python 2.6 and 3.6 but that doesn't make it 'good'.
>
> ### Code
> import random
>
> def my_thing():
> """ Return a random number from 1-6
> >>> 0 < my_thing() <=6
> True
> >>> 6 < my_thing()
> False
> """

These are fine as illustrative tests that demonstrate how my_thing() is
used.

If you want to test the "randomness" -- that's hard. You could run more
often

all(1 <= mything() <= 6 for _ in range(1000))

but that doesn't guarantee that the 1001st attempt is outside the specified
range. You could have a look at the distribution

>>> c = Counter(my_thing() for _ in range(1000))
>>> set(c) == set(range(1, 7))
True

but that *should* occasionally fail even though in practice

>>> dict(c) == {3: 1000}
True

would be a strong indication that something is broken rather than that you
are really lucky...

Leam Hall

unread,

Aug 28, 2017, 3:17:22 PM8/28/17

to

On 08/28/2017 11:40 AM, Dennis Lee Bieber wrote:

... a bunch of good stuff ...

I'm (re-)learning python and just trying make sure my function works.
Not at the statistical or cryptographic level. :)

Thanks!

Leam

Steve D'Aprano

unread,

Aug 28, 2017, 9:53:28 PM8/28/17

to

On Mon, 28 Aug 2017 07:41 pm, Leam Hall wrote:

> Is this a good way to test if random numeric output? It seems to work
> under Python 2.6 and 3.6 but that doesn't make it 'good'.

That depends on what you are actually testing. If you are intending to test the
statistical properties of random, google for the Die Hard tests and start by
porting them to Python.

But if you're just hoping to test your library's APIs, that's trickier than it
seems. Unfortunately, Python doesn't guarantee that the exact output of the
random module is stable across bug fix releases, except for random.random
itself. So this is safe:

def my_thing():
"""blah blah blah

>>> random.seed(45)
>>> my_thing() # calls random.random
0.738270225794931

"""

But this is not:

def my_thing():
"""blah blah blah

>>> random.seed(45)
>>> my_thing() # calls random.int
4

"""

That makes doctesting anything related to random a PITA. Here are some
suggestions, none of them are really great:

(1) Disable doctesting for that example, and treat it as just documentation:

def my_thing():
"""blah blah blah

>>> my_thing() #doctest:+SKIP
4

"""

(2) Monkey-patch the random module for testing. This is probably the worst idea
ever, but it's an idea :-)

def my_thing():
"""blah blah blah

>>> import random
>>> save = random.randint
>>> try:
... random.randint = lambda a, b: 4
... my_thing()
... finally:
... random.randint = save
4

"""

That makes for a fragile test and poor documentation.

(3) Write your functions to take an optional source of randomness, and then in
your doctests set them:

def my_thing(randint=None):
"""blah blah blah

>>> my_thing(randint=lambda a,b: 4)
4

"""
if randint is None:
from random import randint
...

(4) Write your doctests to test the most general properties of the returned
results:

def my_thing(randint=None):
"""blah blah blah

>>> num = my_thing()
>>> isinstance(num, int) and 0 <= my_thing() <= 6
True

"""

--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

Chris Angelico

unread,

Aug 28, 2017, 10:26:10 PM8/28/17

to

On Tue, Aug 29, 2017 at 11:53 AM, Steve D'Aprano
<steve+...@pearwood.info> wrote:
> (1) Disable doctesting for that example, and treat it as just documentation:
>
> def my_thing():
> """blah blah blah
>
> >>> my_thing() #doctest:+SKIP
> 4
>
> """

For a lot of functions, this completely destroys the value of doctesting.

> (2) Monkey-patch the random module for testing. This is probably the worst idea
> ever, but it's an idea :-)
>

> That makes for a fragile test and poor documentation.

This highlights the inherent weakness of doctests. For proper unit
testing, I would definitely recommend this. Maybe a hybrid of 1 and 2
could be organized... hmm.

> (3) Write your functions to take an optional source of randomness, and then in
> your doctests set them:
>
> def my_thing(randint=None):
> """blah blah blah
>
> >>> my_thing(randint=lambda a,b: 4)
> 4
>
> """
> if randint is None:
> from random import randint
> ...

Unless that would be useful for other reasons, not something I like
doing. Having code in your core that exists solely (or even primarily)
to make testing easier seems like doing things backwards.

> (4) Write your doctests to test the most general properties of the returned
> results:
>
>
> def my_thing(randint=None):
> """blah blah blah
>
> >>> num = my_thing()
> >>> isinstance(num, int) and 0 <= my_thing() <= 6
> True
>
> """

This is what I'd probably do, tbh.

None of the options really appeal though. Personally, I'd probably
either go with #4, or maybe something like this:

def roll(sequence):
"""Roll a set of dice

>>> from test_mymodule import * # ensure stable RNG
>>> roll("d12 + 2d6 + 3")
You roll d12: 8
You roll 2d6: 1, 6, totalling 7.
You add a bonus of 3
For d12 + 2d6 + 3, you total: 18
"""

and bury all the monkey-patching into test_mymodule. It can have its
own implementations of randint and whatever else you use. That way, at
least there's only one line that does the messing around. I still
don't like it though - so quite honestly, I'm most likely to go the
route of "don't actually use doctests".

ChrisA

Steven D'Aprano

unread,

Aug 29, 2017, 3:39:31 AM8/29/17

to

On Tue, 29 Aug 2017 12:25:45 +1000, Chris Angelico wrote:

> On Tue, Aug 29, 2017 at 11:53 AM, Steve D'Aprano
> <steve+...@pearwood.info> wrote:
>> (1) Disable doctesting for that example, and treat it as just
>> documentation:
>>
>> def my_thing():
>> """blah blah blah
>>
>> >>> my_thing() #doctest:+SKIP
>> 4
>>
>> """
>
> For a lot of functions, this completely destroys the value of
> doctesting.

"The" value? Doc tests have two values: documentation (as examples of
use) and as tests. Disabling the test aspect leaves the value as
documentation untouched, and arguably is the least-worst result. You can
always write a unit test suite to perform more detailed, complicated
tests. Doc tests are rarely exhaustive, so you need unit tests as well.

>> (2) Monkey-patch the random module for testing. This is probably the
>> worst idea ever, but it's an idea :-)
>>
>> That makes for a fragile test and poor documentation.
>
> This highlights the inherent weakness of doctests. For proper unit
> testing, I would definitely recommend this. Maybe a hybrid of 1 and 2
> could be organized... hmm.

Doc tests should be seen as *documentation first* and tests second. The
main roll of the tests is to prove that the documented examples still do
what you say they do.

It makes for a horrible and uninformative help() experience to have
detailed, complex, exhaustive doc tests exercising every little corner
case of your function. That should go in your unit tests.

Possibly relevant: the unittest module has functionality to automatically
extract and run your library's doctests, treating them as unit tests. So
you can already do both.

>> (3) Write your functions to take an optional source of randomness, and
>> then in your doctests set them:
>>
>> def my_thing(randint=None):
>> """blah blah blah
>>
>> >>> my_thing(randint=lambda a,b: 4)
>> 4
>>
>> """
>> if randint is None:
>> from random import randint
>> ...
>
> Unless that would be useful for other reasons, not something I like
> doing. Having code in your core that exists solely (or even primarily)
> to make testing easier seems like doing things backwards.

I see your point, and I don't completely disagree. I'm on the fence about
this one. But testing is important, and we often write code to make
testing easier, e.g. pulling out a complex but short bit of code into its
own function so we can test it, using dependency injection, etc. Why
shouldn't we add hooks to enable testing? Not every function needs such a
hook, but some do.

See, for example, "Enemies of Test Driven Development":

https://jasonmbaker.wordpress.com/2009/01/08/enemies-of-test-driven-
development-part-i-encapsulation/

In Python, we have the best of both worlds: we can flag a method as
private, and *still* test it! So in a sense, Python's very design has
been created specifically to allow testing.

For a dissenting view, "Are Private Methods a Code Smell?":

http://carlosschults.net/en/are-private-methods-a-code-smell/

>> (4) Write your doctests to test the most general properties of the
>> returned results:
>>
>>
>> def my_thing(randint=None):
>> """blah blah blah
>>
>> >>> num = my_thing()
>> >>> isinstance(num, int) and 0 <= my_thing() <= 6
>> True
>>
>> """
>
> This is what I'd probably do, tbh.

Sometimes that's sufficient. Sometimes its not. It depends on the
function.

For example, imagine a function that returns a randomly selected prime
number. The larger the prime, the less likely it is to be selected, but

there's no upper limit. So you write:

>>> num = my_thing()

>>> isinstance(num, int) and 2 <= num
True

Not very informative as documentation, and a lousy test too.

> None of the options really appeal though. Personally, I'd probably
> either go with #4, or maybe something like this:
>
> def roll(sequence):
> """Roll a set of dice
>
> >>> from test_mymodule import * # ensure stable RNG
> >>> roll("d12 + 2d6 + 3")
> You roll d12: 8 You roll 2d6: 1, 6, totalling 7.
> You add a bonus of 3 For d12 + 2d6 + 3, you total: 18
> """
>
> and bury all the monkey-patching into test_mymodule.

Wait... are you saying that importing test_mymodule monkey-patches the
current library? And doesn't un-patch it afterwards? That's horrible.

Or are you saying that test_module has its own version of roll(), and so
you're using *that* version instead of the one in the library?

That's horrible too.

I think that once you are talking about monkey-patching things in order
to test them, you should give up on doc tests and use unittest instead.
At least then you get nice setUp and tearDown methods that you can use.

> It can have its own
> implementations of randint and whatever else you use. That way, at least
> there's only one line that does the messing around. I still don't like
> it though - so quite honestly, I'm most likely to go the route of "don't
> actually use doctests".

Are you saying don't use doctests for *this* problem, or don't use them
*at all*?

--
Steven D'Aprano
“You are deluded if you think software engineers who can't write
operating systems or applications without security holes, can write
virtualization layers without security holes.” —Theo de Raadt

Chris Angelico

unread,

Aug 29, 2017, 4:41:05 AM8/29/17

to

On Tue, Aug 29, 2017 at 5:39 PM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> On Tue, 29 Aug 2017 12:25:45 +1000, Chris Angelico wrote:
>
>> For a lot of functions, this completely destroys the value of
>> doctesting.
>
>
> "The" value? Doc tests have two values: documentation (as examples of
> use) and as tests. Disabling the test aspect leaves the value as
> documentation untouched, and arguably is the least-worst result. You can
> always write a unit test suite to perform more detailed, complicated
> tests. Doc tests are rarely exhaustive, so you need unit tests as well.

You can have a docstring that isn't crafted to be runnable tests. The
point about doc tests is that they're executable documentation - the
point of them is to be tests, not just docs. You can always write your
unit tests separately, and let your docstrings merely be
documentation, and then none of this matters.

> For example, imagine a function that returns a randomly selected prime
> number. The larger the prime, the less likely it is to be selected, but
> there's no upper limit. So you write:
>
> >>> num = my_thing()
> >>> isinstance(num, int) and 2 <= num
> True
>
>
> Not very informative as documentation, and a lousy test too.

Yes, but you could have some sort of primality test on it.

>>> is_prime(my_thing())
True

Even if all you have is a "probably prime" test, that would still make
for better documentation AND better testing than no test at all.

> Wait... are you saying that importing test_mymodule monkey-patches the
> current library? And doesn't un-patch it afterwards? That's horrible.
>
> Or are you saying that test_module has its own version of roll(), and so
> you're using *that* version instead of the one in the library?
>
> That's horrible too.

My original plan was to have *a function in* that module that does the
monkey-patching, but I seem to have not actually typed that part in...
mea culpa. I agree that merely importing your test helpers shouldn't
do the changes! Even with "import test_mymodule;
test_mymodule.stable()" it's still just one extra line (better than
the full version).

Not un-patching it afterwards? Yes. Since part of its purpose is to
seed the RNG with a fixed value, it's not really possible or practical
to "undo" that, and so I wouldn't worry too much about an "afterwards"
- after testing, you exit the interpreter, if you want to get back to
normality.

>> It can have its own
>> implementations of randint and whatever else you use. That way, at least
>> there's only one line that does the messing around. I still don't like
>> it though - so quite honestly, I'm most likely to go the route of "don't
>> actually use doctests".
>
> Are you saying don't use doctests for *this* problem, or don't use them
> *at all*?

For this and any other problem where doctesting is impractical.
Because let's face it, laziness is a big thing. If it's too much
hassle to make a docstring executable, I'm just not going to make it
executable. Which has the unfortunate downside of allowing the
docstrings to get out of sync with the code, but that's the cost.

ChrisA

Peter Otten

unread,

Aug 29, 2017, 5:12:59 AM8/29/17

to

Steven D'Aprano wrote:

> Wait... are you saying that importing test_mymodule monkey-patches the
> current library? And doesn't un-patch it afterwards? That's horrible.

There's something in the library, unittest.mock that makes this relatively
safe -- if not painless

with mock.patch("random.randint", side_effect=[42]) as randint:
self.assertEqual(my_module.my_thing(), 42)
randint.assert_called_once_with(1, 6)

and sometimes monkey-patching may be a necessary evil to verify that a
portion of the code that is buried a bit too deep is called as expected.

However, in this case it tests that the code is what it is rather than what
it does. Good tests would allow for replacing random.randint() with
random.randrange() or random.SystemRandom().randrange() and still succeed.

Pavol Lisy

unread,

Aug 29, 2017, 9:15:49 AM8/29/17

to

> --
> https://mail.python.org/mailman/listinfo/python-list
>

I am not sure what is best practice but I would use sys.exit to
propagate failure (for better scripting possibility).
For example:

if __name__ == "__main__":
import doctest

import sys
sys.exit(doctest.testmod()[0])

But maybe I am wrong and non zero exit status is just for errors in code?

---

If you don't need something at scientific level (which is hard, see:
https://www.random.org/analysis/ ) you could probably use fact that
random sequences are "hard" to compress. For example something like
this could help ->

>>> import zlib
>>> A = [my_thing() for i in range(100)]
>>> 50 < len(zlib.compress(bytes(A))) < 70
True

But be careful!! Other randint parameters would need other limit values!

# for randint(1,6) you have distribution of lengths like this
collections.Counter(len(zlib.compress(bytes(random.randint(1,6) for i
in range(100)))) for j in range(100000))
Counter({55: 1,
56: 46,
57: 834,
58: 7349,
59: 31035,
60: 42884,
61: 16434,
62: 1397,
63: 20})

# but for randint(1,16) you have distribution like this!
collections.Counter(len(zlib.compress(bytes(random.randint(1,16) for i
in range(100)))) for j in range(100000))
Counter({71: 4,
72: 412,
73: 11291,
74: 27392,
75: 28293,
76: 29103,
77: 3296,
78: 209})

So maybe it help you, maybe not :)

Chris Angelico

unread,

Aug 29, 2017, 2:04:49 PM8/29/17

to

On Wed, Aug 30, 2017 at 1:39 AM, Stefan Ram <r...@zedat.fu-berlin.de> wrote:
> Dennis Lee Bieber <wlf...@ix.netcom.com> writes:
>>Testing randomness itself requires statistical tests...
>
> A perfectly random coin /can/ yield "heads" a thousand times
> in sequence (which is very unlikely, but possible).
>
> This behavior should fail nearly all statistical tests for
> randomness. Yet the generator was perfectly random.
>
> So the tests for randomness give correct answers only with
> a certain probability ("confidence"). Insofar the concept of
> randomness is "fuzzy" when defined as an observable
> property of an otherwise "black box".
>
> The tests in the OP test only what one can test with
> certainity, which might be reasonable.
>
> To gain confidence in a function providing sufficiently
> "random" results other measures might be added, such as
> a code review (view the generator as a "white box").

The point of unit testing (of which doctests are a form) is generally
that you test THIS function, without needing to test everything else.
Testing whether random.random() is "sufficiently random" is not the
point of the doctest. For a non-trivial example, consider my dice
roller; I don't have a Python function for it, but it's a feature of
my D&D MUD. You pass it a string that details the dice you want to
roll, and it rolls them:

>>> roll d20
You roll d20: 3
>>> roll d20 + 5
You roll d20: 14
You add a bonus of 5
For d20 + 5, you total: 19
>>> roll 3d6+ d8 -2
You roll 3d6: 1, 5, 5, totalling 11.
You roll d8: 2
You add a bonus of -2
For 3d6+ d8 -2, you total: 11

This is fine as documentation. The trouble is that, for testing, we
have to logically accept any integer from 1 to 20 as "correct", and
doctest doesn't support that. I don't care, in this test, whether the
dice roller is "fair" (that it has equal probability of returning each
value) - what I care about is whether, when you enter a particular
string of dice descriptions, you get back a proper pattern of rolls.

And I don't think doctest is flexible enough to handle this without
some sort of monkeypatching - unless you code your function to use
NOTHING other than random.random(), and then you can reliably just
seed the RNG.

ChrisA

Man with No Name

unread,

Apr 17, 2019, 12:12:48 AM4/17/19

to

On Monday, August 28, 2017 at 4:41:28 AM UTC-5, Leam Hall wrote:
> Is this a good way to test if random numeric output? It seems to work
> under Python 2.6 and 3.6 but that doesn't make it 'good'.

There is no good way to doctest or unittest any random output. If doctest had scoping awareness and could programmers could re-use variables defined in outer docstring scopes then you could put a seed value in the module-level scope.

Marcos

duncan smith

unread,

Apr 17, 2019, 12:10:54 PM4/17/19

to

If it's supposed to generate values that follow a particular
distribution, and they don't, then it doesn't work. I had a bunch of
functions for generating values from various distributions. My line
manager told me to just set the seed so that the outputs were
deterministic. Worse than no test at all. It relied on my original
implementation (that generated the values for comparison) working, and
only tested if the implementation (of random.random() or my code) had
changed. So I ignored my boss and simply generated samples of values and
tested using a KS goodness of fit test. The tests should fail 5% of the
time. Run them a few times and check that no individual test is failing
consistently. I don't see how you can do much better than that. Of
course, this doesn't relate directly to doctest.

Duncan

Marc Casas Riera

unread,

Dec 3, 2021, 7:02:13 AM12/3/21

to

Una birreta nanus