+ Examples are priceless.
+ Examples that don't work are worse than worthless.
+ Examples that work eventually turn into examples that don't.
+ Docstrings too often don't get written.
+ Docstrings that do get written rarely contain those priceless examples.
+ The rare written docstrings that do contain priceless examples eventually
turn into rare docstrings with examples that don't work. I think this one
may follow from the above ...
+ Module unit tests too often don't get written.
+ The best Python testing gets done in interactive mode, esp. trying
endcases that almost never make it into a test suite because they're so
tedious to code up.
+ The endcases that were tested interactively-- but never coded up --also
fail to work after time.
About a month ago, I tried something new: take those priceless interactive
testing sessions, paste them into docstrings, and write a module to do all
the rest by magic (find the examples, execute them, and verify they still
work exactly as advertised).
Wow -- it turned out to be the only scheme I've ever really liked, and I
like it a lot! With almost no extra work beyond what I was doing before,
tests and docstrings get written now, and I'm certain the docstring examples
are accurate. It's also caught an amazing number of formerly-insidious
buglets in my modules, from accidental changes in endcase behavior, to hasty
but inconsistent renamings.
doctest.py is attached, and it's the whole banana. Give it a try, if you
like. After another month or so of ignoring your groundless complaints,
I'll upload it to the python.org FTP contrib site. Note that it serves as
an example of its own use, albeit an artificially strained example.
winklessly y'rs - tim
# Module doctest.
# Released to the public domain 06-Mar-1999,
# by Tim Peters (tim...@email.msn.com).
# Provided as-is; use at your own risk; no warranty; no promises; enjoy!
"""Module doctest -- a framework for running examples in docstrings.
NORMAL USAGE
In normal use, end each module M with:
def _test():
import M, doctest # replace M with your module's name
return doctest.testmod(M) # ditto
if __name__ == "__main__":
_test()
Then running the module as a script will cause the examples in the
docstrings to get executed and verified:
python M.py
This won't display anything unless an example fails, in which case
the failing example(s) and the cause of the failure(s) are printed
to stdout (why not stderr? because stderr is a lame hack <0.2 wink>),
and the final line of output is "Test failed.".
Run it with the -v switch instead:
python M.py -v
and a detailed report of all examples tried is printed to stdout, along
with assorted summaries at the end.
You can force verbose mode by passing "verbose=1" to testmod, or prohibit
it by passing "verbose=0". In either of those cases, sys.argv is not
examined by testmod.
In any case, testmod returns a 2-tuple of ints, (f, t), where f is the
number of docstrings examples that failed and t is the total number of
docstring examples attempted.
WHICH DOCSTRINGS ARE EXAMINED?
In any case, M.__doc__ is searched for examples.
By default, the following are also searched:
+ All functions in M.__dict__.values(), except those whose names
begin with an underscore.
+ All classes in M.__dict__.values(), except those whose names
begin with an underscore.
By default, any classes found are recursively searched similarly, to
test docstrings in their contained methods and nested classes. Pass
"deep=0" to testmod and *only* M.__doc__ is searched.
Warning: imports can cause trouble; e.g., if you do
from XYZ import XYZclass
then XYZclass is a name in M.__dict__ too, and doctest has no way to
know that XYZclass wasn't *defined* in M. So it may try to execute the
examples in XYZclass's docstring, and those in turn may require a
different set of globals to work correctly. I prefer to do "import *"-
friendly imports, a la
import XYY
_XYZclass = XYZ.XYZclass
del XYZ
and then the leading underscore stops testmod from going nuts. You may
prefer the method in the next section.
WHAT'S THE EXECUTION CONTEXT?
By default, each time testmod finds a docstring to test, it uses a
*copy* of M's globals (so that running tests on a module doesn't change
the module's real globals). This means examples can freely use any
names defined at top-level in M. It also means that sloppy imports (see
above) can cause examples in external docstrings to use globals
inappropriate for them.
You can force use of your own dict as the execution context by passing
"globs=your_dict" to testmod instead. Presumably this would be a copy
of M.__dict__ merged with the globals from other imported modules.
WHAT IF I WANT TO TEST A WHOLE PACKAGE?
Piece o' cake, provided the modules do their testing from docstrings.
Here's the test.py I use for the world's most elaborate Rational/
floating-base-conversion pkg (which I'll distribute some day):
from Rational import Cvt
from Rational import Format
from Rational import machprec
from Rational import Rat
from Rational import Round
from Rational import utils
modules = (Cvt,
Format,
machprec,
Rat,
Round,
utils)
def _test():
import doctest
import sys
verbose = "-v" in sys.argv
for mod in modules:
doctest.testmod(mod, verbose=verbose, report=0)
doctest.master.summarize()
if __name__ == "__main__":
_test()
IOW, it just runs testmod on all the pkg modules. testmod remembers the
names and outcomes (# of failures, # of tries) for each item it's seen,
and passing "report=0" prevents it from printing a summary in verbose
mode. Instead, the summary is delayed until all modules have been
tested, and then "doctest.master.summarize()" forces the summary at the
end.
So this is very nice in practice: each module can be tested individually
with almost no work beyond writing up docstring examples, and collections
of modules can be tested too as a unit with no more work than the above.
SO WHAT DOES A DOCSTRING EXAMPLE LOOK LIKE ALREADY!?
Oh ya. It's easy! In most cases a slightly fiddled copy-and-paste of an
interactive console session works fine -- just make sure there aren't any
tab characters in it, and that you eliminate stray whitespace-only lines.
>>> # comments are harmless
>>> x = 12
>>> x
12
>>> if x == 13:
... print "yes"
... else:
... print "no"
... print "NO"
... print "NO!!!"
no
NO
NO!!!
>>>
Any expected output must immediately follow the final ">>>" or "..."
line containing the code, and the expected output (if any) extends
to the next ">>>" or all-whitespace line. That's it.
Bummer: only non-exceptional examples can be used. If anything raises
an uncaught exception, doctest will report it as a failure.
The starting column doesn't matter:
>>> assert "Easy!"
>>> import math
>>> math.floor(1.9)
1.0
and as many leading blanks are stripped from the expected output as
appeared in the code lines that triggered it.
If you execute this very file, the examples above will be found and
executed, leading to this output in verbose mode:
Running doctest.__doc__
Trying: # comments are harmless
Expecting: nothing
ok
Trying: x = 12
Expecting: nothing
ok
Trying: x
Expecting: 12
ok
Trying:
if x == 13:
print "yes"
else:
print "no"
print "NO"
print "NO!!!"
Expecting:
no
NO
NO!!!
ok
... and a bunch more like that, with this summary at the end:
2 items had no tests:
doctest.run_docstring_examples
doctest.testmod
4 items passed all tests:
7 tests in doctest
2 tests in doctest.TestClass
2 tests in doctest.TestClass.get
1 tests in doctest.TestClass.square
12 tests in 6 items.
12 passed and 0 failed.
Test passed.
"""
__version__ = 0, 0, 1
import types
_FunctionType = types.FunctionType
_ClassType = types.ClassType
_ModuleType = types.ModuleType
del types
import string
_string_find = string.find
del string
import re
PS1 = re.compile(r" *>>>").match
PS2 = "... "
del re
# Extract interactive examples from a string. Return a list of string
# pairs, (source, outcome). "source" is the source code, and ends
# with a newline iff the source spans more than one line. "outcome" is
# the expected output if any, else None. If not None, outcome always
# ends with a newline.
def _extract_examples(s):
import string
examples = []
lines = string.split(s, "\n")
i, n = 0, len(lines)
while i < n:
line = lines[i]
i = i + 1
m = PS1(line)
if m is None:
continue
j = m.end(0) # beyond the prompt
if string.strip(line[j:]) == "":
# a bare prompt -- not interesting
continue
assert line[j] == " "
j = j + 1
nblanks = j - 4 # 4 = len(">>> ")
blanks = " " * nblanks
# suck up this and following PS2 lines
source = []
while 1:
source.append(line[j:])
line = lines[i]
if line[:j] == blanks + PS2:
i = i + 1
else:
break
if len(source) == 1:
source = source[0]
else:
source = string.join(source, "\n") + "\n"
# suck up response
if PS1(line) or string.strip(line) == "":
expect = None
else:
expect = []
while 1:
assert line[:nblanks] == blanks
expect.append(line[nblanks:])
i = i + 1
line = lines[i]
if PS1(line) or string.strip(line) == "":
break
expect = string.join(expect, "\n") + "\n"
examples.append( (source, expect) )
return examples
# Capture stdout when running examples.
class _SpoofOut:
def __init__(self):
self.clear()
def write(self, s):
self.buf = self.buf + s
def get(self):
return self.buf
def clear(self):
self.buf = ""
# Display some tag-and-msg pairs nicely, keeping the tag and its msg
# on the same line when that makes sense.
def _tag_out(printer, *tag_msg_pairs):
for tag, msg in tag_msg_pairs:
printer(tag + ":")
msg_has_nl = msg[-1:] == "\n"
msg_has_two_nl = msg_has_nl and \
_string_find(msg, "\n") < len(msg) - 1
if len(tag) + len(msg) < 76 and not msg_has_two_nl:
printer(" ")
else:
printer("\n")
printer(msg)
if not msg_has_nl:
printer("\n")
# Run list of examples, in context globs. "out" can be used to display
# stuff to "the real" stdout, and fakeout is an instance of _SpoofOut
# that captures the examples' std output. Return (#failures, #tries).
def _run_examples_inner(out, fakeout, examples, globs, verbose):
import sys
TRYING, BOOM, OK, FAIL = range(4)
tries = failures = 0
for source, want in examples:
if verbose:
_tag_out(out, ("Trying", source),
("Expecting",
want is None and "nothing" or want))
fakeout.clear()
state = TRYING
tries = tries + 1
if want is None:
# this must be an exec
want = ""
try:
exec source in globs
got = fakeout.get() # expect this to be empty
state = OK
except:
etype, evalue = sys.exc_info()[:2]
state = BOOM
else:
# can't tell whether to eval or exec without trying
try:
result = eval(source, globs)
# interactive console applies repr to result, so us too
got = fakeout.get() + repr(result) + "\n"
state = OK
except SyntaxError:
# must need an exec
pass
except:
etype, evalue = sys.exc_info()[:2]
state = BOOM
if state == TRYING:
try:
exec source in globs
got = fakeout.get()
state = OK
except:
etype, evalue = sys.exc_info()[:2]
state = BOOM
assert state in (OK, BOOM)
if state == OK:
if got == want:
if verbose:
out("ok\n")
continue
state = FAIL
assert state in (FAIL, BOOM)
failures = failures + 1
out("*" * 65 + "\n")
_tag_out(out, ("Failure in example", source))
if state == FAIL:
_tag_out(out, ("Expected", want), ("Got", got))
else:
assert state == BOOM
_tag_out(out, ("Exception raised",
str(etype) + ": " + str(evalue)))
return failures, tries
# Run list of examples, in context globs. Return (#failures, #tries).
def _run_examples(examples, globs, verbose):
import sys
saveout = sys.stdout
try:
sys.stdout = fakeout = _SpoofOut()
x = _run_examples_inner(saveout.write, fakeout, examples,
globs, verbose)
finally:
sys.stdout = saveout
return x
def run_docstring_examples(f, globs, verbose=0):
"""f, globs [,verbose] -> run examples from f.__doc__.
Use globs as the globals for execution.
Return (#failures, #tries).
If optional arg verbose is true, print stuff even if there are no
failures.
"""
try:
doc = f.__doc__
except AttributeError:
return 0, 0
if not doc:
# docstring empty or None
return 0, 0
e = _extract_examples(doc)
if not e:
return 0, 0
return _run_examples(e, globs, verbose)
class _Tester:
def __init__(self):
self.globs = {} # globals for execution
self.deep = 1 # recurse into classes?
self.verbose = 0 # print lots of stuff?
self.name2ft = {} # map name to (#failures, #trials) pairs
def runone(self, target, name):
if _string_find(name, "._") >= 0:
return 0, 0
if self.verbose:
print "Running", name + ".__doc__"
f, t = run_docstring_examples(target, self.globs.copy(),
self.verbose)
if self.verbose:
print f, "of", t, "examples failed in", name + ".__doc__"
self.name2ft[name] = f, t
if self.deep and type(target) is _ClassType:
f2, t2 = self.rundict(target.__dict__, name)
f = f + f2
t = t + t2
return f, t
def rundict(self, d, name):
f = t = 0
for thisname, value in d.items():
if type(value) in (_FunctionType, _ClassType):
f2, t2 = self.runone(value, name + "." + thisname)
f = f + f2
t = t + t2
return f, t
def summarize(self):
notests = []
passed = []
failed = []
totalt = totalf = 0
for x in self.name2ft.items():
name, (f, t) = x
assert f <= t
totalt = totalt + t
totalf = totalf + f
if t == 0:
notests.append(name)
elif f == 0:
passed.append( (name, t) )
else:
failed.append(x)
if self.verbose:
if notests:
print len(notests), "items had no tests:"
notests.sort()
for thing in notests:
print " ", thing
if passed:
print len(passed), "items passed all tests:"
passed.sort()
for thing, count in passed:
print " %3d tests in %s" % (count, thing)
if failed:
print len(failed), "items had failures:"
failed.sort()
for thing, (f, t) in failed:
print " %3d of %3d in %s" % (f, t, thing)
if self.verbose:
print totalt + totalf, "tests in", len(self.name2ft), "items."
print totalt, "passed and", totalf, "failed."
if totalf:
print "***Test Failed***", totalf, "failures."
elif self.verbose:
print "Test passed."
master = _Tester()
def testmod(m, name=None, globs=None, verbose=None, deep=1, report=1):
"""m, name=None, globs=None, verbose=None, deep=1, report=1
Test examples in docstrings in functions and classes reachable from
module m, starting with m.__doc__. Names beginning with a leading
underscore are skipped.
See doctest.__doc__ for an overview.
Optional keyword arg "name" gives the name of the module; by default
use m.__name__.
Optional keyword arg "globs" gives a dict to be used as the globals
when executing examples; by default, use m.__dict__. A copy of this
dict is actually used for each docstring.
Optional keyword arg "verbose" prints lots of stuff if true, prints
only failures if false; by default, it's true iff "-v" is in sys.argv.
Optional keyword arg "deep" tests *only* m.__doc__ when false.
Optional keyword arg "report" prints a summary at the end when true,
else prints nothing at the end. In verbose mode, the summary is
detailed, else very brief.
"""
if type(m) is not _ModuleType:
raise TypeError("testmod: module required; " + `m`)
if name is None:
name = m.__name__
if globs is None:
globs = m.__dict__
if verbose is None:
import sys
verbose = "-v" in sys.argv
master.globs = globs
master.verbose = verbose
master.deep = deep
failures, tries = master.runone(m, name)
if deep:
f, t = master.rundict(m.__dict__, name)
failures = failures + f
tries = tries + t
if report:
master.summarize()
return failures, tries
class TestClass:
"""
A pointless class, for sanity-checking of docstring testing.
Methods:
square()
get()
>>> TestClass(13).get() + TestClass(-12).get()
1
>>> hex(TestClass(13).square().get())
'0xa9'
"""
def __init__(self, val):
self.val = val
def square(self):
"""square() -> square TestClass's associated value
>>> TestClass(13).square().get()
169
"""
self.val = self.val ** 2
return self
def get(self):
"""get() -> return TestClass's associated value.
>>> x = TestClass(-42)
>>> print x.get()
-42
"""
return self.val
def _test():
import doctest
return doctest.testmod(doctest)
if __name__ == "__main__":
_test()
Well, I've never really thought of being like Tim, but since you mention
it... No. I've been using Python for a lot shorter.
And I'm writing a paper on testing software using scripting languages. This
technique is VERY similar to the one I'm recommending, and I'm going to
plagarize shamelessly. It's more complicated, and doesn't apply to most
other languages, but has a lot of side benefits.
>doctest.py is attached, and it's the whole banana. Give it a try, if you
>like. After another month or so of ignoring your groundless complaints,
>I'll upload it to the python.org FTP contrib site. Note that it serves as
>an example of its own use, albeit an artificially strained example.
Very impressive. Thank you.
>winklessly y'rs - tim
Even more impressive.
>PS1 = re.compile(r" *>>>").match
I'm no RE expert, but shouldn't this allow tabs?
--
-William "Billy" Tanksley
"But you shall not escape my iambics."
-- Gaius Valerius Catullus
Great post, but I question your timing. I can't help noticing
that we're only a few weeks away from the next Pythonic Award.
all-the-best-episodes-occur-during-sweeps-week-ly yr's,
Best regards,
Jeff Bauer
Rubicon, Inc.
> About a month ago, I tried something new: take those priceless
> interactive testing sessions, paste them into docstrings, and write
> a module to do all the rest by magic
when i was writing an optics code last year i tried a similar but
simpler version of this: when i first constructed the basics of a
module, i ran it and collected the results and pasted it in at the
bottom of the module file as comments. subsequent usage of the module
could use a checker() function which compared subsequent test output
with what was in the saved string of results, showing only changes
between one and the other.
i like your idea better for the reason that each example is located
near its code, so a failure could be detected and the user sent to
where the problem __may__ lie.
as for interactive testing being the best mode, i agree for short
pieces of code i go to the interpreter, hack around, and then paste a
goodie into my module.
but for long winded code with many dependent pieces (a ray tracer
firing and bouncing rays around reflecting and refracting surfaces) i
tend to test only some larger subset -- for example, the initial ray
and the final ray through a refracting prism -- by running a script
and examining output.
but i guess there is a whole art to figuring out at what granularity
one should be testing (and re-testing) pieces of code (functions, a
few lines, classes, etc). i am still learning about this after coding
for 26 years, and still feel i have too much to learn.
<rant> finally, what i like best about your idea and design is that it
further encourages documentation, visible examples, and testing right
within the code base. i feel so stringly taht in the age of
internetwide development of code, all these things must be integrated
INTO THE SOURCE FILE to be really useable.
sometimes i tire so of clicking over to the browser to read the
doc.html, clicking to the README to look at the example, clicking to
xemacs to look at the source, clicking to the xterm to look at a
python interpreter. etc. etc. just as structs (classes) were created
by divine inspiration (intervention?) to collect different but
related values (and functions), so too should open source code gather
all its parts together. </rant>
nice work tim.....
--
____ Les Schaffer ___| --->> Engineering R&D <<---
Theoretical & Applied Mechanics | Designspring, Inc.
Center for Radiophysics & Space Research | http://www.designspring.com/
Cornell Univ. scha...@tam.cornell.edu | l...@designspring.com
Tim Peters wrote:
[agreed that all other methods sucked]
> About a month ago, I tried something new: take those priceless interactive
> testing sessions, paste them into docstrings, and write a module to do all
> the rest by magic (find the examples, execute them, and verify they still
> work exactly as advertised).
A wonderful, phantastic idea. I love you.
I could imagine one improvement:
Docstrings might get a little too much used for all
kinds of stuff.
Zope uses the existance of docstrings to decide wether a thing
will be published.
Others (was it J. Asbahr?) use docstrings to store regexen
for little languages compilers.
I'm working on type library support for Python Servers, and
docstrings will show up as popups form VB and colleagues.
This means to me that we need some way to distinguish
docstring meanings:
For typeinfo support, it would be bad to have the test code in
a multi-lined docstring.
For Zope, we need a way to split the meaning of the
docstring existence. Zope first look if there is an
object.__doc__ attribute and then for entry_name+"__doc__".
My point is to extend Zope's idea to __test__.
Kinda proposal:
If an object has an __test__ attribute, it is preferred to
__doc__. Currently this doesn't help with functions, so I'd
say we could use name mangling like function_name+"__test__",
with the same semantics as Zope has.
a) try to find a __test__ attribute. If not found,
b) try to find a global object_name+"__test__" attribute. I.n.f.,
c) try to find an object.__doc__ attribute. If not found,
d) try to find a global object_name+"__doc__" attribute.
I think, with this scheme, we can have tested Zope code which
is not published, also a short doc string for type libraries,
a longer alternative by object_name+"__doc__", and so on.
cheers - chris
--
Christian Tismer :^) <mailto:tis...@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
Christian Tismer wrote:
>
> Tim Peters wrote:
>
> [agreed that all other methods sucked]
>
[snip]
> Others (was it J. Asbahr?) use docstrings to store regexen
> for little languages compilers.
>
That was John Aycock: ``Compiling Little Languages in Python,'' at the
7th Conference. Great idea.
<a-glyph-in-the-compiler-is-worth-two-on-the-stela>-ly y'rs,
Ivan
----------------------------------------------
Ivan Van Laningham
Callware Technologies, Inc.
iva...@callware.com
http://www.pauahtun.org
See also:
http://www.foretec.com/python/workshops/1998-11/proceedings.html
----------------------------------------------
Ack, on my keyboard I rebound the Tab key to something less annoying: it
deletes a random DLL from c:\Windows\SYSTEM\. OK, OK, version 0.0.2 caters
to stinking tabs. Patch follows.
if-tabs-were-called-hemorrhoids-as-god-intended-people-would-
see-them-for-the-bloody-mess-they-are-ly y'rs - tim
*** Old/doctest.py Sat Mar 06 14:35:36 1999
--- New/doctest.py Sat Mar 06 14:38:58 1999
***************
*** 1,4 ****
! # Module doctest.
# Released to the public domain 06-Mar-1999,
# by Tim Peters (tim...@email.msn.com).
--- 1,4 ----
! # Module doctest version 0.0.2
# Released to the public domain 06-Mar-1999,
# by Tim Peters (tim...@email.msn.com).
***************
*** 81,90 ****
By default, each time testmod finds a docstring to test, it uses a
*copy* of M's globals (so that running tests on a module doesn't change
! the module's real globals). This means examples can freely use any
! names defined at top-level in M. It also means that sloppy imports (see
! above) can cause examples in external docstrings to use globals
! inappropriate for them.
You can force use of your own dict as the execution context by passing
"globs=your_dict" to testmod instead. Presumably this would be a copy
--- 81,91 ----
By default, each time testmod finds a docstring to test, it uses a
*copy* of M's globals (so that running tests on a module doesn't change
! the module's real globals, and so that one test in M can't leave behind
! crumbs that accidentally allow another test to work). This means
! examples can freely use any names defined at top-level in M. It also
! means that sloppy imports (see above) can cause examples in external
! docstrings to use globals inappropriate for them.
You can force use of your own dict as the execution context by passing
"globs=your_dict" to testmod instead. Presumably this would be a copy
***************
*** 136,144 ****
SO WHAT DOES A DOCSTRING EXAMPLE LOOK LIKE ALREADY!?
! Oh ya. It's easy! In most cases a slightly fiddled copy-and-paste of an
! interactive console session works fine -- just make sure there aren't any
! tab characters in it, and that you eliminate stray whitespace-only lines.
>>> # comments are harmless
>>> x = 12
--- 137,147 ----
SO WHAT DOES A DOCSTRING EXAMPLE LOOK LIKE ALREADY!?
! Oh ya. It's easy! In most cases a copy-and-paste of an interactive
! console session works fine -- just make sure the leading whitespace
! is rigidly consistent (you can mix tabs and spaces if you're too lazy
! to do it right, but doctest is not in the business of guessing what
! you think a tab means).
>>> # comments are harmless
>>> x = 12
***************
*** 150,155 ****
--- 153,159 ----
... print "no"
... print "NO"
... print "NO!!!"
+ ...
no
NO
NO!!!
***************
*** 169,176 ****
>>> math.floor(1.9)
1.0
! and as many leading blanks are stripped from the expected output as
! appeared in the code lines that triggered it.
If you execute this very file, the examples above will be found and
executed, leading to this output in verbose mode:
--- 173,180 ----
>>> math.floor(1.9)
1.0
! and as many leading whitespace characters are stripped from the expected
! output as appeared in the initial ">>>" line that triggered it.
If you execute this very file, the examples above will be found and
executed, leading to this output in verbose mode:
***************
*** 212,218 ****
Test passed.
"""
! __version__ = 0, 0, 1
import types
_FunctionType = types.FunctionType
--- 216,232 ----
Test passed.
"""
! # 0,0,1 06-Mar-1999
! # initial version posted
! # 0,0,2 06-Mar-1999
! # loosened parsing:
! # cater to stinkin' tabs
! # don't insist on a blank after PS2 prefix
! # so trailing "... " line from a compound stmt no longer
! # breaks if the file gets whitespace-trimmed
! # better error msgs for inconsistent leading whitespace
!
! __version__ = 0, 0, 2
import types
_FunctionType = types.FunctionType
***************
*** 225,232 ****
del string
import re
! PS1 = re.compile(r" *>>>").match
! PS2 = "... "
del re
# Extract interactive examples from a string. Return a list of string
--- 239,249 ----
del string
import re
! PS1 = ">>>"
! PS2 = "..."
! _isPS1 = re.compile(r"(\s*)" + re.escape(PS1)).match
! _isPS2 = re.compile(r"(\s*)" + re.escape(PS2)).match
! _isEmpty = re.compile(r"\s*$").match
del re
# Extract interactive examples from a string. Return a list of string
***************
*** 237,283 ****
def _extract_examples(s):
import string
examples = []
lines = string.split(s, "\n")
i, n = 0, len(lines)
while i < n:
line = lines[i]
i = i + 1
! m = PS1(line)
if m is None:
continue
j = m.end(0) # beyond the prompt
! if string.strip(line[j:]) == "":
# a bare prompt -- not interesting
continue
! assert line[j] == " "
j = j + 1
! nblanks = j - 4 # 4 = len(">>> ")
! blanks = " " * nblanks
# suck up this and following PS2 lines
source = []
while 1:
source.append(line[j:])
line = lines[i]
! if line[:j] == blanks + PS2:
i = i + 1
else:
break
if len(source) == 1:
source = source[0]
else:
source = string.join(source, "\n") + "\n"
# suck up response
! if PS1(line) or string.strip(line) == "":
expect = None
else:
expect = []
while 1:
! assert line[:nblanks] == blanks
expect.append(line[nblanks:])
i = i + 1
line = lines[i]
! if PS1(line) or string.strip(line) == "":
break
expect = string.join(expect, "\n") + "\n"
examples.append( (source, expect) )
--- 254,312 ----
def _extract_examples(s):
import string
+ isPS1, isPS2, isEmpty = _isPS1, _isPS2, _isEmpty
examples = []
lines = string.split(s, "\n")
i, n = 0, len(lines)
while i < n:
line = lines[i]
i = i + 1
! m = isPS1(line)
if m is None:
continue
j = m.end(0) # beyond the prompt
! if isEmpty(line, j):
# a bare prompt -- not interesting
continue
! if line[j] != " ":
! raise ValueError("line " + `i-1` + " of docstring lacks "
! "blank after " + PS1 + ": " + line)
j = j + 1
! blanks = m.group(1)
! nblanks = len(blanks)
# suck up this and following PS2 lines
source = []
while 1:
source.append(line[j:])
line = lines[i]
! m = isPS2(line)
! if m:
! if m.group(1) != blanks:
! raise ValueError("inconsistent leading whitespace "
! "in line " + `i` + " of docstring: " + line)
i = i + 1
else:
break
if len(source) == 1:
source = source[0]
else:
+ # get rid of useless null line from trailing empty "..."
+ if source[-1] == "":
+ del source[-1]
source = string.join(source, "\n") + "\n"
# suck up response
! if isPS1(line) or isEmpty(line):
expect = None
else:
expect = []
while 1:
! if line[:nblanks] != blanks:
! raise ValueError("inconsistent leading whitespace "
! "in line " + `i` + " of docstring: " + line)
expect.append(line[nblanks:])
i = i + 1
line = lines[i]
! if isPS1(line) or isEmpty(line):
break
expect = string.join(expect, "\n") + "\n"
examples.append( (source, expect) )
# end of patch
That really wasn't my intent, and doctest clearly had no shot because it had
nothing at all to do with tabs. Hmm. Maybe I shouldn't be saying this.
Ah, what the hell, my reign is almost over, so I can now best serve as an
object lesson for my unfortunate successor:
It's well-known Python Corporate Policy never to admit that tabs can cause
problems. So the Steering Committee never mentioned this in public: Guido
was so despondent over all the problems caused by leading whitespace that he
had typed up a resignation letter, aiming to start a new life as a ferret
breeder (an avocation to which the Dutch psyche is particularly
well-attuned; from whence the charming English expression "like feeding
tulips to a ferret"; etc).
I had schemed to usurp Guido by solving this problem once & for all, and
with Gordon's goading produced the triumphant tabnanny.py (available in a
Tools/Scripts/ directory near you!). But I hadn't sufficiently anticipated
Guido's fabled Dutch ferocity: seeing that the problem was now well & truly
solved, the clever bastard used his time machine to introduce the
serviceable workalike tabpolice.py, some weeks *before* tabnanny.py. Thus
my pioneering effort was made to appear as if the work of a madman,
laboriously elaborating a minor refinement beyond all reason, and--
indeed! --beyond even the bounds of decency.
The Steering Committee, quite familiar with Guido's mind-clouding tricks,
honored me with the Pythonic Award anyway, over Guido's objection that
tabnanny.py was clearly the work of a madman, laboriously etc. This epic
battle for power continued, unseen by the public, for several more months,
until I discovered that Guido doesn't actually have any groupies! Python
really isn't Perl. So I figured "why bother?", and have been at peace ever
since.
Guido is still at it, though. I suppose none of you remember my
introduction late last summer of GUINANNY, the Tk-based Python development
environment featuring rigorous treatment of leading whitespace in all cases.
This time Guido retroactively undid the postings and long praiseful
discussions, replacing them with his serviceable workalike IDLE instead.
Am I bitter? You'd think so -- but, no. Not at all! You see, Pythonic
Award Winners *do* have groupies.
money-for-nothing-&-chicks-for-free-ly y'rs - tim
[Christian Tismer]
> A wonderful, phantastic idea. I love you.
I love you too. Don't tell anyone, though -- the Python community is so
intolerant of whitespace-based email latent homosexuality.
> I could imagine one improvement:
> Docstrings might get a little too much used for all kinds of stuff.
I thought about that, but decided it was like my worrying that if Guido
didn't put a password on the Python ToDo list, it would be overwhelmed with
ill-considered ideas. That is, docstrings are barely used at all now, so
they're a scarce resource only in theory.
Besides, some well-chosen examples *belong* in docstrings, and this scheme
keeps them honest -- you can't object to putting documentation in docstrings
<wink>. For hordes of unilluminating examples, I'm doing stuff like:
def for_testing():
"""
... hordes of lines deleted ...
>>> from Rational.Rat import Rat
>>> from Rational.Format import Format
# multiply 1/2 * 2/3 * 3/4 * ... * N/(N+1) in random order; should
# get 1/(N+1) in the end.
>>> N = 150
>>> base = []
>>> for i in range(1, N+1):
... base.append(Rat(i, i+1))
...
>>> p = Rat(1)
>>> from random import randrange
>>> xs = base[:]
>>> while xs:
... p = p * xs.pop(randrange(len(xs)))
...
>>> assert p == Rat(1)/(N+1), "cancellation in times"
# check approximation to pi
>>> import math
>>> papprox = Rat(math.pi).approx(1e6)[-1][1]
>>> print papprox.str(mode=Format.RATIO)
355/113
... hordes of lines deleted ...
"""
The user never sees that unless they look at the source, but the framework
runs it along with everything else, and it is *so pleasant* to write tests
this way: just paste a session into the string & add a few comments.
> Zope uses the existance of docstrings to decide wether a thing
> will be published.
Ya? Well that's an abuse <wink>.
> Others (was it J. Asbahr?) use docstrings to store regexen
> for little languages compilers.
That's OK for John (Aycock): his parsing functions have no use for
docstrings for documentation or testing purposes; they're automatically
called as part of his framework, and make no sense on their own.
Nevertheless, doctest.py would simply search his docstrings and find no
examples in them to run -- no problem.
> I'm working on type library support for Python Servers, and
> docstrings will show up as popups form VB and colleagues.
So, does anyone use docstrings for documentation <0.8 wink>?
> This means to me that we need some way to distinguish
> docstring meanings:
> For typeinfo support, it would be bad to have the test code in
> a multi-lined docstring.
> For Zope, we need a way to split the meaning of the
> docstring existence. Zope first look if there is an
> object.__doc__ attribute and then for entry_name+"__doc__".
> My point is to extend Zope's idea to __test__.
>
> Kinda proposal:
> If an object has an __test__ attribute, it is preferred to
> __doc__.
As above, well-chosen examples belong in docstrings, assuming docstrings are
primarily meant for documentation. I have no objection to putting
*additional*
tests elsewhere (& already am), except that, as you say next:
> Currently this doesn't help with functions, so I'd
> say we could use name mangling like function_name+"__test__",
> with the same semantics as Zope has.
This doesn't work for me -- I've tried dozens of schemes over the years, and
even one layer of indirection is irksome enough to kill a scheme in
practice. I stop using it, & that's that. I waited a month before posting
this one, to make sure it was one I still used. It is, and I'm not giving
it up even if Guido denounces it <wink>.
Python2 should allow setting attributes on any object (remind me to remind
Guido about that ...), and when I can set a __test__ attribute on functions
and class methods I'll be happy to do that. As is, there's no (pleasant)
way to associate information with functions or class methods other than
docstrings, so of course they're going to get warped toward conflicting
purposes.
> a) try to find a __test__ attribute. If not found,
> b) try to find a global object_name+"__test__" attribute. I.n.f.,
> c) try to find an object.__doc__ attribute. If not found,
> d) try to find a global object_name+"__doc__" attribute.
>
> I think, with this scheme, we can have tested Zope code which
> is not published, also a short doc string for type libraries,
> a longer alternative by object_name+"__doc__", and so on.
Ya, except I don't want this for the same reasons Zope doesn't want to force
people to run around creating artificial objects with "__publish__" in their
names.
It would be very easy to add an optional __test__=0 arg to testmod, though,
which if true would look *only* in module.__test__ for something to test.
I don't want something fancier than that because I can't guess what people
might need; e.g., I'm already feeling a need to distinguish among "short"
tests and "long" tests and "overnight" tests. This suggests making __test__
a flexible structure, like a dict mapping names to strings to be tested.
But then there's no builtin *policy* left, so insisting on calling it
__test__ and making it a dict just impose restrictions. So I'd rather
expose the _Tester class, add new
runstring(self, string, name, ...)
and
runobject(self, obj [, name], ...) # run obj.__doc__
methods, and let the user build whatever policies they want out of those.
testmod's policy of running everything in sight will stay, though: I'll be
damned if I change this into something I don't want to use myself <0.5
wink>.
had-enough-of-that-with-tabnanny<wink>-ly y'rs - tim
Folks, doc strings are meant for documentation, not to carry
information that may be vital to the program itself. Use
explicitly named attributes for those cases, please !
Not doing so will confuse not only auto doc tools but also
programmers who are not aware of the special meaning of those
particular doc strings.
BTW: In 1.5.1 there's a new optimization -OO that will cause the
byte code compiler to not only perform the usual -O optimizations
but also have it ignore any doc strings found in the source code.
Programs relying on doc strings for whatever purpose will of course
break...
--
Marc-Andre Lemburg Y2000: 299 days left
---------------------------------------------------------------------
: Python Pages >>> http://starship.skyport.net/~lemburg/ :
---------------------------------------------------------
Tim replied:
> Ya? Well that's an abuse <wink>.
Hmmm .. I think you're right. It's an abuse. No wink about it.
Not that I've actually written any (more than 100 lines)
of Python, nor used Zope (more than 10 minutes), but I
am following both, in anticipation of greater fluency.
Seems like Zope shouldn't be telling me I can't docstring
what I don't want published.
On the other hand, I'm not convinced that your use of
docstrings for testing fits my biases either. I would
have hoped (pure imagination here) that docstrings would
be a suitable place for a sort of plain text man page,
telling how to use a method or class. Tests would
quickly obscure such usage.
Maybe its time for a little xml-style markup.
As in surrounding your tests with <test> ... </test>.
And Zope using a <zope/> tag to mean publish.
And my imagined man page with <man> ... </man>
You could even imagine nested test tags, such as
<test> ... <overnight> ... </overnight> ... </test>
All in the docstring, along with some other untagged
or otherwise tagged stuff.
Oh well -- just fantasizing here, in the easy style
only available to the unwashed and uninitiated.
You're right, Tim, that if it doesn't lead to actual
usage, to a practical increase in testing, then it
doesn't matter how nice it sounds in theory.
--
=======================================================================
I won't rest till it's the best ... Software Production Engineer
Paul Jackson (p...@sgi.com; p...@usa.net) 3x1373 http://sam.engr.sgi.com/pj
>[Christian Tismer]
>> Zope uses the existance of docstrings to decide wether a thing
>> will be published.
>Tim replied:
>> Ya? Well that's an abuse <wink>.
>Hmmm .. I think you're right. It's an abuse. No wink about it.
True. One can work around that by prefixing with an underscore (following
Python convention), but it's still an abuse.
>On the other hand, I'm not convinced that your use of
>docstrings for testing fits my biases either. I would
>have hoped (pure imagination here) that docstrings would
>be a suitable place for a sort of plain text man page,
>telling how to use a method or class. Tests would
>quickly obscure such usage.
As a matter of fact, they don't. The tests are VITAL documentation, telling
what assumptions the author made about the code he wrote (what he wanted to
use it for) and what assumptions he's making about the code he imports.
The tests in the doc string should be somewhat illustrative, of course; but
they should cover the entire range of inputs (the way you're supposed to
write tests). If a test gets too much in the way of clarity, then change it
into a function and call it from the doc.
>Maybe its time for a little xml-style markup.
For Zope, yes. But Tim's tests already have markup -- the triple
greater-thans. And no need for XML even with Zope, just some kind of
markup.
Has anyone ever accidentally published Zope stuff? I don't really think
it's that easy. The __doc__ attribute is only part of what you need.
>Paul Jackson (p...@sgi.com; p...@usa.net) 3x1373 http://sam.engr.sgi.com/pj
--
As a practical matter, in Zope the problem is usually reversed,
i.e. missing docstrings on methods that need to be published.
I would wager that you'll see a lot of Zopist's code that
looks like similar to the method below, with docstring comments
almost as meaningful.
def investigate_all_known_Y2K_problems(self):
"""This is a method."""
If you want add docstrings to non-publishable methods, you could
always declare them with leading underscores. Another alternative
is to use comments for you non-publishable Zope methods. For me,
this is sometimes even useful as a documentation aid.
def charge_outrageous_Y2K_consulting_fee(self):
# no docstring, this is not a publishable method.
requesting-a-'private'-keyword-is-not-an-option-ly yr's,
Jeff Bauer
Rubicon, Inc.
>Tim Peters wrote:
>[agreed that all other methods sucked]
>> About a month ago, I tried something new: take those priceless interactive
>> testing sessions, paste them into docstrings, and write a module to do all
>> the rest by magic (find the examples, execute them, and verify they still
>> work exactly as advertised).
>I could imagine one improvement:
>Docstrings might get a little too much used for all
>kinds of stuff.
Would be terrible if someone actually USED those things. Dammit Jim, I'm a
programmer, not a field engineer.
>Others (was it J. Asbahr?) use docstrings to store regexen
>for little languages compilers.
>I'm working on type library support for Python Servers, and
>docstrings will show up as popups form VB and colleagues.
Both mistakes. Docstrings should be used for brief programming docs,
nothing else. Tests are one of the most vital documentation you can have,
in addition to their obvious uses.
>cheers - chris
> On Sat, 6 Mar 1999 17:26:05 GMT, Christian Tismer wrote:
[Tim embeds test in doc strings]
> >I could imagine one improvement:
> >Docstrings might get a little too much used for all
> >kinds of stuff.
>
> Would be terrible if someone actually USED those things. Dammit
> Jim, I'm a programmer, not a field engineer.
>
> >Others (was it J. Asbahr?) use docstrings to store regexen
> >for little languages compilers.
>
> >I'm working on type library support for Python Servers, and
> >docstrings will show up as popups form VB and colleagues.
>
> Both mistakes. Docstrings should be used for brief programming
> docs, nothing else. Tests are one of the most vital documentation
> you can have, in addition to their obvious uses.
Well Bones, I suggest you adjust your tricoder. It sounds like that's
exactly what Christian's using them for (pop-up help text for VBers
using COM servers written in Python).
As for John Aycock's parsing framework, you subclass his Lexer /
Parser / Generator, and embed the rule in the docstring and the
action in the code of the methods you add. Both the rule and action
tend to be very simple, most of the behavior is in the framework. It
might sound perverse, but it's very natural in practice. With the
possible (and trivial) exception of the regexes in the Lexer, you
wouldn't be testing the individual methods, but the behavior of the
Lexer / Parser / Generator as a whole.
In all these cases, the major problem is that methods / functions
have only one user-assignable attribute. Christian, John, Tim and the
Zope folks all want to associate information with methods /
functions. Well, Tim has the additional problem that his usability
tester is (to be charitable) somewhat less than reasonable...
- Gordon
|> As a matter of fact, they don't. The tests are VITAL documentation,
melarkey. I write tests for (some of) what I code,
and I write man-page like interface description and
how to use commentary for (more of) what I code.
They are not the same thing. Tests are not documentation.
One is a tool to find bugs, the other is a means to explain
to fellow humans something about the code in question.
|> But Tim's tests already have markup - ... ">>>".
true - good point.
--
=======================================================================
I won't rest till it's the best ... Software Production Engineer
Are you going to retire? Or will you start a (new) job that doesn't allow
you to be a full-time c.l.p author?
And who will be the "unfortunate successor"? Can there be any successor?
I mean Tim++ oh sorry: Tim=Tim+1 :-)
Michael
--
''''\ Michael Scharf
` c-@@ TakeFive Software
` > http://www.TakeFive.com
\_ V mailto:Michael...@TakeFive.co.at
>> On Sat, 6 Mar 1999 17:26:05 GMT, Christian Tismer wrote:
>[Tim embeds test in doc strings]
>> >I could imagine one improvement:
>> >Docstrings might get a little too much used for all
>> >kinds of stuff.
>> Would be terrible if someone actually USED those things. Dammit
>> Jim, I'm a programmer, not a field engineer.
Blush. Sorry, my admittedly sarcastic humor came out a little too cutting.
(Only a little?)
>> >Others (was it J. Asbahr?) use docstrings to store regexen
>> >for little languages compilers.
>> >I'm working on type library support for Python Servers, and
>> >docstrings will show up as popups form VB and colleagues.
>> Both mistakes. Docstrings should be used for brief programming
>> docs, nothing else. Tests are one of the most vital documentation
>> you can have, in addition to their obvious uses.
>Well Bones, I suggest you adjust your tricoder.
It's worse than that -- it's physics, Jim! :)
>It sounds like that's
>exactly what Christian's using them for (pop-up help text for VBers
>using COM servers written in Python).
Great! Then he'll have no complaint with mmore documentation. At the same
time, I have to note that he probably won't be very happy, because he's
expecting the docstrings to be brief descriptions (thumbnails) describing
the thing's actions.
Well, they've been used that way before, but that's by no means what they're
for. By and large, in the few timmes I've seen docstrings used, they've
contained fairly extensive documentation, usually of an entire module
(function docstrings get used less, and when they are used it is indeed more
often short).
>As for John Aycock's parsing framework, you subclass his Lexer /
>Parser / Generator, and embed the rule in the docstring and the
>action in the code of the methods you add. Both the rule and action
>tend to be very simple, most of the behavior is in the framework. It
>might sound perverse, but it's very natural in practice. With the
>possible (and trivial) exception of the regexes in the Lexer, you
>wouldn't be testing the individual methods, but the behavior of the
>Lexer / Parser / Generator as a whole.
It sounds very pleasant, actually. I have no real problem with this --
Tim's testing use would get in their way, but they can't get in Tim's way
(well for them that they don't, or he'd mangle their whitespace). If I were
building one of these and also using doctests, I would put the doctests in
at the module level.
>In all these cases, the major problem is that methods / functions
>have only one user-assignable attribute. Christian, John, Tim and the
>Zope folks all want to associate information with methods /
>functions. Well, Tim has the additional problem that his usability
>tester is (to be charitable) somewhat less than reasonable...
I understand you clearly right up to the last sentance, and then we go way
off the dial. "To be charitable?" It sounds like you're understating an
obviously crippling problem, but I assure you that I do not understate when
I say that I don't understand what the problem is.
In doctests, IF you put a docstring in AND it has the appropriate
interactive-like syntax in it at some point AND you call doctest on its
object, THEN the stuff will be executed. All other stuff is ignored.
How is this -- what was your word -- "less than reasonable"? Seems almost
boolean.
>- Gordon
The output is also adjacent to the test code that produces it; as much
clarifying commentary as you like can be interspersed at will; and if actual
output differs from expected output, the exact stmt that produced the rogue
result is displayed by the driver. Each of these addresses something I
didn't like in previous schemes.
> as for interactive testing being the best mode, i agree for short
> pieces of code i go to the interpreter, hack around, and then paste a
> goodie into my module.
>
> but for long winded code with many dependent pieces (a ray tracer
> firing and bouncing rays around reflecting and refracting surfaces) i
> tend to test only some larger subset -- for example, the initial ray
> and the final ray through a refracting prism -- by running a script
> and examining output.
Floating-point output is a puzzle, since output rounding can (& does) vary
even across IEEE-754 platforms. But if you merely have canned output from a
long canned script, why not check that too? The doctest framework doesn't
cater to that directly, and never will (there are too many ways you may
*want* to structure it for me to outguess), but my working version now
contains methods for running tests on arbitrary strings too. If you don't
want to put all the code inline in a test string, stuff it in a module
function instead and call that from within the string; a la
def _fat_test:
"""A really long test script"""
whatever
_fattest = r"""\
>>> _fat_test()
expected output goes here
"""
A call to doctest.master.runstring(_fattest) will do the rest, merging the
_fat_test results into the rest of the module's test results. I think I'll
graft in a simple version of Chris's __test__ idea too, so you can do e.g.
__test__ = {"fat_test": _fattest}
and have testmod run it by magic.
> but i guess there is a whole art to figuring out at what granularity
> one should be testing (and re-testing) pieces of code (functions, a
> few lines, classes, etc). i am still learning about this after coding
> for 26 years, and still feel i have too much to learn.
Same here, although I picked a bad Subject for this thread: doctest was
really only aiming at verifying docstring *examples*, and wasn't meant to be
a general testing framework. It's turning into one, though <wink>.
> <rant> finally, what i like best about your idea and design is that it
> further encourages documentation, visible examples, and testing right
> within the code base. i feel so stringly taht in the age of
> internetwide development of code, all these things must be integrated
> INTO THE SOURCE FILE to be really useable.
>
> sometimes i tire so of clicking over to the browser to read the
> doc.html, clicking to the README to look at the example, clicking to
> xemacs to look at the source, clicking to the xterm to look at a
> python interpreter. etc. etc. just as structs (classes) were created
> by divine inspiration (intervention?) to collect different but
> related values (and functions), so too should open source code gather
> all its parts together. </rant>
I agree; I like stuff in one place too, and the ease of copying example
input/output from a console window is making it hard to resist creating
useful docstrings. *Something* is very right in all this -- although I'm
not sure doctest.py is it <wink>.
wake-up-time-to-die-ly y'rs - leon
[Tim]
> Ya? Well that's an abuse <wink>.
[Paul Jackson]
> Hmmm .. I think you're right. It's an abuse. No wink about it.
It's a wink of implied consent; that is, while it's abuse, it's hard to be
unsympathetic.
> ...
> On the other hand, I'm not convinced that your use of
> docstrings for testing fits my biases either. I would
> have hoped (pure imagination here) that docstrings would
> be a suitable place for a sort of plain text man page,
> telling how to use a method or class. Tests would
> quickly obscure such usage.
Actually, other people started talking about using this approach for general
testing. I didn't (at first), apart from the poorly chosen thread Subject.
If you read the original rant, I hoped it was clear that the thrust was
toward making *examples* part of docstrings for their *documentation* value,
and using this framework to verify the examples work exactly as advertised.
Here's the docstring for a gcd function:
"""a, b -> greatest common divisor.
This is > 0 unless a == b == 0, in which case return 0.
>>> print gcd(3, 8), gcd(8, 3), gcd(2L**500, 2L**500 +1)
1 1 1L
>>> print gcd(6, 15), gcd(15, 6), gcd(15, -6), gcd(-6, -15)
3 3 3 3
>>> print gcd(-6, -6), gcd(0, 6), gcd(-6, 0), gcd(6, -12)
6 6 6 6
>>> gcd(0, 0)
0
"""
I think the console session pasted in there is much clearer about endcase
behavior than would be an equally large pile of obscure technical prose. I
don't consider those to be gcd tests so much as a vital part of the docs.
The strenuous but unilluminating gcd *tests* are hiding elsewhere in the
module from which this was taken.
BTW, while speeding the gcd function, on at least two occasions some of
those examples stopped working as advertised!
This is valuable in a different way in "tutorial mode", where actual
input/output illustrates points made in the prose. For example, here's part
of a module's tutorial docstring:
So what does this rounded guy really look like? Here it is in binary:
>>> print rounded.str(mode=Format.FIXED, base=2, prec=55)
0.0001100110011001100110011001100110011001100110011001101_2
>>>
Since the machine value isn't exactly 1/10, it shouldn't be surprising
that if we print out enough digits they won't be exactly "0.100..."
either (.cvt_exact(base) attempts exact conversion to the given base,
but returns None if that's impossible):
>>> print rounded.cvt_exact(10)
0.1000000000000000055511151231257827021181583404541015625
>>>
And that's the exact decimal value of the best possible IEEE double
approximation to 1/10!
IOW, I'm aiming for illumination, not distraction. And, BTW, one of those
examples stopped working during development too (change in keyword argument
name -- a million things can cause examples to break, and now I no longer
have to worry about letting one slip thru the cracks).
> Maybe its time for a little xml-style markup.
>
> As in surrounding your tests with <test> ... </test>.
Agreed here with Billy that Python's console conventions are already markup
enough for *this* purpose. That's the other thing I like about this: every
Python programmer knows what they're seeing above instantly.
Besides, I won't like XML-style markup of docstrings until a std tool is
distributed for displaying docstrings with that crap weeded out. I'll stick
to the "plain ASCII manpage" idea until then, and think this scheme isn't
stretching that even a teensy bit.
> ...
> You're right, Tim, that if it doesn't lead to actual
> usage, to a practical increase in testing, then it
> doesn't matter how nice it sounds in theory.
I'm more concerned about getting accurate examples into docstrings, and
keeping them accurate. Try it! You'll like it. I'll upload version 0.0.3
to the python.org ftp site sometime this week.
the-patch-is-now-bigger-than-the-original-code-ly y'rs - tim
>|> > Tests would quickly obscure such usage.
>|> As a matter of fact, they don't. The tests are VITAL documentation,
>melarkey.
Sir! I'll have you know that my posts are composed of the finest hand
picked mAlarkey. None of this second-class stuff.
>I write tests for (some of) what I code,
>and I write man-page like interface description and
>how to use commentary for (more of) what I code.
That's good!
>They are not the same thing. Tests are not documentation.
This is sheer opinion, contradicted by the vast weight of actual experience
and studies. Let me quote one scholarly dissertation -- "It's not what you
expect, it's what you inspect" [Mom, 1988]. In other words, if you don't
state why you wrote your library and the exact limits you place on its
behavior, someone will try to use your library for something beyond its
design, and only when you fix an unrelated bug and their app stops working
will the realize their subtle misreading.
I will shortly be posting a short discussion (about 10-20pp, double spaced)
about testing software written in "scripting" languages. Tim's method is,
of all of the things I've reasearched, by FAR the best by any number of
criteria. I could earn a PhD with this writeup... And Tim won't get a
THING!
>One is a tool to find bugs, the other is a means to explain
>to fellow humans something about the code in question.
Wrong! If a test was merely a tool to find bugs, interactive testing would
be sufficient -- because once the software has passed once, we know that the
bug is not there.
In fact, the test has two purposes after that first passed test.
First of all, it tells the customer what the coder was planning when he
wrote the code. If the tests don't match the requirements, the customer
should complain QUICK. (If the customer is a coder, he can also use the
tests as a sample of use, but that's secondary.)
Second, it tells the maintainer briefly what the original coder intended the
limits of his code to be, and provides him with a quick way to affirm that
his code meets the original requirements.
Both of these jobs are primarily issues of human communication.
Guido, could the next version of IDLE support doctests in a graphical way,
perhaps tree-format, with percentage tested summaries and a way to activate
and view a single test with a click or two?
>Paul Jackson (p...@sgi.com; p...@usa.net) 3x1373 http://sam.engr.sgi.com/pj
--
William Tanksley wrote:
...
> >Others (was it J. Asbahr?) use docstrings to store regexen
> >for little languages compilers.
>
> >I'm working on type library support for Python Servers, and
> >docstrings will show up as popups form VB and colleagues.
>
> Both mistakes. Docstrings should be used for brief programming docs,
> nothing else. Tests are one of the most vital documentation you can have,
> in addition to their obvious uses.
Sorry, I might have been clearer about what I'm doing.
If a COM server is written in Python, with a generated
type lib, it will show the doc strings of published objects
as ballon help in VB/whatever. This is *exactly* what a
docstring's primary purpose should be in the first place:
A brief description of an object's interface.
Neither a line like "I'm a method" nor a three page
description including several test sessions makes sense
in this context.
ciao - chris
Gordon McMillan wrote:
>
> William Tanksley wrote:
>
> > On Sat, 6 Mar 1999 17:26:05 GMT, Christian Tismer wrote:
>
> [Tim embeds test in doc strings]
...
> > [Chris is warning about __doc__ overloading]
> > [and also misspelling John Aycock :-)
...
> Well Bones, I suggest you adjust your tricoder. It sounds like that's
> exactly what Christian's using them for (pop-up help text for VBers
> using COM servers written in Python).
Thanks Gordon for the clarification. I was quite astonished
about the other reactions, since the docstring idea seemed
to be the natural thing which should show up from VB.
...
> In all these cases, the major problem is that methods / functions
> have only one user-assignable attribute. Christian, John, Tim and the
> Zope folks all want to associate information with methods /
> functions. Well, Tim has the additional problem that his usability
> tester is (to be charitable) somewhat less than reasonable...
Folks, I see a way out.
In Python 1.5, all docstrings of modules, classes and methods
can be modified at runtime.
(for methods, assign to klass.meth.im_func.__doc__)
Without giving a syntax proposal, it should be noted that
an initialization function of a module could do with the
doc strings whatever it wants to do. An initializer could
modify docstrings arbitrarily, depending of the actual purpose.
In the case of my COM stuff, I could cut the doc down to a
brief description, if the module is published by COM, otherwise
leave the longer docs intact.
Tim's code could stay as it is (since it is indeed tagged), or it
could be modified. Leaving some brief examples which are helpful
for the user, but removing more extensive testing stuff. This
would enable Tim to do even more in place, with his decision
wether to expose it to the user or not.
Another (less related) benefit of docstring modification code
could be this: People tend to write doc strings this way:
def func(someargs):
"""sometimes this is line 1
and this is line 2
"""
or something similar. While this notation is readable in the
source code, the printout isn't perfect. A modification at
runtime which adjusts extra line feeds and indentation
can make sense.
At the risk that I might be called perverse (some of you
don't do so already :), I want to note that the __doc__
attribute can take any Python object, not just strings.
One *could* decide to execute a docstring and assign the
result back, or even replace it with an arbitrary object,
which might show runtime configurable doc text.
I just played with that. Wrote a doc class with a "brief" flag,
and wrapped instances into a module's object around all doc
strings. One can extend this idea and teach __doc__ to do a lot
more. For instance, and instance can show example code with
instance specific data filled in.
This was just to inject an idea. Doc strings are by no means
restrictive and can be extended as seems fit. The possibility
to assign arbitrary objects is very new to me and needs
some thought, sure.
cheers - chris
Many months ago we offered to drop that as a requirement if someone
could come up with a better solution. The crux of the problem is that
docstrings are the *only* metadata that one can associate with functions
(and methods).
(In a clever diversionary tactic, Paul redirects the burden of proof
elsewhere...)
The *real* problem is that you can't assign information to functions.
If I could assign "publish_me=1" to my functions, then we wouldn't have
hijacked the docstring. Also, this policy in Zope existed prior to
Python 1.5's _ policy.
Perhaps the truth is out there, looming in the interfaces
proposal...anyway, go start an uprising on the Zope list and demand the
dropping of the docstring requirement. I know I'm in favor of dropping
it.
--Paul
Paul Everitt wrote:
>
> Paul Jackson wrote:
> >
> > [Christian Tismer]
> > > Zope uses the existance of docstrings to decide wether a thing
> > > will be published.
> >
> > Tim replied:
> > > Ya? Well that's an abuse <wink>.
> >
> > Hmmm .. I think you're right. It's an abuse. No wink about it.
>
> Many months ago we offered to drop that as a requirement if someone
> could come up with a better solution. The crux of the problem is that
> docstrings are the *only* metadata that one can associate with functions
> (and methods).
>
> (In a clever diversionary tactic, Paul redirects the burden of proof
> elsewhere...)
>
> The *real* problem is that you can't assign information to functions.
> If I could assign "publish_me=1" to my functions, then we wouldn't have
> hijacked the docstring. Also, this policy in Zope existed prior to
> Python 1.5's _ policy.
Now you can.
Replace the doc string with an object at run time,
and you can put all kinds of information into the __doc__
object.
> Perhaps the truth is out there, looming in the interfaces
> proposal...anyway, go start an uprising on the Zope list and demand the
> dropping of the docstring requirement. I know I'm in favor of dropping
> it.
What about this (there are myriads of other possibilities, just
one idea here): If the __doc__ attribute has a certain format,
say it is a syntactically correct tuple, execute it, store the
first element as the __doc__ attribute and use the rest for Zope.
This would allow for a doc string like
'''("""this one has a doc string but isn't published""", None)'''
Not a too serious proposal, but there are some ways...
ciao - chris
> On Mon, 8 Mar 1999 00:21:24 GMT, Gordon McMillan wrote:
[snip]
> >In all these cases, the major problem is that methods / functions
> >have only one user-assignable attribute. Christian, John, Tim and the
> >Zope folks all want to associate information with methods /
> >functions. Well, Tim has the additional problem that his usability
> >tester is (to be charitable) somewhat less than reasonable...
[Bones scowls while Jim and Scotty snicker]
> I understand you clearly right up to the last sentance, and then we
> go way off the dial. "To be charitable?" It sounds like you're
> understating an obviously crippling problem, but I assure you that I
> do not understate when I say that I don't understand what the
> problem is.
Alas, none of us understand the problem, Bones - only its magnitude
and its source:
[Exhibit A]
> This doesn't work for me -- I've tried dozens of schemes over the
> years, and even one layer of indirection is irksome enough to kill
> a scheme in practice. I stop using it, & that's that. I waited a
> month before posting this one, to make sure it was one I still
> used. It is, and I'm not giving it up even if Guido denounces it
> <wink>.
dreading-the-Python2-__wink__-string-ly y'rs
- Gordon
Ah -- I'd missed that. Thanks for the refocusing.
And yes, the ">>>" is sufficient markup for this purpose.
My "XML-like" proposal was just brainstorming.
--
=======================================================================
I won't rest till it's the best ... Software Production Engineer
|> In other words, if you don't state why you wrote your library
|> and the exact limits you place on its behavior, someone will
|> try to use your library for something beyond ...
Documentation is not an exact science, rather a form of human
communication. I'd rather that 9 of 10 readers understood what
they needed, than that I could self righteously proclaim "I told
you so" to the 10th reader who exceeded some obscure limit.
|> Tim's method is, of all of the things I've reasearched,
|> by FAR the best by any number of criteria.
Yes - Tim - this might well survive the ravages of time
as a "good thing". Heck - someone might actually _use_
the darn thing.
|> >One is a tool to find bugs, the other is a means to explain
|> >to fellow humans something about the code in question.
|>
|> Wrong! If a test was merely a tool to find bugs, interactive
|> testing would be sufficient -- because once the software has
|> passed once, we know that the bug is not there.
Well - putting aside the issue that just because the "bug is not
there" today, doesn't mean it hasn't crept in tomorrow, after I
change another 1000 lines of code - tests are a poor means of
telling either the customer or the maintainer what the intended
limits are.
Which isn't to deny that some key tests, for nice well defined
units, can't server both roles, documenting and testing. Just
don't let such fortuitous congruences lull you into complacent
devotion to the current fads^H^H^H^Hideals of our industry.
>> >I'm working on type library support for Python Servers, and
>> >docstrings will show up as popups form VB and colleagues.
>> Both mistakes. Docstrings should be used for brief programming docs,
>> nothing else. Tests are one of the most vital documentation you can have,
>> in addition to their obvious uses.
>Sorry, I might have been clearer about what I'm doing.
>If a COM server is written in Python, with a generated
>type lib, it will show the doc strings of published objects
>as ballon help in VB/whatever. This is *exactly* what a
>docstring's primary purpose should be in the first place:
>A brief description of an object's interface.
The problem is that you're gambling on YOUR interpretation of what a
docstring SHOULD be used for. Many that I've seen actually _document_ the
function or module. This does make sense, really; it's the best place to do
that, aside from the lack of a pager.
You _should_ create your own attribute, which by default contains the
docstring (or possibly a function which can return the docstring unless it's
too long for a popup).
You can't deny that popup "tool tips" are not what docstrings are specified
for either; tool tips have fairly specific requirements.
>ciao - chris
>> About a month ago, I tried something new: take those priceless
>> interactive testing sessions, paste them into docstrings, and write
>> a module to do all the rest by magic
>i like your idea better for the reason that each example is located
>near its code, so a failure could be detected and the user sent to
>where the problem __may__ lie.
Interestingly enough, the tests can also be taken out of where they're
stored and used for other purposes. This way you don't absolutely _have_ to
run the tests.
>as for interactive testing being the best mode, i agree for short
>pieces of code i go to the interpreter, hack around, and then paste a
>goodie into my module.
>but for long winded code with many dependent pieces (a ray tracer
>firing and bouncing rays around reflecting and refracting surfaces) i
>tend to test only some larger subset -- for example, the initial ray
>and the final ray through a refracting prism -- by running a script
>and examining output.
There's always a need for human examination :). However, I would be tempted
to toss a call to a special test function into the docstring, just to make
sure that the needed test gets hit.
Logic demands that such a function would be private to the module, since
it's not part of the API. Tim, can _private functions be called from a
docstring-test?
>but i guess there is a whole art to figuring out at what granularity
>one should be testing (and re-testing) pieces of code (functions, a
>few lines, classes, etc). i am still learning about this after coding
>for 26 years, and still feel i have too much to learn.
Absolutely.
><rant> finally, what i like best about your idea and design is that it
>further encourages documentation, visible examples, and testing right
>within the code base. i feel so stringly taht in the age of
>internetwide development of code, all these things must be integrated
>INTO THE SOURCE FILE to be really useable.
Applause! I'm glad someone else feels the same way. I'm also glad that
we're developing some things that take some of the burden of at least some
of the tests off the programmer.
>sometimes i tire so of clicking over to the browser to read the
>doc.html, clicking to the README to look at the example, clicking to
>xemacs to look at the source, clicking to the xterm to look at a
>python interpreter. etc. etc. just as structs (classes) were created
>by divine inspiration (intervention?) to collect different but
>related values (and functions), so too should open source code gather
>all its parts together. </rant>
Have you played with "noweb", or any of the other systems derived from
Knuth's concept of Literate Programming? Come to think of it, there are two
such systemms written in Python: the only one I remember (and a good one it
is) is "Maxtal Interscript". Check it out!
>nice work tim.....
Agreed.
>____ Les Schaffer ___| --->> Engineering R&D <<---
Though with judicious planning you can have both. For example,
from one of our functions:
def cansmiles(self, iso = 0):
"""(iso) -> canonical SMILES string
The parameter "iso" specifies whether "the SMILES
string should contain isomeric labelings (isotopic and chiral
information)." (from the Daylight documentation) The default
value is 0.
"""
The blank line seperates the short/bubble help information from
the more verbose description; and I suppose from any embedded self
tests. This seems to be the common form for most of the Python
source.
Andrew
da...@bioreason.com
Andrew Dalke wrote:
>
> Starship Captain Tismer <tis...@appliedbiometrics.com> said:
> > If a COM server is written in Python, with a generated
> > type lib, it will show the doc strings of published objects
> > as ballon help in VB/whatever. This is *exactly* what a
> > docstring's primary purpose should be in the first place:
> >
> > A brief description of an object's interface.
Maybe I should release this a bit...
> > Neither a line like "I'm a method" nor a three page
> > description including several test sessions makes sense
> > in this context.
>
> Though with judicious planning you can have both. For example,
> from one of our functions:
>
> def cansmiles(self, iso = 0):
> """(iso) -> canonical SMILES string
>
> The parameter "iso" specifies whether "the SMILES
> string should contain isomeric labelings (isotopic and chiral
> information)." (from the Daylight documentation) The default
> value is 0.
>
> """
>
> The blank line seperates the short/bubble help information from
> the more verbose description; and I suppose from any embedded self
> tests. This seems to be the common form for most of the Python
> source.
Good point. This could make a useful default for bubble popup
help strings.
Maybe it is even easier, since I tend to build COM classes
only as thin wrapper classes. They would go with the brief
doc strings.
The real work is sitting in another class, which has the
real doco, the real code, and so on.
I never felt comfortable with mixing the COM interface with
the real actions, maybe since they already have to look quite
weird with all their _published_methods_ and a number of other
things.
This might change, since I'm thinking to get rid of most of that,
in a more Zope-like way, publishing everything which has
no underscore in front, has a doc string, and is no module.
One of the reasons why I was insisting in discussing this
doc stuff: Is Zope doing right, or should I use a different
way? Still undecided, a little enlightened by Paul,
perhaps this should be a user option.
It would be very convenient for lazy dummies like me, just
to "publish" a class to COM and writing nothing else
but the doc strings :-)
I'll surely be using doctest. Thanks, Tim!
Tim Peters wrote in message <000201be6933$79e2ae20$d29e2299@tim>...
[snip]
> Guido
> was so despondent over all the problems caused by leading whitespace that he
> had typed up a resignation letter, aiming to start a new life as a ferret
> breeder (an avocation to which the Dutch psyche is particularly
> well-attuned; from whence the charming English expression "like feeding
> tulips to a ferret"; etc).
[snip]
> But I hadn't sufficiently anticipated
> Guido's fabled Dutch ferocity:
What's all this about the Dutch psyche? Anyway, I already explained the
Dutch psyche to everyone's satisfaction in my 'Sinterklaas' postings
back in December. More recently I uncovered the Dutch Cosmic Conspiracy;
many the people who speak Python also speak Dutch (also those who do not
appear to be of Dutch nationality!). Far more than chance would
indicate.
Anyway, the Python newsgroup is serious; we all read it in this
introduction on Python a while ago. So I must ferociously suggest you
stop making your statements about the Dutch psyche when I've done this
so thoroughly already.
Back-to-breeding-ferrets-ly yours,
Martijn
[most people seperate the brief description from the in-depth one by a blank
line]
>Good point. This could make a useful default for bubble popup
>help strings.
Better than any I've made. I like it -- it's natural and easy.
>Maybe it is even easier, since I tend to build COM classes
>only as thin wrapper classes. They would go with the brief
>doc strings.
>The real work is sitting in another class, which has the
>real doco, the real code, and so on.
Good point.
>This might change, since I'm thinking to get rid of most of that,
>in a more Zope-like way, publishing everything which has
>no underscore in front, has a doc string, and is no module.
>One of the reasons why I was insisting in discussing this
>doc stuff: Is Zope doing right, or should I use a different
>way? Still undecided, a little enlightened by Paul,
>perhaps this should be a user option.
I would rather not do simply as Zope has done -- why not only publish the
things which have a specific tag somewhere in (or if you want to be
speedier, at the beginning or end of) their docstring.
def boshuda(yadda, yaddo):
"""COM Export: Do something I'll regret.
But maybe not too much!
"""
>It would be very convenient for lazy dummies like me, just
>to "publish" a class to COM and writing nothing else
>but the doc strings :-)
That'd be great. Zope sure is convenient.
>ciao - chris
Or alternately something like the following might work
>>> class DocString:
... pass
...
>>> def z():
... x = DocString
... x.ManPage = "This is a manpage"
... x.Tests = "These are some tests"
... x.ShouldZopePublishMe = "Nope"
... z.__doc__ = x
... return 1
...
>>> z()
1
>>> dir (z)
['__doc__', '__name__', 'func_code', 'func_defaults', 'func_doc',
'func_globals', 'func_name']
>>> z.__doc__
<class __main__.DocString at 80dd238>
>>> z.__doc__.ManPage
'this is a manpage'
Now, I'll agree that it's not as easy as Tim's proposal, but it is a lot
more flexible, and far more extensible, and I imagine that with a better
DocString class, it might become almost as easy.
x = DocString( """Some data which the DocString can parse.
>>> Lines like these go into the Tests member.
Output goes into the Tests, or perhaps the Results member.
Other lines go into the man page
ShouldZopePublishMe = No
gets put into the right place too.
"""
Of course, getting the class to do intelligent things would be hard.
>If the __doc__ attribute has a certain format,
>say it is a syntactically correct tuple, execute it, store the
>first element as the __doc__ attribute and use the rest for Zope.
>This would allow for a doc string like
>'''("""this one has a doc string but isn't published""", None)'''
Hey, perhaps that's the better way to initialize the class...
Later,
Blake.
[Les]
> ...
> but for long winded code with many dependent pieces (a ray tracer
> firing and bouncing rays around reflecting and refracting surfaces) i
> tend to test only some larger subset -- for example, the initial ray
> and the final ray through a refracting prism -- by running a script
> and examining output.
[Billy]
> There's always a need for human examination :). However, I would
> be tempted to toss a call to a special test function into the
> docstring, just to make sure that the needed test gets hit.
>
> Logic demands that such a function would be private to the module, since
> it's not part of the API. Tim, can _private functions be called from a
> docstring-test?
Oh sure -- every name visible at top level in the module is visible in
docstring code, and docstring code can use "import" (or anything else it
likes) to make more names visible. It would be wrong, though! testmod only
finds docstrings in objects with public names, so they're public docstrings,
and public docs shouldn't refer to private names at all. Right? I thought
you'd agree <wink>.
Besides, what can calling "_private()" do? Stuffing a ton of expected
output from _private() into a public docstring too is doubly unattractive,
and if _private just returns 0/1 ("success"/"fail", normal_return/exception,
...) then all the tracking advantages of verbose mode interleaving stmts w/
expected output-- and displaying the exact stmts that fail regardless of
mode --are lost.
So the 0.9.1 doctest.py (see earlier post) offers (but does not require)
several alternatives, by exposing the machinery that testmod previously kept
to itself.
One way is to create your own Tester instance, and run
Tester.rundoc(_private)
by hand, where _private's docstring contains "the usual" input/output pairs.
Another is to pass "pvt=1" to a Tester constructor, in which case you get a
Tester object that scours docstrings in private objects too.
Simpler is to invoke the "master" Tester instance directly:
doctest.master.rundoc(_private)
but then you also have to assume responsibility for getting the summary
displayed at the right time.
After playing with those and disliking them (I don't want to keep
reimplementing policy from the ground up), I went back to Chris Tismer's
idea of having a __test__ attribute too. Setting
__test__ = {__name__ + "._private": _private}
in the module is now enough to get _private.__doc__ run by magic, without
anything private polluting the public docstrings, or requiring any new code
beyond that line. __test__ is a dict and can contain any number of similar
name -> object_to_scour mappings.
__test__ will also run raw string values (as if they were docstrings), and
what I've liked best so far is to exploit that:
_bigtest_string = r"""
>>> r = Rat
>>> a = r(5, 6)
>>> b = r(5, 12)
>>> a > 0 and b < a and not a < b
1
>>> str(a) == "5/6"
1
>>> `a` == "Rat(5, 6)"
1
>>> a + b == b + a == r(5, 4)
1
>>> a - b == b
1
... many lines deleted ...
"""
__test__ = {"Rat.privateTest" : _bigtest_string}
BTW, there's a good reason not to use the terser "assert" in docstring
tests: just as in "real code", asserts vanish when Python is run with -O.
So a docstring test like
>>> assert a > 0 and b < a and not a < b
>>> assert str(a) == "5/6"
cannot fail under -O.
> ...
> Knuth's concept of Literate Programming? Come to think of it,
> there are two such systems written in Python: the only one I
> remember (and a good one it is) is "Maxtal Interscript". Check it out!
Literate programming is a great idea -- but I never use it because nobody
else does <0.7 wink>. An analogy with literate programming is appropriate
here all the same. A big difference is that this "literate docstring
testing" requires almost nothing new be learned <wink>.
laziness-is-a-virtue-even-if-larry-wall-agrees-ly y'rs - tim