Just thought I'd mention the NLTK jenkins instance and its
documentation. It's up and running, testing NLTK using the doctests in
pytohn2.5. It also builds the nltk.github.com page to html and pushes it
automatically.
Jenkins instance:
https://jenkins.shiningpanda.com/nltk/
Some doc:
http://nltk.github.com/dev/jenkins.html
A note on the builds it produces. The workspace on our ShiningPanda is
not available from the web interface. This mean we are now building
nightly builds with no way of reading them. I can push them anywhere,
any good ideas? This is how the built files look: http://www.8d.no/nltk/
Cheers,
--
Morten Minde Neergaard
Thanks =)
> If I understand correctly, Jenkins runs tests using "nosetests
> --with-doctests" command. I'm not able to run it locally (mac os X 10.7.2,
> python 2.7) because python process memory consumption skyrockets (I killed
> the process when it reaches 10Gb). And ideas?
This is the command I run:
nosetests -e inference.doctest --with-xunit --xunit-file=$WORKSPACE/nosetests.xml --with-doctest --doctest-extension=.doctest --doctest-options=+ELLIPSIS,+NORMALIZE_WHITESPACE nltk/
Description of each non-obvious option (Hey, I'll add this to the
Jenkins node! =)):
-e inference.doctest
One of the inference tests did what you described earlier. Not sure if
it always spirals out of control, but I disabled the test.
--with-xunit
Write a report in XML format for publishing in Jenkins.
--doctest-options=+ELLIPSIS,+NORMALIZE_WHITESPACE
This makes a lot more tests pass :)
> So we have 2 different ways for running interfaces now: Jenkins and tox.
> Jenkins runs tests automatically and provides a web interface, tox can run
> tests locally on demand for several python interpreters (python 2.5-2.7,
> pypy) but requires some work from the developer to get it running (numpy,
> etc). Any thoughts on how to make them work together? It seems logical to
> integrate tox with Jenkins (
> http://tox.testrun.org/latest/example/jenkins.html ) but I don't know how
> ShiningPanda support this + tests in tox are now executed with 2 commands
> and this may be a problem.
ShiningPanda have their own multi-virtualenv setup. I tested it, but
ended up just running one virtualenv in py2.5 for testing. Cuts down on
time and complexity. Installing all dependencies in a py2.5 environment
was actually kind of time consuming.
I'm using nose mostly because it's zero configuration and supports
junit-compatible XML output.
Thanks for your thoughts on testing -- our test framework is on the
critical path...
I agree with your stated requirements, except that I'm not sure about
the importance of xUnit XML output.
A couple of comments:
> Sphinx can only run doctests that are included in docs. Current regression
> doctests are not included in docs (I personally think it'll be good to
> include them).
Do you mean putting regression doctests into module docstrings, and
eliminating the nltk/test directory? My concern is to avoid
cluttering up the module code and the API documentation.
How about putting regression doctests in files called package/test.py
(e.g. tokenize/test.py). Perhaps these test files would be
conditionally imported (and not imported when building API
documentation or distributions)?
Then Sphinx could take care of these.
> The proposed solution:
>
> Unify the test running and use an improved version of Morten's solution:
I like the incremental nature of this proposal, which means its more
likely to happen.
> 1) find a way to exclude some doctests from python files;
Can we use the doctest SKIP flag?
> 2) remove custom doctest runner (test/doctest_driver.py,
> test/doctest_builder.py, test/testrunner.py) - it is 1000+ lines of code to
> support!;
> 3) bundle improved nose doctest plugin;
> 4) switch tox config to use nosetests command instead of sphinx+custom
> testrunner.
These are fine with me.
-Steven
Mikhail,Thanks for your thoughts on testing -- our test framework is on the
critical path...I agree with your stated requirements, except that I'm not sure about
the importance of xUnit XML output.
A couple of comments:
> Sphinx can only run doctests that are included in docs. Current regression
> doctests are not included in docs (I personally think it'll be good to
> include them).Do you mean putting regression doctests into module docstrings, and
eliminating the nltk/test directory? My concern is to avoid
cluttering up the module code and the API documentation.
How about putting regression doctests in files called package/test.py
(e.g. tokenize/test.py). Perhaps these test files would be
conditionally imported (and not imported when building API
documentation or distributions)?Then Sphinx could take care of these.
> The proposed solution:
>
> Unify the test running and use an improved version of Morten's solution:I like the incremental nature of this proposal, which means its more
likely to happen.> 1) find a way to exclude some doctests from python files;
Can we use the doctest SKIP flag?
Hopefully this patch can be included in nose dev ASAP. I'll prod the
nose devs.
> b) maybe somebody more experienced with nose can explain how to exclude
> the nltk.align.IBMModel1 doctest.
nosetests -e IBMModel1 should work. Not that I've tested it. I'm already
skipping all tests in inference.doctest using -e inference.doctest.
We can make a script that imports nose and runs it with the correct
parameters – currently I'm running it with a billion or so command line
arguments. Or we can make a shell script, or document the entire command
line that needs to be run. +SKIP on the tests or an explicite skip in
the test script/command line, either floats my boat.
Maybe we should remove most or all .py files under test/ and make _one_
that uses nose and a minimal config?
[…]
> The proposed solution:
>
> Unify the test running and use an improved version of Morten's solution:
>
> 1) find a way to exclude some doctests from python files;
I'm going to bed now, but this should be the easiest way of making a
nose test script based on the command line options it takes:
import nose
nose.main(argv=[
'--exclude=inference.doctest|IBMModel1',
'--with-xunit', '--xunit-file=$WORKSPACE/nosetests.xml',
'--with-doctest', '--doctest-extension=.doctest',
'--doctest-options=+ELLIPSIS,+NORMALIZE_WHITESPACE',
'nltk/' # This needs to be a relative path to the nltk/ folder –
# or we can try to autodetect it in the script.
])
I didn't test this, as now is nite nite. If noone comments on it the
next few days I'll try to run this on the shiningpanda server. Currently
coverage is running outside of nose: "coverage run …" – I'll move the
coverage config into the nose config as well. when I have the time.
> 2) remove custom doctest runner (test/doctest_driver.py,
> test/doctest_builder.py, test/testrunner.py) - it is 1000+ lines of code to
> support!;
Yes, please =)
> 3) bundle improved nose doctest plugin;
… or give it a few days and document that people need at least $version
of nose (==dev to begin with…)
No strong voice against bundling it until it's in stable (as there is no
guarantee that will happen soon or at all) :)
> 4) switch tox config to use nosetests command instead of sphinx+custom
> testrunner.
I concur with this general idea.
At 13:06, Fri 2012-01-27, Mikhail Korobov wrote:
[…]
> * nose.
>
> It does almost all from the list above. Configuration issues may be solved
> with shell script or a config file. Issues with nose:
>
> a) patched version is necessary in order to pass extra arguments to doctest
> runner (we may bundle the patched plugin; I think the requirement to
> download and patch nose is a show-stopper);Hopefully this patch can be included in nose dev ASAP. I'll prod the
nose devs.
> b) maybe somebody more experienced with nose can explain how to exclude
> the nltk.align.IBMModel1 doctest.nosetests -e IBMModel1 should work. Not that I've tested it. I'm already
skipping all tests in inference.doctest using -e inference.doctest.
We can make a script that imports nose and runs it with the correct
parameters – currently I'm running it with a billion or so command line
arguments. Or we can make a shell script, or document the entire command
line that needs to be run. +SKIP on the tests or an explicite skip in
the test script/command line, either floats my boat.
Maybe we should remove most or all .py files under test/ and make _one_
that uses nose and a minimal config?
[…]
> The proposed solution:
>
> Unify the test running and use an improved version of Morten's solution:
>
> 1) find a way to exclude some doctests from python files;I'm going to bed now, but this should be the easiest way of making a
nose test script based on the command line options it takes:import nose
nose.main(argv=[
'--exclude=inference.doctest|IBMModel1',
'--with-xunit', '--xunit-file=$WORKSPACE/nosetests.xml',
'--with-doctest', '--doctest-extension=.doctest',
'--doctest-options=+ELLIPSIS,+NORMALIZE_WHITESPACE',
'nltk/' # This needs to be a relative path to the nltk/ folder –
# or we can try to autodetect it in the script.
])I didn't test this, as now is nite nite. If noone comments on it the
next few days I'll try to run this on the shiningpanda server. Currently
coverage is running outside of nose: "coverage run …" – I'll move the
coverage config into the nose config as well. when I have the time.
Hi!
> Could you please try the new test setup
> (see https://github.com/nltk/nltk/pull/209 )?
Sorry for the delay. Sadly, it fails if you have a patched version of
nose, like me. And this will make it fail for anyone running nose==dev
when and if they merge. For now I'll just install a virgin nose on the
shining panda machine.
I'll add passing og sys.argv as well.
Smiles,
--
Morten
If you have a version that already accepts --doctest-options:
RuntimeWarning: Plugin <nltk.test.doctest_nose_plugin.DoctestFix object
at 0x231d310> has conflicting option string: option --doctest-options:
conflicting option string(s): --doctest-options and will be disabled
Thus disabling all the tests. But it's not really a problem until/unless
the patch is accepted into mainline.
--
Morten