https://jenkins.shiningpanda.com/nltk/

110 views
Skip to first unread message

Morten Minde Neergaard

unread,
Jan 13, 2012, 9:05:13 AM1/13/12
to nltk...@googlegroups.com
Hi!

Just thought I'd mention the NLTK jenkins instance and its
documentation. It's up and running, testing NLTK using the doctests in
pytohn2.5. It also builds the nltk.github.com page to html and pushes it
automatically.

Jenkins instance:
https://jenkins.shiningpanda.com/nltk/

Some doc:
http://nltk.github.com/dev/jenkins.html


A note on the builds it produces. The workspace on our ShiningPanda is
not available from the web interface. This mean we are now building
nightly builds with no way of reading them. I can push them anywhere,
any good ideas? This is how the built files look: http://www.8d.no/nltk/


Cheers,
--
Morten Minde Neergaard

Mikhail Korobov

unread,
Jan 13, 2012, 12:18:24 PM1/13/12
to nltk...@googlegroups.com
Hi Morten,

Great job!

If I understand correctly, Jenkins runs tests using "nosetests --with-doctests" command. I'm not able to run it locally (mac os X 10.7.2, python 2.7) because python process memory consumption skyrockets (I killed the process when it reaches 10Gb). And ideas?

So we have 2 different ways for running interfaces now: Jenkins and tox. Jenkins runs tests automatically and provides a web interface, tox can run tests locally on demand for several python interpreters (python 2.5-2.7, pypy) but requires some work from the developer to get it running (numpy, etc). Any thoughts on how to make them work together? It seems logical to integrate tox with Jenkins ( http://tox.testrun.org/latest/example/jenkins.html ) but I don't know how ShiningPanda support this + tests in tox are now executed with 2 commands and this may be a problem.

Morten Minde Neergaard

unread,
Jan 13, 2012, 1:07:05 PM1/13/12
to nltk...@googlegroups.com
At 09:18, Fri 2012-01-13, Mikhail Korobov wrote:
> Hi Morten,
>
> Great job!

Thanks =)

> If I understand correctly, Jenkins runs tests using "nosetests
> --with-doctests" command. I'm not able to run it locally (mac os X 10.7.2,
> python 2.7) because python process memory consumption skyrockets (I killed
> the process when it reaches 10Gb). And ideas?

This is the command I run:
nosetests -e inference.doctest --with-xunit --xunit-file=$WORKSPACE/nosetests.xml --with-doctest --doctest-extension=.doctest --doctest-options=+ELLIPSIS,+NORMALIZE_WHITESPACE nltk/

Description of each non-obvious option (Hey, I'll add this to the
Jenkins node! =)):

-e inference.doctest
One of the inference tests did what you described earlier. Not sure if
it always spirals out of control, but I disabled the test.

--with-xunit
Write a report in XML format for publishing in Jenkins.

--doctest-options=+ELLIPSIS,+NORMALIZE_WHITESPACE
This makes a lot more tests pass :)

> So we have 2 different ways for running interfaces now: Jenkins and tox.
> Jenkins runs tests automatically and provides a web interface, tox can run
> tests locally on demand for several python interpreters (python 2.5-2.7,
> pypy) but requires some work from the developer to get it running (numpy,
> etc). Any thoughts on how to make them work together? It seems logical to
> integrate tox with Jenkins (
> http://tox.testrun.org/latest/example/jenkins.html ) but I don't know how
> ShiningPanda support this + tests in tox are now executed with 2 commands
> and this may be a problem.

ShiningPanda have their own multi-virtualenv setup. I tested it, but
ended up just running one virtualenv in py2.5 for testing. Cuts down on
time and complexity. Installing all dependencies in a py2.5 environment
was actually kind of time consuming.

I'm using nose mostly because it's zero configuration and supports
junit-compatible XML output.

Mikhail Korobov

unread,
Jan 27, 2012, 4:06:07 PM1/27/12
to nltk...@googlegroups.com
Hi again,

I still don't know what is the good way to run the tests.
It should be possible to run all the tests with a single command and using a reasonable amount of RAM and in a reasonable time. 

So the requirements are:

- run doctests from *.doctest files as well as python files;
- it should be possible to skip some tests (in my case the nltk.align.IBMModel1 test consumes a huge amount of RAM, please also note it runs 22 minutes according to https://jenkins.shiningpanda.com/nltk/job/NLTK-py2.5/lastBuild/testReport/ );
- it should be possible to pass options to the doctest runner;
- test discovery is nice to have;
- xUnit XML output;
- tests should be easy to run for all supported python interpreters;
- it'll be good to run unittests as well if there will be some;
- ...?

Options:

* nose.

It does almost all from the list above. Configuration issues may be solved with shell script or a config file. Issues with nose: 

a) patched version is necessary in order to pass extra arguments to doctest runner (we may bundle the patched plugin; I think the requirement to download and patch nose is a show-stopper);
b) maybe somebody more experienced with nose can explain how to exclude the nltk.align.IBMModel1 doctest. 

* sphinx

Sphinx can only run doctests that are included in docs. Current regression doctests are not included in docs (I personally think it'll be good to include them). Sphinx is limited and can't run unittests or produce xUnit XML. It is easy to exclude individual tests or pass custom options to doctest runner with sphinx.

* unittest2

It is not zero-configuration, using unittest2 will require writing custom test runner and customizing the test discovery code. No XML out of box.

* nose2

Lacks some nose features and plugins. In heavy development. Does not support python 2.5 because of full python 3.2 support. Much smaller than nose1.

* py.test

Features are similar to nose; no way to pass options to doctest runner.

The current state:

ShiningPanda runs tests using patched nose with single interpreter using huge amount of memory, and taking a lot of time. 

tox runs tests for multiple interpreters with 2 hacky commands: custom doctest runner (that is a lot of code) for *.doctest tests + sphinx for *.py doctests. There is no way to integrate these 2 commands and produce the combined report, there is no way to make jUnit XML from the results.

The proposed solution:

Unify the test running and use an improved version of Morten's solution:

1) find a way to exclude some doctests from python files;
2) remove custom doctest runner (test/doctest_driver.py, test/doctest_builder.py, test/testrunner.py) - it is 1000+ lines of code to support!; 
3) bundle improved nose doctest plugin;
4) switch tox config to use nosetests command instead of sphinx+custom testrunner.

I can do (2), (3) and (4) if somebody will do (1).

Steven Bird

unread,
Jan 29, 2012, 5:31:47 PM1/29/12
to nltk...@googlegroups.com
Mikhail,

Thanks for your thoughts on testing -- our test framework is on the
critical path...

I agree with your stated requirements, except that I'm not sure about
the importance of xUnit XML output.

A couple of comments:

> Sphinx can only run doctests that are included in docs. Current regression
> doctests are not included in docs (I personally think it'll be good to
> include them).

Do you mean putting regression doctests into module docstrings, and
eliminating the nltk/test directory? My concern is to avoid
cluttering up the module code and the API documentation.

How about putting regression doctests in files called package/test.py
(e.g. tokenize/test.py). Perhaps these test files would be
conditionally imported (and not imported when building API
documentation or distributions)?

Then Sphinx could take care of these.

> The proposed solution:
>
> Unify the test running and use an improved version of Morten's solution:

I like the incremental nature of this proposal, which means its more
likely to happen.

> 1) find a way to exclude some doctests from python files;

Can we use the doctest SKIP flag?

> 2) remove custom doctest runner (test/doctest_driver.py,
> test/doctest_builder.py, test/testrunner.py) - it is 1000+ lines of code to
> support!;
> 3) bundle improved nose doctest plugin;
> 4) switch tox config to use nosetests command instead of sphinx+custom
> testrunner.

These are fine with me.

-Steven

Mikhail Korobov

unread,
Jan 30, 2012, 4:26:02 PM1/30/12
to nltk...@googlegroups.com
Steven,

понедельник, 30 января 2012 г. 4:31:47 UTC+6 пользователь Steven Bird написал:
Mikhail,

Thanks for your thoughts on testing -- our test framework is on the
critical path...

I agree with your stated requirements, except that I'm not sure about
the importance of xUnit XML output.

It is necessary for Jenkins. I'm fine without Jenkins if tests are easy to run locally. But if xUnit output is for free there is no need to avoid it. 
 

A couple of comments:

> Sphinx can only run doctests that are included in docs. Current regression
> doctests are not included in docs (I personally think it'll be good to
> include them).

Do you mean putting regression doctests into module docstrings, and
eliminating the nltk/test directory?  My concern is to avoid
cluttering up the module code and the API documentation.

I mean finding a way to add the tests in nltk/test directory to the existing docs (include these whole files into docs). These tests may act as usage examples and it is good to have usage examples online. This doesn't necessary mean moving them out of nltk/test folder. [unittester hat on] The tests that can't be used as usage examples would be probably easier to support in unittet.TestCase format. I afraid doctests can cause troubles while porting because of e.g. this:

python 2.x:
>>> line = u'Вася'
>>> line
u'\u0412\u0430\u0441\u044f'
>>> print line
Вася
>>> print [line]
[u'\u0412\u0430\u0441\u044f']
>>> 

python 3.x:
>>> line = 'Вася'
>>> line
'Вася'
>>> print(line)
Вася
>>> print([line])
['Вася']
>>> 

While code ( u'string' vs 'string' and "print foo" vs "print(foo)" ) can be unified using "__future__" imports, the last "print([line])" output cannot. So supporting such tests (there are a lot of them) will require custom output functions or manual part comparisons or some crazy doctest module hacking, which defeats the purpose of doctests to be readable. That is one of reasons why I think a limited amount of doctests should be maintained (and if these doctests are readable and provide usage examples then put them to docs) and other doctests (which primary goal is to prevent regressions) should be rewritten to unittests. Doctests are great idea and (please correct me if I'm wrong) doctest module author is even a co-author of NLTK book, but with the current state of tools it may be easier to support unittests instead of doctests. [unittester hat off]

How about putting regression doctests in files called package/test.py
(e.g. tokenize/test.py).  Perhaps these test files would be
conditionally imported (and not imported when building API
documentation or distributions)?

Then Sphinx could take care of these.

> The proposed solution:
>
> Unify the test running and use an improved version of Morten's solution:

I like the incremental nature of this proposal, which means its more
likely to happen.

> 1) find a way to exclude some doctests from python files;

Can we use the doctest SKIP flag?

Yes, but this flag should be on every line of doctest and this make them less readable (esp. while viewing via __doc__). We may go this way if no better solution exists. But.. in case of IBMModel1 this option may be viable because there are only a couple of doctest lines for IBMModel1, so I think this is a solution we may live with.

Morten Minde Neergaard

unread,
Jan 30, 2012, 5:25:20 PM1/30/12
to nltk...@googlegroups.com
At 13:06, Fri 2012-01-27, Mikhail Korobov wrote:
[…]

> * nose.
>
> It does almost all from the list above. Configuration issues may be solved
> with shell script or a config file. Issues with nose:
>
> a) patched version is necessary in order to pass extra arguments to doctest
> runner (we may bundle the patched plugin; I think the requirement to
> download and patch nose is a show-stopper);

Hopefully this patch can be included in nose dev ASAP. I'll prod the
nose devs.

> b) maybe somebody more experienced with nose can explain how to exclude
> the nltk.align.IBMModel1 doctest.

nosetests -e IBMModel1 should work. Not that I've tested it. I'm already
skipping all tests in inference.doctest using -e inference.doctest.

We can make a script that imports nose and runs it with the correct
parameters – currently I'm running it with a billion or so command line
arguments. Or we can make a shell script, or document the entire command
line that needs to be run. +SKIP on the tests or an explicite skip in
the test script/command line, either floats my boat.

Maybe we should remove most or all .py files under test/ and make _one_
that uses nose and a minimal config?

[…]


> The proposed solution:
>
> Unify the test running and use an improved version of Morten's solution:
>
> 1) find a way to exclude some doctests from python files;

I'm going to bed now, but this should be the easiest way of making a
nose test script based on the command line options it takes:

import nose
nose.main(argv=[
'--exclude=inference.doctest|IBMModel1',
'--with-xunit', '--xunit-file=$WORKSPACE/nosetests.xml',
'--with-doctest', '--doctest-extension=.doctest',
'--doctest-options=+ELLIPSIS,+NORMALIZE_WHITESPACE',
'nltk/' # This needs to be a relative path to the nltk/ folder –
# or we can try to autodetect it in the script.
])

I didn't test this, as now is nite nite. If noone comments on it the
next few days I'll try to run this on the shiningpanda server. Currently
coverage is running outside of nose: "coverage run …" – I'll move the
coverage config into the nose config as well. when I have the time.

> 2) remove custom doctest runner (test/doctest_driver.py,
> test/doctest_builder.py, test/testrunner.py) - it is 1000+ lines of code to
> support!;

Yes, please =)

> 3) bundle improved nose doctest plugin;

… or give it a few days and document that people need at least $version
of nose (==dev to begin with…)

No strong voice against bundling it until it's in stable (as there is no
guarantee that will happen soon or at all) :)

> 4) switch tox config to use nosetests command instead of sphinx+custom
> testrunner.

I concur with this general idea.

Mikhail Korobov

unread,
Jan 31, 2012, 1:13:59 PM1/31/12
to nltk...@googlegroups.com
Hi Morten,

вторник, 31 января 2012 г. 4:25:20 UTC+6 пользователь Morten M Neergaard написал:
At 13:06, Fri 2012-01-27, Mikhail Korobov wrote:
[…]
> * nose.
>
> It does almost all from the list above. Configuration issues may be solved
> with shell script or a config file. Issues with nose:
>
> a) patched version is necessary in order to pass extra arguments to doctest
> runner (we may bundle the patched plugin; I think the requirement to
> download and patch nose is a show-stopper);

Hopefully this patch can be included in nose dev ASAP. I'll prod the
nose devs.

That would be great!
 

> b) maybe somebody more experienced with nose can explain how to exclude
> the nltk.align.IBMModel1 doctest.

nosetests -e IBMModel1 should work. Not that I've tested it. I'm already
skipping all tests in inference.doctest using -e inference.doctest.

No, it doesn't work.
 

We can make a script that imports nose and runs it with the correct
parameters – currently I'm running it with a billion or so command line
arguments. Or we can make a shell script, or document the entire command
line that needs to be run. +SKIP on the tests or an explicite skip in
the test script/command line, either floats my boat.

Another option is nose.cfg file, e.g.:

[nosetests]
verbosity=3
with-doctest=1
doctest-extension=.doctest
where=nltk
;collect-only=1
exclude=inference.doctest

and then run `nosetests -c nose.cfg`
 

Maybe we should remove most or all .py files under test/ and make _one_
that uses nose and a minimal config?

Even this _one_ may be unnecessary, I think nose config file may be enough.
 

[…]
> The proposed solution:
>
> Unify the test running and use an improved version of Morten's solution:
>
> 1) find a way to exclude some doctests from python files;

I'm going to bed now, but this should be the easiest way of making a
nose test script based on the command line options it takes:

import nose
nose.main(argv=[
        '--exclude=inference.doctest|IBMModel1',
        '--with-xunit', '--xunit-file=$WORKSPACE/nosetests.xml',
        '--with-doctest', '--doctest-extension=.doctest',
        '--doctest-options=+ELLIPSIS,+NORMALIZE_WHITESPACE',
        'nltk/' # This needs to be a relative path to the nltk/ folder –
                # or we can try to autodetect it in the script.
    ])

I didn't test this, as now is nite nite. If noone comments on it the
next few days I'll try to run this on the shiningpanda server. Currently
coverage is running outside of nose: "coverage run …" – I'll move the
coverage config into the nose config as well. when I have the time.


As for coverage, I think we shouldn't bundle an outdated coverage.py version, pip works fine these days.

Mikhail Korobov

unread,
Feb 6, 2012, 8:38:48 PM2/6/12
to nltk...@googlegroups.com
Hi Morten,

Could you please try the new test setup (see https://github.com/nltk/nltk/pull/209 )?

Morten Minde Neergaard

unread,
Feb 8, 2012, 3:16:49 PM2/8/12
to Mikhail Korobov, nltk...@googlegroups.com
At 17:38, Mon 2012-02-06, Mikhail Korobov wrote:
> Hi Morten,

Hi!

> Could you please try the new test setup
> (see https://github.com/nltk/nltk/pull/209 )?

Sorry for the delay. Sadly, it fails if you have a patched version of
nose, like me. And this will make it fail for anyone running nose==dev
when and if they merge. For now I'll just install a virgin nose on the
shining panda machine.

I'll add passing og sys.argv as well.

Smiles,
--
Morten

Mikhail Korobov

unread,
Feb 8, 2012, 3:21:02 PM2/8/12
to nltk...@googlegroups.com, Mikhail Korobov
Thanks!

Why does it fail with nose==dev or a patched nose? It shouldn't :) 

Morten Minde Neergaard

unread,
Feb 8, 2012, 3:25:24 PM2/8/12
to nltk...@googlegroups.com, Mikhail Korobov
At 12:21, Wed 2012-02-08, Mikhail Korobov wrote:
> Why does it fail with nose==dev or a patched nose? It shouldn't :)

If you have a version that already accepts --doctest-options:

RuntimeWarning: Plugin <nltk.test.doctest_nose_plugin.DoctestFix object
at 0x231d310> has conflicting option string: option --doctest-options:
conflicting option string(s): --doctest-options and will be disabled

Thus disabling all the tests. But it's not really a problem until/unless
the patch is accepted into mainline.

--
Morten

Mikhail Korobov

unread,
Feb 8, 2012, 3:30:44 PM2/8/12
to nltk...@googlegroups.com, Mikhail Korobov
I'll fix that.

Mikhail Korobov

unread,
Feb 8, 2012, 3:57:43 PM2/8/12
to nltk...@googlegroups.com, Mikhail Korobov
Reply all
Reply to author
Forward
0 new messages