testing takes time

josef...@gmail.com

unread,

Oct 15, 2012, 12:30:29 PM10/15/12

to pystatsmodels

Running the test suite takes now about 5 minutes (single process,
faster with multiple processes)

The time it takes will get another jump when we merge the nonparametric branch.
Last week I started to do some profiling with nosetests, mainly to see
the impact of the L1 regularization merge.

Do we need to get more liberal in marking tests as "slow"?

I don't have too much of a problem with the time right now. For
development or checking specific parts we can run selected tests or
test files with nosetests.
This is still pretty fast as long the test files don't get too large.

For keeping master clean, we will have to run the full test suite
anyway, and it's much better if the test suite is large.

I don't know if there is much to gain streamlining the tests, but it
might be worth looking at the main "offenders".

Josef

>>> stats.sort_stats('cumulative', 'calls')
<pstats.Stats instance at 0x038C4378>
>>> stats.print_stats('\(test_')
2812915 function calls (2805910 primitive calls) in 53.924 CPU seconds

Ordered by: cumulative time, call count
List reduced from 798 to 82 due to restriction <'\\(test_'>

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 16.857 16.857
test_discrete.py:793(test_margeff_dummy)
1 0.000 0.000 15.025 15.025
test_discrete.py:450(test_cvxopt_versus_slsqp)
1 0.000 0.000 8.100 8.100
test_discrete.py:783(test_margeff_overall)
1 0.001 0.001 1.675 1.675
test_discrete.py:900(test_poisson_predict)
1 0.000 0.000 1.588 1.588
test_discrete.py:763(test_margeff_dummy_overall)
1 0.000 0.000 1.417 1.417
test_discrete.py:756(test_margeff_overall)
1 0.033 0.033 0.613 0.613
test_discrete.py:927(test_issue_339)
1 0.010 0.010 0.561 0.561
test_discrete.py:916(test_poisson_newton)
8 0.000 0.000 0.446 0.056 test_discrete.py:61(test_llnull)
1 0.000 0.000 0.144 0.144
test_discrete.py:942(test_issue_341)
1 0.000 0.000 0.133 0.133
test_discrete.py:883(test_perfect_prediction)
1 0.000 0.000 0.102 0.102
test_discrete.py:788(test_margeff_mean)

Skipper Seabold

unread,

Oct 15, 2012, 1:37:16 PM10/15/12

to pystat...@googlegroups.com

On Mon, Oct 15, 2012 at 12:30 PM, <josef...@gmail.com> wrote:

Running the test suite takes now about 5 minutes (single process,
faster with multiple processes)

The time it takes will get another jump when we merge the nonparametric branch.
Last week I started to do some profiling with nosetests, mainly to see
the impact of the L1 regularization merge.

Do we need to get more liberal in marking tests as "slow"?

I don't have too much of a problem with the time right now. For
development or checking specific parts we can run selected tests or
test files with nosetests.

This is still pretty fast as long the test files don't get too large.

For keeping master clean, we will have to run the full test suite
anyway, and it's much better if the test suite is large.

I don't know if there is much to gain streamlining the tests, but it
might be worth looking at the main "offenders".

I think empirical likelihood increased the test run time by a considerable fraction. We should mark them slow and not run slow tests by default. I think the way `test` is now, we're still running the slow tests.

As you can see from the recent stata-writer merge, I'm almost never running tests from the interpreter anymore.

We also need to enable travis builds for merges. This would greatly reduce the chance that master gets in a state with test errors. Unfortunately, I have very little bandwidth to devote to this now and won't likely until towards the end of the semester.

Ralf Gommers

unread,

Oct 15, 2012, 2:50:45 PM10/15/12

to pystat...@googlegroups.com

On Mon, Oct 15, 2012 at 7:37 PM, Skipper Seabold <jsse...@gmail.com> wrote:

On Mon, Oct 15, 2012 at 12:30 PM, <josef...@gmail.com> wrote:
Running the test suite takes now about 5 minutes (single process,
faster with multiple processes)

The time it takes will get another jump when we merge the nonparametric branch.

Working on this, right now the time it taken isn't really acceptable....

Last week I started to do some profiling with nosetests, mainly to see
the impact of the L1 regularization merge.

Do we need to get more liberal in marking tests as "slow"?

+1

I don't have too much of a problem with the time right now. For
development or checking specific parts we can run selected tests or
test files with nosetests.
This is still pretty fast as long the test files don't get too large.

For keeping master clean, we will have to run the full test suite
anyway, and it's much better if the test suite is large.

I don't know if there is much to gain streamlining the tests, but it
might be worth looking at the main "offenders".

I think empirical likelihood increased the test run time by a considerable fraction. We should mark them slow and not run slow tests by default. I think the way `test` is now, we're still running the slow tests.

As you can see from the recent stata-writer merge, I'm almost never running tests from the interpreter anymore.

We also need to enable travis builds for merges.

Would be nice, but this is kind of tricky with scipy as a dependency. IIRC the default Travis behavior is to build all dependencies, and the scipy build will time out most of the time.

Ralf

Skipper Seabold

unread,

Oct 15, 2012, 3:06:09 PM10/15/12

to pystat...@googlegroups.com

On Mon, Oct 15, 2012 at 2:50 PM, Ralf Gommers <ralf.g...@gmail.com> wrote:

On Mon, Oct 15, 2012 at 7:37 PM, Skipper Seabold <jsse...@gmail.com> wrote:

<snip>

We also need to enable travis builds for merges.

Would be nice, but this is kind of tricky with scipy as a dependency. IIRC the default Travis behavior is to build all dependencies, and the scipy build will time out most of the time.

Hmm, that seems a bit silly for runtime dependencies. I haven't looked into this obviously.

Skipper

Nathaniel Smith

unread,

Oct 16, 2012, 3:34:02 AM10/16/12

to pystat...@googlegroups.com

On 15 Oct 2012 19:50, "Ralf Gommers" <ralf.g...@gmail.com> wrote:
>
>
>
> On Mon, Oct 15, 2012 at 7:37 PM, Skipper Seabold <jsse...@gmail.com> wrote:
>>
>> On Mon, Oct 15, 2012 at 12:30 PM, <josef...@gmail.com> wrote:
>>>
>>> Running the test suite takes now about 5 minutes (single process,
>>> faster with multiple processes)
>>>
>>> The time it takes will get another jump when we merge the nonparametric branch.
>
>
> Working on this, right now the time it taken isn't really acceptable....
>
>>>
>>> Last week I started to do some profiling with nosetests, mainly to see
>>> the impact of the L1 regularization merge.
>>>
>>> Do we need to get more liberal in marking tests as "slow"?
>
>
> +1
>
>>>
>>>
>>> I don't have too much of a problem with the time right now. For
>>> development or checking specific parts we can run selected tests or
>>> test files with nosetests.
>>>
>>> This is still pretty fast as long the test files don't get too large.
>>>
>>> For keeping master clean, we will have to run the full test suite
>>> anyway, and it's much better if the test suite is large.
>>>
>>> I don't know if there is much to gain streamlining the tests, but it
>>> might be worth looking at the main "offenders".
>>>
>>
>> I think empirical likelihood increased the test run time by a considerable fraction. We should mark them slow and not run slow tests by default. I think the way `test` is now, we're still running the slow tests.
>>
>> As you can see from the recent stata-writer merge, I'm almost never running tests from the interpreter anymore.
>>
>> We also need to enable travis builds for merges.
>
>
> Would be nice, but this is kind of tricky with scipy as a dependency. IIRC the default Travis behavior is to build all dependencies, and the scipy build will time out most of the time.

If you don't need to control exactly which version of scipy you're testing against (or test against multiple versions), then you can just sudo apt-get install python-scipy in your Travis test script. Really like Ralf says it's the only option right now, but it's still much better than nothing.

-n

VincentAB

unread,

Oct 16, 2012, 11:35:40 AM10/16/12

to pystat...@googlegroups.com, n...@pobox.com

Not sure apt-get will work.

From the Travis docs http://about.travis-ci.org/docs/user/languages/python/:

"Travis CI Uses Isolated virtualenvs CI Environment uses separate virtualenv instances for each Python version. System Python is not used and should not be relied on. If you need to install Python packages, do it via pip and not apt."

I tried a few things (e.g. installing from apt-get first, then pip), but scipy build usually broke with a BlasNotFound error. I had never tried travis before, so I'm not an authority by any means.

Vincent

Christoph Deil

unread,

Oct 16, 2012, 12:32:54 PM10/16/12

to pystat...@googlegroups.com, n...@pobox.com

Mathew Brett has figured out a great hack to test projects that have numpy / scipy / matplotlib as a dependency:

https://github.com/nipy/nipy/blob/master/.travis.yml

scipy on travis-ci was discussed here:

https://groups.google.com/d/topic/travis-ci/uJgu35XKdmI/discussion

Christoph

VincentAB

unread,

Oct 24, 2012, 6:11:28 PM10/24/12

to pystat...@googlegroups.com, n...@pobox.com, deil.ch...@googlemail.com

Thanks Christoph! the Brett hack seems to work:

https://github.com/statsmodels/statsmodels/pull/543

josef...@gmail.com

unread,

Jan 10, 2013, 8:20:32 PM1/10/13

to pystat...@googlegroups.com

just an update here:

running the full tests "nosetests statsmodels" now takes about 9
minutes, after the George/Ralph nonparametric merge, using one
processor.
sm.test() is currently at 6 minutes and skips the "slow" tests. There
should be more tests marked as slow in future.

I haven't done any timing of individual tests across statsmodels.

TravisCI works pretty well (with occassional hiccups) and tests python
2.7 and python 3.2 with whatever default packages of numpy, scipy,
pandas and patsy are installed by TravisCI.
We try to keep those tests for master green at (almost all) times.

There are occassionally compatibility problems in master for older
versions or unreleased versions of our dependencies. Those are not
covered by continuous testing and fixed an random intervals.

For developers
===========

use nosetests !

running nosetests on the commandline allows to select which
subpackage, which subdirectory or which test is run:

> nosetests statsmodels.nonparametric
> nosetests <path_of_test_file>
> nosetests <path_of_test_file>:<name_of_test>

All versions help to make it much faster to repeatedly test specific
parts during development.

This has been my favorite feature of nosetests since I managed to get
the scipy.stats tests to take 6 to 8 minutes.

Josef

Ralf Gommers

unread,

Jan 14, 2013, 5:53:23 PM1/14/13

to pystat...@googlegroups.com

On Fri, Jan 11, 2013 at 2:20 AM, <josef...@gmail.com> wrote:

just an update here:

running the full tests "nosetests statsmodels" now takes about 9
minutes, after the George/Ralph nonparametric merge, using one
processor.
sm.test() is currently at 6 minutes and skips the "slow" tests. There
should be more tests marked as slow in future.

This should be straightforward to improve. For nonparametric I tried quite hard to reduce it, the default (non-'full') tests take 8 seconds on my box.

The largest offender seems to be emplike/tests/test_regression.py. There's only a handful of tests which look almost identical, and those take well over one third of total runtime. So question to whoever is familiar with that module: can you mark some as slow and/or delete some of those tests?

Ralf

josef...@gmail.com

unread,

Jan 14, 2013, 7:15:12 PM1/14/13

to pystat...@googlegroups.com

On Mon, Jan 14, 2013 at 5:53 PM, Ralf Gommers <ralf.g...@gmail.com> wrote:
>
>
>
> On Fri, Jan 11, 2013 at 2:20 AM, <josef...@gmail.com> wrote:
>>
>> just an update here:
>>
>> running the full tests "nosetests statsmodels" now takes about 9
>> minutes, after the George/Ralph nonparametric merge, using one
>> processor.
>> sm.test() is currently at 6 minutes and skips the "slow" tests. There
>> should be more tests marked as slow in future.
>
>
> This should be straightforward to improve. For nonparametric I tried quite
> hard to reduce it, the default (non-'full') tests take 8 seconds on my box.
>
> The largest offender seems to be emplike/tests/test_regression.py. There's
> only a handful of tests which look almost identical, and those take well
> over one third of total runtime. So question to whoever is familiar with
> that module: can you mark some as slow and/or delete some of those tests?

test_regression looks like a good candidate.

I will prepare a PR if Justin doesn't have time.

https://github.com/statsmodels/statsmodels/issues/622

Josef

justin

unread,

Jan 14, 2013, 7:26:35 PM1/14/13

to pystat...@googlegroups.com

On 01/14/2013 07:15 PM, josef...@gmail.com wrote:
> On Mon, Jan 14, 2013 at 5:53 PM, Ralf Gommers <ralf.g...@gmail.com> wrote:
>>
>>
>> On Fri, Jan 11, 2013 at 2:20 AM, <josef...@gmail.com> wrote:
>>> just an update here:
>>>
>>> running the full tests "nosetests statsmodels" now takes about 9
>>> minutes, after the George/Ralph nonparametric merge, using one
>>> processor.
>>> sm.test() is currently at 6 minutes and skips the "slow" tests. There
>>> should be more tests marked as slow in future.
>>
>> This should be straightforward to improve. For nonparametric I tried quite
>> hard to reduce it, the default (non-'full') tests take 8 seconds on my box.
>>
>> The largest offender seems to be emplike/tests/test_regression.py. There's
>> only a handful of tests which look almost identical, and those take well
>> over one third of total runtime. So question to whoever is familiar with
>> that module: can you mark some as slow and/or delete some of those tests?
> test_regression looks like a good candidate.
>
> I will prepare a PR if Justin doesn't have time.
>
> https://github.com/statsmodels/statsmodels/issues/622
>
> Josef

I can do it. The reason the tests are identical is because I need to
run a nested optimization for each parameter being tested. If testing
only 1 or 2 of the parameters is sufficient, I can take out some of the
tests.

Another alternative would be using a different (generated) data set. I
simply used a dataset already in sm but when I was using generated data,
the testing was going much faster. So, if a test with a generated
dataset is OK, I can rewrite the tests with that data and it will be
much faster.

josef...@gmail.com

unread,

Jan 14, 2013, 7:47:55 PM1/14/13

to pystat...@googlegroups.com

I added my questions and comments to the issue 622

Thanks,

Josef

Reply all

Reply to author

Forward