Deferred Plans

Ovid

unread,

Nov 19, 2007, 6:04:40 AM11/19/07

to per...@perl.org

Hi all,

Trying to speed up some tests at work. We have a bunch of test files like this:

t/foo/bar/baz.yml
t/foo/bar/baz.t

So each .yml file has a corresponding .t file. However, each .t file is (more or less) identical. It looks like I can get the test time down from 10 minutes to about three by using one .t file which looks like this:

use Test::More 'no_plan';
foreach my $test ( find_tests() ) {
runtests($test);
}

That avoids the overhead of reloading perl and the modules multiple times. However, each .yml file defines its own test count and I don't want 'no_plan'. What I really want to do is this:

use Test::More 'deferred_plan'
my $plan = 0;

foreach my $test ( find_tests() ) {

$plan += runtest($test); # returns count

}
plan $plan;

And if my plan doesn't match tests run, I get an error.

I could get around this by loading all of the YAML files and checking their count, but then I'd have to load them *again* when I run the tests and that defeats the purpose of speeding up the test suite.

Cheers,
Ovid
--
Buy the book - http://www.oreilly.com/catalog/perlhks/
Perl and CGI - http://users.easystreet.com/ovid/cgi_course/
Personal blog - http://publius-ovidius.livejournal.com/
Tech blog - http://use.perl.org/~Ovid/journal/

Adrian Howard

unread,

Nov 19, 2007, 7:16:39 AM11/19/07

to Perl QA

On 19 Nov 2007, at 11:04, Ovid wrote:
[snip]

> That avoids the overhead of reloading perl and the modules multiple
> times. However, each .yml file defines its own test count and I
> don't want 'no_plan'. What I really want to do is this:
>
> use Test::More 'deferred_plan'
> my $plan = 0;
>
> foreach my $test ( find_tests() ) {
>
> $plan += runtest($test); # returns count
>
> }
> plan $plan;

[snip]

For this particular case I would just do:

use Test::More 'no_plan';
my $builder = Test::More->builder;

foreach my $test ( find_tests() ) {

my $initial_count = $builder->current_test;
my $expected_num_tests = runtest($test);
is $builder->current_test,
$initial_count + $expected_num_tests,
"expected $expected_count tests in $test
}

:-)

Cheers,

Adrian

Andy Armstrong

unread,

Nov 19, 2007, 8:06:38 AM11/19/07

to Ovid, per...@perl.org

On 19 Nov 2007, at 11:04, Ovid wrote:

> I could get around this by loading all of the YAML files and
> checking their count, but then I'd have to load them *again* when I
> run the tests and that defeats the purpose of speeding up the test
> suite.

I think we really need to reach a decision on

http://testanything.org/wiki/index.php/Test_Groups versus
http://testanything.org/wiki/index.php/Test_Blocks

Once we have structured TAP a lot of these problems go away.

--
Andy Armstrong, Hexten

Eric Wilhelm

unread,

Nov 19, 2007, 12:50:29 PM11/19/07

to per...@perl.org

# from Ovid
# on Monday 19 November 2007 03:04:

>That avoids the overhead of reloading perl and the modules multiple
> times. However, each .yml file defines its own test count and I
> don't want 'no_plan'.

So, you write a custom 'exec' program which starts a daemon (the first
time (e.g. checks a lockfile)) and passes it the name of the test file
(on a socket/fifo.) The daemon has preloaded all of the modules, then
forks-off a new process to run each test. The harness sees it as
multiple tests, so the plans are fine.

Now you can get it down to 1.5min if you have a second CPU.

Some sort of preload support in TAP::Harness would be nice.

Also note, with SGI::FAM, a persistent daemon could conceivably have
already loaded the new code before you can reach for the hotkey to
start the tests.

--Eric
--
To a database person, every nail looks like a thumb.
--Jamie Zawinski
---------------------------------------------------
http://scratchcomputing.com
---------------------------------------------------

Jonathan Rockway

unread,

Nov 19, 2007, 5:10:20 PM11/19/07

to per...@perl.org

On Mon, 2007-11-19 at 13:06 +0000, Andy Armstrong wrote:
> On 19 Nov 2007, at 11:04, Ovid wrote:
> > I could get around this by loading all of the YAML files and
> > checking their count, but then I'd have to load them *again* when I
> > run the tests and that defeats the purpose of speeding up the test
> > suite.
>
>
> I think we really need to reach a decision on
>
> http://testanything.org/wiki/index.php/Test_Groups versus
> http://testanything.org/wiki/index.php/Test_Blocks

It looks like the con on both of these proposals is lack of backcompat.

My idea is to encode as much of the block information in the current
dialect of TAP as possible, and then add extra info in the comments that
new harnesses can process (and old harnesses can ignore).

How about:

<no current-style plan>
# PLAN 4 BLOCKS
# {BLOCK 1} 1..2
ok 1 - BLOCK{1} TEST{1} - and the usual comment
ok 2 - BLOCK{1} TEST{2}
# {BLOCK 2} PLAN NO_PLAN
ok 3 - BLOCK{2} TEST{1}
# {BLOCK 2} 1..1
# {BLOCK 3} 1..1
...
# {BLOCK 4} 1..2
<a total of 6 tests run over 4 blocks>
1..6

This is fully-backwards compatible with current harnesses, and even
provides most of the safety of the above proposals (a bit better than
no_plan, since the number at the bottom of the TAP is calculated based
on block declarations, not on number of tests run). Blocks can also
nest, if you want.

One thing I might add is a symbol after the # like:

#@ this is a new-style TAP command

If the @ after the # (without a space separating them) is legal in TAP
1.0, then even Test::More::diag('@ BLOCK{1} 1..2') would still be
old-style TAP as far as the new parser is concerned (since it would
print "# @ BLOCK..."). Nifty.

Thoughts?

Regards,
Jonathan Rockway

Michael Peters

unread,

Nov 19, 2007, 5:17:34 PM11/19/07

to Jonathan Rockway, per...@perl.org

Jonathan Rockway wrote:
> On Mon, 2007-11-19 at 13:06 +0000, Andy Armstrong wrote:
>> I think we really need to reach a decision on
>>
>> http://testanything.org/wiki/index.php/Test_Groups versus
>> http://testanything.org/wiki/index.php/Test_Blocks
>
> It looks like the con on both of these proposals is lack of backcompat.

Backcompat shouldn't be a problem now that we have T::H 3. That was one of the
goals. TAP is now a versioned protocol and anything emitting a specific version
of TAP needs to declare the version. Older TAP processors were supposed to
ignore anything they didn't understand so none of these should really be a problem.

--
Michael Peters
Developer
Plus Three, LP

Andy Lester

unread,

Nov 19, 2007, 5:11:35 PM11/19/07

to Perl QA

I guess I'm not seeing why a deferred plan is better than no plan at
all. Seems to me the whole point of a plan is that you know up front
how many they're gonna be.

--
Andy Lester => an...@petdance.com => www.petdance.com => AIM:petdance

Chromatic

unread,

Nov 19, 2007, 6:04:56 PM11/19/07

to per...@perl.org, Andy Lester

On Monday 19 November 2007 14:11:35 Andy Lester wrote:

> I guess I'm not seeing why a deferred plan is better than no plan at
> all. Seems to me the whole point of a plan is that you know up front
> how many they're gonna be.

There's that, and there's that Ovid's tests take too long to run when you time
all of the startup costs.

I'm having trouble convincing myself that the right solution to that is to
wedge more stuff into TAP though.

-- c

Andy Lester

unread,

Nov 19, 2007, 6:08:05 PM11/19/07

to A. Pagaltzis, per...@perl.org

On Nov 19, 2007, at 5:04 PM, A. Pagaltzis wrote:

> A deferred plan is clearly not as good as a predeclared plan,
> but is definitely much safer than no plan at all.

But what if something blows up before getting to the deferred plan?
Then you don't know. You've bypassed having a plan.

I would never use a deferred plan.

A. Pagaltzis

unread,

Nov 19, 2007, 6:04:54 PM11/19/07

to per...@perl.org

* Andy Lester <an...@petdance.com> [2007-11-19 23:17]:

> I guess I'm not seeing why a deferred plan is better than no
> plan at all.

At a minimum, because the harness expects a plan. If you exit
prematurely, it can at least detect that no plan was given,
whereas if you test without a plan, it knows nothing at all.

And beyond that, you still declare intent and the harness can
compare with actual behaviour. A buggy set of tests is more
likely to align with the count than it might with an up-front
plan, but not with complete certainty – whereas if you test
without a plan, the harness, once again, knows nothing at all.

A deferred plan is clearly not as good as a predeclared plan,

but is definitely much safer than no plan at all.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

A. Pagaltzis

unread,

Nov 19, 2007, 6:14:47 PM11/19/07

to per...@perl.org

* Andy Lester <an...@petdance.com> [2007-11-20 00:10]:

> But what if something blows up before getting to the deferred
> plan? Then you don't know.

How could you *not* know? The TAP stream says “I’m gonna supply
a plan at the end, I just don’t know how many tests I’m going to
run yet.” How would the harness miss the fact that the promised
plan never materialised?

Jonathan Rockway

unread,

Nov 19, 2007, 7:21:30 PM11/19/07

to perl-qa

On Mon, 2007-11-19 at 17:08 -0600, Andy Lester wrote:
> On Nov 19, 2007, at 5:04 PM, A. Pagaltzis wrote:
>
> > A deferred plan is clearly not as good as a predeclared plan,
> > but is definitely much safer than no plan at all.
>
> But what if something blows up before getting to the deferred plan?
> Then you don't know. You've bypassed having a plan.

More information is better than less information.

Consider the case where you want to run n + 10 tests. With blocks in a
deferred plan, you can't be entirely sure that n is correct, but you can
be sure that the other 10 tests did run. Not perfect, but better than
just saying "1..63" at the end and not knowing if the "+ 10" is included
in that 63.

Secondly, perhaps it's possible to refactor the test to turn an entire
"block" of TAP into a single test. Compare "files_are_valid(@FILES)" to
"file_is_valid($_) for @FILES". Same effect, but with the first one you
can declare the plan in advance. (OK, bad example because you know how
many elements are in @FILES. But the concept still applies.)

Regards,
Jonathan Rockway

Ovid

unread,

Nov 20, 2007, 3:19:45 AM11/20/07

to Perl QA

----- Original Message ----
> From: Andy Lester <an...@petdance.com>

> I guess I'm not seeing why a deferred plan is better than no plan at
> all. Seems to me the whole point of a plan is that you know up front
> how many they're gonna be.

I've not explained myself well. Sorry about that.

The reason you have to know up front is because that's how the entire system was designed. We currently have two cases when we should have at least three:

1. We know the test count up front, in which case we declare a leading plan.
2. We don't know the test count up front, in which case Test::Builder supplies a trailing plan which merely tells me how many tests I've run.

This misses the obvious case where I don't know the expected count up front, but
I do know the expected count at the end (not guaranteed to be the
same as the actual count). It should be trivial to fix Test::Builder and co., so that the programmer supplies the trailing plan instead of Test::Builder.

That being said, I like Adrian's code for this and I'll be stealing it.

Ovid

unread,

Nov 20, 2007, 3:10:23 AM11/20/07

to per...@perl.org

----- Original Message ----
> From: chromatic <chro...@wgz.org>

> There's that, and there's that Ovid's tests take too long to run
> when you time all of the startup costs.

The runtime of the tests is completely orthogonal to this problem.

> I'm having trouble convincing myself that the right solution to that
> is
to wedge more stuff into TAP though.

There's nothing else being wedged into TAP. This is about the programmer supplying the trailing plan instead of Test::Builder.

Ovid

unread,

Nov 20, 2007, 3:30:32 AM11/20/07

to per...@perl.org

----- Original Message ----
> From: Jonathan Rockway <j...@jrock.us>

> > I think we really need to reach a decision on
> >
> > http://testanything.org/wiki/index.php/Test_Groups versus
> > http://testanything.org/wiki/index.php/Test_Blocks
>
> It looks like the con on both of these proposals is lack of backcompat.

No. They're actually both completely backwards compatible. Consider test groups:
1..3
ok 1
1..2 2 a block
1..3 2.1 another block
ok 2.1.1
ok 2.1.2
ok 2.1.3
ok 2.1 # end of another block
ok 2.2
ok 2 # end of a block
1..3 3 a third block
ok 3.1
ok 3.2
not ok 3 # end of a third block, planned for 3 but only ran 2 tests
Since older TAP parsers are required to ignore lines which don't recognize the grammar, here's what the parser should see:

1..3
ok 1
ok 2 # end of a block
not ok 3 # end of a third block, planned for 3 but only ran 2 tests
And with test blocks (the version on the Wiki is different and incorrect. I've fixed it below, but not yet on the wiki):
TAP version 14
1..4
ok 1 - testing
begin 1 Object creation
1..2
ok 1 Object created OK
ok 2 Object isa Flunge::Twizzler
end 1 Object creation
ok 2 Clone OK
begin 3 Methods
1..4
ok 1 has twizzle method
ok 2 has burnish method
ok 3 has spangle method
not ok 4 has frob method
end 3 Methods
ok 3 another test
ok 4 Resources releasedHere's what an older TAP parser will see:
1..4
ok 1 - testing
ok 2 Clone OK
ok 3 another test
ok 4 Resources released
So if your current TAP parser is correct, you shouldn't have a problem. The "breaks backwards compatibility" arguments on the wiki don't seem correct.

> # PLAN 4 BLOCKS
> # {BLOCK 1} 1..2
> ok 1 - BLOCK{1} TEST{1} - and the usual comment
> ok 2 - BLOCK{1} TEST{2}
> # {BLOCK 2} PLAN NO_PLAN
> ok 3 - BLOCK{2} TEST{1}
> # {BLOCK 2} 1..1
> # {BLOCK 3} 1..1

This has much of the same problem as the current 'test groups' proposal: it's ugly and hard to read. However, it seems even harder to read than test groups. TAP should be as terse as possible, and no terser, in order to unequivocally represent intent. Otherwise, why not just switch to XML?

Sorry to be so blunt :)

Andy Armstrong

unread,

Nov 20, 2007, 4:35:51 AM11/20/07

to chromatic, per...@perl.org, Andy Lester

On 19 Nov 2007, at 23:04, chromatic wrote:
>> I guess I'm not seeing why a deferred plan is better than no plan at
>> all. Seems to me the whole point of a plan is that you know up front
>> how many they're gonna be.
>
> There's that, and there's that Ovid's tests take too long to run
> when you time
> all of the startup costs.
>
> I'm having trouble convincing myself that the right solution to that
> is to
> wedge more stuff into TAP though.

I hope the idea of structured TAP is quite generic. It's not just
about supporting this case. Is that the proposal you're sceptical about?

--
Andy Armstrong, Hexten

Adrian Howard

unread,

Nov 20, 2007, 10:19:01 AM11/20/07

to Perl QA

On 19 Nov 2007, at 23:08, Andy Lester wrote:

>
> On Nov 19, 2007, at 5:04 PM, A. Pagaltzis wrote:
>
>> A deferred plan is clearly not as good as a predeclared plan,
>> but is definitely much safer than no plan at all.
>
>
> But what if something blows up before getting to the deferred
> plan? Then you don't know. You've bypassed having a plan.

Then you get an error because you have said that you'll defer the
plan, and you didn't.

At least I think that was the behaviour that Ovid was after.

Adrian

Adrian Howard

unread,

Nov 20, 2007, 10:21:15 AM11/20/07

to Perl QA

On 19 Nov 2007, at 23:04, A. Pagaltzis wrote:
[snip]

> A deferred plan is clearly not as good as a predeclared plan,
> but is definitely much safer than no plan at all.

[snip]

I don't get this. Why is saying "I know this test script outputs 8
test results" at the start better than saying it at the end?

Adrian

Andy Lester

unread,

Nov 20, 2007, 10:27:40 AM11/20/07

to Adrian Howard, Perl QA

On Nov 20, 2007, at 9:19 AM, Adrian Howard wrote:

> Then you get an error because you have said that you'll defer the
> plan, and you didn't.

That there is a "there's a plan coming later" part is what I missed.
Now I get it.

A. Pagaltzis

unread,

Nov 20, 2007, 12:02:04 PM11/20/07

to per...@perl.org

* Adrian Howard <adr...@quietstars.com> [2007-11-20 16:25]:

> I don't get this. Why is saying "I know this test script
> outputs 8 test results" at the start better than saying it
> at the end?

I assume that if you knew up front how many tests you are going
to run, then you’d just say it.

So you’d defer the plan in cases where the number of tests is
predetermined but maybe hard to precompute, or where it’s
variable. So in both cases you are calculating the number at run
time, which is immediately subject to more bugs than providing a
constant.

Additionally, it’s more likely for a bug in the calculation to
line up with a bug in the corresponding test code, so that you
end up with a plan that matches the number of tests run even
though you *intended* to run fewer/more tests.

And lastly, even a runtime-calculated predeclared plan separates
the test code and calculation code at least in time (while
running) and probably also in space (in the source code).
Therefore it seems to me that bugs are somewhat less likely
to line up.

So a deferred plan should be used only if you really can’t
determine the number of tests ahead of time or it is *very*
hard to do so.