Six Million Reports!

Barbie

unread,

Nov 23, 2009, 8:43:16 AM11/23/09

to cpan-teste...@perl.org

http://blog.cpantesters.org/diary/58

Cheers,
Barbie.
--
Birmingham Perl Mongers <http://birmingham.pm.org>
Memoirs Of A Roadie <http://barbie.missbarbell.co.uk>
CPAN Testers Blog <http://blog.cpantesters.org>
YAPC Conference Surveys <http://yapc-surveys.org>

Barbie

unread,

Nov 23, 2009, 12:51:15 PM11/23/09

to Tim Bunce, cpan-teste...@perl.org

Hi Tim,

> I've been meaning to ask, though, about the diminishing returns from
> further growth in the number of "typical" testers. Each new tester with a
> "typical" perl & platform adds far less value now than a new tester with
> a more unusual perl config & platform.
>
> I wonder if more could be done to encourage testers, especially those
> running automated smoke testing, to test with a wider range of perl
> configurations. For example enable/disable: usethreads, usemultiplicity,
> use64bitint, use64bitall, uselongdouble, usemymalloc, etc.

This is one area I do try and encourage people to think about when I do
the BOFs at YAPCs. However, the emphasis previously has been more at
encouraging testing on under represented platforms.

> Related to that is the fact it's hard to explore the cpan-testers data
> looking for results for platforms that match a given set of those
> configuration options.

Currently the information just isn't stored in the stats data, though it
does reside in most of the articles (alas not all). However, part of
the plan for CPAN Testers 2.0, is to store this information so that it
can be queried and you can have a complex query to discover exactly what
you want.

There has been a lot of work towards CT 2.0, especially by David Golden
and Ricardo Signes, but there is still work to be done before we can
actually implement these kinds of queries.

It is coming, and it will be awesome (tm) :)

Martin J. Evans

unread,

Nov 23, 2009, 3:11:47 PM11/23/09

to Barbie, Tim Bunce, cpan-teste...@perl.org

Barbie wrote:
> Hi Tim,
>
>> I've been meaning to ask, though, about the diminishing returns from
>> further growth in the number of "typical" testers. Each new tester with a
>> "typical" perl & platform adds far less value now than a new tester with
>> a more unusual perl config & platform.
>>
>> I wonder if more could be done to encourage testers, especially those
>> running automated smoke testing, to test with a wider range of perl
>> configurations. For example enable/disable: usethreads, usemultiplicity,
>> use64bitint, use64bitall, uselongdouble, usemymalloc, etc.

I am running a smoker on Linux (x86) and as of today, one on AIX 5.1
(64bit powerpc) - 27000 reports so far but slowing now I've done pretty
much everything once. I've so far made no attempt to build a perl other
than with the defaults but I am happy to rebuild the ones I do run with
a more unusual configuration. If someone wants to suggest a
configuration for the 2 I do I will happily rebuild them tomorrow.

The Linux one currently is:

config_args='-Dprefix=/home/cpantester/perl10 -de'
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=undef, use64bitall=undef, uselongdouble=undef
usemymalloc=n, bincompat5005=undef

and the AIX one is:

argh - cannot tell you as we only have a one user license and I cannot
login from home whilst the smoker is running.

> This is one area I do try and encourage people to think about when I do
> the BOFs at YAPCs. However, the emphasis previously has been more at
> encouraging testing on under represented platforms.

As we are posting 1M reports every 3 months (so I hear) perhaps now is a
good time to try and expand the configurations. I didn't see any advice
on the wiki regarding this.

>> Related to that is the fact it's hard to explore the cpan-testers data
>> looking for results for platforms that match a given set of those
>> configuration options.
>
> Currently the information just isn't stored in the stats data, though it
> does reside in most of the articles (alas not all). However, part of
> the plan for CPAN Testers 2.0, is to store this information so that it
> can be queried and you can have a complex query to discover exactly what
> you want.
>
> There has been a lot of work towards CT 2.0, especially by David Golden
> and Ricardo Signes, but there is still work to be done before we can
> actually implement these kinds of queries.
>
> It is coming, and it will be awesome (tm) :)
>
> Cheers,
> Barbie.

BTW, 6000000 reports - well done. SALVA++.

Martin

David Golden

unread,

Nov 23, 2009, 3:18:34 PM11/23/09

to Barbie, Tim Bunce, cpan-teste...@perl.org

The "plan" for 2.0 is to use Config::Perl::V to provide a meaningful subset
of Config and capture it as structured data.

More generally, I'd like to see more "every day" users configure smoke
reporting, so we get real world perls with diverse versions of prereqs,
PERL5LIB, odd install locations, etc. as they go about their day to day use
of CPAN.

Smoke reporting is good, but not enough.

CT2.0 work will resume once I discharge my toolchain obligations in
preparation for Perl 5, Version 12.

David

Tim Bunce

unread,

Nov 23, 2009, 10:41:20 AM11/23/09

to Barbie, cpan-teste...@perl.org

On Mon, Nov 23, 2009 at 01:43:16PM +0000, Barbie wrote:
> Subject: Re: Six Million Reports!
> http://blog.cpantesters.org/diary/58

That's wonderful!

I've been meaning to ask, though, about the diminishing returns from
further growth in the number of "typical" testers. Each new tester with a
"typical" perl & platform adds far less value now than a new tester with
a more unusual perl config & platform.

I wonder if more could be done to encourage testers, especially those
running automated smoke testing, to test with a wider range of perl
configurations. For example enable/disable: usethreads, usemultiplicity,
use64bitint, use64bitall, uselongdouble, usemymalloc, etc.

Related to that is the fact it's hard to explore the cpan-testers data

looking for results for platforms that match a given set of those
configuration options.

So, for example, it's hard to answer the question "what proportion of
x86_64-linux-thread-multi platforms have uselongdouble enabled?".

You could say I'm asking for the definition of 'platform' to be
extended, but I think that wouldn't scale well. Something like
'configuration tags' could be a more extensible mechanism.

Tim.

David Cantrell

unread,

Nov 26, 2009, 11:55:09 AM11/26/09

to cpan-teste...@perl.org

On Mon, Nov 23, 2009 at 05:51:15PM +0000, Barbie wrote:
> Hi Tim,

> > I wonder if more could be done to encourage testers, especially those
> > running automated smoke testing, to test with a wider range of perl
> > configurations. For example enable/disable: usethreads, usemultiplicity,
> > use64bitint, use64bitall, uselongdouble, usemymalloc, etc.

Unfortunately, many of the rarer machines are quite slow by modern
standards, so to test even a few of those is difficult. eg, the
NetBSD/Alpha machine I use wouldn't be able to do more than three
different combinations - and even with just one, it occasionally gets
backlogged.

> Currently the information just isn't stored in the stats data, though it
> does reside in most of the articles (alas not all). However, part of
> the plan for CPAN Testers 2.0, is to store this information so that it
> can be queried and you can have a complex query to discover exactly what
> you want.

Also, things like CPAN::Reporter may need patching to keep track of
which perl config options were used for each distribution, so that it
DTRT with regard to duplicate reports.

--
David Cantrell | Official London Perl Mongers Bad Influence

Human Rights left unattended may be removed,
destroyed, or damaged by the security services.

Tim Bunce

unread,

Nov 27, 2009, 4:43:13 AM11/27/09

to David Cantrell, cpan-teste...@perl.org

On Thu, Nov 26, 2009 at 04:55:09PM +0000, David Cantrell wrote:
> On Mon, Nov 23, 2009 at 05:51:15PM +0000, Barbie wrote:
> > Hi Tim,
> > > I wonder if more could be done to encourage testers, especially those
> > > running automated smoke testing, to test with a wider range of perl
> > > configurations. For example enable/disable: usethreads, usemultiplicity,
> > > use64bitint, use64bitall, uselongdouble, usemymalloc, etc.
>
> Unfortunately, many of the rarer machines are quite slow by modern
> standards, so to test even a few of those is difficult. eg, the
> NetBSD/Alpha machine I use wouldn't be able to do more than three
> different combinations - and even with just one, it occasionally gets
> backlogged.

As the volume of releases grows that issue is likely to affect more
testers. We don't want to loose testers because they can't devote enough
resources to keep up.

Perhaps some mechanism could be added to randomly skip 1-in-N releases
(except those in the Volatile 100 and Fail 100).

Tim.

Andreas J. Koenig

unread,

Nov 27, 2009, 4:02:15 PM11/27/09

to Tim Bunce, Barbie, cpan-teste...@perl.org

>>>>> On Mon, 23 Nov 2009 15:41:20 +0000, Tim Bunce <Tim....@pobox.com> said:

> So, for example, it's hard to answer the question "what proportion of
> x86_64-linux-thread-multi platforms have uselongdouble enabled?".

Fortunately I have a fairly representative sample of reports on my disk,
collected over many months:

distros: 2517
reports: 528662
Of those:
archname=~/^x86_64-linux-thread-multi(-ld)?$/ : 40555
Of those:
uselongdouble=define : 9840

I can answer more such questions if you like:)

--
andreas

Tim Bunce

unread,

Nov 27, 2009, 7:51:06 PM11/27/09

to Andreas J. Koenig, Tim Bunce, Barbie, cpan-teste...@perl.org

On Fri, Nov 27, 2009 at 10:02:15PM +0100, Andreas J. Koenig wrote:
> >>>>> On Mon, 23 Nov 2009 15:41:20 +0000, Tim Bunce <Tim....@pobox.com> said:
>
> > So, for example, it's hard to answer the question "what proportion of
> > x86_64-linux-thread-multi platforms have uselongdouble enabled?".
>
> Fortunately I have a fairly representative sample of reports on my disk,
> collected over many months:
>
> distros: 2517
> reports: 528662
> Of those:
> archname=~/^x86_64-linux-thread-multi(-ld)?$/ : 40555
> Of those:
> uselongdouble=define : 9840

Bless you Andreas! :)

> I can answer more such questions if you like:)

Thanks. I'm not sure how well that service would scale though :)

Tim.

(Andreas J. Koenig)

unread,

Nov 28, 2009, 10:40:47 AM11/28/09

to Tim Bunce, Barbie, cpan-teste...@perl.org

>>>>> On Sat, 28 Nov 2009 00:51:06 +0000, Tim Bunce <Tim....@pobox.com> said:

>> I can answer more such questions if you like:)

> Thanks. I'm not sure how well that service would scale though :)

You can download my sample as a whole if you prefer.

ftp://pause.perl.org/scratch/cpantesters.sample.20091128.json.gz

This is not the same sample as I used yesterday, it is shifted by one
day and recalculated.

The file is 211904144 large. wc tells us

% zcat cpantesters.sample.20091128.json.gz|wc
532546 46914106 3071436559

Every line is one test as a hash in JSON format. The key-value pairs are
documented in CPAN::Testers::ParseReport but seem intuitive enough for
me that I think you get away without reading a manpage.

The sample contains only tests for distros that have been released
within the last three years and are still current releases and have at
least three PASSes and three FAILs.

No Archive-Tar-0.23 because it is not current.
No DBIx-Romani-0.0.16 because it has no PASS.
No Data-Compare-1.2101 because it has no FAIL.

HTH, let me know if I can help you with it,
--
andreas