HWMCC'14 results announced

76 views
Skip to first unread message

Armin Biere

unread,
Jul 21, 2014, 1:00:14 PM7/21/14
to hw...@googlegroups.com
HWMCC'14 results have been announced at the FLoC olympics.
Slides and benchmarks are available on the competition site.

http://fmv.jku.at/hwmcc

More information will be provided in the next days ...

Armin

Armin Biere

unread,
Jul 23, 2014, 9:33:25 AM7/23/14
to hw...@googlegroups.com
The log files and statistics are online as well as pictures.

Armin

Alberto Griggio

unread,
Aug 8, 2014, 5:20:34 AM8/8/14
to hw...@googlegroups.com
Hello Armin,

> The log files and statistics are online as well as pictures.

First, thanks for organizing once again a great competition!

I have a question on the deep bounds track. I started analyzing the
results, and I'm not sure I understood how the scores are computed. I
tried to recompute the scores using the CSV data you provided, with the
formula at slide 8 of your presentation, but I get slightly different
numbers than what reported in the slides, so I think I'm doing something
wrong. The Python script I'm using is in attach. With it, I get the
following:

ranking on 79 benchmarks
tipbmc 93.19% 3471
blimc 92.54% 3401
shiftbmc 92.47% 3244
aigbmc 92.23% 3062
nuxmv 91.89% 2714
v3 91.82% 3084
v3db v3db 91.77% 3180
supdeep 87.76% 3267
tip 83.18% 510
iimc 15.88% 579

Notice that the order of the solvers is the same, and also the numbers
are close, but not quite the same as on the slides... If you (or anybody
else on the list) have the chance to take a look at my script, it would
be great.


Second, I noticed something potentially strange in some of the
results. In particular, there are two instances for which only tip was
able to report a bound > 0: 6s399b02 and 6s399b03. Both tip and tipbmc
report a bound of 2, whereas all the others are stuck at 0. However, the
log files for tipbmc seem to indicate that it also could not get past
0. For your convenience, I'm attaching the log for 6s399b02: you see
that tip tries to unroll the first 2 cycles, but gets stuck while
proving k=1 (or at least that is my interpretation of the log). Is there
something strange (e.g. the "filter.tip" script gets confused by the
output), or am I missing something? Notice that even by assiging a score
of 0 to the two instances, tipbmc is still the winner, so in any case
the results of the competition won't change. Nevertheless, I think it
would still be important to understand what is going on.


Best,
Alberto

analyze-deepbounds.py
6s399b02.log

Armin Biere

unread,
Aug 11, 2014, 4:34:16 AM8/11/14
to hw...@googlegroups.com
I triple checked this one. First I used the scripts
I had from last year and then my analyze script in C
(attached) and finally recomputed it with an Excel
script. After some tweaking (debugging) they all
agreed. The logic for this is implemented in 'prdeepbound':


#define CAPPED 100

...
capped = s->bound;
if (capped < 0) capped = -1;
else if (capped >= INT_MAX - 2) capped = INT_MAX - 2;
delta = 1.0 - 1.0 / (2 + capped);
assert (capped >= 0 || !delta);
stat->depth += delta;
printf (" %6d %5.1f%%", s->bound, 100.0*delta);
if (++capped > CAPPED) capped = CAPPED;
stat->sumbounds += capped;
...

for (j = 0; j < nstats; j++) {
stat = stats[j];
if (!hasdeepbounds (stat->pretty)) continue;
printf ("%11s %7.2f %6.2f %5d\n",
stat->pretty, 100.0*stat->depth/(double)deep, stat->depth,
stat->sumbounds);
}
...

I will have a look at the 6s399b02 issue ...

Armin
> --
> You received this message because you are subscribed to the Google Groups
> "HWMCC" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hwmcc+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
analyze.c

Armin Biere

unread,
Aug 11, 2014, 4:43:48 AM8/11/14
to hw...@googlegroups.com
There is one catch here which unfortunately makes
the scoring scheme dependent on the participating
solvers, e.g., something I think one should avoid
in competitions. Otherwise, comparing a new solver
(or algorithm in a paper) with the competition results
becomes difficult. Here the catch is that the set of
submitted solvers determines the set of unsolved
instances which are used for computing the score
of the deep bound track. I have included in the CSV
file the results for the two pdtrav versions, but since
pdtrav showed a discrepancy, all instances, which only
pdtrav was able to solve, should still be used for the
deep bound track. So you need to remove those lines
from the CSV file before applying your script.

Armin

Armin Biere

unread,
Aug 11, 2014, 4:49:05 AM8/11/14
to hw...@googlegroups.com
There are two shell scripts missing, needed
for determining the bounds. Now attached.

Armin

On Mon, Aug 11, 2014 at 10:34 AM, Armin Biere <bi...@jku.at> wrote:
grepsatbound.sh
grepunsbound.sh

Alberto Griggio

unread,
Aug 11, 2014, 7:42:06 AM8/11/14
to hw...@googlegroups.com
Hello,

> I triple checked this one. First I used the scripts
> I had from last year and then my analyze script in C
> (attached) and finally recomputed it with an Excel
> script. After some tweaking (debugging) they all
> agreed. The logic for this is implemented in 'prdeepbound':

Thanks for triple-checking. It turns out that I was using a wrong logic
for computing the score, i.e. I was using the capped bound in the
computation of the depth, instead of the original bound. I was also
adding bound instead of bound+1 to sumbounds. With these fixes, my
script (run on the csv file available from the website) now agrees with
the results in the slides. The fixed script is in attach if anyone is
interested.

> I will have a look at the 6s399b02 issue ...

Thanks, that still puzzles me...

Alberto
analyze-deepbounds.py

Niklas Sörensson

unread,
Aug 25, 2014, 4:45:57 AM8/25/14
to hw...@googlegroups.com
Hi all,

I think I can answer what happened here. The script doesn't get tip's depth right in the case when it get stuck during temporal decomposition. It always adds the temporal decomposition depth regardless of whether it completed or not. Armin, here's a ugly hack to grepunsbound.sh that fixes the problem as far as I can tell:

    if [ ! x"$unrolled" = x ]
    then
        tmp="`grep circ-stats: $1|wc -l`"
        if [ $tmp -ge 4 ]
        then
            last=`expr $last + $unrolled`
        fi
    fi

In my test this changed two benchmarks that went from k of 2 to 0, which noticeably changed the final score, but (barely) preserved the order.

/Niklas




Alberto

--
You received this message because you are subscribed to the Google Groups "HWMCC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hwmcc+unsubscribe@googlegroups.com.

Fabio Somenzi

unread,
Aug 25, 2014, 12:07:00 PM8/25/14
to hw...@googlegroups.com
Thanks Niklas,

Just for the record, the dismal performance of IImc in the deep-bound
track was due to us forgetting to flush cout after writing a u-line.

Embarrassing, I know. At least, it's unlikely to happen again. I
only say "unlikely" because in the single track, the two segfaults we
collected were due running IImc with unlimited stack size. We knew it
was a problem from last year's competition, but we forgot to remind
Armin.

For those curious to know, we did run the 79 models of the deep-bound
track on a machine that is reasonably close to the Aalto cluster's
machines and we measured a score of 92.44, with a sum of the capped
bounds of 3404. Of course, there is no guarantee that this would have
been the exact result in the competition had we had that fateful
"<< flush."


Fabio

>>>>> "NS" == Niklas Sörensson <nikl...@gmail.com> writes:

NS> Hi all,
NS> I think I can answer what happened here. The script doesn't get tip's depth
NS> right in the case when it get stuck during temporal decomposition. It always
NS> adds the temporal decomposition depth regardless of whether it completed or
NS> not. Armin, here's a ugly hack to grepunsbound.sh that fixes the problem as
NS> far as I can tell:

NS>     if [ ! x"$unrolled" = x ]
NS>     then
NS>         tmp="`grep circ-stats: $1|wc -l`"
NS>         if [ $tmp -ge 4 ]
NS>         then
NS>             last=`expr $last + $unrolled`
NS>         fi
NS>     fi

NS> In my test this changed two benchmarks that went from k of 2 to 0, which
NS> noticeably changed the final score, but (barely) preserved the order.

NS> /Niklas

NS> On Mon, Aug 11, 2014 at 1:41 PM, Alberto Griggio <alberto...@gmail.com>
NS> wrote:

NS> Hello,

NS> I triple checked this one.  First I used the scripts
NS> I had from last year and then my analyze script in C
NS> (attached) and finally recomputed it with an Excel
NS> script.  After some tweaking (debugging) they all
NS> agreed.  The logic for this is implemented in 'prdeepbound':

NS> Thanks for triple-checking. It turns out that I was using a wrong logic
NS> for computing the score, i.e. I was using the capped bound in the
NS> computation of the depth, instead of the original bound. I was also
NS> adding bound instead of bound+1 to sumbounds. With these fixes, my
NS> script (run on the csv file available from the website) now agrees with
NS> the results in the slides. The fixed script is in attach if anyone is
NS> interested.

NS> I will have a look at the 6s399b02 issue ...

NS> Thanks, that still puzzles me...

NS> Alberto

NS> --
NS> You received this message because you are subscribed to the Google Groups
NS> "HWMCC" group.
NS> To unsubscribe from this group and stop receiving emails from it, send an
NS> email to hwmcc+un...@googlegroups.com.
NS> For more options, visit https://groups.google.com/d/optout.

NS> --
NS> You received this message because you are subscribed to the Google Groups
NS> "HWMCC" group.
NS> To unsubscribe from this group and stop receiving emails from it, send an
NS> email to hwmcc+un...@googlegroups.com.
NS> For more options, visit https://groups.google.com/d/optout.

Gianpiero Cabodi

unread,
Oct 9, 2014, 5:24:06 AM10/9/14
to hw...@googlegroups.com, quer
Hi Armim,

In view of the JSAT special issue, and after chatting with
Brayton/Mishchenko and Baumgartner, we are considering to prepare an
experimental paper based on analysis and characterization of results at
the HWMCC 2014 competition (single track), with the main purpose of
providing insights (as deep as possible) on tools and benchmarks.
Put in other works, we'd like to write what all competitors have been
trying to do (in very partial, hidden and/or focussed ways), in a fair
(as much as possible) way, addressing both competitors and and
industrial audience.

Our idea was to

- start analizing the results you published, build a table/grid of
problem/tool coverage, understand the power of individual MC techniques

- concentrate on selected tools/problems. Run tools ourselves (if
possible and authors agree) in order to obtain more data or to weight
the power of individual engines in portfolios.

- whenever not possible or as an alternative, interact with authors to
get more data.

As far as we can understand, such a work would be highly appreciated by
industry, and also by toolmakers, but we are basically missing
volunteers (at least this is what Bob Brayton told us).
Our group can volunteer, taking the opportunity of the special issue.
We know that we have just 1 month, but we can do our best.

We definitely need to interact with partecipant groups, so we plan to
exploit the HWMCC mailing list and Google group.

What do you think about this idea ?
By the way, are you interested, as organizer, to take part and give some
contribution (and co-author a sumission) ?



Any comment/suggestion from you is defintely appreciated.

Cheers,

Gianpiero and Stefano

Armin Biere

unread,
Oct 9, 2014, 12:20:07 PM10/9/14
to hw...@googlegroups.com, quer
Let us do some (offline) discussion on this first.

Armin
> --
> You received this message because you are subscribed to the Google Groups
> "HWMCC" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hwmcc+un...@googlegroups.com.

Gianpiero Cabodi

unread,
Nov 18, 2014, 3:46:39 PM11/18/14
to hw...@googlegroups.com
Ciao Fabio,

avrai forse letto qualche email a proposito di un paper (con molti autori) che stiamo preparando per descrivere la HWMCC 2014, per la special issue JSAT sulle competition a CAV 2014.
Sei interessato a darci qualche info in più’ su IIMC ?
Vanno bene sia info testuali (sul template che ho già’ inviato, se serve te lo rimando) sia dati nuovi o che possano mettere in risalto l’impatto di alcuni engines (es. IC3 o varianti di IC3).
Se non hai tempo e vuoi che facciamo un po’ di esperimenti per te, ho un po’ di manpower in questo momento (dottorandi abbastanza allenati…).

A parte questo spero che tu stia bene e non sia oberato di lavoro…
Ah, se preferisci che contattiamo qualcuno dei tuoi, dimmi chi e provvedo a scrivergli.

Gianpiero
Reply all
Reply to author
Forward
0 new messages