Static analysis tinderbox

Joshua Cranmer

unread,

Aug 30, 2011, 12:41:45 PM8/30/11

to

I would like to propose that we reintroduce the static analysis
tinderbox for mozilla-central and related tinderboxes; furthermore, I
would also like to have a policy of requiring that this tinderbox go
green for a checkin to land.

Bug 681634 is tracking current issues that prevent mozilla-central from
being built with static-analysis. The major ones I've found so far:
1. NS_OVERRIDE failure in content
2. elf-hack causes a failure in dehydra
3. JS doesn't have proper includes for jsstaticcheck.h (could just be a
transient JS failure, not sure)
4. Linker errors at libxul
5. Two of the HTML 5 parser files cause the scripts to run out of memory.

The following is a list of static analysis errors:
* NS_OVERRIDE, NS_FINAL [These are equivalent to @Override and final in
Java, and are also keywords that will be present in C++11]
* "nsQueryFrame<%s> found, but %s::kFrameIID is not declared"
* non-assigning call to operator new
* Flow not going through a given label
* NS_MUST_OVERRIDE (all subclasses must override this function)

There is also a collection of warnings. To be useful, I think the
tinderbox should go orange if the number of warnings of a particular
type exceeds a set parameter (e.g., no more than 100 warnings on misuse
of outparams).

Are there any thoughts, questions, comments, concerns, etc.?

Benjamin Smedberg

unread,

Aug 30, 2011, 1:49:13 PM8/30/11

to Joshua Cranmer, dev-pl...@lists.mozilla.org

On 8/30/2011 12:41 PM, Joshua Cranmer wrote:
> I would like to propose that we reintroduce the static analysis
> tinderbox for mozilla-central and related tinderboxes; furthermore, I
> would also like to have a policy of requiring that this tinderbox go
> green for a checkin to land.

There are two ways to introduce this:

* turn static-checking on by default for some of the existing builds on
Linux (debug, release, or both)
* create a new build specifically for static checking

I think it's probably preferable to use existing builds, but we'd need
to understand the build-time cost of the option.

>
> There is also a collection of warnings. To be useful, I think the
> tinderbox should go orange if the number of warnings of a particular
> type exceeds a set parameter (e.g., no more than 100 warnings on
> misuse of outparams).

Why do you think this, and what warnings are there?

The outparam warnings have turned out to be largely useless, partly
because people and code still disagree about the correct behavior of
outparams in failure cases, and partly because the annotation burden is
probably too high. I think we should probably completely disable the
outparam analysis (as cool as it is).

--BDS

Joshua Cranmer

unread,

Aug 30, 2011, 3:08:55 PM8/30/11

to

On 8/30/2011 12:49 PM, Benjamin Smedberg wrote:
> On 8/30/2011 12:41 PM, Joshua Cranmer wrote:
>> I would like to propose that we reintroduce the static analysis
>> tinderbox for mozilla-central and related tinderboxes; furthermore, I
>> would also like to have a policy of requiring that this tinderbox go
>> green for a checkin to land.
> There are two ways to introduce this:
>
> * turn static-checking on by default for some of the existing builds
> on Linux (debug, release, or both)
> * create a new build specifically for static checking
>
> I think it's probably preferable to use existing builds, but we'd need
> to understand the build-time cost of the option.

Static analysis is pretty slow, although I think most of that is the
outparam analysis (I believe that's the only one using ESP).

A non-parallel static analysis build on my computer took several hours,
where a normal build is around a half-hour.

>>
>> There is also a collection of warnings. To be useful, I think the
>> tinderbox should go orange if the number of warnings of a particular
>> type exceeds a set parameter (e.g., no more than 100 warnings on
>> misuse of outparams).
> Why do you think this, and what warnings are there?

I want to see the warnings count trend down to 0. Making it go orange on
>0 immediately would never turn the box green. I should also make it
clear that I don't want to limit this counting to just static-analysis
warnings but any warning the compiler produces, e.g., shadowed variable
warnings, etc.

Benjamin Smedberg

unread,

Aug 30, 2011, 3:17:28 PM8/30/11

to Joshua Cranmer, dev-pl...@lists.mozilla.org

On 8/30/2011 3:08 PM, Joshua Cranmer wrote:
>
> A non-parallel static analysis build on my computer took several
> hours, where a normal build is around a half-hour.

Then you want a non-parallel build? Do you also want all the
warning-counting and reporting features that the original box had? If
so, we're probably going to need some significant build power, and this
may slow down complete try results significantly.

>>>
>>> There is also a collection of warnings. To be useful, I think the
>>> tinderbox should go orange if the number of warnings of a particular
>>> type exceeds a set parameter (e.g., no more than 100 warnings on
>>> misuse of outparams).
>> Why do you think this, and what warnings are there?
>
> I want to see the warnings count trend down to 0. Making it go orange
> on >0 immediately would never turn the box green. I should also make
> it clear that I don't want to limit this counting to just
> static-analysis warnings but any warning the compiler produces, e.g.,
> shadowed variable warnings, etc.

I want to see warnings go down too, but I'm not convinced that is a
super-high priority that should take releng resources at the present
time. Let's make that decision separately from the error cases, which we
could deal with in parallel builds now?

--BDS

Joshua Cranmer

unread,

Aug 30, 2011, 3:27:51 PM8/30/11

to

On 8/30/2011 2:17 PM, Benjamin Smedberg wrote:
> On 8/30/2011 3:08 PM, Joshua Cranmer wrote:
>>
>> A non-parallel static analysis build on my computer took several
>> hours, where a normal build is around a half-hour.
> Then you want a non-parallel build? Do you also want all the
> warning-counting and reporting features that the original box had? If
> so, we're probably going to need some significant build power, and
> this may slow down complete try results significantly.

I definitely don't think the analysis builds should be done with -j1
equivalency. Since you suggest disabling the outparam analysis, I'll try
my next rebuild with that disabled and with -j5 and report on
from-scratch build times with and without static checking.

> I want to see warnings go down too, but I'm not convinced that is a
> super-high priority that should take releng resources at the present
> time. Let's make that decision separately from the error cases, which
> we could deal with in parallel builds now?

The warnings is lower on my priority; my intention was to eventually put
it on the tinderbox, but omit warning counts as a first pass.

Scott Johnson

unread,

Aug 30, 2011, 5:08:49 PM8/30/11

to dev-pl...@lists.mozilla.org

On 08/30/2011 11:41 AM, thus spoke Joshua Cranmer:

> I would like to propose that we reintroduce the static analysis
> tinderbox for mozilla-central and related tinderboxes; furthermore, I
> would also like to have a policy of requiring that this tinderbox go
> green for a checkin to land.

I think this is a great idea in general, but one thing I've observed
from my previous work in Java with static analysis tools (e.g. Coverity
and FIndbugs) is that developers tend to rely on the tool too heavily.
What I mean is that sometimes Coverity or findbugs would find something
it considered to be a bug, but it's really a false positive. The
developer wouldn't think it through, and just make a change. Since most
of these changes are easy to make, there was this blanket code fixing
that went on, introducing many regressions, as the changes weren't
thought through properly.

Obviously, I'm not saying this will happen, I'm just stating that it's
something to be aware of. One thing we could never decide on was whether
there should be a single bug opened for each flagged static analysis
bug, one bug for each collection of static analysis findings, or one bug
for all static analysis findings. This isn't a problem with new code
that's written - if your code that you're adding has problems in it, you
fix it under the same bug you're fixing to begin with. It's the large
amount of existing code we'd have problems with (unless there are no
static analysis problems within the existing code, which would be hard
to believe).

Another issue is test coverage. How can we track that a fix to a static
analysis bug didn't introduce some new regressions without really great
automated test coverage? (I don't have an answer for this, I'm just
thinking out loud here).

~Scott Johnson

Joshua Cranmer

unread,

Aug 30, 2011, 5:33:40 PM8/30/11

to

On 8/30/2011 4:08 PM, Scott Johnson wrote:
>
> On 08/30/2011 11:41 AM, thus spoke Joshua Cranmer:
>> I would like to propose that we reintroduce the static analysis
>> tinderbox for mozilla-central and related tinderboxes; furthermore, I
>> would also like to have a policy of requiring that this tinderbox go
>> green for a checkin to land.
> I think this is a great idea in general, but one thing I've observed
> from my previous work in Java with static analysis tools (e.g.
> Coverity and FIndbugs) is that developers tend to rely on the tool too
> heavily. What I mean is that sometimes Coverity or findbugs would find
> something it considered to be a bug, but it's really a false positive.
> The developer wouldn't think it through, and just make a change. Since
> most of these changes are easy to make, there was this blanket code
> fixing that went on, introducing many regressions, as the changes
> weren't thought through properly.

Right now, most of our scripts (5 out of 8) are triggered by explicit
annotations--e.g., checking that a function must override a base class
function. Another two checks are triggered automatically in some pretty
narrow circumstances (one is if it's a child of nsIFrame, the other if
it's a static initializer) and have narrow goals (that it must have a
frame IID in the first case, the latter warns about functions being
called during static initialization). The final one is the most open-ended:
/*
* Check that only JS_REQUIRES_STACK/JS_FORCES_STACK functions, and
functions
* that have called a JS_FORCES_STACK function, access cx->fp directly or
* indirectly.
*/

The js-stack check, static-init check, and the outparams check (which
I'm not including in this list) are all warnings while the other
failures are errors. I suspect that we should adopt a policy of making
static-analysis failures warnings if they are likely to have several
failures in the tree already or if false-positives could be a problem.

Benjamin Smedberg

unread,

Aug 30, 2011, 5:42:24 PM8/30/11

to Scott Johnson, dev-pl...@lists.mozilla.org

On 8/30/2011 5:08 PM, Scott Johnson wrote:
>
> On 08/30/2011 11:41 AM, thus spoke Joshua Cranmer:

>> I would like to propose that we reintroduce the static analysis
>> tinderbox for mozilla-central and related tinderboxes; furthermore, I
>> would also like to have a policy of requiring that this tinderbox go
>> green for a checkin to land.

> I think this is a great idea in general, but one thing I've observed
> from my previous work in Java with static analysis tools (e.g.
> Coverity and FIndbugs) is that developers tend to rely on the tool too
> heavily. What I mean is that sometimes Coverity or findbugs would find
> something it considered to be a bug, but it's really a false positive.
> The developer wouldn't think it through, and just make a

One of the primary advantages of our system is that we have complete
control over it. The analysis scripts are checked into the tree, and we
can control them with annotations to suppress particular cases if
necessary. I don't think this is a problem.

To the extent that it is a problem, we don't need to worry about
bug-filing or any of that, because the errors are all issued at compile
time and turn the build red.

--BDS

Joshua Cranmer

unread,

Aug 30, 2011, 7:18:46 PM8/30/11

to

On 8/30/2011 12:49 PM, Benjamin Smedberg wrote:

> On 8/30/2011 12:41 PM, Joshua Cranmer wrote:
>> I would like to propose that we reintroduce the static analysis
>> tinderbox for mozilla-central and related tinderboxes; furthermore, I
>> would also like to have a policy of requiring that this tinderbox go
>> green for a checkin to land.
> There are two ways to introduce this:
>
> * turn static-checking on by default for some of the existing builds
> on Linux (debug, release, or both)
> * create a new build specifically for static checking
>
> I think it's probably preferable to use existing builds, but we'd need
> to understand the build-time cost of the option.

A build with --disable-debug --enable-optimize --disable-elf-hack
completed in 31min under -j5.

I don't have timing for the static-checking build, since it died in
js/jsd and I apparently forgot to say time, but around 30 minutes after
I started the build, I found it in js/jetpack (again under -j5). For
reference, in the same amount of time, the static analysis failed to
make it even to content and layout. I would expect, then, that a static
analysis build would take around the amount of time that a windows PGO
build is taking at present.

Relevant other notes:
* Linux x86-64 VM
* 4 GiB of memory, 4 CPUs (on a quad-core with hyperthreading)
* Disk speed did not appear to be a limiting factor [it wasn't pegged]
* The non-static-checking build used gcc 4.6 and the static-checking one
used gcc 4.5
* A top executed in the middle indicated only one cc1plus active [at
100%]; the others were present but at 0%. I suspect memory is a limiting
factor.
* The outparams analysis was removed from the list.

Taras Glek

unread,

Aug 31, 2011, 6:36:27 PM8/31/11

to Scott Johnson

I have my doubts about usefulness of general purpose analyses.

Luckily, our analyses are Mozilla-specific. If they are ever producing
unhelpful results that's a bug in the analysis. This is why outparams is
being phased out.

>
> Another issue is test coverage. How can we track that a fix to a static
> analysis bug didn't introduce some new regressions without really great
> automated test coverage? (I don't have an answer for this, I'm just
> thinking out loud here).

Same as any other test failure

Taras