Coverage vs. Exceptions

carste...@gmail.com

unread,

Nov 3, 2014, 12:15:51 PM11/3/14

to jac...@googlegroups.com

Dear all,

JaCoCo does not detect that certain instructions are run (covered) if these lead to an implicit exception. In the following example, if c() throws an exception, the result indicate no coverage for b().

public void foo() {
a();
try {
b();
c();
} catch (Throwable t) {}
}

In the following example, no coverage at all is reported, assuming c() again throws an exception.

public void bar() {
a();
b();
c();
}

Note that the last example is quite common when writing JUnit 4 tests using the "expected" keyword, as in the following example.

@Test(expected=SomeException.class)
public void testBar() {
...
}

I experimented with adding more probes inserted in front of opcodes that may thrown an (implicit) exception. With such probes more precise coverage results are obtained, at the cost of having more probes. This means a slower execution (analysis) speed, and bigger instrumented class files.

In my _preliminary_ experiments I see about 1-2% of increased coverage, if coverage of the Test classes is included in the report. This comes at the cost of about 1% more runtime when compared to the current release of JaCoCo.

I am now interested in your opinions on this. Do you think having a more precise analysis as described is worthwhile? Should the feature be in JaCoCo, or not? Should it be configurable, possibly disabled by default?

On a more technical note, it is possible to only add probes for some of the opcodes which may thrown an exception. For example, one could add probes for INVOKE opcodes (which would help with the JUnit tests, based on my experience), and not add probes for PUTFIELD and array opcodes.

Best regards,
Carsten

Marc R. Hoffmann

unread,

Nov 3, 2014, 3:31:53 PM11/3/14

to jac...@googlegroups.com

Hi,

we discussed this limitation many times before. Currently JaCoCo is
designed to add the smallest possible overhead to the application under
test as JaCoCo is mainly used for large scale projects.

I like the idea of adding additional probes before every INVOKE*
instruction as this is the most likely place for implicit instructions
(beside NPE, IOOB, CCE, etc.). It would be interesting to see how this
increases

1) the total number of probes
2) the execution time

for a reasonable sized test set like e.g. Apache commons collections.

BTW, I don't think this feature could be configurable: The way to
determine probes has to be absolutely deterministic at runtime and
analysis time. So it has to be ensured that the config setting must be
the same. Or the setting is written to the exec file, which makes
merging impossible of exec files with different settings.

Cheers,
-marc

Carsten Otto

unread,

Nov 5, 2014, 9:18:56 AM11/5/14

to jac...@googlegroups.com

Hi,

I had troubles with several Apache Commons projects, but I managed to compile some interesting results nevertheless.

In short:

- Adding probes for every INVOKE opcode results in about 2.4x as many probes.

- The runtime ranges from 100% to 146% of the current JaCoCo release.
- The number of instructions found to be covered increases by 0 to 2.5 percentage points (meaning, for example, the updated version finds 96% covered instructions while the current release finds 94%).

I am satisfied with the results (finding higher coverage percentages). But I also see that, depending on the analyzed code, having more probes can dramatically increase the runtime.

Do you need more numbers? (I'd need help then)

Do you think changing JaCoCo in order to produce results outlined in this mail is a good idea?

What do you think about having a separate version of JaCoCo which is more precise (and, sadly, slower)?

Can you think of a way to speed up JaCoCo while still having better coverage results?

Details:

Apache Commons Beanutils:

4677 probes -> 12529 probes (2.7x)

Runtime: 101%

Coverage: +2.0%

Apache Commons Collections:

19220 probes -> 59283 probes (3.1x)

Runtime: 99%

Coverage: +2.1%

Apache Commons Math:

49340 probes -> 148996 probes (3.0x)

Runtime: 146%

Coverage: +2.3%

Apache Commons Primitives:

11187 probes -> 28936 probes (2.6x)

Runtime: 124%

Coverage: +2.5%

Jenkins Core:

101964 probes -> 188196 probes (1.8x)

Runtime: 102%

Coverage: +0.0%

Fitnesse:

66503 probes -> 163509 probes (2.5x)

Runtime: 100%
Coverage: +0.3%

Best regards,
Carsten

--
You received this message because you are subscribed to a topic in the Google Groups "JaCoCo and EclEmma Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jacoco/s6Xbj69308Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jacoco+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Carsten Otto
carste...@gmail.com
www.c-otto.de

Marc R. Hoffmann

unread,

Nov 10, 2014, 3:18:49 PM11/10/14

to jac...@googlegroups.com

Hi Carsten,

thanks for this detailed analysis! The increase in probes is quite substantial. Interesting to see that the factor is that different for the different products.

What does the increased coverage mean at the end? The tests itself are not improved by using a different JaCoCo version. Now it is more correct in terms of the amount of code that is covered by a given test suite. Still the tests seem to lack to cover some of the non-exception execution paths.

It would be interesting to see some examples of highlighted source code for both versions of JaCoCo.

The modified version definitely shows more the code covered which has actually been executed and gives a better picture in case of anticipated exceptions. On the other hand since many years I try to teach people that they should not care about tested code or percentage values. The focus should be providing additional tests for yet untested code.

> What do you think about having a separate version of JaCoCo which is more precise (and, sadly, slower)?

Speaking for the JaCoCo project we will stay with one version. We can't handle multiple versions and this will also confuse users as many users use JaCoCo in different tools.

> Can you think of a way to speed up JaCoCo while still having better coverage results?

Not sure whether this can be implemented but I'm thinking about installing exception handlers with the instrumented code since quite some time. And as said before: The only reasonable way to get "better coverage results" is to improve your tests. Anything else does not add any value to your project.

Cheers,
-marc

To unsubscribe from this group and all its topics, send an email to jacoco+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Carsten Otto
carste...@gmail.com
www.c-otto.de

--
You received this message because you are subscribed to the Google Groups "JaCoCo and EclEmma Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jacoco+un...@googlegroups.com.

Carsten Otto

unread,

Nov 11, 2014, 6:00:17 AM11/11/14

to jac...@googlegroups.com

Hi Marc,

after some troubles understanding your mail, I think I now get what your main question seems to be.

You are correct in that whatever JaCoCo might return as a result, the analyzed code still is the same. It is exactly as good or bad as it was before running the analysis. Whatever change you did in the past to JaCoCo, including the change I propose here, does not change that at all.

However, and this is the important part, JaCoCo is not used to change code. It is used to see where code should be changed by the developer. For example, developers see which code is not tested. This makes the developer write more tests. These tests are of real value to the project (you might say tests are not strictly necessary for perfect code, but I think we both agree that good tests are very valuable). So, JaCoCo helps in identifying necessary changes ("Here's a scary blob of red code!"), and making sure the problem is solved after the developer adds more tests ("Hey, look, it's green now! Go find some other red blob!").

We need to talk about false positives now. I believe JaCoCo is correct ("sound") in the sense that no uncovered code is marked as covered/green. However, it is "incorrect" in the sense that covered code might be reported as uncovered/red. I'd like to call this problem "incomplete" instead of "incorrect" or "unsound". Any opcode marked as uncovered, although it in fact is covered, is a false positive. The obvious goal is to have as few of these false positives as possible, while still being correct/sound as explained above. JaCoCo is quite good at this, and this (together with the reporting, EclEmma, and a reasonable execution speed penalty) is the main reason for its success. I really like JaCoCo and EclEmma!

However, there still are false positives. Every single false positive is annoying, and confuses the user. As an extreme example, a virus scanner which detects all infected files, but also has a false positive every few minutes is just useless. You can't write a letter without being disturbed ten times by that virus scanner. The same principle can be applied to JaCoCo. If there are too many false positives reported by JaCoCo, the developer using it gets annoyed and needs more manual analysis. Depending on how demanding the developer is, and how many false positives are reported, this could lead to the developer not using or trusting JaCoCo at all.

The change I am proposing has the only advantage of decreasing the number of false positives. In other words, it does not change code. It does not directly add value to the project. However, it helps JaCoCo users, and helps them having the results they really need. Thus, they spend less time dealing with false positives, and can use this time to work on the actual uncovered code (leading to better tests, and more value). If you look at past comments and questions, some posted to this mailing list, you see that JaCoCo users are confused by false positives, and would like to have this problem solved.

So, I hope this covers the "what does it all mean" part of your question.

As a rather personal remark, I'd like to add that the percentage number reported by JaCoCo can also be used to provide value. For example, in a multi million Euro project done for a well known company with several 100,000s of employees, we used measurements including code coverage to answer the following questions: Should we throw away the existing code and start all over? Should we just refactor it? Where exactly are the problems that should be solved soon? Which parts of the code base are not as bad? How bad is it, and how much do we want that to change? Having a more precise percentage reported by JaCoCo leads to better decisions and, thus, to more value.

I answered your more detailled questions below.

On Mon, Nov 10, 2014 at 9:18 PM, Marc R. Hoffmann <hoff...@mountainminds.com> wrote:

It would be interesting to see some examples of highlighted source code for both versions of JaCoCo.

Sadly, my reports (generated by maven) do not allow me to see highlighted source code. Do you know how that can be changed easily? I can provide the HTML reports generated for the projects I mentioned in my previous report, though. In these you can see (and directly compare) the percentages and absolute numbers of missed instructions for individual packages/classes/methods.

> What do you think about having a separate version of JaCoCo which is more precise (and, sadly, slower)?

Speaking for the JaCoCo project we will stay with one version. We can't handle multiple versions and this will also confuse users as many users use JaCoCo in different tools.

OK, so we're talking about an inclusion of the changes, or a fork (as I really feel the need to get rid of this annoying incompleteness).

> Can you think of a way to speed up JaCoCo while still having better coverage results?

Not sure whether this can be implemented but I'm thinking about installing exception handlers with the instrumented code since quite some time.

I also experimented with this approach. Sadly, this approach is worse, as you'd need a probe for every single opcode which might throw an exception, in addition to said handler(s).

Best regards,
Carsten

Carsten Otto

unread,

Nov 20, 2014, 5:00:36 PM11/20/14

to jac...@googlegroups.com

I'm looking forward to some kind of reply, if you find the time.

Carsten Otto

unread,

Nov 30, 2014, 3:34:42 PM11/30/14

to jac...@googlegroups.com

Marc?

Marc Hoffmann

unread,

Dec 5, 2014, 8:18:54 AM12/5/14

to jac...@googlegroups.com

Hi Carsten,

> I also experimented with this approach. Sadly, this approach is
> worse, as you'd need a probe for every single opcode which might
> throw an exception, in addition to said handler(s).

Why, you can do the same approximation as for invoke* opcodes: Just
declare handlers for method invocations.

Cheers,
-marc

>> www.c-otto.de [1]
>
> --
>
> Carsten Otto
> carste...@gmail.com
> www.c-otto.de [1]
>
> --
>
> Carsten Otto
> carste...@gmail.com
> www.c-otto.de [1]

>
> --
> You received this message because you are subscribed to the Google
> Groups "JaCoCo and EclEmma Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to jacoco+un...@googlegroups.com.

> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jacoco/CA%2BgHx6Av5tVhDy5rk0cho8BVdtttSSK6f00G8s0Gcc3mC40oWQ%40mail.gmail.com
> [2].
> For more options, visit https://groups.google.com/d/optout [3].
>
>
> Links:
> ------
> [1] http://www.c-otto.de
> [2]
> https://groups.google.com/d/msgid/jacoco/CA%2BgHx6Av5tVhDy5rk0cho8BVdtttSSK6f00G8s0Gcc3mC40oWQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> [3] https://groups.google.com/d/optout

Carsten Otto

unread,

Jan 6, 2015, 9:04:58 AM1/6/15

to jac...@googlegroups.com

Hi,

first of all, sorry for the late reply. Holidays got in the way, and I also finally handed in my PhD thesis.

To your question:

The code change I proposed includes a probe in front of each INVOKE opcode, which basically is the only change. This way one can see if the code leading to the INVOKE opcode has been executed, or not. Currently, this cannot be determined in the case where an INVOKE opcode throws an exception. Instead, only partial information (derived from other probes) is used for the computation.

In case I get you correctly, the change you're suggesting goes along these lines:

BEFORE:

beforeDummy++;

methodInvocation();

afterDummy++

AFTER:

beforeDummy++;

try {

methodInvocation();

} catch (Throwable t) {

throw t; // if this block is reached, we know the invocation threw an exception

}

afterDummy++;

This way, you can also detect if the method invocation goes through. However, in addition to the necessary code instrumentation, you'd need to add more than one probe.

One could also extend the "try" block to cover, for example, the complete method. However, in thise case there is no way to determine which of the possibly many opcodes threw an exception.
Maybe I'm not getting you, though. Could you provide a (pseudocode) example of your idea?

PS: Could you please give me your current gut feeling regarding the choice of an inclusion or a fork? I've been approached by a user who was interested in the additional coverage results reported by my changes, and I strongly believe there is more interest in the community (this is an invitation to raise your voice, community!).

Carsten Otto

unread,

Jan 23, 2015, 2:26:29 PM1/23/15

to jac...@googlegroups.com

Marc, do you have a comment on this? Especially the "PS" is important to me right now.

Marc R. Hoffmann

unread,

Jan 24, 2015, 4:48:23 AM1/24/15

to jac...@googlegroups.com

Hi Carsten,

one of the main features of JaCoCo is its low overhead and scalability für *large* projects. To be honest I'm not willing to sacrify this for this feature. Before we would need some serious testing with large scale JaCoCo use cases. There are other tools like cubertura which follow a different startegy and e.g. insert a probe for every source line, which would give more precise results in the exception cases.

Regarding forks: I'm more than happy to see JaCoCo forks! This is a great opportunity to test alternative aproaches or major reworks with a actual user base. Still, if the we see that a major change in JaCoCo works out well for representative projects we can bring this back to JaCoCo.

Cheers,
-marc

Reply all

Reply to author

Forward