Using JaCoCo for Differential Code Tracing

48 views
Skip to first unread message

Toby Tobkin

unread,
Jan 3, 2017, 4:16:15 PM1/3/17
to JaCoCo and EclEmma Users
Hi JaCoCo devs and users,

I wanted to know if there would be any interest in me writing up a feature and some documentation for something I found JaCoCo very useful for recently: finding bugs in large, legacy codebases that can't be test harnessed.

I was working on one of those big, old, multimillion-line, legacy Java codebases for a client recently. One of those all too common cases where nobody knows anything about it, no tests to help you make changes, etc, etc, etc.

On this codebase, there was a bug that caused improper tax calculation in one scenario but not another. Further, in total the procedure that included tax calculation touched thousands of methods. Much more than I wanted to dissect and analyze by hand. And just to add another layer of constraint, the software needed to be run on a specially configured server, precluding the possibility of doing any test analysis using e.g. a unit test framework.

"You know what would be perfect for solving this problem?," I thought. "Is if I could somehow get a diff of all the methods used when the bug occurs versus when it doesn't occur." I knew that if I could find some way to do this, it would narrow my search for the bug in this giant codebase immediately without having any "whitebox" knowledge about the codebase.

A quick Google search revealed some suggestions, but nothing that would work given my constraints:

Having thought about the problem for a few days, and knowing that JaCoCo was the underlying software that powers code coverage tools, I eventually came up with a solution leveraging JaCoCo:
1. Configure my server to use the JaCoCo Java agent
2. Create JaCoCo outputs for (1) the execution that produced correct tax calculation, and (2) the execution that produced incorrect calculations
3. Use the JaCoCo ant tasks to produce CSVs of these two executions
4. Use a script to create a diff between the two executions in the CSVs

It worked perfectly, and in a few minutes me and another engineer used the data to peg the problem on an outdated Java class that was supposed to be removed years ago.

Here are my questions:
1. Does this application of JaCoCo make sense? Or is there something I missed such as another existing tool that would have done the same thing?
2. If yes to 1, is it possible it would be worth it for me to add and maintain features/documentation to JaCoCo in order to support these use cases?

Any other feedback is welcome too! Thank you devs for maintaining such a great tool.

Toby
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages