Adventures in Golang: Mutation testing in Go

Ondekoza

unread,

Feb 16, 2013, 3:25:57 PM2/16/13

to golan...@googlegroups.com

============================================

I always thought that error seeding and mutation testing are cool ideas. You intentionally introduce errors into your source code and see what happens when you run your tests. The theory behind error seeding also claims to be able to predict the total number of errors in a body of code, but I am interested for another reason. Error seeding allows me to enhance the coverage and quality of my unit tests.

The plan works like this:

Make an atomic change to the source code.

Run the tests.

If the test reports a 'fail' that's good, because the test actually caught the patch. But if the test runs 'ok', this means that the modification - representing a coding error - was not detected! Actually this procedure is called 'Mutation testing' in the literature instead of 'Error seeding'.

Now: manual error seeding is cumbersome and error-prone, difficult to manage without the help of specialized tools.

The obvious solution is to patch the source code automatically with a tool. This allows to patch code effectively, reproducibly and using a large number of strategies.

But another problem persists. If you re-compile a substantial body of code written in C/C++, the time consuming compilation quickly becomes an obstacle. You can make only one modification per cycle, because as soon as you introduce more than one error, the errors might cancel out. Or - if you introduce a larger number of modifications - it will become cumbersome the figure out which of the modifications produced the problem or you may have even non-linear dependencies between the modifications.

Thus, error seeding as a concept is dropped.

Enter Go

-----------------

The quick compilation, lean syntax, and excellent test framework out-of-the-box make error seeding suddenly a feasible strategy.

I haven't spoken about the mechanics of error seeding yet. I want to do this by way of example. In this example, I am using a Perl script to change one '==' comparison for equality with a '!=' test for inequality. I would have preferred to have a Go-only version for this text, but I don't know enough about the parser to write this error-seeder on a low-level, and Perl is always a good fit when manipulating text.

Applying it to 'tabwriter'

------------------------------------

Next, I need a package as a victim. I chose a harmless package from the standard library: 'tabwriter'. It has only one source file 'tabwriter.go' and one test file 'tabwriter_test.go'. I renamed the package to 'mytabwriter' in these two files and copied them to a temporary folder (to avoid namespace collision with the original files).

Then I wrote the Perl script 'mutator.pl'. It takes the original mytabwriter source code and creates 'mutants', modified versions named 'mytabwriter.go.NUMBER'. Number means the n'th comparison operator was changed from '==' into '!='.

I found that tabwriter.go contains sixteen '==' operators. mutator.pl has produced 16 different versions and each has one operator patched. Then the Perl script copies these mutants back and forth and runs the 'go test' command and checks the result. Then the script renames the n'th mutation and appends either OK or FAIL to the filename. The result is the following list:

mytabwriter.go

mytabwriter.go.1.OK

mytabwriter.go.2.FAIL

mytabwriter.go.3.OK

mytabwriter.go.4.FAIL

mytabwriter.go.5.FAIL

mytabwriter.go.6.FAIL

mytabwriter.go.7.FAIL

mytabwriter.go.8.FAIL

mytabwriter.go.9.OK

mytabwriter.go.10.FAIL

mytabwriter.go.11.FAIL

mytabwriter.go.12.FAIL

mytabwriter.go.13.FAIL

mytabwriter.go.14.FAIL

mytabwriter.go.15.FAIL

mytabwriter.go.16.FAIL

This tells me that for the modifications 1, 3 and 9, the 'go test' run did not catch the modification. I examined these particular patches to figure out, why that was the case.

CASE 1:

diff mytabwriter.go mytabwriter.go.1.OK

165c165

< // if padchar == '\t', the Writer will assume ...

---

> // if padchar != '\t', the Writer will assume ...

This is commented out code. It has no functionality. This is not a big problem. Mutant.pl does not recognize comments, so no big deal.

CASE 3:

diff mytabwriter.go mytabwriter.go.3.OK

216c216

< if n != len(buf) && err == nil {

---

> if n != len(buf) && err != nil {

This gets more interesting. There is no test case that checks 'err'!

CASE 9:

diff mytabwriter.go mytabwriter.go.9.OK

415c415

< if b.flags&StripEscape == 0 {

---

> if b.flags&StripEscape != 0 {

Another interesting case. There is obviously no test case that makes a difference for the flag StripEscape (at least in this special case). Or in other words: Whether this if-branch is entered or not does not make any difference. I tried to write a testcase that would actually test this. I examined a variable named b.cell.width, but I could not figure out what it does. This variable is not read, only set in tabwriter.go. I probably simply don't get it.

Since there is no check for this particular case, I don't even know if the original code is correct!

Conclusion

-------------------

So, using a fairly primitive technique - I could easily come up with a whole bunch of other more intricate modifications (cycle through the various comparisons, add 1 to any computation, change + to -, and so on) - I found two potential problems in the tests, perhaps even the code.

Perhaps it's possible to create a framework for Go to apply patterns of errors to packages, and produce some nicely formatted HTML output to report the result. This would allow developers to enhance the quality and robustness of Go's tests and this would be of benefit for everyone involved.

Code was re-indented for readability. Platform is Windows with Go 1.0.3., Script on Github

https://github.com/StefanSchroeder/Golang-Mutation-testing

Nate Finch

unread,

Feb 16, 2013, 4:23:48 PM2/16/13

to Ondekoza, golan...@googlegroups.com

That's a very interesting way to check how well your code is tested. Most tools just tell you if your tests run a function, but not if they'll actually detect errors in the function. It's basically a test tester. Very cool.

Using the go parser you could do all kinds of interesting mutations without having to write crazy regular expressions. I think I'd use it only on functions that are particularly error prone, but I think it has merit in those cases.

I'd much prefer a go-only solution of course. I try never to touch Perl if at all possible.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

John Asmuth

unread,

Feb 16, 2013, 4:36:38 PM2/16/13

to golan...@googlegroups.com, Ondekoza

I think that this is excellent, and I agree with Nate's suggestion of using go/parser and go/printer.

Kamil Kisiel

unread,

Feb 16, 2013, 5:17:40 PM2/16/13

to golan...@googlegroups.com, Ondekoza

I'm at home with an ugly cold and a bit bored, so I've rewritten this basic example with go/parser and go/printer. It could at least serve as a starting point for future experimentation. Will put it up on github as soon as I fix up a few things to do with file handling.

Ondekoza

unread,

Feb 16, 2013, 5:35:06 PM2/16/13

to golan...@googlegroups.com, Ondekoza

Excellent! That's what I was hoping for and only 2h after the original post!

Kamil Kisiel

unread,

Feb 16, 2013, 5:53:00 PM2/16/13

to golan...@googlegroups.com, Ondekoza

Here's some proof of concept code: https://github.com/kisielk/mutator

It's still quite rough, and only supports the two operators in your original example, and doesn't actually run the tests, just generates the mutated files in to directories at the location specified by the -o flag.

It does all the work using the go/parser, go/ast, and go/printer packages.

Dan Kortschak

unread,

Feb 16, 2013, 6:12:03 PM2/16/13

to Kamil Kisiel, golan...@googlegroups.com, Ondekoza

Taking a Mutate(orig token.Token) (token.Token, bool) function would
probably be the best way to do this for the general case. When go/types
is available, replacement of evaluated expressions with type-equivalents
becomes possible. At this stage Go GAs become potentially interesting.

Just musing.

Kamil Kisiel

unread,

Feb 16, 2013, 6:35:01 PM2/16/13

to golan...@googlegroups.com

Sure, it's entirely possible to architect this idea every which way :) I was just shooting for not doing it in Perl and this is what I came up with.

Dan Kortschak

unread,

Feb 16, 2013, 6:43:04 PM2/16/13

to Kamil Kisiel, golan...@googlegroups.com

And a very fine target to shoot for too.

David DENG

unread,

Feb 16, 2013, 7:02:20 PM2/16/13

to golan...@googlegroups.com, Ondekoza

Line 92 should be the following?:

exp.Op = operators[*rep]

David

David DENG

unread,

Feb 16, 2013, 7:33:58 PM2/16/13

to golan...@googlegroups.com

This is very interesting!

Go's parser (as mentioned by someone else) is a powful package for writing cool tools like this. I think the best part is it is always upgraded as the languange is upgraded.

David

Kamil Kisiel

unread,

Feb 16, 2013, 8:03:27 PM2/16/13

to golan...@googlegroups.com

Thanks, I renamed a bunch of the variables before uploading to github and messed that one up :)

Kevin Gillette

unread,

Feb 16, 2013, 8:14:02 PM2/16/13

to golan...@googlegroups.com

Interesting concept! It does seem like it can raise a whole lot of false-positives that are guaranteed to work if the other parts of the system that a block of code interacts with are implemented correctly. I think this technique could find additional use in analyzing input and output parameters for a given function in order to generate separate wrappers that mutate function inputs and outputs. Mutating input wrappers would serve to test the behavior of that function, while mutating output wrappers would test the behavior of callers. In all cases, however, there's a limit in Go where it's not reasonable to account for every unreasonable case (those often end up as panics anyway) -- trading, sometimes considerable, code reductions in order to improve simplicity and maintainability (so that it's trivial to verify that the code works correctly without computer assistance).

David DENG

unread,

Feb 16, 2013, 10:25:14 PM2/16/13

to golan...@googlegroups.com

A good refactoring tool is necessary.

David

Kamil Kisiel

unread,

Feb 17, 2013, 2:00:45 AM2/17/13

to golan...@googlegroups.com

I spent some more time on it this evening and got it running the tests. Also it now acts on an entire package instead of a single file. The program copies the package to a temporary directory and modifies each file in turn before running the tests. The reporting still needs more work and I don't think it works as well as the original yet, but it's nearly a useful tool.

On Saturday, February 16, 2013 2:35:06 PM UTC-8, Ondekoza wrote:

Rory McGuire

unread,

Feb 18, 2013, 3:49:46 AM2/18/13

to golan...@googlegroups.com

interesting idea, especially if you take the other idea floating around somewhere on the NG about finding all changes in golang projects on github that consist of changes less than say 2-3 lines, as these are likely to be bug fixes.

You could use that idea to automated the introduction of common bugs.

Hǎiliàng

unread,

Feb 26, 2013, 5:16:28 AM2/26/13

to golan...@googlegroups.com

How is mutation testing compared to static code coverage analysis?

Hǎiliàng

On Sunday, February 17, 2013 4:25:57 AM UTC+8, Ondekoza wrote:

Nate Finch

unread,

Feb 26, 2013, 4:12:03 PM2/26/13

to golan...@googlegroups.com

On Tuesday, February 26, 2013 5:16:28 AM UTC-5, Hǎiliàng wrote:

How is mutation testing compared to static code coverage analysis?

Coverage ensures that your tests *use* the code.. but they don't ensure that your tests will fail if the code is broken.

For example:

// surrounds the text with brackets

func Foo(s string) string {

return "[" + s + "]"

}

func FooTest(t *testing.T) {

s := Foo("abc")

if s == nil {

t.Fail("string returned from Foo shouldn't be nil!")

}

Coverage would tell you that you have 100% test coverage of your code... but you're not fully testing the functionality of the code. yes, the returned string shouldn't be nil, but we didn't actually test that the given string was returned surrounded by brackets. Mutation testing would go in and change one of the brackets in the function to something else like "abc" and see if any of your tests fail. If they don't, then you know you haven't fully tested that function, regardless of what coverage may say.

Markus Zimmermann

unread,

Dec 30, 2014, 2:03:34 PM12/30/14

to golan...@googlegroups.com

I implemented a mutation testing tool called go-mutesting https://github.com/zimmski/go-mutesting which I announced here https://groups.google.com/forum/?fromgroups=#!topic/golang-nuts/oo46Uh6k2F0 I hope that somebody is still interested in the technique and I am hoping to receive some feedback on the project as well as your experience on using the tool and mutation testing in general.

Reply all

Reply to author

Forward