Bug in StatsUtils.median

64 views
Skip to first unread message

Sjors

unread,
May 20, 2012, 7:40:05 AM5/20/12
to epo...@googlegroups.com, Sjors van Berkel, Andrei Pruteanu
Hi guys,

While using the EpochX framework for a project, I uncovered a very delicate bug that can have a very disastrous effect on the outcome of the program. The bug is in the StatsUtils.median function, which unintentionally sorts the array. In other methods having similar functionality, the arrays are generally cloned before they are adjusted, and I think the median function should also do this.

The effect of this bug can really propagate throughout the program. For example in the following piece of pseudocode:

on generation end:
1. get median fitness of programs
2. get best program of generation

In step 1, the array with fitnesses gets sorted and cached as a statistic for that generation. In step 2, getting the best program assumes that the array of fitnesses is ordered in the same way as the array of programs, such that the index for the lowest fitness corresponds to the index for the best program. However, because the fitness array is sorted, they have a different ordering, and the returned program could be any program.

The fix is to clone an array before sorting it.
Are you willing to release an update containing this fix, or should I create a custom version for my project?

Thanks in advance!
Kind regards,

Sjors

Tom Castle

unread,
May 23, 2012, 4:25:59 PM5/23/12
to epo...@googlegroups.com, Sjors van Berkel, Andrei Pruteanu
Hi Sjors,

Thanks for uncovering this and for the suggested fix! I'll make sure this gets fixed in the next release. This won't be immediate though, so the best work around for now, is probably to define your own replacement Stat to use. If you're willing to post your code for that here, others may find it useful.

Thanks again!
Tom

Sjors van Berkel

unread,
May 29, 2012, 8:38:40 AM5/29/12
to epo...@googlegroups.com, Andrei Pruteanu
Hi Tom,

Thanks for your reply! The fix is very small:

In StatsUtils.java, in the org.epochx.stats package, go to the median functions (there's one for integers, and one for doubles), and change:
Arrays.sort(values);
to
Arrays.sort(values.clone());
in both functions.

That's it! I've been running my version for a week now and I haven't found any regressions, so that's good news.

Kind regards,

Sjors
Reply all
Reply to author
Forward
0 new messages