I have been chasing a weird bug
for some time now - and finally found it. I thought
MapLike.mapValues() was a shortcut variant of map(), but to only
map the values. Easy and less code.
The ScalaDoc of mapValues reads "Transforms this map by applying a
function to every retrieved value". A generic explanation that
might mean a lot of things. And surely, until now it always did
the expected. I thought I was safe anyway because I made sure to
use only immutable maps.
However, mapValues() causes the transformation function to run
each time the map value is accessed. Which in my case is a big
nono because the transformation takes a lot of time. Hence, doing
it just once and store the results.
This is solved by using map() instead. It does the transformation
once immediately.
While I understand the need for such functionalities, it also
introduces the danger of misunderstanding them. It is poorly
documented and the assumption of being safe when using immutable
maps. Immutable => no side effects involved => safe. In my
opinion, this is not an immutable map - because the values can
change. Even though by deed I understand the whole structure is
immutable. That's debate I won't even try to start. :-)
So, what's the rationale - Is it part of a concept I don't know
yet about ? A concept with which "mapValues" would mean something
else than mapping the values alone ?
And more importantly: Are there any other such traps hiding in
Scala's API ? :-)
Is there a way to recognize them without checking every method
explicitly first ?
Thanks !
Jan
====================================================================
class MapTransformationLearningSpec extends FunSpec {
var addStatisticsCalled = 0
describe("a Scala programmer") {
it("should know how map transformations work") {
val e = new Variable("e", "energy", classOf[Double])
val m = new Variable("m", "mass", classOf[Double])
val c = new Variable("c", "speed of light", classOf[Double])
val vars = Seq(e, m, c)
assert(addStatisticsCalled == 0, "addStatistics() was called
%d times".format(addStatisticsCalled))
val map: Map[String, Variable] = vars.map( v =>
v.name
-> v).toMap.
mapValues(v => addStatistics(v))
// the map isn't empty, is it ?
assert(!map.isEmpty)
assert(map.size === 3)
// I expect addStatistics() to be called already - was it ?
println("Before accessing the statistics: addStatistics()
was called %d times".format(addStatisticsCalled))
// do the operation to check the statistics - does it call
addStatistics then ?
map.values.foreach(v => {
assert(v.stats != None)
})
// do the operation to check the statistics - does it call
addStatistics all over again ?
map.values.foreach(v => {
assert(v.stats != None)
})
// By now it should ideally be called three times only
println("Before accessing the statistics: addStatistics()
was called %d times".format(addStatisticsCalled))
// At least be called sometime
assert(addStatisticsCalled == 3, "addStatistics() was called
an unexpected number of times: %d".format(addStatisticsCalled))
}
}
def addStatistics(in: Variable): Variable = {
addStatisticsCalled += 1
val stat = new Statistics(0, 10, 100)
in.copy(stats = Some(stat))
}
}
case class Variable(name: String, title: String, dataType:
Class[_], stats: Option[Statistics] = None)
case class Statistics(min: Double, max: Double, count: Int)