Should case class members use concrete collection types?

36 views
Skip to first unread message

Daniel Armak

unread,
Oct 11, 2015, 6:39:53 AM10/11/15
to scala-user
Hi all,

This is a question about trade-offs. I know different libraries adopt different approaches. Is there a known best practice or consensus? Do people care about the difference?

When modeling data with case classes whose members are collections, we can use concrete collection types (List, Vector) or traits (Seq, etc). 

Pro concrete types: this makes both performance and semantics definite. If a library has many methods operating on the case classes, they may be more efficient with particular collection types. Some people also like using List explicitly for its :: . 

Con: conversions of data to/from the case class are expensive and explicit. And if different libraries used different concrete types, we would need to convert all the time at the interface between them. 

Pro abstract types: using the least required abstract type (usually Seq) makes it cheap to construct case classes around existing collections of any type. It also avoids what is essentially premature optimization (eg List vs Vector). (There's also a sort of middle ground by requiring LinearSeq or IndexedSeq.)

Con: any algorithms with performance guarantees still have to convert to a concrete type. If the user knows this and constructs the case classes using the right data type to begin with, it doesn't cost anything, but the code is still full of calls toList or toVector, which is ugly. 

Also, using Seq results 9 times out of 10 in using scala.Seq instead of scala.immutable.Seq by mistake, and you almost never want a mutable collection in a case class. (I'm should use a codebase-wide inspection to warn on all usages of scala.Seq...)

Which do you prefer?

Thanks,

Daniel Armak

martin odersky

unread,
Oct 11, 2015, 10:53:42 AM10/11/15
to Daniel Armak, scala-user
I go with "premature optimization is the root of all evil", so
abstract wins. And, yes, it would be much better if Seq was immutable.
Hopefully we can fix this in some future iteration of the stdlib.

Btw, choosing collection.Seq as the default was motivated by the fact that
vararg parameters are specified to be Seq's, so

def f(xs: String*): Seq[String] = xs

must be typeable. But Java varargs are passed as arrays, which are
mutable, and the standard wrapper from arrays to sequences also
produces a mutable Seq. All this did not fit with immutable.Seq, so
defining

type Seq = scala.collection.immutable.Seq

in package scala was an easy way out. I think the right approach
should be to support an immutable Array as a separate type. This is
harder to implement, and will most likely require some compiler
support. But then varargs could be immutable arrays, and there would
be no problem to map this to an immutable Seq. So I think that's the
right thing to do.

- Martin


> You received this message because you are subscribed to the Google Groups
> "scala-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scala-user+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL
Reply all
Reply to author
Forward
0 new messages