Tips for improving performance

308 views
Skip to first unread message

Donald McLean

unread,
Apr 13, 2012, 3:30:26 PM4/13/12
to scala-user
A while back, there was an email about Scala used in a corporate
system that included some observations on specific
practices/constructs to avoid and alternate practices/constructs. I
haven't been able to find it - would someone happen to have a link to
it in the repository?

Are there any other sources of performance related material?

I'm going to be working on profiling/tweaking my app and would like to
have some solid leads for things to look at.

Thank you,

Donald

Daniel Sobral

unread,
Apr 13, 2012, 4:01:23 PM4/13/12
to Donald McLean, scala-user
General guidelines are difficult and subject to contention, but here
are some anyway:

* Never do something like implicit def f(x) = new { ... }; instead,
create a class, and do new XXX.
* In fact, be sure implicit conversions are not doing something
unexpected. For example, "abc".size is much slower than "abc".length.
For this reason, avoid JavaConversions and use JavaConverters instead.
* List has very particular performance characteristics. Use it only if
you understand them well. Otherwise, play it safe and use Vector --
it's often better anyway. You should learn the performance
characteristics of the collections anyway -- they are documented
somewhere on docs.scala-lang.org.
* There are trade offs between immutable and mutable collections --
understand them and pick the right collection.
* Chaining multiple operations may lead to cache misses due to
excessive object allocation. In such cases, converting first to
Iterator and then back to the expected collection at the end can
improve performance significantly. In many cases, you don't even need
to convert back to a collection.
* Avoid structural types. By the way, doing "val x = new XXX { .... }"
leads to structural types. Doing "val x: XXX = new XXX { ... }" in
these cases avoid them.

These are cheap tips. Real performance needs two things:

1. Measurements. Use a library that collect timing statistics and
sprinkle calls to it along the processing pipelines of your code.
2. Profiling. Use tools to gather data about what the application is doing.

These two are related, but the former will miss everything you miss,
and may not tell you _why_ something is slow. On the other hand, it
can, and should, be performed on production as well. The latter can
find things you missed or did not know about, but often results in
serious performance degradation.

--
Daniel C. Sobral

I travel to the future all the time.

amulya rattan

unread,
Apr 13, 2012, 4:14:22 PM4/13/12
to Daniel Sobral, Donald McLean, scala-user
This should help:

It's effective scala talk by Josh Suereth. His book "Scala in Depth" is a fantastic read too in that front.

~Amulya

Rex Kerr

unread,
Apr 13, 2012, 4:19:43 PM4/13/12
to Daniel Sobral, Donald McLean, scala-user
On Fri, Apr 13, 2012 at 4:01 PM, Daniel Sobral <dcso...@gmail.com> wrote:
* Avoid structural types. By the way, doing "val x = new XXX { .... }"
leads to structural types. Doing "val x: XXX = new XXX { ... }" in
these cases avoid them.

This is #1 on my list of poorly designed Scala features (as opposed to sensibly designed but still buggy features, or lack-of-feature-that-I-wish-was-there).  Scala does the wrong thing ~99% of the time.  (In my code, as a Bayesian estimate.)

I hope that these will someday _not_ be structural types and all the "don't do that" advice will, thankfully, become irrelevant.

  --Rex

Luke Vilnis

unread,
Apr 13, 2012, 4:28:17 PM4/13/12
to amulya rattan, Daniel Sobral, Donald McLean, scala-user
Curious: does  "val x = new XXX { .... }" lead to structural types if you only overload abstract members? So like would:

trait Foo { def bar: Int }
(new Foo { val bar = 12 }).bar

lead to a reflective call?

Anyhow, thanks for the great list - definitely consistent with my limited experience. 

Here's an example of a big performance boost I got from avoiding chained operations: using "xs groupBy { something } mapValues { _.length }" to get word counts is much slower than iterating through the sequence and accumulating the counts in a mutable hashmap - this gets rid of intermediate lists and (depending on implementation) intermediate maps.

Erik Osheim

unread,
Apr 13, 2012, 4:30:10 PM4/13/12
to Daniel Sobral, Donald McLean, scala-user
I agree with everything Daniel wrote, and I would add a few things:

If you are dealing with a lot of primitives (e.g. huge arrays of
Doubles or Ints) you may want to avoid situations which will cause each
element to be boxed/unboxed when accessed/stored. Right now in 2.9 that
means that you may want to try to store these in Array[Int] rather than
List[Int] or Vector[Int] (Currently Array is the only collection that
doesn't cause boxing of primitives).

Similarly, if you find yourself operating on millions of values, you
may want to start paying attention to how many objects you create,
since at this point the allocation/gc overhead may start to be a
problem. One example of this would be to represent an array of points
as an array of X coordinates and an array of Y coordinates (rather than
an array of tuples) to avoid allocating the extra objects.

Finally, there are a class of situations involving collections and
loops where using higher-level constructs will impose a penalty. For
instance, while loops are currently still faster than foreach/for in
Scala (although hopefully not for long). It's hard to enumerate these,
but you can discover them through testing (and reading the source in
some cases).

In general I don't think these things are a problem for most people,
but they do come up.

-- Erik

HamsterofDeath

unread,
Apr 13, 2012, 5:26:34 PM4/13/12
to scala...@googlegroups.com
really? i always assumed it would only be a structural type if i add a method, for example:

val x = new JLabel {def notUsuallyThere = "hello"}
because if i don't - there is no reason for it to be a structural type

Daniel Sobral

unread,
Apr 13, 2012, 5:51:26 PM4/13/12
to Luke Vilnis, amulya rattan, Donald McLean, scala-user
On Fri, Apr 13, 2012 at 17:28, Luke Vilnis <lvi...@gmail.com> wrote:
> Curious: does  "val x = new XXX { .... }" lead to structural types if you
> only overload abstract members? So like would:
>
> trait Foo { def bar: Int }
> (new Foo { val bar = 12 }).bar
>
> lead to a reflective call?

Probably not -- I'd check with javap to make sure, though. The problem
happens when you add something that is not an override, and that leads
to structural types.

Rex Kerr

unread,
Apr 13, 2012, 5:52:39 PM4/13/12
to HamsterofDeath, scala...@googlegroups.com
On Fri, Apr 13, 2012 at 5:26 PM, HamsterofDeath <h-s...@gmx.de> wrote:
Am 13.04.2012 22:19, schrieb Rex Kerr:
On Fri, Apr 13, 2012 at 4:01 PM, Daniel Sobral <dcso...@gmail.com> wrote:
* Avoid structural types. By the way, doing "val x = new XXX { .... }"
leads to structural types. Doing "val x: XXX = new XXX { ... }" in
these cases avoid them.

This is #1 on my list of poorly designed Scala features (as opposed to sensibly designed but still buggy features, or lack-of-feature-that-I-wish-was-there).  Scala does the wrong thing ~99% of the time.  (In my code, as a Bayesian estimate.)

I hope that these will someday _not_ be structural types and all the "don't do that" advice will, thankfully, become irrelevant.

  --Rex

really? i always assumed it would only be a structural type if i add a method, for example:

This is entirely true, but it's not exactly helpful to make it a structural type instead of an anonymous class (which can then match a structural type if you ask for one).

When doing anything nontrivial, it's really easy to add a method and not mark it private.

  --Rex

Daniel Sobral

unread,
Apr 13, 2012, 6:21:05 PM4/13/12
to Rex Kerr, HamsterofDeath, scala...@googlegroups.com

Those were the words I was searching for. :-)

Rodrigo Cano

unread,
Apr 13, 2012, 6:30:09 PM4/13/12
to Daniel Sobral, Rex Kerr, HamsterofDeath, scala...@googlegroups.com
When doing anything nontrivial, it's really easy to add a method and not
mark it private.

Ok, but I expected that only the "strcutural methods" (that is, all the non inherited one) are dispatched via reflection, so if I extend something and add a non private method for example, still, if I only use that instance by its parent definition, then there should be no performance penalty right? after all, those methods need not use reflection.. right?

Rex Kerr

unread,
Apr 13, 2012, 7:07:58 PM4/13/12
to Rodrigo Cano, Daniel Sobral, HamsterofDeath, scala...@googlegroups.com
On Fri, Apr 13, 2012 at 6:30 PM, Rodrigo Cano <ioni...@gmail.com> wrote:
When doing anything nontrivial, it's really easy to add a method and not
mark it private.

Ok, but I expected that only the "strcutural methods" (that is, all the non inherited one) are dispatched via reflection, so if I extend something and add a non private method for example, still, if I only use that instance by its parent definition, then there should be no performance penalty right? after all, those methods need not use reflection.. right?

This is also true as far as I know (I don't recall a counterexample).  But it is easy to do something logically equivalent to this:

class UnfortunateReflection {
  class X { def x = 1 }
  val y = new X { def w = 2 }
  def sad = y.x * y.w
}

That is, you have a new instance with a bit of handy new capability which you then use.

It doesn't impact _every_ use case, but it's an easy one to run into.

  --Rex

Reply all
Reply to author
Forward
0 new messages