shaving nanos by the billion

35 views
Skip to first unread message

Paul Phillips

unread,
May 8, 2011, 3:13:18 AM5/8/11
to Odersky Martin, scala-i...@googlegroups.com
I took 30+% off all startup times tonight.

% time rcscalac -J-d32 -d /tmp/out /scala/trunk/src/library/scala/Immutable.scala
real 0m2.515s

% time pscalac -J-d32 -d /tmp/out /scala/trunk/src/library/scala/Immutable.scala
real 0m1.584s

% time rcscala -J-d32 -nc -e '5+5'
real 0m3.161s

% time pscala -J-d32 -nc -e '5+5'
real 0m2.278s

Hey scala fans, these kinds of gains have been dangling from the fruit tree for years, waiting for the arrival of someone with a clue how to wield a profiler. No such luck, and eventually a blind squirrel found his way to one of the nuts. If you were holding back because you thought all the easy coconuts had already been picked, now you know better. I bet this was the first nut someone who knew something about profiling would have gone for. I'm sure I'll be hammering on it until only brown powder and coconut residue remain.

http://lampsvn.epfl.ch/trac/scala/changeset/24909

Well I missed 2.9, but now we all have an advance reason to move to 2.9.1.

Sébastien Bocq

unread,
May 8, 2011, 4:41:42 AM5/8/11
to scala-i...@googlegroups.com, Paul Phillips

2011/5/8 Paul Phillips <pa...@improving.org>


Nice! Was there really a significant overhead for drop and dropRight compared to calling substring directly? If I look at how StringOps is implemented (2.8) there are only a few levels of indirection between the two.

Thanks,
Sébastien

Francois

unread,
May 8, 2011, 4:43:36 AM5/8/11
to scala-i...@googlegroups.com
On 08/05/2011 09:13, Paul Phillips wrote:
> I took 30+% off all startup times tonight.
>
> [...]

Wow, that's just amazing ! Congrats

> http://lampsvn.epfl.ch/trac/scala/changeset/24909
>
> Well I missed 2.9, but now we all have an advance reason to move to 2.9.1.

That's too bad, but you're right, now we will all be waiting for 2.9.1
with eagerness.


(that's just impressive...)

--
Francois ARMAND
http://fanf42.blogspot.com
http://www.normation.com

Ruediger Keller

unread,
May 8, 2011, 4:50:14 AM5/8/11
to scala-i...@googlegroups.com, Odersky Martin
Paul, very nice!

I thought you had written before that you were using a profiler to
find spots for optimization. But perhaps then you were profiling the
compiler, instead of the "startup infrastructure". Just wondering.

Regarding you changeset's comment, it makes me want to have more
optimizations in Scala. ;-)

> Frequent guest stars in the parade of slowness were: using Lists and ListBuffers?
> when any amount of random access is needed,

Ok, not much that can be done about this besides changing the code.

> using Strings as if one shouldn't have to supply 80 characters of .substring
> noise to drop a character here and there, imagining that code can be reused in any way
> shape or form without a savage slowness burn being unleashed upon you
> and everything you have ever loved,

Hmm, perhaps some String specific optimizations to drop and take, etc.
would help, but I see you did just that by optimizing the underlying
slice. What remains is optimizing away the implicit conversion, I
guess.

> String.format,

Hmm, probably not the fasted method to create a String.

> methods which return tuples,

I remember you writing a patch for optimizing this a lot, also
removing the secretly created tuple field when used in a constructor.
AFAIR Martin was against it, so it never went in. Perhaps someone
should convince Martin, that we need this. ;-)

> and any method written with appealing scala features which
> turns out to be called a few orders of magnitude more often than the
> author probably supposed.

I have a feeling that implicit conversions (and the allocations they
cause) might have a rather big part in this. I remember someone
posting an analysis of the optimizer and it's (rather severe)
shortcomings. If I remember correctly implicit conversions are
(almost?) never optimized. Is there currently anyone maintaining the
optimizer? Any chances of some enhancements coming in after 2.9?

Btw. up to now I thought the Scala team was already profiling the
compiler and time critical infractructure, but if you are not, I can
take a stab at it. Perhaps I can find something...

Ah, and also when I began rewriting the Scaladoc HTML generator, I
noticed that it uses ++ for String concatenation. That seems to be a
whole lot slower than plain String concatenation with +.

Regards,
Rüdiger

PS: Which profiler do you use? I'm using VisualVM at home, because I
don't know a better free alternative. Is there?


2011/5/8 Paul Phillips <pa...@improving.org>:

Paul Phillips

unread,
May 8, 2011, 11:40:04 AM5/8/11
to Sébastien Bocq, scala-i...@googlegroups.com
On 5/8/11 1:41 AM, S�bastien Bocq wrote:
> Nice! Was there really a significant overhead for drop and dropRight
> compared to calling substring directly?

Anything I touched in that patch is a direct result of repeatedly going
after the biggest lump under the carpet.

> If I look at how StringOps is implemented (2.8) there are only a few
> levels of indirection between the two.

It looks like less than it is. For one thing all the chars are boxed.
Unnecessary boxing overhead was probably the biggest factor overall.

Paul Phillips

unread,
May 8, 2011, 12:44:53 PM5/8/11
to scala-i...@googlegroups.com, Ruediger Keller
On 5/8/11 1:50 AM, Ruediger Keller wrote:
> I thought you had written before that you were using a profiler to
> find spots for optimization. But perhaps then you were profiling the
> compiler, instead of the "startup infrastructure". Just wondering.

I've profiled plenty - whole days have vanished into its gaping maw. I
just haven't been very good at translating that effort into anything
useful. I didn't set out to optimize startup last night, all I did was
stop trying to profile anything interesting and focused on profiling the
compilation of one program, which was:

trait Immutable

> PS: Which profiler do you use? I'm using VisualVM at home, because I
> don't know a better free alternative. Is there?

Free, no. I mean, I don't know; I've tried them all at various times,
but none of them compared to yourkit. Some people find the netbeans
profiler useful.

Reply all
Reply to author
Forward
0 new messages