I'd suggest trying to get rid of the Option objects altogether (either encode using null or use a separate bool flag).
Indeed, and type tests aren't that cheap, especially when compared to
a stack boolean.
if (x1.isInstanceOf[collection.immutable.::[Int]]) {
if (x1.isInstanceOf[collection.immutable.::[Int]]) {
if (x1.isInstanceOf[collection.immutable.::[Int]]) {
if (x1.isInstanceOf[collection.immutable.::[Int]]) {
val o47: Option[AnyVal] = if
(x1.isInstanceOf[collection.immutable.::[Int]])
if (x1.isInstanceOf[collection.immutable.::[Int]])
if (x1.isInstanceOf[collection.immutable.::[Int]])
if (x1.isInstanceOf[collection.immutable.::[Int]]) {
if (x1.isInstanceOf[collection.immutable.::[Int]]) {
if (x1.isInstanceOf[collection.immutable.::[Int]]) {
val o47: Option[AnyVal] = if
(x1.isInstanceOf[collection.immutable.::[Int]])
if (x1.isInstanceOf[collection.immutable.::[Int]])
if (x1.isInstanceOf[collection.immutable.::[Int]])
At least record the result of every type test which must be executed.
It'll get a little fuzzier when you get into non-mandatory code paths;
but you could allocate almost-free "lazy val booleans" in the form of
tri-state ints.
you could allocate almost-free "lazy val booleans" in the form of tri-state ints.
Are these numbers also representative for “usual code”? A performance regression by 15-25% looks scary ...
Are these numbers also representative for “usual code”? A performance regression by 15-25% looks scary ...
Say, do these numbers include the suggested optimization of converting
Option tests into null tests?
--
Daniel C. Sobral
I travel to the future all the time.
Say, do these numbers include the suggested optimization of converting Option tests into null tests?
They're fast, but no matter how fast they are they can't be faster
than reading a boolean on the stack without introducing some
additional factor. The use of $tag was trying to avoid the instanceof
tests entirely, and it brought overhead. Here the first instanceof
test is being performed, only successive identical ones are being
avoided, and the additional overhead is in the neighborhood of zero.
My guess is that the jvm is already caching and reusing the result of
the first instanceof test, and their cache is closer to the metal than
ours is.
on the other hand, it could be that the JVM is just good at instanceof tests :-)
I've also implemented tri-state ints for lazy bools for caching repeated isInstanceOf tests:[strap.lib.timer: 1:59.914 sec] 120s ==> +19%[quick.lib.timer: 1:41.313 sec] 101s[strap.comp.timer: 1:49.832 sec] 110s ==> +35%[quick.comp.timer: 1:21.208 sec] 81sso that's a bit of a bummer... the overheads went up since last time i measured (before this optimization)
Could you test it with a list of lists? Maybe the JVM inlines the list into the pattern matcher, thus completely eliminating the tests.
Is the result of asInstanceOf also cached? if not the jvm will still need to do instance checks as part of asInstanceOf.
In the end i don't think we'll reach the performance baseline without generating jumps,
so maybe now is a good opportunity to think about how we can get there in a clean, high-level way.
Is the result of asInstanceOf also cached? if not the jvm will still need to do instance checks as part of asInstanceOf.no, I only cached the isInstanceOf test, so yes, this could very well explain why the "optimization" didn't helpI'm not sure caching both would help, though, since other evidence suggests the JVM is simply good at this kind of caching/profilingwe could try caching the asInstanceOf anyway, but that brings me to your next pointIn the end i don't think we'll reach the performance baseline without generating jumps,so maybe now is a good opportunity to think about how we can get there in a clean, high-level way.The isInstanceOf caching experiment was just that, and, after sleeping on it for a night, I think I'll punt on it for now.I agree we need a more sophisticated analysis that generates clean code -- preferably without jumps, though.
Cheers
-- Martin
So, it was the Scanner that crawled, right?
Yes, emitting switches is essential. You can simply leave the match expression in the trees for them; the backend will take care of the rest.