Pattern matching inefficiencies

391 views
Skip to first unread message

martin odersky

unread,
Oct 10, 2012, 9:15:25 AM10/10/12
to scala-internals
I looked at the bytecode generated by the pattern matcher, and discovere that matching against Nil yields the following bytecode:
{{
   36: getstatic #63; //Field scala/collection/immutable/Nil$.MODULE$:Lscala/collection/immutable/Nil$;
   39: aload 5
   41: astore 7
   43: dup
   44: ifnonnull 56
   47: pop
   48: aload 7
   50: ifnull 64
   53: goto 74
   56: aload 7
   58: invokevirtual #67; //Method java/lang/Object.equals:(Ljava/lang/Object;)Z
   61: ifeq 74
}}
That's quite a load! What happens is that the pattern matcher expands this to 

  Nil == scrutinee

and then uses the usual, involved == logic.

Question 1: It looks like the prelude to the equals call contains two null pointer checks. The first is redundant because we test an object, which should be always non-null (right?) The second is also redundant because equals should handle the case of a null argument anyway (right?). Can we get rid of the tests?

 Question 2: It seems attractive in the interest of performance to change the spec of pattern matching for case objects: these could use an instanceof check instead of an equals operation. That would speed up tests against Nil considerably. On the other hand, a Vector() would not longer be matched by a Nil. But is that even desirable?

Cheers

 - Martin


Paul Phillips

unread,
Oct 10, 2012, 10:03:01 AM10/10/12
to scala-i...@googlegroups.com
On Wed, Oct 10, 2012 at 6:15 AM, martin odersky <martin....@epfl.ch> wrote:
> Question 1: It looks like the prelude to the equals call contains two null
> pointer checks. The first is redundant because we test an object, which
> should be always non-null (right?) The second is also redundant because
> equals should handle the case of a null argument anyway (right?). Can we get
> rid of the tests?

I think you're misreading it. The logic being applied is:

if (l eq null) r eq null else l.equals(r);

We can't lose the null check on the left because:

var x: Nil.type = null
x match { ...

And the null check on the right is only performed if the left hand side is null.

> Question 2: It seems attractive in the interest of performance to change
> the spec of pattern matching for case objects: these could use an instanceof
> check instead of an equals operation. That would speed up tests against Nil
> considerably. On the other hand, a Vector() would not longer be matched by a
> Nil. But is that even desirable?

I'm not sure it's desirable for Vector() to match Nil either, but it
would be such a dangerously breaking change, we'd be looking at the
far future. Any change which affects pattern matcher semantics is
deadly because there will be no indication in the affected code - it
will just stop/start matching.

Paul Phillips

unread,
Oct 10, 2012, 10:05:42 AM10/10/12
to scala-i...@googlegroups.com
Oh, at some point we swapped from 'scrutinee == pattern' to 'pattern =
scrutinee', right? If Nil is on the left we must have. In that case I
suppose it can't be null.

Paul Phillips

unread,
Oct 10, 2012, 10:06:40 AM10/10/12
to scala-i...@googlegroups.com
But can we distinguish the real Nil from this:

val NilLike: Nil.type = null

foo match { case NillLike => ... }

martin odersky

unread,
Oct 10, 2012, 10:49:26 AM10/10/12
to scala-i...@googlegroups.com
yes, exactly. So it seems the null checks can go. 

Regarding changing the semantics of pattern matching. Yes, I know it's very hard, so fixing the general case might be unboundedly far out. We can special-treat some common patterns however. If the scrutinee is a List and the pattern is a Nil then we know that we can get by with an instanceof or eq comparison. Same for None in Option. Same again for any case object that gets the synthesized equals instead of a user-defined one.
Because these cases should be quite common it's probably worth to do it.

Cheers

 - Martin

Adriaan Moors

unread,
Oct 10, 2012, 12:11:38 PM10/10/12
to scala-i...@googlegroups.com
ok, so to make sure I understood correctly:
"when a constant pattern is statically known to have a synthetic equals method that simply calls `eq`, do not perform a null check and call `eq` directly"


cheers
adriaan

Jeff Olson

unread,
Oct 10, 2012, 12:14:56 PM10/10/12
to scala-i...@googlegroups.com


On Wednesday, October 10, 2012 8:15:47 AM UTC-5, martin wrote:
 Question 2: It seems attractive in the interest of performance to change the spec of pattern matching for case objects: these could use an instanceof check instead of an equals operation. That would speed up tests against Nil considerably. On the other hand, a Vector() would not longer be matched by a Nil. But is that even desirable?



A year or so ago, I advocated for changing the behavior of pattern matching on case objects: https://groups.google.com/d/topic/scala-language/KAOFOMHxUv8/discussion. At the time, both you and Paul protested. I'm still very much in favor of the change, both for efficiency and soundness. A empty Vector() should never match Nil.

-Jeff

martin odersky

unread,
Oct 10, 2012, 12:38:49 PM10/10/12
to scala-i...@googlegroups.com
On Wed, Oct 10, 2012 at 6:11 PM, Adriaan Moors <adriaa...@typesafe.com> wrote:
ok, so to make sure I understood correctly:
"when a constant pattern is statically known to have a synthetic equals method that simply calls `eq`, do not perform a null check and call `eq` directly"


Yes, or instanceof. - Martin
 
cheers
adriaan


On Wed, Oct 10, 2012 at 7:49 AM, martin odersky <martin....@epfl.ch> wrote:


On Wed, Oct 10, 2012 at 4:06 PM, Paul Phillips <pa...@improving.org> wrote:
But can we distinguish the real Nil from this:

val NilLike: Nil.type = null

foo match { case NillLike => ... }

yes, exactly. So it seems the null checks can go. 

Regarding changing the semantics of pattern matching. Yes, I know it's very hard, so fixing the general case might be unboundedly far out. We can special-treat some common patterns however. If the scrutinee is a List and the pattern is a Nil then we know that we can get by with an instanceof or eq comparison. Same for None in Option. Same again for any case object that gets the synthesized equals instead of a user-defined one.
Because these cases should be quite common it's probably worth to do it.

Cheers

 - Martin





--
Martin Odersky
Prof., EPFL and Chairman, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967

Paul Phillips

unread,
Oct 10, 2012, 5:59:45 PM10/10/12
to scala-i...@googlegroups.com


On Wed, Oct 10, 2012 at 9:38 AM, martin odersky <martin....@epfl.ch> wrote:
Yes, or instanceof. - Martin

I am betting instanceof will be faster, maybe even a lot faster.  For sure we had best measure both.

Rex Kerr

unread,
Oct 10, 2012, 7:13:34 PM10/10/12
to scala-i...@googlegroups.com
At least in the collections benchmark I posted for the dynamic dispatch thread, instanceof (i.e. isInstanceOf) is much faster than eq X where X is a non-final val in an object (though as usual, optimization thresholds prevent this from being an entirely clear-cut story):

"_ eq F"
             size  foreach time (ns) / iterator time (ns)
             ----  --------------------------------------
T List          2  Filter:   7.4/9.0 
T List         50  Filter:  13.6/7.9  
T List       1250  Filter:  13.3/10.7  
T List      31250  Filter:  13.5/11.0  

"_.isInstanceOf[False]"
             size  foreach time (ns) / iterator time (ns)
             ----  --------------------------------------
T List          2  Filter:   4.9/6.4
T List         50  Filter:   3.8/9.2
T List       1250  Filter:   7.9/9.4 
T List      31250  Filter:   8.0/9.6

isInstanceOf is a win in 7 of 8 cases (and favors foreach over iterator more strongly in filter).

Making a local val (to avoid object dereferencing) does not change anything (data not shown).

  --Rex

P.S. This takes place in the context of T,F instances (one each) with True,False <: Bool hierarchy.  See
  http://pastebin.com/NXhNkEx9
for the code; I just changed lines 196 and 197 to use isInstanceOf instead of eq.

Adriaan Moors

unread,
Oct 10, 2012, 8:38:47 PM10/10/12
to scala-i...@googlegroups.com
interesting, thanks! -- I'll give this a try and benchmark when I start work on this ticket

Paolo G. Giarrusso

unread,
Oct 10, 2012, 10:00:04 PM10/10/12
to scala-i...@googlegroups.com
Do you imply that isInstanceOf is faster in general, or just because it's isInstanceOf[C] where C has no subclasses?
Also, isn't eq just a pointer comparison? How can the comparison in isInstanceOf be faster? 

Paul Phillips

unread,
Oct 11, 2012, 10:11:58 AM10/11/12
to scala-i...@googlegroups.com
On Wed, Oct 10, 2012 at 7:00 PM, Paolo G. Giarrusso <p.gia...@gmail.com> wrote:
Do you imply that isInstanceOf is faster in general, or just because it's isInstanceOf[C] where C has no subclasses?

I imply it's in general faster than intuition predicts.
 
Also, isn't eq just a pointer comparison? How can the comparison in isInstanceOf be faster? 

A pointer comparison to what? First you have to go get it, then you can compare it.

Josh Suereth

unread,
Oct 11, 2012, 10:14:44 AM10/11/12
to scala-i...@googlegroups.com
The JVM made instanceof FAST, thanks to typical Java code.  Faster than it has any right to be, which is fun for all sorts of compiler tricks.

Suggested new tagline for JVM: This ain't your grandpappy's VM.

Rex Kerr

unread,
Oct 11, 2012, 11:01:49 AM10/11/12
to scala-i...@googlegroups.com

I'm not set up to test it in general, and given how fickle JVMs are  I don't want to speculate at this point.

One reason it can be faster is that most JVMs run using CompressedOops.  This does have a performance penalty, including potentially for things like reference equality (have not tested how much is intrinsic and how much is just a consequence of impacting constants set for optimization).

But after further testing, I think in this case the answer may be entirely "it depends".  I was using 1.6.0_31 before.  If we look in more detail, we see all sorts of highly reproducible (timings accurate to +- 0.2 ns or so) yet essentially unpredictable behavior:

"_ eq F"  with JDK 1.6.0_31 with -J-XX:-UseCompressedOops
T List          2  Map:   7.3/6.8     Filter:   6.9/6.8     Fold:   6.3/5.6  
T List         50  Map:   8.5/9.1     Filter:   5.1/9.0     Fold:   7.4/7.4  
T List       1250  Map:  10.1/9.8     Filter:   9.2/10.2    Fold:   8.3/7.5  
T List      31250  Map:  10.3/10.1    Filter:   9.4/10.4    Fold:   8.3/7.6  

"_ eq F" with JDK 1.6.0_31 with -J-XX:+UseCompressedOops
T List          2  Map:   5.8/7.5     Filter:   7.4/9.0     Fold:   4.2/5.1  
T List         50  Map:   6.9/9.6     Filter:  13.6/9.5     Fold:   5.5/6.4  
T List       1250  Map:   8.5/11.8    Filter:  13.2/10.7    Fold:   6.5/7.3  
T List      31250  Map:   8.7/12.0    Filter:  13.4/10.9    Fold:   6.6/7.4  

"_.isInstanceOf[False]" with JDK 1.6.0_31
T List          2  Map:   6.0/7.7     Filter:   5.0/6.2     Fold:   5.4/5.0  
T List         50  Map:   7.0/11.0    Filter:   7.4/8.9     Fold:   7.1/6.9  
T List       1250  Map:   8.6/11.7    Filter:   7.8/9.0     Fold:   7.8/7.2  
T List      31250  Map:   8.8/11.9    Filter:   8.0/9.2     Fold:   7.9/7.3  

(Last one: look, the magic 3.8 ns/op 50-item list filter is gone!  But isInstanceOf still wins 7/8.)

Unfortunately, I'm not set up to spit out the assembly generated by the inliner, so I'm not entirely sure what is going on in detail.  Also, this is 1.6.0_31; 1.7.0_07 is better with optimizing when using UseCompressedOops and when using eq:

"_ eq F" with JDK 1.7.0_07 (x64), -UseCompressedOops
T List          2  Map:   5.7/6.2     Filter:   6.1/5.6     Fold:   4.3/4.9  
T List         50  Map:   7.4/8.5     Filter:   7.2/7.6     Fold:   5.7/7.1  
T List       1250  Map:   8.6/9.2     Filter:   9.9/8.9     Fold:   6.8/7.3  
T List      31250  Map:   8.7/9.4     Filter:  10.0/9.0     Fold:   6.9/7.3  

"_ eq F" with JDK 1.7.0_07 (x64), +UseCompressedOops (server mode)
T List          2  Map:   5.6/5.9     Filter:   6.1/5.5     Fold:   4.2/4.5  
T List         50  Map:   7.2/7.8     Filter:   5.4/5.1     Fold:   5.9/5.6  
T List       1250  Map:   8.5/9.0     Filter:   9.2/8.8     Fold:   7.0/6.9  
T List      31250  Map:   8.7/9.2     Filter:   9.5/9.0     Fold:   7.0/7.0

"_.isInstanceOf[False]" with JDK 1.7.0_07 (x64)
T List          2  Map:   5.6/6.2     Filter:   4.7/5.6     Fold:   4.2/4.9  
T List         50  Map:   7.5/8.7     Filter:   3.9/8.3     Fold:   6.0/7.3  
T List       1250  Map:   8.5/9.2     Filter:   8.3/9.4     Fold:   7.0/7.2  
T List      31250  Map:   8.7/9.4     Filter:   8.6/9.5     Fold:   7.0/7.3  

(Last one: look, the magic 3.9 ns/op filter is back again!)

Now it's only a win in 4 of 8 cases, a statistical tie in one case, and a loss in 3/8.  (It racks up its wins on the faster foreach side of things, though.)

This kind of situation--where what seems like an answer in favor of one thing can flip and argue for the opposite in a slightly different context--is unfortunately very common in JVMs where the key is getting the JVM to supply the most aggressive optimizations which depends on a myriad of factors and arbitrary thresholds set in the JVM.  (Optimizing for CPUs in general instead of for one architecture is also quite tricky.)

  --Rex

Adriaan Moors

unread,
Jan 31, 2013, 2:16:20 PM1/31/13
to scala-i...@googlegroups.com, martin odersky
Hi Martin,

The following PR implements some of the various suggestions on improving the bytecode emitted by the pattern matcher.


I use bytecode a/b testing to ensure a match compiles (under -optimize) to the same bytecode as a hand-written if/then/else.

I've been trying to figure out https://issues.scala-lang.org/browse/SI-6508 (the result of this particular thread),
but am not sure how the bytecode you mentioned was produced.

There are some long-term improvements implied by this bug, but they'd require a spec rewrite and would break existing code.
(Switching from == to eq for objects would change matching on Nil.)

If there's anything else we can do in time for 2.10.1-RC1 to improve patmat performance, I'd love to hear about it.

cheers
adriaan
Reply all
Reply to author
Forward
0 new messages