virtpatmat devirtualized

434 views
Skip to first unread message

Adriaan Moors

unread,
Feb 24, 2012, 6:20:08 AM2/24/12
to scala-i...@googlegroups.com
Hi,

Bits and pieces have been revealed, but I guess it is about time to tell you a little more about -Yvirtpatmat (aka the virtualized pattern matcher (aka the 2.10 pattern matcher)).

Short story: last summer, we decided to reimplement the compilation of pattern matching (after Paul and others spent 2 years trying to fix the previous implementation). To you, Scala user, nothing should change, except for long-standing bugs being fixed. New ones will surely prop up (please try the new implementation using -Yvirtpatmat and report bugs!), but they will be much easier to fix.

If you'd like to know a bit more about virtpatmat's internals, keep reading -- if not, please go and give it a spin!



During type checking, after a match and its cases have been type checked, the match is expanded using a scheme closely related to the translation of for comprehensions: all patterns are turned into extractors, functions of the shape T => Option[P], where P typically is a product type (TupleN), pattern nesting is translated as flatMapping. Consider this simple match:

x match { case P(a, b) => a + b }

it is translated to:

PX(x).flatMap{ o => Some(o._1 + o._2) }

where PX is the extractor corresponding to the pattern P (if P is a case class, this is simply P.unapply)

To illustrate nesting:

x match { case P(Q(x), R(y)) => x + y }

becomes:

QX(x).flatMap((x1 => RX(x1._1).flatMap((x2 => SX(x1._2).flatMap((x3 => Some(x2 + x3)))))))


in general, a case is flattened using a depth-first traversal of its patterns


The translation of P to PX (the extractor that implements the test implied by pattern P) depends on the shape and type of P (we distinguish: efficient case class matching, calling user-defined unapply[Seq], doing type-tests before calling an unapply with a specific argument type, doing a length-check on the result of an unapplySeq, inlining the equality-checking extractor, inlining the type-testing extractor, which also checks equality of the prefixes). However, the way extractors are combined is the same for all of them. Note that a guard is also considered an extractor, albeit one whose result is Option[Unit]. Even the case body is modelled as a degenerate kind of extractor. In short, a match is modelled a list of lists of extractor calls.

In the implementation, this translation (i.e., making the tree that calls the right extractor, synthesizing the extractor if necessary) is performed by the corresponding subclass of TreeMaker. These tree makers are then chained (flatMap'ed) together to form a case, and cases are orElse'ed together to form a match.


Obviously, this naive translation scheme is horribly slow. Options are generated only to be decomposed by flatMap.
This why there are two code generation schemes:

1) the "pure", or "virtualizing" one, which is parametric in the zero-plus monad to be used for matching. It generates `flatMap`, `orElse`, `one` and `zero` calls. It is only used when there's a value of __match in scope that defines a method `one', which determines the match monad. This is illustrated by https://gist.github.com/1900166  (comments show output of scalac-2.10.0-M2 -Xprint:typer -Yvirtpatmat)


trait VirtualizeMatch {
  type Pure[T] = T
  type Monad[T] = Option[T]

  object __match {
    def zero: Monad[Nothing] = None
    def one[T](x: Pure[T]): Monad[T] = Some(x)
    def guard[T](cond: Pure[Boolean], then: => Pure[T]): Monad[T] = if(cond) Some(then) else None
    def runOrElse[T, U](in: Pure[T])(matcher: Pure[T] => Monad[U]): Pure[U] = matcher(in) getOrElse (throw new MatchError(in))
    def isSuccess[T, U](x: Pure[T])(f: Pure[T] => Monad[U]): Pure[Boolean] = !f(x).isEmpty
  }
}

class Test extends VirtualizeMatch {
  case class P(a: Int, b: Int)
  case class Q(a: R, b: S)
  case class R(x: Int)
  case class S(y: Int)

  val x: P = ???

  x match { case P(a, b) => a + b }
// __match.runOrElse(x)((x1 => P.unapply(x1).flatMap((x2 => __match.one(x2._1 + x2._2)))))

  val y: Q = ???

  y match { case Q(R(x), S(y)) => x + y }
// __match.runOrElse(y)((x1 => Q.unapply(x1).flatMap((x4 => R.unapply(x4._1).flatMap((x5 => S.unapply(x4._2).flatMap((x6 => __match.one(x5 + x6)))))))))
}

Have a look at https://github.com/scala/scala/blob/master/test/files/run/virtpatmat_staging.scala for how to use this mechanism to stage matches.
You could also imagine swapping in the Parser, Stream or Logic monads.


2) if you don't intend to virtualize matching, and thus don't have a __match in scope (which would be the common case), the Option monad is assumed, and flatMap and friends are inlined immediately (during expansion at type-checking time, this goes beyond the abilities of the current inliner). No Options are allocated, cases are subsequent if-then-else's, each guarded by a check that we should "keep going" (i.e., a match has not been found). This is to avoid forward jumps (which cause trouble for e.g. scala+gtw). In future versions we may optimize further using jumps etc.

To observe this translation, comment out the "extends VirtualizeMatch" above and re-run the compiler. For the second match, you'll get:


val
zero9: Option[Nothing] = None
val x1: Q = y
var matchRes8: Int = 0
var keepGoing7: Boolean = true

val o12: Option[(R, S)] = Q.unapply(x1)
if (o12.isEmpty) zero9
else {
  val o11: Option[Int] = R.unapply(o12.get._1)
  if (o11.isEmpty) zero9
  else {
    val o10: Option[Int] = S.unapply(o12.get._2)
    if (o10.isEmpty) zero9
    else {
      keepGoing7 = false
      matchRes8 = o11.get.+(o10.get)
      zero9
    }
  }
}

if (keepGoing7) throw new MatchError(x1)
else matchRes8


Furthermore, the current implementation does common subcondition elimination (think `(a, b) match { case (1, 2) =>  case (3, 4) => }` --> the outer Pair extractor is only called once).
We're planning on doing pattern matching-specific DCE as well. First, we still need to finish implementing unreachability and exhaustivity checking.


That is all. From me. For now.

cheers
adriaan

Ismael Juma

unread,
Feb 24, 2012, 6:28:17 AM2/24/12
to scala-i...@googlegroups.com
Thanks for the explanation Adriaan. Useful to know.

Best,
Ismael

Meredith Gregory

unread,
Mar 1, 2012, 7:31:10 PM3/1/12
to scala-i...@googlegroups.com
Dear Adriaan,

Very nice!

Best wishes,

--greg
--
L.G. Meredith
Managing Partner
Biosimilarity LLC
7329 39th Ave SW

Shelby

unread,
Sep 14, 2013, 11:04:37 PM9/14/13
to scala-i...@googlegroups.com, adriaa...@epfl.ch
Cross-referencing to example bugs that this effort fixed:

Reply all
Reply to author
Forward
0 new messages