Hi,
Bits and pieces have been revealed, but I guess it is about time to tell you a little more about -Yvirtpatmat (aka the virtualized pattern matcher (aka the 2.10 pattern matcher)).
Short story: last summer, we decided to reimplement the compilation of pattern matching (after Paul and others spent 2 years trying to fix the previous implementation). To you, Scala user, nothing should change, except for long-standing bugs being fixed. New ones will surely prop up (please try the new implementation using -Yvirtpatmat and report bugs!), but they will be much easier to fix.
If you'd like to know a bit more about virtpatmat's internals, keep reading -- if not, please go and give it a spin!
During type checking, after a match and its cases have been type checked, the match is expanded using a scheme closely related to the translation of for comprehensions: all patterns are turned into extractors, functions of the shape T => Option[P], where P typically is a product type (TupleN), pattern nesting is translated as flatMapping. Consider this simple match:
x match { case P(a, b) => a + b }
it is translated to:
PX(x).flatMap{ o => Some(o._1 + o._2) }
where PX is the extractor corresponding to the pattern P (if P is a case class, this is simply P.unapply)
To illustrate nesting:
x match { case P(Q(x), R(y)) => x + y }
becomes:
QX(x).flatMap((x1 => RX(x1._1).flatMap((x2 => SX(x1._2).flatMap((x3 => Some(x2 + x3)))))))
in general, a case is flattened using a depth-first traversal of its patterns
The translation of P to PX (the extractor that implements the test implied by pattern P) depends on the shape and type of P (we distinguish: efficient case class matching, calling user-defined unapply[Seq], doing type-tests before calling an unapply with a specific argument type, doing a length-check on the result of an unapplySeq, inlining the equality-checking extractor, inlining the type-testing extractor, which also checks equality of the prefixes). However, the way extractors are combined is the same for all of them. Note that a guard is also considered an extractor, albeit one whose result is Option[Unit]. Even the case body is modelled as a degenerate kind of extractor. In short, a match is modelled a list of lists of extractor calls.
In the implementation, this translation (i.e., making the tree that calls the right extractor, synthesizing the extractor if necessary) is performed by the corresponding subclass of
TreeMaker. These tree makers are then chained (flatMap'ed) together to form a case, and cases are orElse'ed together to form a match.
Obviously, this naive translation scheme is horribly slow. Options are generated only to be decomposed by flatMap.
This why there are two code generation schemes:
1) the "pure", or "virtualizing" one, which is parametric in the zero-plus monad to be used for matching. It generates `flatMap`, `orElse`, `one` and `zero` calls. It is only used when there's a value of __match in scope that defines a method `one', which determines the match monad. This is illustrated by
https://gist.github.com/1900166 (comments show output of scalac-2.10.0-M2 -Xprint:typer -Yvirtpatmat)
trait VirtualizeMatch {
type Pure[T] = T
type Monad[T] = Option[T]
object __match {
def zero: Monad[Nothing] = None
def one[T](x: Pure[T]): Monad[T] = Some(x)
def guard[T](cond: Pure[Boolean], then: => Pure[T]): Monad[T] = if(cond) Some(then) else None
def runOrElse[T, U](in: Pure[T])(matcher: Pure[T] => Monad[U]): Pure[U] = matcher(in) getOrElse (throw new MatchError(in))
def isSuccess[T, U](x: Pure[T])(f: Pure[T] => Monad[U]): Pure[Boolean] = !f(x).isEmpty
}
}
class Test extends VirtualizeMatch {
case class P(a: Int, b: Int)
case class Q(a: R, b: S)
case class R(x: Int)
case class S(y: Int)
val x: P = ???
x match { case P(a, b) => a + b }
// __match.runOrElse(x)((x1 => P.unapply(x1).flatMap((x2 => __match.one(x2._1 + x2._2)))))
val y: Q = ???
y match { case Q(R(x), S(y)) => x + y }
// __match.runOrElse(y)((x1 => Q.unapply(x1).flatMap((x4 => R.unapply(x4._1).flatMap((x5 => S.unapply(x4._2).flatMap((x6 => __match.one(x5 + x6)))))))))
}
You could also imagine swapping in the Parser, Stream or Logic monads.
2) if you don't intend to virtualize matching, and thus don't have a __match in scope (which would be the common case), the Option monad is assumed, and flatMap and friends are inlined immediately (during expansion at type-checking time, this goes beyond the abilities of the current inliner). No Options are allocated, cases are subsequent if-then-else's, each guarded by a check that we should "keep going" (i.e., a match has not been found). This is to avoid forward jumps (which cause trouble for e.g. scala+gtw). In future versions we may optimize further using jumps etc.
To observe this translation, comment out the "extends VirtualizeMatch" above and re-run the compiler. For the second match, you'll get:
val zero9: Option[Nothing] = None
val x1: Q = y
var matchRes8: Int = 0
var keepGoing7: Boolean = true
val o12: Option[(R, S)] = Q.unapply(x1)
if (o12.isEmpty) zero9
else {
val o11: Option[Int] = R.unapply(o12.get._1)
if (o11.isEmpty) zero9
else {
val o10: Option[Int] = S.unapply(o12.get._2)
if (o10.isEmpty) zero9
else {
keepGoing7 = false
matchRes8 = o11.get.+(o10.get)
zero9
}
}
}
if (keepGoing7) throw new MatchError(x1)
else matchRes8
Furthermore, the current implementation does common subcondition elimination (think `(a, b) match { case (1, 2) => case (3, 4) => }` --> the outer Pair extractor is only called once).
We're planning on doing pattern matching-specific DCE as well. First, we still need to finish implementing unreachability and exhaustivity checking.
That is all. From me. For now.
cheers
adriaan