I spent much of yesterday and this morning, trying to understand the subtleties of using regex's in pattern matching.
Ended up making a list of pitfalls to my colleagues, in order to keep our code fresh and working in the future, as well. The problem seems to be, quite a lot is left to the runtime, which is counter to the overall goals of the Scala language. So I wonder if something can be done about this in Scala language development.
Sample code:
val Re = """.*cd""".r
val tmp=
"abcd" match {
case Re() => 0 // gets here if no captures
case Re(_) => 1 // gets here if there's one capture
case Re(_,_) => 2 // gets here if there's two captures
case _ => -1
}
tmp shouldBe 0
This compiles, and runs.
In any use case I can think of, a regex would always have a pre-determined number of capture groups (0 or more). If the compiler knows this number, it could err on the cases that would never be matching.
Let's modify the code a bit:
val Re = """.*(c)d""".r
val tmp=
"abcd" match {
case Re() => ...
}
I've added a capture group but am only matching against a no-catch extractor. This compiles, and runs, without any warnings, but the match will never happen (since there's no 'Re(_)' case). If the compiler knew the number of captures in a 'Regex' it would not compile such code.
Is doing such an improvement (more compile time checks in using Regex's in pattern matching) possible, in the future Scala versions?
- Asko Kauppi