I don't have actual code handy, but for a parser that extends
RegexParsers you can do something like this:
class MyREParser
extends RegexParsers
{
val number = "[0-9]+".r
val name = "[A-Z_a-z][A-Z_a-z0-9]".r
// Here you can use number and name as terminals
// in your combinator parser grammar.
}
By default, white-space will be skipped before attemting to match any
regular-expression terminal (token). You can control that as well as
what constitutes inter-token white-space.
The second edition of Programming In Scala has a RegexParser example in
its Combinator Parsers chapter. I believe the 1st ed. did, as well. And
if you don't know, the 1st. ed. is available free online in HTML [1].
[1] http://www.artima.com/pins1ed/
Randall Schulz
But that isn't a lexer.
--
Daniel C. Sobral
I travel to the future all the time.
So?
It answers the question "what do do with tokens once one has them."
(Consume them in non-terminal productions, of course.) It clearly is a
(extremely minimal, in fact entirely incomplete) "... parser that
tokenize[s] via regexes..."
Randall Schulz
See scala.utils.parsing.json, particularly Lexer and Parser. Looking
for subclasses on Scaladoc is often useful in such situations.
Why?
> Thanks,
> Ken
Randall Schulz
The CPs that are discussed in the "Programming in Scala" book (the ones that extend RegexParsers) effectively seem to be combining the lexing and parsing phases; this is what Randall did. I'm interested in seeing an example where the two phases are distinct, i.e. where the lexing simply produces a series of tokens, and the parser's productions operate on the tokens (as opposed to RegexParsers, where the productions operate directly on string input). I'd like to see what this looks like, and it might be better for what I'm trying.