(Manual) transformation of ANTLR 4 grammar into Scala parser combinator DSL - a viable solution ?

822 views
Skip to first unread message

Jürgen Pfundt

unread,
Jun 29, 2013, 4:21:20 PM6/29/13
to antlr-di...@googlegroups.com
At the moment no ANTLR 4 Scala target language is available. Supposed that building a bridge between Scala and a Java parser generated from an ANTLR 4 grammar is not an option. Is the (manual) conversion of an existing well written ANTLR 4 grammar into a Scala parser combinator a viable solution ?

To illustrate this with a concrete example, I use the ANTLR 4 grammar for CSV from Terence Parr's book:

grammar CSV;
// start of parser
file  
: hdr row+ ;
hdr  
: row ;
row  
: field (',' field)* '\r'? '\n' ;
field
: TEXT | STRING | ;
// start of lexer
TEXT  
: ~[,\n\r"]+ ;
STRING : '"' ('""'|~'"')* '"' ; // quote-quote is an escaped quote

Here a brief description of the Scala notation used in the example. Given p1 and p2 of type scala.util.parsing.combinator.Parsers.Parser:

p1 ~ p2   // sequencing: must match p1 followed by p2
p1
| p2   // alternation: must match either p1 or p2,
         
// with preference given to p1
p1
||| p2 // alternation: If p and q both succeed,
         
// the parser that consumed the most characters accepts
p1
.?      // optionality: may match p1 or not
p1
.+      // repetition: matches 1 or more repetitions of p1
p1
.*      // repetition: matches any number of repetitions of p1
p1
~> p2  // a parser combinator for sequential
         
// composition which keeps only the right result.
p1
<~ p2  // a parser combinator for sequential
         
// composition which keeps only the left result.

Looking at the CSVParser trait below, the resemblance of each single parser combinator rule to the original ANTLR 4 rule is remarkable. The Scala grammar is extended to a complete program.

import util.parsing.combinator.RegexParsers
 
trait
CSVParser extends RegexParsers {
 
// adjust handling of white space to ANTLR 4 characteristics
 
override val skipWhitespace = false
 
override val whiteSpace = """[ \t]""".r
 
// start of parser
 
def file: Parser[List[List[String]]] = hdr ~ row.+ ^^ {
   
case header ~ rows => header :: rows
 
}
 
def hdr: Parser[List[String]] = row
 
def row: Parser[List[String]] = field ~ ("," ~> field).* <~ "\r".? <~ "\n" ^^ {
   
case field ~ fields => field :: fields
 
}
 
def field: Parser[String] = TEXT ||| STRING | EMPTY
 
// start of lexer
  lazy val TEXT
: Parser[String] = ("[^,\n\r\"]".r).+ ^^ makeText
  lazy val STRING
: Parser[String] = "\"" ~> ("\"\"" | "[^\"]".r).* <~ "\"" ^^ makeString
  lazy val EMPTY
: Parser[String] = "" ^^ makeEmpty
 
 
// signatures
 
def makeText: List[String] => String
 
def makeString: List[String] => String
 
def makeEmpty: String => String
}


trait
CSVLexerAction {
 
// remove leading and trailing blanks
 
def makeText = (text: List[String]) => text.mkString("").trim
 
// remove embracing quotation marks
 
// replace to consecutive quotes by a single quote
 
def makeString = (string: List[String]) => string.mkString("").replaceAll("\"\"", "\"")
 
// modify result of EMPTY token if required
 
def makeEmpty = (string: String) => ""
}


import java.io.FileReader
 
object CSVParserCLI extends CSVParser with CSVLexerAction {
 
def main(args: Array[String]) {
    println
(parseAll(file, new FileReader(args(0))))
 
}
}

The transformation of the ANTLR 4 CSV grammar into Scala parser combinators is admittedly a simple example, but I believe that a lot of existing ANTLR 4 grammars can be transformed into Scala using the parser combinators DSL.

Is this a viable way to circumvent the missing Scala language target for ANTLR 4 ? 

rba...@gmail.com

unread,
Jul 3, 2013, 5:45:04 AM7/3/13
to antlr-di...@googlegroups.com
If you are using listeners, you don't need a scala target to be able to use an antlr4 grammar with Scala.
If your grammar is named "Expr", then antlr will generate a java file ExprBaseListener.java.
You just need to derived the ExprBaseListener class in your own scala file and implement the various listener methods and you are almost done

Ronan


Reply all
Reply to author
Forward
0 new messages