|A different route to a Python target||Athanasios Anastasiou||5/6/14 9:52 AM|
PyParsing is a Python module that provides an object oriented application programming interface to creating parsers. It looks like this:
from pyparsing import *
'''Converts a string to a number. (an action)'''
number = Regex("[0-9]+")("number").setParseAction(str2num) //The "number" and subsequent similar calls provide a label for the nodes of the produced AST. The "setParseAction" is self-explanatory.
string = Regex("'.*?'")("string")
atom = (number^string)("atom") // The OR operator (there is also an EACH operator represented by "&" )
list = (Suppress("[") + delimitedList(atom,delim=',') + Suppress("]"))("list"); //Suppress parses the token but does not add it to the AST, delimitedList is a convenience function, '+' stands for concatenation.
input = OneOrMore(Group(list))("list") //Other multiplicity qualifiers are Optional and ZeroOrMore. Their argument can be any other PyParsing expression. Group collects any deeper parsing into a single node in the AST.
input.parseString("[1,2,3,4,5,'Blah'] ['A','B']") // Or input.parseFile to parse a file.
For a more practical example, you might want to check: https://bitbucket.org/aanastasiou/parseac3d
Having used pyparsing and coming across ANTLR's meta-language made me wonder if something could be done to get ANTLR to produce a pyparsing representation. A python with antlers, naturally.
So, this is the alternative route to Python. Parsing ANTLR and transforming its representation to pyparsing code.
The preliminary work for this is available from: https://bitbucket.org/aanastasiou/antlr2pyparsing
At the moment, it is possible to parse ANTLR with pyparsing and there is also a set of rules for going from the AST (a list of (lists or dictionaries) in Python) to the PyParsing representation but these were all derived by example grammars visualised through the test-rig. In any case, one limitation that emerged from this analysis was that in PyParsing it is impossible to express something like ~('Alpha'|'Beta') because there is no negation operator that can be applied over a whole parser segment (it could be expressed via a regexp like [^Alpha] but there is no ordering in that set of course).
If anyone else would also be interested in this kind of a brainteaser, it would be nice to work together at least in firming up the transformation rules. I am just worried that i may have missed or miss-interpreted something just by looking at AST's in the test-rig. Currently i envisage the transformation as a one-step AST traversing process. Later on, a more elaborate process might have to be worked out for more complex grammars.
Please let me know if you have any ideas or any sort of feedback and anyway, many thanks for ANTLR, i really like the meta-language's features.
All the best