Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ANN: parsing 1.0.0 - Python class library for defining and executing text parsers

1 view
Skip to first unread message

Paul McGuire

unread,
Dec 16, 2003, 3:04:38 AM12/16/03
to
The parsing module is an alternative approach to creating and executing
simple grammars, vs. the traditional lex/yacc approach, or the use of
regular expressions. The parsing module provides a library of classes that
client code uses to construct the grammar directly in Python code.

Here is a program to parse "Hello, World!" (or any greeting of the form
"<salutation>, <addressee>!"):

from parsing import Word, alphas
greet = Word( alphas ) + "," + Word( alphas ) + "!"
hello = "Hello, World!"
print hello, "->", greet.parseString( hello )

The program outputs the following:

Hello, World! -> ['Hello', ',', 'World', '!']

The Python representation of the grammar is quite readable, owing to the
self-explanatory class names, and the use of '+', '|' and '^' operator
definitions.

The parsed results returned from parseString() can be accessed as a nested
list, a dictionary, or an object with named attributes.

The parsing module handles some of the problems that are typically vexing
when writing text parsers:
- extra or missing whitespace (the above program will also handle
"Hello,World!", "Hello , World !", etc.)
- quoted strings
- embedded comments

The .zip file includes examples of a simple SQL parser, simple CORBA IDL
parser, a config file parser, a chemical formula parser, and a four-function
algebraic notation parser. It also includes a simple how-to document, and a
UML class diagram of the library's classes.

The parsing module can be found at
http://sourceforge.net/projects/pyparsing/.

Please let me know if you find this package helpful.

Regards,
-- Paul McGuire


0 new messages