lex, yacc

Bjorn Pettersen

unread,

Oct 7, 1997, 3:00:00 AM10/7/97

to

As I was struggling way too hard to write a parser for a very simple language I kept wondering if
anyone had ported lex/yacc (or one of their incarnations) to Python? (i.e. to emit Python code)

I looked at the parser module at python.org, but the syntax was sufficiently different that I didn't
feel like messing with it...

-- bjorn

Stefane Fermigier

unread,

Oct 8, 1997, 3:00:00 AM10/8/97

to

Hi,

I'd say that kjParser is probably the best choice, but there is another
possibility: use whatever compiler construction tools you have at hand
to dump a representation of the abstract syntax tree, then read in python
and do whatever middle-end and back end processing you need to do.

I personnally used some of the ``cocktail compiler tools''
(ftp://ftp.gmd.de/gmd/cocktail/) last year for a small compiler project
(that was dropped as soon at it was nearly finished), specifically:

- rex (lex lookalike)

- lalr (yacc lookalike)

- ast : Generator for Abstract Syntax Trees

- plus ten lines or so of C to dump the AST in python readable form.

The point was that kjParser doesn't support (to my knowledge) operator
precedence, like yacc or lalr do, and that ASTs are slightly easier to deal with
than concrete syntax trees.

I didn't use the other tools because I couldn't see how to relate them to
my project, and because I wanted to do some programming in Python.

Cheer,
S.
--
Stéfane Fermigier, MdC à l'Université Paris 7. Tel: 01.44.27.61.01 (Bureau).
Mathematician, hacker, bassist. http://www.math.jussieu.fr/~fermigie/
"In mathematics you don't understand things. You just get used to them."
Johann Von Neumann.

Aaron Watters

unread,

Oct 8, 1997, 3:00:00 AM10/8/97

to

>
>The point was that kjParser doesn't support (to my knowledge) operator

>precedence, like yacc or lalr do, and that ASTs are slightly easier to dea=

>l with
>than concrete syntax trees.

Right. kwParsing is quite basic -- you have to reflect
the precedence in the grammar itself. Also the parsing
algorithm used is pretty weak, without any of the
disambiguation conventions people seem to like, uses
nonstandard notations, etc. -- it's certainly not commercial
quality or anything.

As you suggest the "red carpet" way to go would be to
use Lex/Yacc (or similar) to generate C code for parsing
that feeds into a python extension that passes up the
parsed structures to python callbacks for interpretation.
There are quite a lot of layers of complexity involved,
but ultimately, it's the most powerful, fast running, and flexible
approach, if that's important.

If you want to get something running soon I think it
might be faster to translate your rules to kwParsing format
and use kwParsing. It won't get anywhere near the Yacc
approach for speed of execution, of course.
-- Aaron Watters
===
http://starship.skyport.net/crew/aaron_watters/term.cgi?418297

Harri Pasanen

unread,

Oct 9, 1997, 3:00:00 AM10/9/97

to

If you are used to bison or yacc, I urge you to take a look at
PyBison. Scott's old announcement is below, courtesy of DejaNews:

Hope this helps,

Harri

----------------------------

Subject: ANNOUNCE: PyBison -- a Python-based parser generator (similar to bison and yacc)
From: Scott Hassan <has...@cs.stanford.edu>
Date: 1997/02/04
Message-Id: <32F7B0C3...@cs.stanford.edu>
Newsgroups: comp.lang.python

PyBison 1.0b

PyBison enables you to write bison/yacc language grammers with
*Python* actions. Note: I haven't written the lex/flex translator
so you have to write your own tokenizer.

Here is the code and three examples:

http://coho.stanford.edu/~hassan/Python/pybison.tar.gz

Here is a simple calculator language grammer. Notice the python
code embedded in the /*py */ comments fields.

--------------------------------------------------------------------
%token NUMBER, MUL, PLUS, LPARA, RPARA

%%

lines : expr
;

expr : expr PLUS term
{
/*py
$$ = $1 + $3
*/
}
| term
;

term : term MUL factor
{
/*py
$$ = $1 * $3
*/
}
| factor
;

factor : LPARA expr RPARA
{
/*py
$$ = $2
*/
}
| NUMBER
;
--------------------------------------------------------------------

Cheers,

Scott Hassan
Stanford University

Christopher Tavares

unread,

Oct 10, 1997, 3:00:00 AM10/10/97

to

For what it's worth, I tried using PyBison a while ago for my email address
parser - it didn't work. In fact, the example code included with PyBison
didn't work either.

-Chris

=================================================================
This message brought to you by the National Non-Sequitur Society.
We may not make sense, but the panda is a giant racoon.
---------------...@connix.com-----------------------