Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Generating parsers in Lisp

146 views
Skip to first unread message

Jocelyn Paine

unread,
May 15, 1995, 3:00:00 AM5/15/95
to
What tools would readers recommend for defining the grammar of a
programming language in Lisp and generating a parser for it? One could
write a recursive-descent parser by hand, but I'm thinking of a "little
language" for grammar definition, something like Prolog Definite Clause
Grammars. Some books do this via ATNs, but are these the only popular
method?

And how about tokenisation? What's the generally recommended method for
building lexical analysers, given the syntax of the tokens in a
language?


Jocelyn Paine

Henry Baker

unread,
May 16, 1995, 3:00:00 AM5/16/95
to
In article <GEERT.95M...@sparc.aie.nl>, ge...@sparc.aie.nl
(Geert-Jan van Opdorp) wrote:

> In article <1995May15.170933.32369@oxvaxd> po...@vax.oxford.ac.uk
(Jocelyn Paine) writes:
>
> Jocelyn> What tools would readers recommend for defining the grammar
> Jocelyn> of a programming language in Lisp and generating a parser for
> Jocelyn> it?

You might look at the paper

ftp://ftp.netcom.com/pub/hb/hbaker/Prag-Parse.html (also .ps.Z)

--
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html

Lou Steinberg

unread,
May 16, 1995, 3:00:00 AM5/16/95
to
In article <1995May15.170933.32369@oxvaxd> po...@vax.oxford.ac.uk (Jocelyn Paine) writes:

What tools would readers recommend for defining the grammar of a
programming language in Lisp and generating a parser for it?

Is this a situation where you can define the syntax of the language
yourself (as opposed to one where you must parse some
already-specified syntax)? If you can choose your syntax, then the
best thing to do is to make the syntax as lispish as possible - each
expression or statement should be a list, with the car specifying the
operation and the cdr a list of operands. Things that need more
structure should be handled positionally - use, e.g., let or do in
common lisp as a model. Tokens should be as close as possible to
lisp's token syntax. Then your parser is simply the lisp reader,
and interpreters and compilers are very easy to write.

Luigi Semenzato

unread,
May 16, 1995, 3:00:00 AM5/16/95
to po...@vax.oxford.ac.uk
In article <1995May15.170933.32369@oxvaxd>, po...@vax.oxford.ac.uk
(Jocelyn Paine) writes:

|> What tools would readers recommend for defining the grammar of a
|> programming language in Lisp and generating a parser for it?

I know this doesn't answer the question, but one common approach
is to use the ubiquitous lex and yacc to scan and parse the input,
and output a parenthesized notation that the Lisp reader understands. ---Luigi

Shriram Krishnamurthi

unread,
May 16, 1995, 3:00:00 AM5/16/95
to
l...@cs.rutgers.edu (Lou Steinberg) writes:

> In article <1995May15.170933.32369@oxvaxd> po...@vax.oxford.ac.uk (Jocelyn Paine) writes:

> [Make your syntax look as much like LISP as possible.]


> Then your parser is simply the lisp reader,

No:

(let ((foo)) bar)

is not legitimate LISP syntax. A reader is just that; it is not a
parser. Of course, the task of writing a parser may be hugely
simplified.

> and interpreters and compilers are very easy to write.

It's not clear how the input syntax makes much of a difference in
writing the back end of a compiler.

'shriram

Christian Lynbech

unread,
May 17, 1995, 3:00:00 AM5/17/95
to
(second time today I get a chance to recommend it, but it *is* nifty :-)

You may want to check out Henry Bakers article "Pragmatic Parsing in
Common Lisp" which describes the META parsing technique which is a
compact and rather efficient way of building recursive descent parsers.

The code in the article actually works (in less than a page!)
but one would probably want to wrap it in some interface for more
serious usage. It is of course in Common Lisp, but I think it should
port to Scheme relatively easy, with the reader macro magic being the
major obstacle.

The ftp address is /anon...@ftp.netcom.com:/pub/hb/hbaker and it is
also available through WWW (except that netcom.com is heavily loaded).


You may also want to check out the ZEBU parser generator, which
generates lalr parsers in the manner of YACC. It started as a scheme
program (written by William Wells) and it is available at various
places such as the Scheme Repository at ftp.cs.indiana. Joachim
Laubsch has ported ZEBU to Common Lisp and enhanced it a great deal. I
am currently using it for a parser to extract specifications from a
LaTeX document and I think it works fine.

ZEBU has some support for customization of the lexing process, and it
has worked for me so far. You can also replace the lexer completely,
but the documentation seems a bit out of date on some of the necessary
internals, and I kept running into some rather mysterious problems
with undefined variables, and not being very experienced with Common
Lisp, I gave up when I found out how much was actually handled
automatically by the default lexer.

It appear to have been tested under Lucid and MCL, but I was able to
get it up under CMU CL with a moderate amount of hassle (just make
sure you compile it in test mode rather production mode, helps a lot
:-).

It is available from /ftp.cs.cmu.edu:/user/ai/lang/lisp/code/zebu,
according to the documentation (I got it through the WWW interface to
the AI archive at CMU, so I haven't tested the address myself).

There several other alternatives, in both the AI and the Scheme
Repositories, but this is what I have looked at myself.

------------------------------------------------------------------------------
Christian Lynbech (R0.33) | Hit the philistines three times over the
phone: +45 8942 3217 | head with the Elisp reference manual.
email: lyn...@daimi.aau.dk | - pet...@hal.com (Michael A. Petonic)
------------------------------------------------------------------------------

Lou Steinberg

unread,
May 17, 1995, 3:00:00 AM5/17/95
to

I wrote


> [Make your syntax look as much like LISP as possible.]
> Then your parser is simply the lisp reader,

shr...@asia.cs.rice.edu (Shriram Krishnamurthi) replied:


No:
(let ((foo)) bar)
is not legitimate LISP syntax. A reader is just that; it is not a
parser.

That depends. Is a parser a string -> parse-tree transformer with
syntax checking as a side effect or is it a syntax checker which
produces the parse tree as a side effect? Either answer makes sense
depending on your perspective, but to me a parser is a primarily a
string -> parse-tree transformer, and I don't really care if some
errors that have a syntactic flavor get caught by other parts of the
compiler instead of by the parser.

So, I claim that list (let foo bar) *is* the internal parse tree
produced from the string "(let foo bar)" by a parser that just happens
not to catch this particular syntax error. In fact, if I want I can
simply define the "syntax" to allow (let foo bar) but say that
semantic rules forbid it.


Erik Naggum

unread,
May 17, 1995, 3:00:00 AM5/17/95
to
[Shriram Krishnamurthi]

| l...@cs.rutgers.edu (Lou Steinberg) writes:
|
| > In article <1995May15.170933.32369@oxvaxd> po...@vax.oxford.ac.uk (Jocelyn Paine) writes:
|

| > [Make your syntax look as much like LISP as possible.]
| > Then your parser is simply the lisp reader,
|

| No:
|
| (let ((foo)) bar)
|
| is not legitimate LISP syntax.

1. this is prefectly valid Lisp syntax.
2. in CLtL1, it produced an error if evaluated as code, but not in CLtL2.
3. the ability to have the same syntax for code and data is one of Lisp's
many valuable characteristics.

I don't see your point, if any

#<Erik>
--
if you see this as a new article after 1995-05-24, please notify me by mail.
--
NETSCAPISM /net-'sca-,pi-z*m/ n (1995): habitual diversion of the mind to
purely imaginative activity or entertainment as an escape from the
realization that the Internet was built by and for someone else.

Clint Hyde

unread,
May 19, 1995, 3:00:00 AM5/19/95
to
an alternative is to use a heavily modified readtable. this can do all
sorts of wild things, but is not easy to produce.

I've worked with teeny changes to readtables and found them not easy.

-- clint


Stuart Watt

unread,
May 23, 1995, 3:00:00 AM5/23/95
to
> In article <1995May15.170933.32369@oxvaxd>, po...@vax.oxford.ac.uk
> (Jocelyn Paine) writes:
>
> |> What tools would readers recommend for defining the grammar of a
> |> programming language in Lisp and generating a parser for it?

I have an integrated lex and yacc equivalent set of tools written in
Common Lisp. I have used these to generate complete parsers for a variety
of different languages. They are not as elegant or as efficient as either
lex or yacc (or, better, flex or bison) but they get the job done, and
they aren't hard to use.

People can mail me directly if you want the sources, at S.N.K...@open.ac.uk.

An example lexical analyser rule:

(defaction (mike "[0-9]+" :kind lexical :context context)
(list 'integer (parse-integer (get-token-string stream))))

And example parser generator rules:

(defaction (mike (top.level.forms --> top.level.forms top.level.form)
:kind syntax :context context))

(defaction (mike (top.level.forms --> )
:kind syntax :context context))

Regards, Stuart

--
Stuart Watt; Human Cognition Research Laboratory, Open University,
Walton Hall, Milton Keynes, MK7 6AA, UK.
Tel: +44 1908 654513; Fax: +44 1908 653169.
WWW: http://kmi.open.ac.uk/~stuart/stuart.html

0 new messages