%T Compact Recursive-descent Parsing of Expressions
%A D. R. Hanson
%J SOFTWARE - Practice and Experience
%V 15
%N 12
%P 1205-1212
%D December 1985
%K Recursive-descent parsing, Expression parsing, Precedence, Compilers
I have seen a similar technique used before (in a C compiler written by
David Conroy -- cf. micro-emacs), circa 1979/80, but David might well have
been using it for some time. (I haven't got the source code of that
compiler, and so I can't say how closely the two implementations
correspond in detail.) I liked it enough then to use it in a parser for
Pascal, and later in a parser for regular expressions.
Still, like most good ideas that someone probably patented last year, this
one has been about for some time!
--
Send compilers articles to comp...@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
>It's used by lcc, but you needn't look there for a description:
>%T Compact Recursive-descent Parsing of Expressions
>%A D. R. Hanson
>%J SOFTWARE - Practice and Experience
>%V 15
>%N 12
>%P 1205-1212
>%D December 1985
>%K Recursive-descent parsing, Expression parsing, Precedence, Compilers
I wrote up a method I'd been teaching for a couple of years at least,
after reading David Hanson's 1985 paper. You don't need complicated
tables for my method, and the code is very simple too! The program can
parse an expression using a number of procedure calls approximately equal
to the number of nodes in its abstract syntax tree, independent of the
number of precedence levels in the grammar. The operator symbols of any
particular expression grammar appear only in two small functions, which
frequently removes the need for grammar transformation before the method
can be used. Hanson's paper shows how the number of procedure definitions
used in a recursive descent parser can be reduced, but does not show how
the number of procedure activations can be reduced.
Two simple functions Priority and RightAssoc are used to map character
values to numbers (representing precedences) and booleans. I give sum,
product and exponentiation the usual behaviour.
proc Priority(c) =
case c of
"+" : 1; "*" : 2 ; "^" : 3
otherwise 0
endcase
endproc
proc RightAssoc(c) =
case c of "^": true;
otherwise false
endcase
endproc
Then there are two routines for doing the parsing; one does atomic things
(factors, primaries - e.g. bracketed expressions, numbers), the other does
infix expressions.
proc parseE(prec) =
initialise p:=parsePrimary();
while Priority(input^) >= prec do begin
let oper = get(input);
let oprec = Priority(oper);
p := mknode(p,
oper,
parseE(if RightAssoc(oper) then oprec else oprec+1))
end;
return p
endproc
proc parsePrimary() =
case input^ of
"(" :
begin
get(input);
let p=E(1);
if input^ <> ")" then fail fi;
return p
end;
otherwise:
if input^ in ['a'..'z'] then
return mkleaf(input^)
else
fail
fi
endcase
endproc
The research report is available still, I think:
K.M.Clarke (1986) "The top-down parsing of expressions", Research Report
383, Dept of Computer Science, QMC.
QMC is now QMW, Mile End Rd, London, UK. I mostly do a sort of
proof-by-transformation.
>Still, like most good ideas that someone probably patented last year, this
>one has been about for some time!
My method turned out to have been used in Dave Turner's SASL compiler
(University of St Andrews) - later 70s or early 80s?; a very similar
method was later described in Richard O'Keefe's Prolog book. Larry
Paulson's (1991) ML book has a similar algorithm. Still, I've never come
across it in a compiling textbook.
--
Keith Clarke
QMW, University of London.