Problem with top down parsing

anan...@gmail.com

unread,

Oct 24, 2006, 12:35:58 PM10/24/06

to

Hi,
In the book "Principles of compiler design, Aho Ullman" the
following exercise caught my attention. The grammar given is

S -> aSa | aa

It is quoted that a "top down parse with backtracking" can establish
the inputs with 2,4 or 8 a's but not 6 a's .... How is this possible ?

Ujjwal

unread,

Nov 22, 2006, 9:17:50 PM11/22/06

to

Hi,

I think 'top down parsing with backtracking' can also produce 6 a's by
menas of the following way:

S -> aSa -> aaSaa (using the production S -> aSa)
-> aaaaaa (using the production S -> aa)

Please correct me in case I missed anything.

With best regards,
Ujjwal.

A Johnstone

unread,

Nov 24, 2006, 8:20:55 AM11/24/06

to

> S -> aSa | aa

> It is quoted that a "top down parse with backtracking" can establish
> the inputs with 2,4 or 8 a's but not 6 a's .... How is this possible ?

The exercise (5.6 on page 193 of the first edition of the Dragon Book) refers
to limited backtrack parsing. Here's an extract question:

...
By tracing through the steps of a top-down parser (with backtrack)
which tries the alternate aSa before aa, show that S succeeds on 2, 4, or
8 a's but not on 6 a's
...

The trick here is to realise that many backtracking parsers use what
we call a singleton match strategy, in which as soon as the parse
function for a rule finds a match, it returns. In general, parse
functions need to return a set of putative matches. Just try working
it through, and you'll see that a singelton match parser misses some
of the possible derivations.

Sadly, there are real parser generators our there that claim to be
general but which exhibit this behaviour. They usually have a getout,
which says something like 'rules must be ordered by longest
match'. This turns out to be impossible in general: there are grammars
for which different orderings are required for different strings in
the language. (See our paper, below, for an example.) Backtrack
parsers? Caveat emptor.

If you want a fuller expostion, see chapter 6 of 'The theory of
parsing, translation and compiling', Alfred V Aho and Jeffrey
D. Ullman, Prentice-Hall, 1972. For a compact exposition which includes
singleton match and truly general backtrack top down parser, see our paper

'Generalised recursive descent parsing and follow determinism', Adrian
Johnstone and Elizabeth Scott, in: Kai Koskimies ed, Proc. 7th
Intnl. Conf. Compiler Construction (CC'98), Lecture Notes in Computer
Science, 1383, Springer-Verlag, Berlin 16-30 (1998)

There's a journal paper by the same authors that might appear
sometime. If it does, you'll be able to get it from our web pages at
http://www.cs.rhul.ac.uk/research/languages/index.shtml.

Now, I hope that wasn't a homework question...

Adrian

--
Dr Adrian Johnstone, Senior Lecturer in Computing, Computer Science Dep,
Royal Holloway, University of London, Egham, Surrey, TW20 0EX, England.
Email a.joh...@rhul.ac.uk Tel:+44(0)1784 443425 Fax:+44(0)1784 439786

Sylvain Schmitz

unread,

Nov 24, 2006, 7:00:41 PM11/24/06

to

A Johnstone wrote:
>> S -> aSa | aa
>
>> It is quoted that a "top down parse with backtracking" can establish
>> the inputs with 2,4 or 8 a's but not 6 a's .... How is this possible ?
>
> The exercise (5.6 on page 193 of the first edition of the Dragon Book) refers
> to limited backtrack parsing.

> [...]
> If you want a fuller exposition, see chapter 6 of 'The theory of

> parsing, translation and compiling', Alfred V Aho and Jeffrey

> D. Ullman, Prentice-Hall, 1972. [...]

More references on the subject:

This limited backtrack parsing technique and the languages it allows
to recognize were investigated in a thesis by Alexander Birman (and
supervised by Jeffrey Ullman). The parsers run in linear time using
memoization techniques. The main results are also accessible in

Alexander Birman and Jeffrey D. Ullman, Parsing Algorithms with
Backtrack, Information and Control 23, 1--34 (1973).
<http://dx.doi.org/10.1016/S0019-9958(73)90851-6>.

The grammatical formalism and the corresponding parsing technique are
now better known as Parsing Expression Grammars (PEGs) and packrat
parsers after the recent work of Bryan Ford. More information is
accessible from his packrat webpage
<http://pdos.csail.mit.edu/~baford/packrat/>, including electronic
copies of his papers and of Birman's PhD thesis.

The technique is currently implemented in a small number of parser
generators, for instance Robert Grimm's Rats!, and used in some ways in
the recent ANTLR 3 by Terence Parr.

--
Hopes that helps,

Sylvain