foo.pyl:
INITIAL:
",": return "COMMA"
"[[:digit:]]+": return "CONSTANT"
"[[:blank:]]": return
"\n": return
foo.pyg:
expression -> COMMA list COMMA;
list ->
CONSTANT
| CONSTANT COMMA CONSTANT
;
Conceptually, this should be able to work. If it sees ", 1 ," it should
use the first pattern in "list", and if it sees ", 1 , 2 ," it should
take the second. It looks its getting confused walking down both
posibilites. I was under the impression though that GLR parsers can
handle this situation correctly. Shouldn't it be something like this?
read "COMMA" or error
read "CONSTANT" or error
read "COMMA" or error
read next token. if "$eof$", match with first list branch. if
"CONSTANT" drop first branch and continue checking against second
branch.
Or do I misunderstand GLR parsers? Also, I did try the "nonassoc SHORT
LONG" for matching if-then-else's, but that didn't appear to work. Any
help would be greatly appreciated.
-e
You should specify that the "CONSTANT COMMA CONSTANT" rule be
either right associative or left associative (depending on
your preferences). This is directly analogous to the "E -> E + E"
example. "id + id + id" could be parsed either as (id + id) + id
or id + (id + id). Or in yoru example "c,c,c" could be (c,c),c
or c,(c,c). If you specify left or right associtiativity one
of these possibilities will be disallowed.
%left COMMA
> take the second. It looks its getting confused walking down both
> posibilites. I was under the impression though that GLR parsers can
> handle this situation correctly. Shouldn't it be something like this?
Yes, it can. It would give you an ambiguous parse tree with
all possible parses.
> -e
Tim Newsham
http://www.lava.net/~newsham/
expression -> list;
list ->
COMMA CONSTANT COMMA
| COMMA CONSTANT COMMA CONSTANT COMMA
;
So does pyggy only consider the multiple branches under the same
grammar branch?
No. This is an entirely different grammar which does not have
any ambiguities.
Tim Newsham
http://www.lava.net/~newsham/